-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support Cluster Snapshot Backup: checkpoint and image backup (part3) #54695
base: main
Are you sure you want to change the base?
Conversation
4d88ab3
to
d723b6f
Compare
7ea70c7
to
5547bbd
Compare
File srcFile = new File(srcPath); | ||
FileUtil.copy(srcFile, fileSystem.getDFSFileSystem(), new Path(destPathUri.getPath()), false, new Configuration()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use fileSystem.copyFromLocalFile()
directly
https://hadoop.apache.org/docs/r3.4.0/api/org/apache/hadoop/fs/FileSystem.html#copyFromLocalFile-boolean-boolean-org.apache.hadoop.fs.Path-org.apache.hadoop.fs.Path-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that copyFromLocalFile will failed if the src is not a file
LOG.error("Interrupted while copy from local {} to {} ", srcPath, destPath + " " + e.getMessage()); | ||
throw new StarRocksException("Failed to copy " + destPath + " from local " + srcPath + " " + e.getMessage()); | ||
} catch (Exception e) { | ||
LOG.error("Exception while copy from local {} to {} ", srcPath, destPath + " " + e.getMessage()); | ||
throw new StarRocksException("Failed to copy " + destPath + " from local " + srcPath + " " + e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.error("Interrupted while copy from local {} to {} ", srcPath, destPath + " " + e.getMessage()); | |
throw new StarRocksException("Failed to copy " + destPath + " from local " + srcPath + " " + e.getMessage()); | |
} catch (Exception e) { | |
LOG.error("Exception while copy from local {} to {} ", srcPath, destPath + " " + e.getMessage()); | |
throw new StarRocksException("Failed to copy " + destPath + " from local " + srcPath + " " + e.getMessage()); | |
LOG.error("Interrupted while copy local {} to {} ", srcPath, destPath, e); | |
throw new StarRocksException("Failed to copy local " + srcPath + " to " + destPath, e); | |
} catch (Exception e) { | |
LOG.error("Exception while copy local {} to {} ", srcPath, destPath, e); | |
throw new StarRocksException("Failed to copy local " + srcPath + " to " + destPath, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -59,6 +59,11 @@ public void copyToLocal(String srcPath, String destPath, Map<String, String> pro | |||
LOG.info("Copied {} to local {}", srcPath, destPath); | |||
} | |||
|
|||
public void copyFromLocal(String srcPath, String destPath, Map<String, String> properties) throws StarRocksException { | |||
fileSystemManager.copyFromLocal(srcPath, destPath, properties); | |||
LOG.info("Copied {} from local {}", destPath, srcPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.info("Copied {} from local {}", destPath, srcPath); | |
LOG.info("Copied local {} to {}", srcPath, destPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
CheckpointController controller = getCheckpointController(); | ||
if (CheckpointController.clusterSnapshotCheckpointRunning()) { | ||
controller = GlobalStateMgr.getServingState().getClusterSnapshotCheckpointController(); | ||
} else { | ||
controller = getCheckpointController(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line 137 and 141 are duplicated ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@SerializedName(value = "createTime") | ||
private long createTime; | ||
@SerializedName(value = "successTime") | ||
private long successTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the veriable name same to the corresponding colume name in system table ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@SerializedName(value = "createTime") | ||
private long createTime; | ||
@SerializedName(value = "successTime") | ||
private long successTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same to the above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Quality Gate failedFailed conditions See analysis details on SonarQube Cloud Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE |
changeRunningState(false); | ||
} | ||
|
||
private boolean selfIsRunning() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about put the waiting code into subClass?
BaseController
/ \
/ \
NormalController SnapshotController
The logic will be more clear.
Self and peer is a little obscure. And it is not reasonable to let the base class perceive the subclass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, refactor code and introduce exclusive flag for checkpoint controller in base class
…p (part3) Signed-off-by: srlch <[email protected]>
9d7db2f
to
049527d
Compare
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]❌ fail : 16 / 38 (42.11%) file detail
|
[BE Incremental Coverage Report]❌ fail : 0 / 14 (00.00%) file detail
|
What I'm doing:
This is part3 for supporting cluster snapshot backport for share data mode.
In this pr, we introduce a new checkpoint controller to finish the image backup:
ClusterSnapshotCheckpointController
to control the creation of the imgaefor FE and StarMgr in the same thread with consistent journal ids between two image.
ClusterSnapshotCheckpointController
should be exclusive with the normal checkpoint.This controller will responsible for create image in leader node and upload it into remote storage.
Fixes #53867
#53867
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: