-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc] Add CRaC support #869
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: The CRaC Project researches coordination of Java programs with mechanisms to checkpoint (make an image of, snapshot) a Java instance while it is executing. Restoring from the image could be a solution to some of the problems with the start-up and warm-up times. The project website is at https://openjdk.org/projects/crac/. Testing: All testcases in jdk/jdk/crac. These testcase requires root privilege and os kernel version should >=4.19. Reviewers: lei.yul, denghui.ddh Issue: dragonwell-project#867
Summary: 1. RMI TCPSocket not support C/R(Checkpoint/Restore). 2. The cppath file write after fork criu process, but criu will kill jvm. Sometimes there no chance to write cppath,but cppath is madatory for restore. 3. Comparing System.nanoTime()'s output is meaningful only in the same process. Testing: All testcases in jdk/jdk/crac. Reviewers: lei.yul, denghui.ddh Issue: dragonwell-project#867
…ses lazily. Summary: Most of the CRaC testcases run failed when run on the lastest criu and linux kernel 5.10. The first is write to file instead of stdout and stderr, because stdout and stderr depend on pipe or tty, but pipe and tty cannot be checkpoint and restore correctly with criu. The second is use 'docker run' instead of 'docker exec' to run java application in docker.If checkpoint a application that lauched by 'docker exec', there is an error "Can't lookup mount=24 for fd=0 path=/dev/null". Testing: All testcases in jdk/jdk/crac. Reviewers: lei.yul, denghui.ddh Issue: dragonwell-project#867
… before restore Summary: If validate failed before restore,JVM fail back to normal startup. Testing : All testcases in jdk/jdk/crac. Reviewers: lei.yul, denghui.ddh Issue: dragonwell-project#867
Summary: The interfaces in ProcessTools.java has changes. Testing: test/jdk/jdk/crac/MinimizeLoadedClass.java Reviewers: lingjun.cg, yulei.lx Issue: dragonwell-project#867
…hread to blocked when write data to client. Summary: AttachListener write data to client,but current thread status is in vm.The other thread cannot enter into safepoint. Testing: runtime/SharedArchiveFile/DumpSymbolAndStringTable.java Reviewers: denghui.ddh,lei.yul Issue: dragonwell-project#867
Summary: To support run CRaC on flink successful, add these features: pseudo persisten file, run with unprivileged mode, and append to app classloader classpath.And remove RMI Transport CRaC callback implementation which is not solid. Testing: All crac testcases. Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
Summary: Add a new option CRaCRestoreInheritPipeFds specify the pipe fds that should restore. Restore stdout and stderr pipe is important when run in container.The container runtime read logs from these pipes. Testing: All crac testcases. Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
Summary: 1. Add to set that cannot run concurrently 2. Remove unused TCPTransportTest.java testcase. Testing: All crac testcases. Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
Summary: The root cause is the difference in BaseOS, not the kernel version that led to restore failure. Testing: All crac testcases. Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
…e/Inflator/Deflator Summary: The finalize methods of ZipFile, Inflater and Deflater are marked deprecated and for removal in since JDK9.They should be removed in JDK12 as planed. Testing: test/jdk/java/util/zip/ZipFile/TestCleaner.java Reviewers: lei.yul, denghui.ddh Issue: dragonwell-project#867
Summary: JMX cache localhost name in sun.rmi.transport.tcp.TCPEndpoint, it should resample in afterRestore. Testing: All crac testcases. Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
Summary: The crac debug log format changed Testing: jdk/crac/LazyProps.java Reviewers: yansendao.ysd,denghui.ddh Issue: dragonwell-project#867
Summary: There is no guarantee the CRaC image dir exists before registering WatchService, so checking image dir exists with proper timeout. Testing: jdk/crac/recursiveCheckpoint/Test.java Reviewers: yansendao.ysd, yueshi.zwj Issue: dragonwell-project#867
…ed docker. Summary: The stdout and stderr are pipe files when run in docker, restore these pipe files is rather tricky. Write the fds to a file named pipefds, then after criu checkpoint successfully, the criu execute the criuengine as a postdump callback. In the callback, append the pipe info of java process to the file pipefds. It read the pipefds when restore, than pass the pipe info as --inherit-fd to criu. The problem is criuengine cannot get the pipefds file path if run with nonprivilged. To fix this, set the environment CRAC_IMAGE_DIR explictly when do checkpointing. Testing: jdk/jdk/crac/stdoutInDocker/TestStdoutInDocker.sh Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
Summary: If the image dir is a relative path, the criuenginue process cannot write to the pipefds file in image dir.So convert it to real path before do checkpointing. Testing: jdk/jdk/crac/AppendAppClassLoaderTest.java,jdk/jdk/crac/RestorePipeFdTest.java Reviewers: yansendao.ysd,lvfei.lv Issue: dragonwell-project#867
lingjun-cg
force-pushed
the
crac-init
branch
6 times, most recently
from
October 11, 2024 06:21
d97ddc1
to
65a9b8a
Compare
…d mode when do checkpointing. Summary: Add a new option CRaCAppendOnlyLogFiles to configure the files can be ignored when do checkpointing, and create an empty file if not exist when restore. Testing: jdk/crac/AppendOnlyFileTest.java Reviewers: lei.yul,denghui.ddh Issue: dragonwell-project#867
D-D-H
approved these changes
Oct 15, 2024
yuleil
approved these changes
Oct 17, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary: Add CRaC support
Testing: all CRaC testcases.
Reviewers: lei.yul,denghui.ddh
Issue: #867