Coordinated Restore at Checkpoint (CRaC) is an innovative JDK project designed to significantly reduce Java applications' startup time. By capturing a fully warmed-up snapshot of a Java process, CRaC enables the launch of one or more JVMs from this checkpoint. This results in faster time to the first transaction and improved overall code execution speed. Several projects, including Quarkus, Micronaut, and Spring, have recognized the potential of CRaC and are actively working on incorporating it into their frameworks to achieve lightning-fast application startup times, which will have a significant impact on how we run our applications on powerful (cloud) servers.
CRaC is the final step after multiple client projects by Azul in the automotive and IoT industry! Infotainment systems, gateways, and other use cases require ultra-fast startup while running on embedded ARM32 and ARM64 systems. Let's explore CRaC's applicability and impact on embedded devices with the Raspberry Pi, using an ARM processor, the ideal and inexpensive playground. While testing and documenting this process, I learned much about how Java starts an application, compiles the code, and needs time to "warm up." It also gave me insight into how the OpenJDK project is organized and what information you can find in its sources.
Brace yourself for some exciting findings because, as it turns out, the OpenJDK project can be read as a history book and CRaC makes a remarkable difference in optimizing Java application performance, yes, even on an inexpensive Raspberry Pi!
Links of the presentation:
https://sdkman.io
https://github.com/openjdk/
https://webtechie.be/post/2020-10-21-build-openjdk-on-raspberry-pi/
https://docs.azul.com/core
https://azul.com/blog/time-zone-and-currency-database-in-jdk
https://github.com/openjdk/jdk/blob/master/src/java.base/share/data/tzdata/europe
https://azul.com/blog/jit-performance-ahead-of-time-versus-just-in-time/
https://webtechie.be/post/2023-10-16-crac-on-raspberry-pi-update/
https://foojay.io/today/foojay-podcast-28/
https://docs.azul.com/core/crac/crac-introduction
https://foojay.io/today/springboot-3-2-crac/
Questions after the presentation:
1/ How big is the checkpoint directory?
The Java application in my example project https://github.com/FDelporte/crac-example is loading 7 ZIPped CSV files with a total of 80 MB compressed file size, containing 1.611.000 records which are loaded into memory. As you can see in the README of the project, the generated CR-directory contains 843M of files. As mentioned during the demo, this is a "forced" load-in-memory-application to illustrate that the state of a program is saved with the checkpoint, so not a real-life use-case. But it gives you an idea of what is happening, and is free to use for further experiments...
2/ Is there a tool to validate a checkpoint? E.g. when creating in a build pipeline to check it before deploying.
You need to check whether the files are present in the checkpoint folder to make sure criu did it’s job, and that the application was killed after the creation of the checkpoint.
An ideal test would be to start the application in your build process and validate it's running as expected.
There are some ideas how this could be indeed be validated with some kind of tool, but this is not available yet.
3/ How to inject values at restore? E.g. a seed for a random generator, a new file location for the log, etc...
Best solution to achieve this, is setting environment variables before restoring the checkpoint and use these in the afterRestore method(s).
4/ Is the memory reduced/cleaned/compacted at checkpoint to minimize the size of it?
a) There is always a full GC when a checkpoint is requested, and some other cleanup is executed to reduce the size.
b) Further compression (via lz or so) is optional. It depends on the relative speed of I/O and CPU but will make the restore longer.