Skip to the content.

[Java][JVM Logs][GC Logs][G1GC] Monday with JVM logs - efficiency of the old generation cleanup - is my CPU wasted by the G1GC?

Basics of the G1GC

Here is a part of the documentation form Oracle site:

The Initiating Heap Occupancy Percent (IHOP) is the threshold at which an Initial Mark collection is triggered and it is defined as a percentage of the old generation size.

G1 by default automatically determines an optimal IHOP by observing how long marking takes and how much memory is typically allocated in the old generation during marking cycles. This feature is called Adaptive IHOP. If this feature is active, then the option -XX:InitiatingHeapOccupancyPercent determines the initial value as a percentage of the size of the current old generation as long as there aren’t enough observations to make a good prediction of the Initiating Heap Occupancy threshold. Turn off this behavior of G1 using the option-XX:-G1UseAdaptiveIHOP. In this case, the value of -XX:InitiatingHeapOccupancyPercent always determines this threshold.

Internally, Adaptive IHOP tries to set the Initiating Heap Occupancy so that the first mixed garbage collection of the space-reclamation phase starts when the old generation occupancy is at a current maximum old generation size minus the value of -XX:G1HeapReservePercent as the extra buffer.

After the IHOP is reached the G1GC starts the concurrent mark phase. When it finishes its work, the G1GC runs a set of mixed collections that clears the old generation. It can run from 0 to -XX:G1MixedGCCountTarget=8 mixed collections (8 is the default value).

At the end of the concurrent mark phase the G1GC runs the remark STW phase, which also can clean the old generation. That phase reclaims completely empty regions.

How many mixed collections should be run after a single concurrent mark phase?

This is a hard question. It completely depends on what your old generation contains. There is a value that is potentially bad for every application. It is 0. If in the concurrent cycle there were 0 mixed collections, and the remark phase didn’t clean any empty region, then G1GC didn’t collect anything from the old generation. The concurrent mark is done in separate threads, but it consumes the CPU.

GC logs - mixed collections

At the beginning of each GC cycle you can find the name of the collection at info level:

[gc,start             ] GC(2197) Pause Young (Concurrent Start) (G1 Evacuation Pause)
[gc                   ] GC(2198) Concurrent Cycle
[gc,start             ] GC(2198) Pause Remark
[gc,start             ] GC(2198) Pause Cleanup
[gc,start             ] GC(2199) Pause Young (Prepare Mixed) (G1 Evacuation Pause)
[gc,start             ] GC(2200) Pause Young (Mixed) (G1 Evacuation Pause)
[gc,start             ] GC(2201) Pause Young (Mixed) (G1 Evacuation Pause)
[gc,start             ] GC(2202) Pause Young (Mixed) (G1 Evacuation Pause)
[gc,start             ] GC(2203) Pause Young (Normal) (G1 Evacuation Pause)

The log above tells that there have been 3 mixed collections after the concurrent mark phase. Here is an example of 0 mixed collections:

[gc,start             ] GC(55650) Pause Young (Concurrent Start) (G1 Humongous Allocation)
[gc                   ] GC(55651) Concurrent Cycle
[gc,start             ] GC(55651) Pause Remark
[gc,start             ] GC(55651) Pause Cleanup
[gc,start             ] GC(55652) Pause Young (Normal) (G1 Evacuation Pause)

As usual, we can put such a value on a chart (all charts are from 1 day period):

alt text

alt text

GC logs - remark phase efficiency

At the end of each GC cycle (including the remark one) you can find such an entry in GC logs at info level:

GC(1204) Pause Remark 1400M->1376M(8192M) 23.501ms

You can find three sizes in such an entry A->B(C) that are:

Value A - B is the reclaimed space by this cycle.

alt text

alt text

GC logs - combined

Now we need to combine two information

If both values were equal to 0 then the CPU was wasted for a concurrent cycle that didn’t reclaim anything. We can present that at charts:

alt text

alt text

Or we can present that at table:

alt text

alt text

Let’s focus on the first table. The concurrent cycle consumed CPU for 7534441ms. 94% of that time was completely wasted. 7534440ms * 94% = 7082374,54ms = 118m. So the CPU was wasted for 118 minutes.

The number of wasted CPU cores depends on the -XX:ConcGCThreads argument. If you want to know the default value on your machine you can run:

java <your arguments> -XX:+PrintFlagsFinal -version | grep ConcGCThreads

Output:

pasq@pasq-Precision-7540:~$ java -XX:+PrintFlagsFinal -version | grep ConcGCThreads
     uint ConcGCThreads                            = 3                                         {product} {ergonomic}
openjdk version "11.0.6" 2020-01-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.6+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.6+10, mixed mode)

Causes

Java 8

The adaptive calculation of IHOP was introduced in Java 9, in older versions the starting old generation occupancy percent was fixed. The default value was -XX:InitiatingHeapOccupancyPercent=45.

Good old friend: humongous allocation

Formally the humongous regions are a part of the old generation. Since Java 8u40 there is a special feature called: eager reclamation of humongous objects. This feature clears humongous regions in the collection of the young generation.

Here is a part of the log of such a situation:

[gc,ergo              ] GC(52633) Initiate concurrent cycle (concurrent cycle initiation requested)
[gc,start             ] GC(52633) Pause Young (Concurrent Start) (G1 Humongous Allocation)
[gc,heap              ] GC(52633) Eden regions: 50->0(405)
[gc,heap              ] GC(52633) Survivor regions: 2->2(149)
[gc,heap              ] GC(52633) Old regions: 824->824
[gc,heap              ] GC(52633) Humongous regions: 419->17
[gc,metaspace         ] GC(52633) Metaspace: 195313K->195313K(1228800K)
[gc                   ] GC(52633) Pause Young (Concurrent Start) (G1 Humongous Allocation) 2588M->1684M(8192M) 10.861ms

For that application the region size was 2MB (-Xms=4G). The line Humongous regions: 419->17 means that at the beginning of the collection there were 419 humongous regions, at the end only 17 remained. It means that young generation collection cleared (419-17)*2MB - 804MB of the old generation.