G1 GC - A primer from performance engineering standpoint

Share on:

Background

In couple of my previous articles I not only tried helping understand fundamentals of JVM memory management, but also delved deeper to understand how Garbage Collection works along with its optimization. So with this background, I am sure you would be convinced that behavior of garbage collection may have ramification on performance of an application.

With newer versions of Java, garbage collection has also evolved from Serial -> Parallel -> CMS -> G1 GC -> Z GC. Hence it becomes extremely important to understand basics of garbage collection. In this article we will try to gather high level overview of Garbage First - Garbage Collector.

What is G1 GC

Introduction

The Garbage First Garbage Collector (i.e. G1 GC) is a low pause, regionalized and generational garbage collector of JVM. It is primarily designed for multi processor machines with relatively large memories. It strives to meet pause time goal with high probability, without compromising on throughput.

This is ONE of the DEFAULT garbage collection algorithm in JDK 11 :) Have patience or you may directly refer trivia section to logically understand this statement

Understanding G1 Heap Layout along with its regions

G1 partitions heap into set of equal sized heap regions. Each region is a contiguous range of memory where allocation and reclamation happens. At given point in time, each of these regions can either be empty or assigned to a particular generation :

  1. Young Generation
    • Eden
    • Survivor
  2. Old Generation
    • Humongous

Structure of a region

Each region mainly consists of -

  1. Space - Region space would range from 1MB to 32 MB depending on max heap size i.e. -Xmx and G1HeapRegionSize
  2. Alive - Some of the objects in region will be alive
  3. Garbage - Some of the objects in region will be garbage
  4. RSet - Remembered Set is nothing but just a book keeping meta data that indicates which objects are live and dead. This in turn helps JVM to determine liveness percent for that region at a given point in time

Liveness % = Live Size / Region Size

G1 GC taxonomy

  1. RSet - Each young generation region has Remembered Set. It mainly tracks pointers from tenured to young generational regions. Since it is kind of accounting data structure, it will have some space complexity.

  2. CSet - Collection Set mainly contains set of regions (young or old generational) that are candidate to be collected during GC.

G1 GC types and its phases

  1. Young only GC - Young GC is responsible for promoting objects from Eden to Survivor regions or Survivor regions to Old generation regions. Young GCs are considered to be Stop The World (i.e. STW) events. G1 collector performs below phases as part of young generation :
PhaseBrief descriptionSTW ?
Initial MarkStarts the marking process along with regular young-only collection. Concurrent marking determines all the live objects in old generation regions.No
RemarkFinalizes marking, and thereby performs global reference processing and class unloading. Update RSet liveness percentageYes
CleanupReclaims empty regions, and determines whether a space-reclamation mixed collection will actually followYes
  1. Old Generation Collection - G1 is primarily designed to be a low pause collector for objects in old generation. G1 collector performs below phases as part of old generation
PhaseBrief descriptionSTW ?
Initial MarkIt is piggybacked on a normal young GC. Mark survivor regions (root regions) which may have references to objects in old generation.Yes
Root Region ScanningScan survivor regions for references into old generation.No
Concurrent MarkingFinds live object over the entire heap. This phase may be interrupted by young generation garbage collections.No
RemarkCompletes marking of live object in the heap. Uses an algorithm called snapshot-at-the-beginning (SATB) which is much faster than what was used in the CMS collector.Yes
Cleanup
  • Performs accounting on live objects and completely free regions. (STW)
  • Scrubs the Remembered Sets. (STW)
  • Reset the empty regions and return them to the free list. (Concurrent)
Partially Yes
CopyingG1 evacuates or copies live object to new unused regions. This can be done with young generation regions which are logged as [GC pause (young)]. Or both young and old generation regions which are logged as [GC Pause (mixed)].Yes

Key configurations from GC optimization standpoint

Even though G1 GC is designed to be a low pause collector, one needs to be mindful of the fact that it may have ramifications on application from performance standpoint. Hence understanding key G1 GC flags subsumes lot of significance :

Options and Default ValueBrief descriptionImplication from performance standpoint
-XX:MaxGCPauseMillis=200If pauses for any of STW phases exceed this value, G1 GC will attempt to compensate by various means - Indicative list :
  • Adjusting old to young ratio or heap size
  • Initiating background processing much sooner
  • Modifying tenuring threshold
  • Processing more or fewer old generation regions during mixed GCs
Decreasing its value may lead to :
  • Increase in frequent young GCs
  • Decrease in no. of old generation regions that can be collected during mixed GCs
Note : -XX:MaxGCPauseMillis should be less than -XX:GCPauseIntervalMillis
-XX:GCPauseIntervalMillis=<>Determines GC frequencyDefault value = -XX::MaxGCPauseMillis + 1 Recommendation : GCPauseIntervalMillis = 2 * MaxGCPauseMillis
-XX:ConcGCThreads=<>Maximum number of threads used for concurrent work. By default, this value is -XX:ParallelGCThreads / 4.Increasing its value will make concurrent cycle shorter
-XX:InitiatingHeapOccupancyPercent=45Used by G1 GC to trigger a concurrent GC cycle based on the occupancy of the entire heap, not just one of the generations.Higher the threshold, the less concurrent marking cycles will be, which also means less mixed GC evacuation. Recommendation would be to keep it just enough low so that, G1 GC can trigger mixed GCs immediately and thereby constantly prune tenured heap
-XX:G1MixedGCCountTarget=8Defines number of mixed garbage collections that should be triggered after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live dataReducing its no. will ensure that :
  • Reduced no. of Mixed GCs between Young Only GCs
  • Young regions are purged regularly
  • Liveness percentile is getting updated for tenured regions regularly which in turn will increase GC efficiency
-XX:G1MixedGCLiveThresholdPercent=65Threshold determines whether a region should be added to the CSet or not.Higher the threshold, more regions will be added to CSet which will eventually lead to more mixed GCs
-XX:G1OldCSetRegionThresholdPercent=10Defines limit on maximum no. of old regions that can be included per Mixed GCIncreasing its value will ensure that more tenured regions are included in mixed GCs
-XX:G1HeapWastePercent=10Defines percentage of reclaimable space of the total heap size, for which G1 will stop doing mixed GCReducing its value will potentially cause G1 to add more expensive region(s) to evacuate for space reclamation.

Main goal in tuning G1 GC is to make sure that no evacuation failures end up in full GCs. Preventing full GC in JDK 8 is of paramount importance as full GC in JDK 8 is single threaded. Full GC in JDK 11 is multi threaded - and this can be one of the compelling reasons to move from JDK 8 to JDK 11 :)

Conclusion

Even though few of the important G1 GC flags have been listed above along with its performance implications, it would be prudent to monitor GC behavior with default or custom configured flags and thereby determine the most optimal configuration for a given application. Needless to say, configuration of G1 GC flags will also be dependent on nature of application i.e. throughput / latency sensitive.

JDK 11 Trivia :)

WHAT IS THE DEFAULT GC IN JDK 11?

JDK 11 has 2 default values -

  1. Serial GC
  2. G1 GC

Default selection is decided based on maximum memory and no. of processors available for an application -

  1. Default GC for an application having maximum memory as 4 GB and 1 active processor - Serial GC Default GC - M4 P1

  2. Default GC for an application having maximum memory as 1 GB and 1 active processor - Serial GC Default GC - M1 P1

  3. Default GC for an application having maximum memory as 4 GB and 2 active processor - G1 GC Default GC - M4 P2

comments powered by Disqus