Java Garbage Collection

Garbage Collection automatically reclaims heap memory occupied by objects that are no longer reachable by any live thread.


1. Why GC exists

Manual memory management is hard and error-prone:

  • Dangling pointers: Free an object too early, then another part of code still uses it → crashes, corruption.
  • Memory leaks: Forget to free memory → heap usage grows until the process dies.
  • Complex ownership: In large object graphs (collections, caches, frameworks), it is hard to know who “owns” what.

The JVM uses GC to:

  • Simplify programming model: Java developers think in terms of references and reachability, not malloc/free.
  • Improve safety: No use-after-free, fewer ways to corrupt memory.
  • Enable optimizations: Moving objects and compacting the heap can improve cache locality and allocation speed.

Production impact:

  • GC is a core part of performance in Java. In a high-traffic REST service, GC behavior dictates:
    • Latency spikes (when stop-the-world pauses happen).
    • Throughput (if a lot of CPU is spent in GC instead of user code).
  • Understanding GC is about predicting and controlling these effects, not just knowing algorithms.

2. Heap structure (Young, Old, Metaspace)

Modern HotSpot heap (simplified view):

+---------------------------------------------------------+
| Java Heap |
| |
| +-------------------+ +-----------------------------+ |
| | Young Gen | | Old Gen | |
| | | | | |
| | +-----+ +-----+ | | Long-lived objects | |
| | |Eden| | S0 | | | big structures, caches | |
| | +-----+ +-----+ | | | |
| | | S1 | | +-----------------------------+ |
| | +-----+ | |
| +-------------------+ |
+---------------------------------------------------------+

+---------------------------------------------------------+
| Metaspace (native) |
| Class metadata: methods, constant pools, vtables, etc. |
+---------------------------------------------------------+
  • Young Generation (Young Gen)
    • Contains Eden and Survivor spaces (S0, S1).
    • Most new objects are allocated in Eden.
    • Objects surviving young GC cycles may move to survivor spaces and eventually promote to Old Gen.
  • Old Generation (Tenured)
    • Long-lived objects: singletons, caches, large graphs that survive multiple GCs.
    • Collected less frequently but usually with heavier algorithms.
  • Metaspace
    • Stores class metadata, not regular Java objects.
    • Lives in native memory (not in the Java heap).
    • Grows as classes are loaded; can cause OOM if there are classloader leaks.

Why this structure?

  • Generational hypothesis: “Most objects die young.”
    • E.g., request-scoped DTOs, temporary collections, intermediate strings — most exist only for one request.
  • Splitting heap into young/old allows:
    • Frequent, fast collections in young gen (cheap copying GC).
    • Infrequent, more expensive collections in old gen (where survival rate is high).
  • Metaspace separation isolates class metadata from regular object churn.

3. Object allocation and object lifetime

3.1 Allocation

HotSpot makes allocation very fast:

  • Each thread gets a Thread-Local Allocation Buffer (TLAB) in Eden.
  • Allocation is just a bump-the-pointer in the TLAB:
    • No global locking, usually just increment a pointer.

Typical lifecycle in a high-traffic REST service:

  • For each HTTP request:
    • Framework stacks, controllers, DTOs, JSON (de)serialization objects, temporary lists/maps, etc., are created in Eden.
    • Many of them become unreachable by the time the request processing finishes.

3.2 Object lifetime categories

Roughly three categories:

  1. Very short-lived
    • Local variables in methods, temporary results, per-request objects.
    • Die in Eden; collected in minor GCs.
  2. Medium-lived
    • Objects that survive several requests but eventually become dead.
    • Survive some young GCs, go through survivor spaces, then promoted to old gen, eventually die in old gen GC.
  3. Long-lived
    • Application singletons, large caches, connection pools, threads, configuration graphs.
    • Quickly promoted to old gen and stay there almost forever.

Why JVM chose generational GC:

  • Empirical observation across many workloads:
    • Majority of objects die quickly.
    • Only a small fraction live “forever”.
  • Collecting only the young region frequently:
    • Minimizes work (scan small region).
    • Can use copying algorithms that are very fast and compact memory.
  • Old gen collections are rarer and more expensive, but:
    • They deal with fewer dead objects.
    • They can run in the background or concurrently in modern collectors.

Production implication:

  • If your app creates a lot of medium/long-lived garbage (e.g., growing caches, forgotten collections), old gen fills up and GC becomes painful (long pauses, promotion failures, full GCs).

4. Minor GC vs Major GC vs Full GC

Terminology is somewhat historical and collector-specific, but commonly:

4.1 Minor GC (Young GC)

  • Scope: Only the young generation.
  • Algorithm: typically copying (Eden → survivor space; survivor → survivor/promote).
  • Stop-the-world: yes, but typically short (milliseconds) if properly sized.
  • Triggers:
    • Eden fills up.
    • Young gen occupancy crosses a threshold.

Effect:

  • Reclaims most short-lived garbage.
  • Promotes survivors to survivor space or old gen.

In a high-traffic REST service:

  • You want frequent, short minor GCs.
  • If minor GCs become long:
    • Young gen may be too large.
    • Or your threads create massive, large objects quickly.

4.2 Major GC (Old Gen GC)

  • Scope: Includes old generation (and usually young too in practice).
  • Older collectors distinguish:
    • “Young-only” vs “Old-only” or “Full-heap” GCs.
  • Old gen collections are typically more expensive:
    • More data to scan.
    • Fragmentation issues; often includes compaction.

4.3 Full GC

  • Scope: Entire heap (young + old) and often also invokes class unloading in metaspace.
  • STW: yes, and usually longest pauses.
  • Triggers:
    • Old gen is full / promotion failure.
    • System GC (System.gc()) if not disabled.
    • Some collectors use full GC as a fallback when incremental schemes fail.

Production symptoms:

  • Long GC pauses logged as “Full GC” lasting hundreds of milliseconds to seconds.
  • All application threads frozen → latency spikes, timeouts, dropped connections, “node is unresponsive” in cluster.

5. Stop-the-World (why it happens)

Stop-the-world (STW) = all Java application threads are paused so GC can safely operate.

Why does STW exist?

  • Heap consistency:
    • GC needs a snapshot of object graph (“GC roots”) that don’t change while it’s scanning/moving objects.
  • Simplifies algorithms:
    • Marking reachable objects and moving them while application is mutating the heap is complex.
    • Even “concurrent” collectors (G1, ZGC, Shenandoah) still need short STW phases for certain steps (e.g., initial mark, final remark).

Conceptual timeline:

Time ----->

App Threads: | running | PAUSED (GC) | running | PAUSED (GC) | ...

GC Threads: | idle | WORKING | idle | WORKING | ...

Production impact:

  • STW pauses directly translate to:
    • Increased 99th/99.9th percentile latency.
    • Timeouts and retries in upstream services.
    • If pauses are long enough, health checks fail; container orchestrators may kill or restart pods/instances.

Goal of modern GC tuning/algorithms:

  • Reduce frequency and duration of STW pauses.
  • Spread work over time (concurrent phases) instead of long single pauses.

6. GC algorithms (Mark-Sweep, Mark-Compact, Copying)

These are building blocks used by real collectors.

6.1 Mark-Sweep

Two phases:

  1. Mark
    • Starting from GC roots (stacks, static fields, JNI refs, etc.), traverse object graph and mark reachable objects.
  2. Sweep
    • Scan heap linearly and free all unmarked objects.

Pros:

  • Simple, does not move objects.

Cons:

  • Leaves fragmentation (holes between live objects).
  • Fragmentation leads to allocation failures for large objects even if total free memory is enough.

6.2 Mark-Compact

Again:

  1. Mark reachable objects.
  2. Compact:
    • Move all live objects together, typically towards one end of the heap.
    • Update references to moved objects.

Pros:

  • Eliminates fragmentation.
  • Improves locality (objects packed together).

Cons:

  • Moving objects is more expensive.
  • Requires updating all references (costly and often STW).

6.3 Copying

Divide region into two semi-spaces or use “from/to” regions:

  1. Allocate in from-space.
  2. When GC runs:
    • Copy live objects to to-space.
    • Clear from-space entirely.

Pros:

  • Very fast for regions where most objects are garbage (like young gen).
  • Naturally compacts memory (no fragmentation).
  • Allocation in a copy region is bump-the-pointer.

Cons:

  • Requires extra memory (you need space for to-space).
  • Not efficient if most objects survive (too much copying).

How they map to generational GC:

  • Young gen: copying collector (because most objects die).
  • Old gen: mark-sweep or mark-compact (because survival is high; copying would be expensive).

7. Modern collectors

7.1 Serial GC

  • Single-threaded GC (one GC thread).
  • Young: copying; Old: mark-compact.
  • All GC work is STW and done by a single thread.

Use cases:

  • Small heaps, client apps, single-core or low-core environments.
  • Not suitable for server-side, multi-CPU, high-traffic workloads.

Flags:

  • -XX:+UseSerialGC

7.2 Parallel GC (Throughput collector)

  • Uses multiple GC threads for young and old gen collections.
  • Still STW, but GC work is done in parallel.

Characteristics:

  • Targets high throughput: maximize amount of work done per unit CPU time.
  • Accepts longer pauses if that increases overall throughput.

Use cases:

  • Batch processing, off-line jobs, non-latency-sensitive services.

Flags:

  • -XX:+UseParallelGC (often default in older JDKs for server).

7.3 CMS (Concurrent Mark-Sweep) – deprecated/removed

  • Goal: low pause times in old gen.
  • Old gen algorithm:
    • Concurrent mark-sweep with some STW phases.
    • Does not compact by default → fragmentation issues, can trigger stop-the-world compaction full GC.

Characteristics:

  • Many concurrent GC threads; reduces long pauses.
  • Complex tuning; can suffer from “concurrent mode failure” → full STW compaction as fallback.

Status:

  • Deprecated and removed in recent JDKs.
  • Conceptually replaced by G1 and newer low-latency collectors.

7.4 G1 (Garbage-First)

Now the default in modern HotSpot (JDK 11+):

  • Heap is split into many regions (e.g., 1–32 MB each), not just contiguous young/old.

Conceptual:

+---------------------------------------------------------+
| R1 | R2 | R3 | ... | Rk | ... | Rn |
+---------------------------------------------------------+
Some regions: 'young', some 'old', some 'humongous'

Key ideas:

  • Generational & regional:
    • Some regions designated as young; others as old.
  • Garbage-First:
    • Track which regions contain the most garbage.
    • Collect (evacuate) regions with the most garbage first.
  • Mostly concurrent:
    • Most marking in old gen is done concurrently.
    • Still has STW phases for evacuation.

Tuning goals:

  • You set a target pause time (e.g., -XX:MaxGCPauseMillis=200).
  • G1 tries to choose how many regions to collect each time to respect this.

Great for:

  • General-purpose server workloads.
  • High-traffic REST services with moderate pause requirements.

Flags:

  • -XX:+UseG1GC (default in newer JDKs)
  • -XX:MaxGCPauseMillis=<n>

7.5 ZGC / Shenandoah (low-latency collectors)

Both target ultra-low pause times (single-digit ms or less), even on large heaps (tens/hundreds of GB).

Common characteristics:

  • Concurrent, region-based, moving collectors.
  • Use sophisticated techniques (colored pointers, Brooks pointers, read barriers) to allow:
    • Moving objects while application threads run.
    • Very short STW phases (for initial/remark only).

ZGC (Oracle/OpenJDK):

  • Aims for <10ms pause even for very large heaps.
  • Mostly concurrent; uses colored pointers.
  • Good when tail latency SLAs are strict (e.g., trading systems, highly latency-sensitive APIs).

Flags:

  • -XX:+UseZGC
  • -XX:MaxGCPauseMillis=<n>

Shenandoah (Red Hat / OpenJDK):

  • Similar intent: “low-pause concurrent compacting collector.”
  • Uses region-based heap, concurrent evacuation.
  • Available in some OpenJDK builds and distributions.

Use cases:

  • You care more about consistent low latency than absolute throughput.
  • Willing to pay some extra CPU overhead for GC concurrency.

8. Throughput vs latency tradeoffs

At a high level:

  • Throughput: Maximize total work done (requests/sec, jobs/hour) at cost of occasional longer pauses.
  • Latency: Minimize worst-case response time, even if throughput is slightly lower.

Comparisons:

  • Serial / Parallel GC:
    • High throughput.
    • Pauses can be long (especially Full GCs on large heaps).
  • G1:
    • Balanced throughput and latency.
    • Tries to keep pauses within a target.
  • ZGC / Shenandoah:
    • Focus on extremely low pauses.
    • Some throughput overhead due to barriers and concurrent work.

Production decision:

  • High-traffic REST service with soft SLAs:
    • G1 often good default: predictable pauses, solid throughput.
  • Latencysensitive real-time service (e.g., trading, real-time bidding):
    • ZGC/Shenandoah may be worth the extra complexity.
  • Batch pipeline / offline processing:
    • Parallel GC may be fine; long pauses are acceptable.

9. GC tuning basics you should actually know

9.1 Key JVM flags (modern JDKs)

Heap size:

  • -Xms<size>: Initial heap size.
  • -Xmx<size>: Maximum heap size.
    • Typically set equal (-Xms = -Xmx) in production to avoid dynamic resizing overhead.

Collector selection:

  • -XX:+UseG1GC (default in JDK 11+ for server).
  • -XX:+UseParallelGC
  • -XX:+UseZGC
  • (Shenandoah): -XX:+UseShenandoahGC (on JDKs that include it).

GC logging (JDK 9+):

  • -Xlog:gc*: Enable detailed GC logging.
    • Analyze with tools like GCViewer, gceasy, etc.

G1 tuning examples:

  • -XX:MaxGCPauseMillis=200 – target max pause time.
  • -XX:InitiatingHeapOccupancyPercent=45 – when to start concurrent marking.

ZGC tuning examples:

  • -XX:+UseZGC
  • Optional pause targets via -XX:MaxGCPauseMillis=N.

Young/old sizing (less critical with G1/ZGC, more with Parallel/Serial):

  • -Xmn or -XX:NewSize / -XX:MaxNewSize: young gen sizing.

9.2 When tuning helps

Tuning helps when:

  • GC is a significant CPU consumer (e.g., 20–40%+ of CPU time).
  • You see:
    • Frequent stop-the-world pauses.
    • Long Full GC events.
    • Promotion failures / to-space exhausted errors.
    • Heap usage pattern suggests GC is doing a lot of work.

Typical interventions:

  • Increase heap size if:
    • Live set is large, GC thrashes reclaiming small amounts.
  • Adjust pause target / collector type:
    • Move from Parallel → G1 or ZGC when latency matters.
  • Tune young gen size:
    • Too small: too many minor GCs.
    • Too large: minor GCs become slower, more data to scan.

9.3 When tuning does NOT help

GC tuning does not solve:

  • Memory leaks:
    • Strong references held in static fields, caches, unbounded queues, or thread-locals.
    • Symptoms: Old gen usage grows monotonically; full GCs reclaim little; eventually OOM.
    • Fix: profile memory (e.g., heap dumps), fix the code.
  • Bad allocation patterns:
    • Creating huge numbers of large objects unnecessarily.
    • Excessive boxing, copying large collections repeatedly.
  • Poor algorithmic complexity:
    • GC cannot fix an O(n²) hotspot.

Rule: Tune GC after ensuring the app’s memory behavior is reasonable.


10. How this is asked in interviews

7–10 typical GC interview questions with conceptual answers:

  1. Q: Why does the JVM use a generational heap?
    A: Because most objects die young; focusing collection on the young generation allows cheap, frequent GCs using copying algorithms, while collecting the old generation less often saves work.
  2. Q: What is the difference between minor GC and full GC?
    A: Minor GC usually collects only the young gen (short pauses), while full GC collects the entire heap (young + old), often with class unloading and compaction, and causes much longer stop-the-world pauses.
  3. Q: Why do stop-the-world pauses exist even in “concurrent” collectors?
    A: GC needs short periods where application threads are paused to establish a consistent view (roots, marking, final remap phases). Some operations are too complex or unsafe to do fully concurrently.
  4. Q: Explain mark-sweep vs mark-compact vs copying collection.
    A: Mark-sweep marks live objects then frees unmarked ones (fragmentation). Mark-compact also compacts live objects to eliminate fragmentation. Copying collection copies live objects from one region to another, naturally compacting memory and working best when most objects are garbage (young gen).
  5. Q: How would a memory leak manifest in a production Java service?
    A: Old gen usage increases over time; GCs (even full GCs) reclaim little; GC frequency and pause times increase; eventually an OOM occurs. Heap dumps show growing data structures with strong references preventing reclamation.
  6. Q: Compare Parallel GC and G1 GC. When would you choose each?
    A: Parallel GC is throughput-oriented with STW collections, good for batch/offline jobs. G1 is regional and mostly concurrent with configurable pause targets, better for general-purpose servers and latency-sensitive services.
  7. Q: What problem do collectors like ZGC and Shenandoah solve?
    A: They provide very low pause times (often <10ms) even for large heaps by doing most marking and moving concurrently, which is important for highly latency-sensitive workloads.
  8. Q: How can GC cause high tail latencies in a high-traffic REST service?
    A: Stop-the-world pauses freeze all request handling threads. Long or frequent pauses directly inflate 99th/99.9th percentile latency, causing timeouts, retries, and cascading failures.
  9. Q: Name some basic GC tuning steps you would apply when you see long pauses.
    A: Ensure a modern collector (G1 or ZGC) is used; set reasonable heap size (-Xms/-Xmx); adjust pause target (MaxGCPauseMillis); analyze GC logs; check for leaks or large long-lived caches before further tuning.
  10. Q: How does object lifetime affect GC behavior and tuning?
    A: If most objects are short-lived, young gen GC is cheap and effective. If many objects are long-lived or medium-lived, they get promoted to old gen, increasing old gen pressure and full/mixed GCs. Tuning may involve right-sizing young/old and redesigning code to reduce medium-lived garbage.

11. Rapid revision summary (bullet points)

  • GC automatically reclaims memory of unreachable objects, preventing manual free/malloc errors.
  • JVM heap is split into young gen (Eden + survivors), old gen, and Metaspace (class metadata in native memory).
  • Generational GC: new objects in young gen; survivors promoted to old gen; based on “most objects die young”.
  • Allocation is fast via TLABs and bump-the-pointer; GC cost depends on object lifetime patterns.
  • Minor GC: young-only, frequent, usually short STW pauses; major/full GC: includes old gen and often entire heap, causing longer STW pauses.
  • STW pauses exist so GC can see a consistent heap snapshot and safely mark/move objects.
  • Core algorithms: mark-sweep (simple, fragmentation), mark-compact (no fragmentation), copying (fast for young where most objects die).
  • Modern collectors:
    • Serial: single-threaded, STW, small heaps.
    • Parallel: multi-threaded STW, high throughput.
    • CMS: old, concurrent mark-sweep, lower pauses but fragmentation and complexity.
    • G1: regional, mostly concurrent, pause-targeted, default server collector.
    • ZGC/Shenandoah: concurrent, compacting, ultra-low pauses for large heaps.
  • Throughput vs latency: Parallel favors throughput (longer pauses okay); G1 balances; ZGC/Shenandoah favor very low pauses with some overhead.
  • GC tuning basics: choose appropriate collector; set -Xms/-Xmx (often equal); enable GC logs (-Xlog:gc*); use pause targets (MaxGCPauseMillis); tune only after addressing leaks/bad allocation patterns.
  • Memory leak symptoms: rising old gen, frequent full GCs reclaiming little, eventual OOM; fix requires code changes, not just GC flags.
  • In interviews, emphasize why generational GC exists, how GC pauses impact latency, and how different collectors and flags trade off throughput vs pause times.

Leave a comment