Garbage Collection Internals (Part 3 – Java)

In Part 1 and Par 2 we saw that there is no Garbage Collection at language level in C and C++, even though we can use “smart” pointers in C++ to carry out deallocation when it goes out of scope.

Languages which came after C/C++ understood the problems of memory leaks and developers putting substantial efforts in managing memory that they implemented Garbage Collection at the platform/VM level.

Java equivalent of earlier C++ program
/*
 * File:      GCJava.java
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 * 
 * Revision History:
 * 2020-Jun-21	[SV]: Created
 */

class AllocatedResource {
    private int data;

    public AllocatedResource(int x) {
        this.data = x;
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Deallocating " + this.data);
    }
}

public class GCJava {
    public static void main(String[] args) {
        AllocatedResource resource = new AllocatedResource(5);
        System.out.println("End of program");
    }
}
And here is the output
End of program

That’s it!!
If we were expecting the finalize() method to blurt out something, well, it didn’t even get called. Not even when the program came to an end. In fact, there is no gaurantee that finalize will ever get called. Morover, starting from Java version 9, it has been deprecated.

Let’s make our code put some pressure on GC and create thousands of objects and see what happens

/*
 * File:      GCJava.java
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 * 
 * Revision History:
 * 2020-Jun-21	[SV]: Created
 */

class AllocatedResource {
    private int data;

    public AllocatedResource(int x) {
        this.data = x;
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Deallocating " + this.data);
    }
}

public class GCJava {
    public static void main(String[] args) {
        
        // Put pressure on GC
        for (int i = 0; i < 1000000; i++) {
            AllocatedResource resource = new AllocatedResource(i);
        }
        System.out.println("End of program");
    }
}
And what do we get?
:
Deallocating 822025
Deallocating 822024
Deallocating 822023
Deallocating 822022
Deallocating 822021
Deallocating 822020
Deallocating 822019
Deallocating 651277
Deallocating 822013
Deallocating 822012
Deallocating 822011
Deallocating 822010
:
End of Program

Finally! The Garbage Collector ran… but only when it started to run out of heap because we allocated 1000000 objects.

So, don’t use finalize() because there is no guarantee that it will be called and also its deprecated now.

So how does Java manage Garbage Collection?

Java has multiple heaps – Eden, Survivors, Tenured, Permanent Generation

Java, unlike C/C++, has multiple heaps like Eden, Survivors, Tenured and Permanent generation. It’s stack is also slightly different with operand stack part for computation (not shown in the Concept Visual, maybe topic of another blogpost).

Java Garbage collector works across 4 heaps (Eden, Survivor 1&2 and Tenured) to keep track of “unreachable” objects which it marks-sweeps-moves across these segments.

HeapPurpose
EdenNew object allocations are stored here
Survivor 1After GC, objects from Eden are moved here
Survivor 2After GC, objects from Survivor 1 are moved here
TenuredLong surviving objects are moved here
PermanentNot really object heap but keeps class artefacts

So what, happens when GC runs?
Let’s say we release two rerefences a and d by setting them to null. As a result, objects containing “5” and “8” and now “unrechable”, i.e. no one is pointing to them. So what happens? GC will come running and remove them? NO! From the first example in this post we know that GC doesn’t run as soon as a single object is orphaned. In fact, the GC wont run unless it starts to run low on heap memory. So the object will keep lying around in the heap.

However, let’s assume that the GC runs, then what happens? Well, GC will mark those objects which are still “reachable”, in this case “6” and “7”, move them to Survivor generation (S0) and then clear the Eden heap.

Mark-Sweep and Move From Eden to Survivor 0 heap

As a result, Eden now becomes empty and surviving objects are “compacted” into Survivor 0. Why not let object remain is just one heap and remove those which are unreachable? Well, if they remain where they are, the free space between them will be fragmented and if there is a larger object is to be allocated, we may not have enough “contiguous” free space in the heap. That’s the reason existing objects are “moved” to a different heap and “compacted”.

So what does the Process Map looks like now?

Post GC, object’s are moved and compacted in Survivor 0

Efficient, isn’t it? Periodically, Garbage Collector removes unreachable objects and moves the remaining in a compacted manner in Survivor generation. So why 2 Survivor generations? Survivor 0 and Survivor 1?

Well, if memory gets fragmented in Survivor 0, then is moves and compacts to Survivor 1 and vice versa. This way new (and may shortlived) objects can be allocated in Eden space while long surviving objects osciallte between S0 and S1. Huh? They will oscillate forever? Hehe.. no. If an object survives a threashold, it’s finally promoted to Tenured generation, where it will no longer be moved around (old people, you see :-)).

Here is what it might look in Tenured.

Tenured Generation

An object reaches “Tenured” generation if it survives N GC passes threshold. Interesting, isn’t it?
We have compared non-GC languages like C and library based GC like C++ Smart pointer. Java was one of the first commercial language to bring in Garbage Collection (the other was Python) and .NET followed with its own Garbage Collection system. In fact, it has 8 heaps (compared to 4 heaps of Java) and 9th has been just added to .NET 5. So, another post?

Garbage Collection Internals Series

  1. Part 1 – C language
  2. Part 2 – C++ Language
  3. Part 3 – Java

4 thoughts on “Garbage Collection Internals (Part 3 – Java)

    1. Java has Minor GC (Eden), Major GC (Tenured), Full GC (Entire heap, including method area, and PermaGen).
      However, all GC runs start with Minor GC and scale up if required. So if a MinorGC is triggered, it may move Eden object to S0, S1, which may lead to Survivors being moved to Tenured, which in turn may lead to Tenured being collected (but not compacted).

      So AFAIK (from JVM documentation, it’s a chain reaction starting from MinorGC)

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s