Friday, April 13, 2007

Garbage Collection...

This is one topic which most of you know about. I was once asked by my senior to read about this.. And yes, i did read about that.

The Java Virtual Machine has 2 primary jobs:
# Execute Code
# Manage Memory

Memory Management involves 3 mail goals

# Allocate Memory from OS
# Manage Java Allocations
# Remove Garbage Objects

Garbage Collection involves around Memory management and it is one major advantage of Java over C or C++. One of our professors were very particular about freeing every memory used in C++ and I am happy that in Java, no need to care about it..
This is the second good reason for shifting to JAVA from C++. Not to forget the first reason "no pointers in java".

Memory allocation in Java is done in the heap, I mean there is one heap and objects are all allocated memory from this heap.

An object is created in the heap and is garbage-collected after there are no more references to it. Objects cannot be reclaimed or freed by explicit language directives.

Objects become garbage when there are no more references to the object.So, how do u say an object has no more references.

If no active threads have reference to this object, It is said eligible for GC.(All programs are said to me executed as threads and it starts executing from the main method.)

The garbage collector checks to see if there are any objects in the heap that are no longer being used by the application. If such objects exist, then the memory used by these objects can be reclaimed. (If no more memory is available for the heap, then the new operator throws an OutOfMemoryException.)

There are several GC algorithms in use today. Each algorithm is fine-tuned for a particular environment in order to provide the best performance. Let us see one of them.

Mark, Sweep and compact Algorithm.

Quite simple
# Mark: identify garbage
# Sweep: Find garbage on heap, de-allocate it
# Compact: collect all empty memory together

Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null. For example, all the global and static object pointers in an application are considered part of the application's roots. In addition, any local variable/parameter object pointers on a thread's stack are considered part of the application's roots. Finally, any CPU registers containing pointers to objects in the managed heap are also considered part of the application's roots. The list of active roots is maintained and is made accessible to the garbage collector's algorithm.



When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap. Now, the garbage collector starts walking the roots and building a graph of all objects reachable from the roots. For example, the garbage collector may locate a global variable that points to an object in the heap.

The collector continues to walk through all reachable objects recursively.

Once this part of the graph is complete, the garbage collector checks the next root and walks the objects again. As the garbage collector walks from object to object, if it attempts to add an object to the graph that it previously added, then the garbage collector can stop walking down that path. This serves two purposes. First, it helps performance significantly since it doesn't walk through a set of objects more than once. Second, it prevents infinite loops should you have any circular linked lists of objects.

Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage. The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory (using the standard memcpy function that you've known for years), removing all of the gaps in the heap.


There are many other Garbage collection algorithms too..

But what we need to know about them is
1) We never know when they get executed.
When the memory goes low, it is executed. We cannot manually start it. We can ask the compiler to do it but we cannot demand it to do it. It means when we ask it execute GC , it may or may not do it.

To ask the compiler.... use ,

System.gc();// call the static function

or

Runtime rt = new Runtime();
rt.gc();

2) We can manually make the object eligible to be Garbage Collected.. by pointing it to null

{
int i=5;
System.out.println(i);
i=null;
}
After the function of the object or variable is over , one can make it to point to null to make it eligible to be Garbage Collected.

3) It doesn't Guarantee to provide enough space for your application.
When you go out of memory u cant rely on GC to remove all garbage and give u enough memory to run your application. It will remove only what it feels as Garbage , even after removing the garbage you might run out of memory.

No comments: