Archive

Posts Tagged ‘jvm’

Garbage collection of the permanent generation (permgen)

November 1st, 2009

There is often confusion around what the permanent generation contains and given its name, whether or not it can be garbage collected. To cut a long story short, the permanent generation can be garbage collected and is where reflective class meta data, as well as string constants are stored.

This blog entry takes a practical approach to answer the following questions:

  • Can the permanent generation be garbage collected?
  • Does the value of -Xmx (the maximum heap size) include the permanent generation?

Can the permanent generation be garbage collected?

Yes it can. There is often the misconception that string constants cannot be garbage collected. However the only requirement that the JVM specification stipulates is that identity comparison works for constant string values. There is no requirement that the constant be the same object throughout the lifetime of the JVM. Hence if there are no references left to a string constant, there is no reason why it cannot be garbage collected.

To prove this take the following code:

 
public class PermGenDemo {
  public static void main(String[] args) {
    int i = 0;
    while (true) {
      ("string-" + ++i).intern();
    }
  }
}

If the above code is run using the following command:

 
$ java -verbose:gc -XX:PermSize=8m -XX:MaxPermSize=64m PermGenDemo

The following output will be seen (note output cut for brevity):

 
[Full GC [PSYoungGen: 32K->0K(28928K)] . . . [PSPermGen: 65535K->6461K(65536K)]]
[Full GC [PSYoungGen: 32K->0K(31168K)] . . . [PSPermGen: 65535K->6461K(65536K)]]

The above conclusively shows that the permanent generation is collected and when it requires collecting, triggers a full collection.

Does the value of -Xmx (the maximum heap size) include the permanent generation?

No it does not. The size of the permanent generation is controlled by the following JVM options:

* -XX:MaxPermSize controls the maximum permanent generation size.
* -XX:PermSize controls the initial permanent generation size.

The permanent generation is like any other generational compartment: it can grow to a maximum, technically can shrink and can also be garbage collected.

To demonstrate that the permanent generation is not included in the maximum heap settings, pmap can be used (e.g. by running pmap <pid> | tail -1).

The following table illustrates the output of pmap using different JVM heap and permanent generation settings:

JVM options pmap output
java -Xms256m -Xmx256m -XX:PermSize=64m -XX:MaxPermSize=64m 512204K
java -Xms512m -Xmx512m -XX:PermSize=64m -XX:MaxPermSize=64m 760556K
java -Xms512m -Xmx512m -XX:PermSize=128m -XX:MaxPermSize=128m 825188K

In the above -Xms=-Xmx and -XX:PermSize=-XX:MaxPermSize are set deliberately to stop the heap expanding and ensure constant results. The first value of 512204K gives us a base value which takes into account any overhead introduced by native libraries (including the jvm itself) being mapped into memory. The second value of 760556K is an exact increase of 256m, which is exactly how much the heap was increased by. The third value of 825188K is an exact increase of 64m, which is exactly how much the permanent generation was increased by.

The above results clearly show that the settings for permgen are unrelated to that of the heap.

Conclusion

  • The permanent generation can be garbage collected like other generations.
  • The permanent generation is additional memory to the java heap.
  • -XX:MaxPermSize controls the maximum memory used by the permanent generation.
  • -XX:PermSize controls the initial memory used by the permanent generation.

Resources

  1. Java Hotspot VM Options
  2. Presenting the Permanent Generation

java , ,

The subtleties of overriding package private methods

May 3rd, 2009

The Java Language comes with several modifiers for controlling access to methods: public, private, protected and package private.

In general we assume the following rules govern these access modifiers:

  • A public method can be accessed by any other method.
  • A private method can only be accessed by the class that it is declared in.
  • A protected method can be accessed by any class in the same package or any subclass.
  • A package private method can be accessed by any class in the same package.

The above would be accepted as suitable answers at most interviews, however there are some subtleties seen in the behaviour of package private that cannot be explained using this simple definition alone.

Take the following piece of code that represents a Square:

 
package uk.co.cooljeff.access;
public class Square {
  private float length;
  public Square(float length) {
    this.length = length;
  }
  public float calculateArea() {
    return length * length;
  }
  String getColour() {
    return "Red";
  }
}

Take the following piece of code that represents a CustomSquare:

 
package uk.co.cooljeff.access;
import uk.co.cooljeff.access.Square;
public class CustomSquare extends Square {
  public CustomSquare() {
    super(5);
  }
  @Override
  String getColour() {
    return "Blue";
  }
}

From the simple definition of package private above, we would expect the following piece of code to print out Blue as the colour of CustomSquare:

 
public class Printer {
  public void print(Square square) {
    System.out.println("Square of type " + square.getClass()
                                   + " has colour " + square.getColour());
  }
  public static void main(String[] args) {
    Printer printer = new Printer();
    printer.print(new CustomSquare());
  }
}

For the majority of people who run the above code, indeed Blue is printed. However for one of my colleagues at work who tried to do something similar, Red was always observed.

To try to explain why Red is sometimes seen, we need to understand what the JVM is trying to do. If you decompile the Printer class you will see that the invokevirutal instruction is used to invoke the Square.getColour() method:

 
public void print(uk.co.cooljeff.access.Square);
  Code:
   25:	invokevirtual	#59; //Method uk/co/cooljeff/access/Square.getColour:()Ljava/lang/String;
   37:	return

For virtual methods, there is a lookup algorithm which is used to locate the exact method to invoke. Specifically the following is given for the invokevirtual instruction (paraphrasing):

Let C be the class of the target of the method invocation. The actual method to be invoked is selected by the following lookup procedure:

  • If C contains a declaration for an instance method M with the same name and descriptor as the resolved method, and the resolved method is accessible from C, then M is the method to be invoked, and the lookup procedure terminates.

Hence using this definition, I would expect that CustomSquare.getColour() would be considered since it matches the same name and descriptor as Square.getColour(). The question then becomes is this method accessible from Printer. To answer this we need to look at the Access Control (5.4.4) definition in the JVM Specification which states (paraphrasing):

A field or method R is accessible to a class or interface D if and only if any of the following conditions are true:

  • R is package private and is declared by a class in the same runtime package as D.

There is a subtlety here in that the phrase runtime package is used. At runtime a class is uniquely defined by its fully qualified name and its ClassLoader. The runtime package takes into account not only the compile time package but also the ClassLoader that loaded the class.

Taking the fact that the ClassLoader could have something to do with why Red is sometimes observed, lets redefine our Printer to load the CustomSquare in a different ClassLoader:

 
package uk.co.cooljeff.access;
import java.net.URL;
import java.net.URLClassLoader;
public class Printer {
  public void print(Square square) {
    System.out.println("Square of type " + square.getClass() + " has colour " + square.getColour());
  }
  public static void main(String[] args) throws Exception {
    // Path to a jar or classes containing CustomSquare.
    String customSquareURL = System.getProperty("customsquare.classpath");
    URL[] urls = new URL[] { new URL( customSquareURL ) };
    URLClassLoader loader = new URLClassLoader(urls);
    Class clazz = loader.loadClass("uk.co.cooljeff.access.CustomSquare");
    Printer printer = new Printer();
    printer.print((Square) clazz.newInstance());
  }
}

When CustomSquare is loaded by a different ClassLoader, the output is Red, which is what my colleague observed.

At this stage, although the behaviour is reproducible, it cannot be explained using the 2nd Edition of the JVM Specification alone. The specification clearly states in the definition of method resolution, that an IllegalAccessError should be thrown if the method is not accessible and the definition of accessibility clearly states that CustomSquare is not accessible. So why do we not get an IllegalAccessError in either Sun’s HotSpot JVM or IBM’s JVM?

The behaviour made sense to me because it provides a way to prevent hacks such as the one my colleague was attempting to do, however I could not prove it using the JVM Specification. It was time to call on the experts so I started a mailthread titled invokevirtual on package private override in different classloader on the hotspot-dev mailing list.

As it turns out there was an amendment made to the 2nd Edition of the JVM Specification relating to this behaviour which revises the definition of invokevirtual. Specifically it replaces the accessibility constraint in favour for an override constraint. Hence to explain the behaviour we don’t look at whether CustomSquare.getColour() is accessible, we need to determine if it overrides Square.getColour().

The answer to the override question is no, which explains why the behaviour is observed. However as pointed out on the HotSpot mail thread, I was not happy that the amendment to the JVM specification refered to a language definition (namely override). At the moment in order to explain the behaviour we need to use the revised definition of override in the Java Language Specification. This goes against the fundamental principle that the Java Virtual Machine Specification is decoupled from the Java Language Specification. This was acknowledged on the mailthread: Alex Buckley replied informing me that the 3rd edition of the JVM specification is independent of the Language specification. Specifically the 3rd edition (not publicly available yet) drops the chapter on the Java Programming Language Concepts entirely and gains a JVM specific definition of overrides.

So I’m happy now, I can explain the behaviour :)

I’d like to thank the following people who helped me understand the behaviour discussed in this blog entry: Karen Kinnear (Sun Microsystems), David Holmes (Sun Microsystems) and Alex Buckley (Sun Microsystems).

Resources:

java

Detecting all running JVMs

April 5th, 2009

Several well known JVM tools appear to be able to magically detect the Java processes running on a system. Good examples of this are: jps and Visual VM. As of Java 6 it is even possible to find out yourself which Java processes are running on the local machine that you have access to using the Dynamic Attach API. These APIs can be found in the tools.jar (JAVA_HOME/lib/tools.jar).

The following code illustrates how the Dynamic Attach API can be used to list all of the local virtual machine identifiers (which for the HotSpot VM map onto the JVM PID):

 
package uk.co.cooljeff.dynamicattach;
import java.util.List;
import com.sun.tools.attach.VirtualMachine;
import com.sun.tools.attach.VirtualMachineDescriptor;
public class JVMFinder {
  public static void main(String[] args) {
    List<VirtualMachineDescriptor> vmDescriptors = VirtualMachine.list();
    for (VirtualMachineDescriptor vmDescriptor : vmDescriptors) {
      System.out.println("Name: " + vmDescriptor.displayName()
                                                    + " PID: " + vmDescriptor.id());
    }
  }
}

The above gives the following output on my machine (cut out a few for brevity):

 
Name: uk.co.cooljeff.dynamicattach.JVMFinder PID: 21116

I’m about to spoil the fun now for those of you who were hoping to see some wacky OS dependent mechanism or hidden RMI registry for finding this information. The solution is quite simple, the JVM dumps a file containing performance data to a standard location which it uses to work out what JVMs are running on a host.

Specifically the performance data is for jvmstat, however it is this file that forms the basis of the Dynamic Attach implementation that HotSpot uses.

The following piece of code replicates what the LocalVmManager (part of jvmstat) does to locate running Java processes:

 
package uk.co.cooljeff.dynamicattach;
import java.io.File;
import java.io.FilenameFilter;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import sun.jvmstat.perfdata.monitor.protocol.local.PerfDataFile;
public class JVMDataLocation {
  public static void main(String[] args) throws Exception {
    final Pattern filePattern = Pattern.compile(PerfDataFile.userDirNamePattern);
    final Matcher fileMatcher = filePattern.matcher("");
    FilenameFilter fileFilter = new FilenameFilter() {
      public boolean accept(File dir, String name) {
        fileMatcher.reset(name);
        return fileMatcher.matches();
      }
    };
    File[] files = new File(PerfDataFile.getTempDirectory()).listFiles(fileFilter);
    for (File file : files) {
      System.out.println("PerfDataDir: " + file);
    }
  }
}

Which gives the following on my machine:

 
PerfDataDir: /tmp/hsperfdata_root
PerfDataDir: /tmp/hsperfdata_tomcat55
PerfDataDir: /tmp/hsperfdata_jeffsinc

If you take a look at one of these directories you will find 1 file per Java process running by the specific user. The name of the file is the local virtual machine identifier (lvmid) which for HotSpot will correspond to the process id. If you run strings on the file you will see all of the properties relating to that JVM instance.

It is as simple as that, on Linux the JVM simply looks for all files that match: /tmp/hsperfdata_*/*.

Resources:

java , ,