Java
Java is a general-purpose, high-level, object-oriented programming language developed in 1995 by Sun Microsystems (purchased by Oracle in 2010). One of the principal design goals for Java was a high degree of portability across platforms, summarized by the slogan write once, run anywhere, and which is realized by having Java source code compiled to 'byte code' which then runs inside a Java virtual machine (JVM), ensuring a very uniform environment across numerous architectures and platforms. This has made Java a popular language choice in some environments and it is also widely used as a language for teaching programming. While performance was not one of the original design goals for Java, there are ways to help Java code run quickly and it has enjoyed a certain popularity in some scientific domains such as the life sciences, e.g. software like the Broad Institute's GATK. This page is not designed to teach the Java programming language but merely to provide some tips and hints for the use of Java in a high-performance computing environment such as Compute Canada.
Compute Canada's systems have several different Java virtual machines installed which are made available to users via the module command like other software packages. You should normally only have one Java module loaded at a time. The principal commands associated with such Java modules are java to launch the Java virtual machine and javac to call the Java compiler for converting a Java source file into byte code.
Parallelism in Java[edit]
Threading[edit]
Java includes built-in support for threading, obviating the need for separate interfaces and libraries like OpenMP, pthreads and Boost threads used in other languages. The principal Java object for handling concurrency is the Thread class which a programmer can use by either providing a Runnable method to the standard Thread class or by subclassing the Thread class. As an example of this second approach, consider the following toy program:
public class HelloWorld extends Thread {
public void run() {
System.out.println("Hello World!");
}
public static void main(String args[]) {
(new HelloWorld()).start();
}
}
This second approach is generally the simplest to use but suffers from the drawback that Java does not permit multiple inheritance, so the class which implements multithreading wouldn't be able to subclass any other, potentially more useful, class.
MPI and Java[edit]
One common method for using MPI-style parallelism in a Java program is the MPJ Express library.
Pitfalls[edit]
Memory Issues[edit]
Java uses an automatic system called garbage collection to identify variables which are out of scope and return the memory associated with them to the operating system which however doesn't stop many Java programs from requiring significant amounts of memory to run correctly. When a Java virtual machine is launched using the java command a default amount of the system memory is allocated to this virtual machine which may be inadequate. To correct this problem, you can tell the Java virtual machine the maximum amount of memory to use with the command line argument Xmx, for instance
[name@server ~]$ java -Xmx4096m file.jar
tells the Java virtual machine that it can use up to 4096 MB of memory.
The volatile Keyword[edit]
This keyword has a sense very different from that used in C/C++. In Java volatile when applied to a variable has the effect of ensuring that its value is always read from and written to main memory.
Further Reading[edit]
Scott Oaks and Henry Wong, Java Threads: Understanding and Mastering Concurrent Programming (3rd edition) (O'Reilly, 2012)