wiki:JavaParty

JavaParty Features

Transparent remote objects

JavaParty allows easy porting of multi-threaded Java programs to distributed environments such as clusters. Regular Java already supports parallel applications with threads and synchronization mechanisms. While multi-threaded Java programs are limited to a single address space, JavaParty extends the capabilities of Java to distributed computing environments.

The normal way of porting a parallel application to a distributed environment is the use of a communication library. Java's Remote Method Invocation (RMI) renders the implementation of communication protocols unnecessary, but still leads to increased program complexity. The reasons for increased complexity are the limited RMI capabilities and additional functionality that must be implemented for creation and access of remote objects.

The JavaParty approach is different. JavaParty classes can be declared as remote. While regular Java classes are limited to one Java virtual machine, remote classes and their instances are visible and accessible anywhere in the distributed JavaParty environment. As far as remote classes are concerned, the JavaParty environment can be viewed as a Java virtual machine that is distributed over several computers.

The access and the creation of remote classes is syntactically indistinguishable from regular Java classes.

Object creation

Instances are created anywhere in the environment with the new statement and are immediately accessible from the entire JavaParty environment without explicitly exporting them or binding them to a name in a registry as in RMI.

Instance access

Methods and fields of remote objects can be accessed as if they were local Java objects. No additional exceptions need to be handled, because JavaParty is targeted towards cluster computing where network outages and failure of single computing nodes are rare. In the event of failure, either the system software of a cluster is expected to recover by resetting to the last checkpoint, or the complete application fails and terminates.

Class access

Just as regular Java classes, remote classes also have a runtime representation in the distributed environment and are accessible from the application with the same constructs as in standard Java. Classes are loaded exactly once in the distributed environment. The state of a class consists of all its static variables. These static variables and all static methods of the class can be accessed as in regular Java.

Object mobility

Location transparency ends where performance considerations come into play. Even in state-of-the-art distributed systems (KaRMI, uka.transport, ParaStation), the access latency to a remote object is orders of magnitude slower than to a local object. A parallel program executed in a distributed JavaParty environment therefore can only utilize the full power of the parallel machine, if communication is minimized. Object migration is one way of adapting the distribution layout to changing locality requirements of the application. Manual object placement is also possible. For details see the JavaParty syntax section later on.

Unless remote objects are declared to be resident they can migrate from one node to another within the distributed JavaParty environment. This is an important capability even if remote objects are syntactically indistinguishable from regular Java objects, and their location within the distributed environment does not influence the semantics of a JavaParty program.

Replicated Objects

Besides optimizing parallel distributed read access, collective replicated objects are a new objectoriented way of expressing data-parallel operations in the bulk-synchronous model. Data-parallel algorithms are widespread in high-performance computing applications, because they can reach a high degree of parallelism and are able to process large amounts of data. Collective replication is a new form of object replication, which allows a seamless integration of control and data parallelism in an object-oriented language. This is an important contribution, since control parallelism was believed to match object-orientation, while data parallelism was restricted to array structures in procedural languages with explicit message passing. Collective replication integrates data parallelism into an object oriented language without conflicting with inheritance, modularization or encapsulation.

All language extensions are automatically transformed back to pure Java. This ensures full portability of the generated code. In the extended language, all concepts like easy object orientation, garbage collection, built-in parallelism and coordination, which made Java popular, are still usable in the distributed environment. This enables easy programming of cluster computers without abstaining from the comfort of an object-oriented language. A prototype shows that the extensions make porting of a parallel application to a distributed cluster environment particularly easy. If there is a data-parallel decomposition of the problem, collective replication even eases the distributed parallelization of sequential programs. With collective replication, the parallelization does not require modifications to the algorithm itself. Guided through an annotation, the data structures are automatically transformed to provide operations for reestablishing consistency after a data-parallel modification. There is no need for any additional coding. The transformation is based on a library for extended remote method invocation for cluster computers which provides communication primitives for collective replication and transparent remote access. Both enable the efficient execution of computing-intensive application in clusters.

Last modified 6 years ago Last modified on Oct 27, 2009 6:25:49 PM