Patty architecture.

Architecture and packages
What the problems are that Patty attempts to solve
How the problems are solved
How Patty works

Architecture and packages

The Patty project uses the JVMTI interface from Sun in JDK1.5.0 onwards.

The library on Windows is built using VC++ 6.0. Eclipse is used to compile the Java classes. The library on other flavours is built using gcc and a simple Makefile.

A web GUI runs on Tomcat 5.5 that collects and analyzes the results. Users can invoke commands on the agent library through links on the page.

What the problems are that Patty attempts to solve

Patty is intending to solve the problem of running a profiler inside the same JVM, which has an enormous impact on the performance of the profiled application.

The perceived problems where CPU cycles are lost ( and therefore not contributed to bytecode execution ), are:

The high amount of timing data that has to be stored in the JVM. Neither JVMTI, JVMPI or JVMDI can efficiently select methods to time, it has to select this from memory from a global array. This happens for each method that is executed in the JVM, causing lost time in context switching from Java to native and the native code to lookup the needed timing data in a native map somewhere.
The timing data in JVMDI/JVMTI is either on or off. It can not be used to only hook into certain classes. It always works for all objects active in the JVM.
In order to protect the shared resources (the timing data lookup map), the profiler needs to create data monitors when performance information is written. This very likely completely serializes the execution of different threads.
Code coverage has the same kind of performance penalty, since the hook that receives single step events only receives a method ID and a location identifier.

The above problems are affecting the performance of an application during a profiling run. In a J2EE container for example, a transaction timeout may occur when profiling is turned on, and not occur when it is running normally. The application may also be subject to race conditions that do not occur in the profiling run, but do in a normal run, or vice-versa.

How the problems are solved

Patty simply sends the profiling data events over a network link to another computer. Since this other computer does not run the Java application, it has all the resources available to find the data structures in memory, where the events are collected and aggregated as they happen. Java also has the benefit that the collections are easier to program and that synchronization of data resources happens on the object instance itself, rather than on the collection overall (how this is normally programmed in native code).

This means that almost no information is stored in the agent library itself. It does need some memory per class that is loaded in order to run certain profiling functions, but this is much lower than other profilers.

The large amount of methods that are normally analyzed (as methods/classes cannot be selected in JVMTI code) is reduced by class instrumentation. The bytecode instrumentation code is done in native code ( not BCEL or other libraries ). That code was created by "Kelly O'Hair" (thanks Kelly!) and has been added as "demo" code in the JDK 1.5.0. The code was added to Patty unchanged (yes, I'm mad! :).

The bytecode instrumentation inserts instructions at every method entry/return of the class that is instrumented. The inserted Java code calls the native "Patty.method_entry/exit" with its identifier. This times the execution using a native accurate timer and immediately sends the execution data line to the data collector.

How Patty works

The JVMTI interface of Java allows a native library to register itself in the VM and interfaces with the VM to request event notifications. Callback functions declared in the agent library are used to receive these events.

Patty gets the configuration line (startup command) from the VM, configures itself according to the items specified and the VM continues to run. A host must be specified that Patty connects to. This is the "drill" website. When a thread starts up, it also connects itself to the same socket, so that for every thread there is a separate socket connection with "drill". This is done to reduce thread contention and raw monitor protection blocks in the native library, so that the Java code executes faster and does not get serialized anywhere.

Patty uses the controlport to open a port on its own machine as well. This port configuration is forwarded to "drill", where the drill web application connects to. "Drill" only connects to one Patty configuration at any one time and when Patty starts up somewhere and connects to the web app, "drill" cleans up existing data and connects back to Patty. It is not necessary to restart the web app for each run.

When classes are loaded into Patty, these are also dumped as a regular file to the "patty_uninstrumented" subdirectory, which is created if it does not exist in the current working directory. That directory is used later to be able to deinstrument and instrument classes.

Patty creates a log file where it writes messages or errors. This is the "patty_agent.log" file. This file is written in the current working directory.

Heap analysis is done through certain JVMTI iteration functions. The size reported for each object is an estimate only and may not be very accurate.