Jason Fager 8/5/10 Jason Fager 8/5/10

Introducing jdiscript

Charles Nutter's recent post on browsing memory with JRuby and the Java Debugger Interface reminded me of my own little project for the JDI, jdiscript ("helping you write elegant scripts for a more civilized debugger"... I'll work on that). I've spent some spare time over the last week or so dusting it off to get it from the simple handler class I've used for one-off tasks in the past to something more generally useful and enjoyable. It still has a ways to go, but I thought I would go ahead and share the basic features and show a quick script that hints at how powerful JDI scripting can be.

What you get from the JDI is a set of tools for working with VMs. A JDI program can start a new VM, attach to a running VM, be attached to by a VM, or be spawned from a VM when an exception occurs. Once you have a hook on one or more running VMs, you can then begin inspecting and controlling them. Inspection occurs via Mirrors representing the remote VM's live objects, classes, threads, and the VM itself, while control can be exerted by manipulating values via a Mirror or by requesting notification of particular events and handling them how you'd like.

The JDI is very powerful, but it can be a bit clunky for writing short, quick scripts, which is where jdiscript comes in. Launching or attaching to a VM is mostly boilerplate, so jdiscript provides utility classes that take care of that for you. While the JDI allows you to easily request event notifications, the process of matching those notifications to code that will do something useful in response is left to the programmer, so jdiscript provides an event loop and a standard way to dispatch events to handlers. There are also some nice minor features in jdiscript, like redefining JDI's EventRequest classes to be chainable. Most of this functionality is exposed through the top-level JDIScript class, which allows you to cut out a bit of code to access common operations. And of course, all of this is intended for use from a higher-level JVM language such as Groovy or JRuby.

So let's get started with an example. One of the more kickass events that we can track with the JDI is the MonitorContendedEnterEvent, which fires whenever a thread tries to enter a monitor that's already held by another thread. Uncontended locks in Java are basically free these days, but contention can still be a performance killer in multithreaded apps, so it's nice to be able to find these spots without a lot of headache so we can eliminate them as far as possible.

Here's a groovy script that prints out a stack trace for any contending threads, and prints another notice when the thread ends up entering the monitor:

package org.jdiscript
import org.jdiscript.handlers.
import org.jdiscript.util.VMSocketAttacher
import com.sun.jdi.
VirtualMachine vm = new VMSocketAttacher(12345).attach()
JDIScript j = new JDIScript(vm)
j.monitorContendedEnterRequest({
    long timestamp = System.currentTimeMillis()
    println "${timestamp}: Contended enter for ${it.monitor()} by ${it.thread()}"
    it.thread().frames().each { println "   " + it }
} as OnMonitorContendedEnter).enable()
j.monitorContendedEnteredRequest({
    long timestamp = System.currentTimeMillis()
    println "${timestamp}: Contended entered for ${it.monitor()} by ${it.thread()}"
} as OnMonitorContendedEntered).enable()
j.run({ println "Got StartEvent" } as OnVMStart)
println "Shutting down"

Let's fire this up against a Cassandra instance to see it in action. I picked Cassandra because it's a production-quality high-concurrency project that's still dirt easy to get up and running from source, and it already includes a python script for stress testing (you can grab a copy of the 0.6.4 release, the version used for this post, to try it for yourself).

To start, copy bin/cassandra to bin/cassandra-dbg and add '-agentlib:jdwp=transport=dt_socket,server=y,address=12345,suspend=y' to the exec command that starts the server, so that the JVM will wait for the debugging script to attach before it starts. Once it's up and waiting, fire off the groovy script. You should see the start event print out, followed by a bunch of stack traces. Most of these are just ReferenceQueue activity, and any others we don't really care about right now (it's just startup, so as long as there's no deadlocking, we're probably fine with a bit more contention).

Once Cassandra is up and stable, fire up contrib/py_stress/stress.py (you'll need to have thrift installed and the Cassandra python bindings generated). Here's what comes up for my run:

1281044297880: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.security.Provider:663 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   sun.security.jca.ProviderList:314 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   sun.security.jca.GetInstance:140 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.security.Security:659 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.security.MessageDigest:129 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.utils.FBUtilities:239 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.utils.FBUtilities:229 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.dht.RandomPartitioner:118 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.dht.RandomPartitioner:44 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.db.Memtable:124 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.db.Memtable:116 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.db.ColumnFamilyStore:434 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.db.Table:407 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.db.RowMutation:200 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.service.StorageProxy$2:138 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   org.apache.cassandra.utils.WrappedRunnable:30 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.util.concurrent.ThreadPoolExecutor$Worker:886 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.util.concurrent.ThreadPoolExecutor$Worker:908 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
   java.lang.Thread:637 in thread instance of java.lang.Thread(name='ROW-MUTATION-STAGE:11', id=1784)
(Subsequent identical traces elided)
1281044297896: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:13', id=1785)
1281044297903: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:12', id=1786)
1281044297912: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:15', id=1787)
1281044297927: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:16', id=1788)
1281044297935: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:17', id=1789)
1281044297940: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:14', id=1790)
1281044297945: Contended enter for instance of sun.security.provider.Sun(id=1783) by instance of java.lang.Thread(name='ROW-MUTATION-STAGE:18', id=1791)

Pretty quickly, 8 threads back up waiting on a synchronized method deep down in the creation of a new MessageDigest which will be used to get an md5 hash of the row key. This is pretty easy to "fix" (though honestly it probably never caused anyone a problem): just preallocate some md5 MessageDigest instances and stuff them in threadlocals.

So there we go. A few lines of code, and we found an avoidable contention point without opening a profiler or even knowing anything ahead of time about the underlying codebase. Awesome.

I'll post more on troubleshooting with JDI in the future, including more examples. In the meantime, I'd love to get some feedback on jdiscript; feel free to email me (jfager at gmail), create an issue on github, or join jdiscript's new group. I've also thrown up a wiki page for collecting jdiscripts, so if you put together something useful, please share it.

Jason Fager 7/26/10 Jason Fager 7/26/10

Lexi-Sortable Number Strings

When working with key-value databases, it's common practice to structure key strings to take advantage of database-specific features. With Cassandra, for instance, you can perform range queries over lexicographically sorted row keys. As an example of how this feature might be used, imagine an index consisting of rowkeys representing attributes like "population" or "region" that map to cities:

     
  "population/0389625": "Tulsa"       
  "population/0815358": "San Francisco"       
  "population/2851268": "Chicago"     
  "population/3831868": "Los Angeles"     
  "population/8391881": "New York"     
  "region/central": "Chicago", "Tulsa"     
  "region/east": "New York"     
  "region/west": "Los Angeles", "San Francisco"

To find cities with at least 1 million people, we would issue a range query from "population/1000000" to "population/9999999". We had to pad out our numbers with leading zeroes to get a fixed-width, but that's easy enough, and this approach works fine for positive integers. But how do we deal with negative integers, or floating point numbers? We'll focus on Java longs and doubles, but the code should be trivially modifiable to work with ints and floats, and the ideas should translate easily to most languages. First up, negative numbers. We can't simply concatenate a negative number to a string that's supposed to sort lexicographically, because larger negative numbers sort as smaller strings:

    "-4"     "-7"     "04"     "07"

In Java, longs are specified as signed two's complement 64 bit integers, meaning that if we treat a long's underlying bits as a string, we would solve this problem at the cost of storing negatives after positives (using 4 bits for demo purposes):

    0100 (4)     0111 (7)     1001 (-7)     1100 (-4)

But this is easy to fix: just flip the sign bit:

    0001 (-7)     0100 (-4)     1100 (4)     1111 (7)

When we use this method in production, we'll want to do two more things: represent the number in hex to avoid ridiculously long keys, and prepend a type token to remind us that a given hex string is a long. Here's the code:
What about floating point? The most common representation of floating point values, and the one that you're almost certainly using, is standardized by IEEE 754. Conveniently enough, IEEE 754 is designed such that:

if two floating-point numbers in the same format are ordered (say x < y), then they are ordered the same way when their bits are reinterpreted as Sign-Magnitude integers.

Said another way, just like with longs, if we just treat the underlying bits of IEEE 754 floating point numbers as strings, we'll get proper sorting with only minor complications. What are the new minor complications? A sign-magnitude integer uses its highest bit to signal signedness, and the remainder of its bits to signal the magnitude of the number:

     0100 (4)     0111 (7)     1100 (-4)     1111 (-7)

We see that positive numbers sort correctly, but negative numbers have some problems - they sort high-to-low, and they sort after positive numbers. To solve the latter problem, we'll again flip the sign bit for all numbers:

    0100 (-4)     0111 (-7)     1100 (4)     1111 (7)

And to make negative numbers sort low-to-high, we'll flip all their other bits, as well:

    0000 (-7)     0011 (-4)     1100 (4)     1111 (7)

Here's the code, building on the utility methods defined earlier:
You can also check out the full code snippet, which includes tests. Now that we can store floating point attributes, our city index can look like:

     
  "area/dC06759999999999A": "Tulsa"     
  "area/dC06CFD70A3D70A3D": "San Francisco"
  "area/dC06D400000000000": "Chicago"     
  "area/dC07D4E6666666666": "New York"     
  "area/dC07F24CCCCCCCCCD": "Los Angeles" 
  "population/l800000000005F1F9": "Tulsa"
  "population/l80000000000C70FE": "San Francisco"       
  "population/l80000000002B81C4": "Chicago"     
  "population/l80000000003A783C": "Los Angeles"     
  "population/l8000000000800CC9": "New York"     
  "region/central": "Chicago", "Tulsa"     
  "region/east": "New York"     
  "region/west": "Los Angeles", "San Francisco"

There are a couple of areas left for improvement. First, hex was chosen because it's easy to work with, but it produces strings that are longer than necessary. If the encoding were changed to Base 64, only 10 characters per key would be needed, and if the keys in the system we're using can be given as raw byte arrays, we can get away with 8 bytes + n bits (where n varies based on how many types you want to handle). Second, this doesn't even pretend to work with arbitrary-precision integers or decimals. After typing this post up, I found an implementation of these ideas that does extend to BigDecimal values, available as part of Lily CMS, as described by Bruno Dumon. You can also take a look at Lucene's NumericUtils class, though you should take care to note that the long returned from the doubleToSortableLong method is not suitable for direct translation to a lexicographically sortable string, as it does not alter the sign bit.