torsdag, oktober 19, 2006

The JRuby Tutorial #4: Writing Java extensions for JRuby

There are many reasons to write a Java extensions for JRuby. Maybe your favorite Ruby library hasn't been ported to JRuby yet, or you want to directly interface with some Java code without going through JRuby's Java interface. Maybe you need the speed from doing calculations in Java, or you just want to add missing functionality. Whatever the reason, writing extensions for JRuby can be tricky if you don't know how the internals of JRuby work. The purpose of this tutorial is to show how to build a simple extension the exercises many parts of the Ruby language and how to implement this with Java.

The example will be a module called Sequence with one class inside it called Sequence. Whenever I create something as a Java extension, I usually write functional Ruby code for doing it first, to get the structure of the code straight in my head. So, without further ado, here is the Sequence module:
module Sequence
def self.fibonacci(to=20)
Sequence.new(1,1,1..to)
end

def self.lucas(to=20)
Sequence.new(1,3,1..to)
end

class Sequence
include Enumerable
attr_reader :n1,:n2,:range
def initialize(n1,n2,range)
@n1, @n2, @range = n1,n2,range
regenerate
end
%w(n1 n2 range).each do |n|
define_method(n) do |v|
send("#{n}=",v)
regenerate
end
end
def regenerate
@value = []
v1, v2 = @n1, @n2
@value << v1 if @range === 1
@value << v2 if @range === 2
3.upto(@range.last) do |i|
v1, v2 = v2, v1+v2
@value << v2 if @range === i
end
nil
end
def [](ix)
@range = ix..(@range.last) if ix < @range.first
@range = (@range.first)..(ix+1) if ix > @range.last
regenerate
@value[ix-@range.first]
end
def each(&b)
@value.each(&b)
end
def to_a
@value
end
def to_s
@value.to_s
end
def inspect
"#<Sequence::Sequence n1=#@n1 n2=#@n2 range=#@range value=#{@value.inspect}>"
end
end
end

Interfacing with the JRuby runtime

There are a few different ways to write extensions for JRuby. The difference isn't big from a functional viewpoint, but there is a definite gap in usability. I call the two major ways to implement an extension the MetaClass way, and the MRI way. The MetaClass subclasses the Java class that represent a Ruby class, called RubyClass, and implements some meta information methods and classes. The MRI way, in contrast, just creates the Ruby class in code, and adds methods to it in some static initializer. This tutorial will use the MRI way for two reasons; first, it's easier and doesn't require so many files and classes, and second, when porting MRI C extensions, the MetaClass way doesn't map very well to how MRI does things.

Project setup

To make the extension building as simple as possible, it helps to follow a few conventions. First of all, I'm going to call the extension "fib". I want my potential users to be able to require 'fib' and get all the good Sequence-functionality. To achieve this there are two things to keep in mind. First, the jar-file should be called fib.jar and put somewhere in JRuby's load path. Secondly, there should be a class called FibService that implements the BasicLibraryService
interface. For our purposes, FibService.java will contain all functionality, but in a realistic situation is makes sense to extract the functionality and let the library loader just set up the
environment. The skeleton for my FibService.java will look like this:
import java.io.IOException;

import org.jruby.IRuby;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
return true;
}
}

At this point the only imports needed are for IRuby, which is the main interface for the JRuby runtime, and the BasicLibraryService which provides the basicLoad method. The return value specifies if the service was loaded correctly or not.

Basic structure

I will start by adding the basic structure for our code; the Sequence module and class:
import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyModule;

import org.jruby.runtime.builtin.IRubyObject;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});
return true;
}
}
What this code does is to establish the Sequence module at the top level, and then define the Sequence class inside this module. We need to specify a super class for it, and this is what the
runtime.getObject()-call is about. Basically it's a shortcut for writing runtime.getClass("Object"). After we have defined the class, make it include Enumerable, and then create attribute readers for the 3 instance variables. Despite the name, newSymbol doesn't necessarily create a new symbol; it returns an existing if there is one.

Singleton methods

We're going to create the singleton factory methods before actually creating the implementation for the class. The new class looks like this:
import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}
}
This code contains a number of new things. First of all, our singleton methods needs implementations. Since we don't need any data associated for these methods, static Java-methods suffice for implementation. A CallbackFactory is used to get a reflection handle at the methods. I use the method call getOptSingletonMethod on the CallbackFactory; this is because the one parameter to the two methods are optional, so the callback factory will look for a static method with signature IRubyObject name(IRubyObject, IRubyObject[]). We'll later see how we
can specify explicit types for method arguments. The recv argument is a specialty for static methods. Usually when working with Ruby instances from Java code, you will have a handle to the runtime implicit in the self, but this isn't possible for static methods. The recv parameter is the instance of RubyModule/RubyClass that the method is called on. In our case this is a handy way of getting hold of the Sequence-module.

All IRubyObject's have checkArgumentCount which is a simple utility method for methods with optional arguments. Basically, it takes an array, the minimum and maximum argument count, and throws a Ruby exception if it isn't correct. It also returns the actual argument count (which is the same as args.length right now). Note, if porting C Ruby code, that this two numeric parameters to checkArgumentCount is NOT the same as rb_scan_args where for example "12" means one required and two optional parameters. The equivalent with checkArgumentCount would be checkArgumentCount(args,1,3).

RubyNumeric has a few utility methods, where fix2int is one of the more useful. It basically allows us translate a Ruby integer into the Java corresponding type.

The most common types have shortcut creation methods in IRuby, and newFixnum is one of these. To create a new Range we have to get a reference to the class and call new on it, though.

The Sequence class

Here comes the meat of it all. This is the final version of the Java source:
import java.io.IOException;

import java.util.ArrayList;
import java.util.List;
import java.util.Iterator;

import org.jruby.IRuby;
import org.jruby.RubyArray;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;
import org.jruby.RubyObject;
import org.jruby.RubyRange;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

CallbackFactory seq_cb = runtime.callbackFactory(Sequence.class);
cSequence.defineSingletonMethod("new",seq_cb.getOptSingletonMethod("newInstance"));
cSequence.defineMethod("initialize",seq_cb.getMethod("initialize",RubyFixnum.class,RubyFixnum.class,RubyRange.class));
cSequence.defineMethod("n1=",seq_cb.getMethod("set_n1",RubyFixnum.class));
cSequence.defineMethod("n2=",seq_cb.getMethod("set_n2",RubyFixnum.class));
cSequence.defineMethod("range=",seq_cb.getMethod("set_range",RubyRange.class));
cSequence.defineMethod("[]",seq_cb.getMethod("arr_ix",RubyFixnum.class));
cSequence.defineMethod("each",seq_cb.getMethod("each"));
cSequence.defineMethod("to_a",seq_cb.getMethod("to_a"));
cSequence.defineMethod("to_s",seq_cb.getMethod("to_s"));
cSequence.defineMethod("inspect",seq_cb.getMethod("inspect"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}

public static class Sequence extends RubyObject {
public static IRubyObject newInstance(IRubyObject recv, IRubyObject[] args) {
Sequence result = new Sequence(recv.getRuntime(), (RubyClass)recv);
result.callInit(args);
return result;
}

public Sequence(IRuby runtime, RubyClass type) {
super(runtime,type);
}

public IRubyObject initialize(RubyFixnum n1, RubyFixnum n2, RubyRange range) {
setInstanceVariable("@n1",n1);
setInstanceVariable("@n2",n2);
setInstanceVariable("@range",range);
regenerate();
return this;
}

public IRubyObject set_n1(RubyFixnum n1) {
setInstanceVariable("@n1",n1);
regenerate();
return n1;
}

public IRubyObject set_n2(RubyFixnum n2) {
setInstanceVariable("@n2",n2);
regenerate();
return n2;
}

public IRubyObject set_range(RubyRange range) {
setInstanceVariable("@range",range);
regenerate();
return range;
}

private void regenerate() {
List v = new ArrayList();
int v1 = RubyNumeric.fix2int(getInstanceVariable("@n1"));
int v2 = RubyNumeric.fix2int(getInstanceVariable("@n2"));
IRubyObject r = getInstanceVariable("@range");
if(r.callMethod("===",getRuntime().newFixnum(1)).isTrue()) {
v.add(getRuntime().newFixnum(v1));
}
if(r.callMethod("===",getRuntime().newFixnum(2)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
int l = RubyNumeric.fix2int(r.callMethod("last"));
for(int i=3;i<=l;i++) {
int tmp = v1;
v1 = v2;
v2 = tmp + v1;
if(r.callMethod("===",getRuntime().newFixnum(i)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
}
setInstanceVariable("@value",getRuntime().newArray(v));
}

public IRubyObject arr_ix(RubyFixnum ix) {
int index = RubyNumeric.fix2int(ix);
if(index < RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{ix,getInstanceVariable("@range").callMethod("last")}));
}
if(index > RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("last"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{getInstanceVariable("@range").callMethod("first"), getRuntime().newFixnum(index+1)}));
}
regenerate();
return getInstanceVariable("@value").callMethod("[]",
getRuntime().newFixnum(index -
RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))));
}

public IRubyObject each() {
Iterator iter = ((RubyArray)getInstanceVariable("@value")).getList().iterator();
while(iter.hasNext()) {
getRuntime().getCurrentContext().yield((IRubyObject)iter.next());
}
return getRuntime().getNil();
}

public IRubyObject to_a() {
return getInstanceVariable("@value");
}

public IRubyObject to_s() {
return getInstanceVariable("@value").callMethod("to_s");
}

public IRubyObject inspect() {
StringBuffer sb = new StringBuffer("#<Sequence::Sequence n1=");
sb.append(getInstanceVariable("@n1").toString());
sb.append(" n2=");
sb.append(getInstanceVariable("@n2").toString());
sb.append(" range=");
sb.append(getInstanceVariable("@range").toString());
sb.append(" value=");
sb.append(getInstanceVariable("@value").callMethod("inspect").toString());
sb.append(">");
return getRuntime().newString(sb.toString());
}
}
}

Compiling this and placing it in fib.jar on your load path will allow JRuby to use the code as if it was Ruby. Try it out.

Now, let's take the code in pieces. First of all, the initialization code defines the methods available and gives them a reflected implementation through CallbackFactory. We create a static inner class to hold the actualy implementation of the class. This isn't strictly necessary in this case, since we haven't associated any external state with the object, but it makes for cleaner separation and easier to understand code. Note that we need to have our own
new-implementation. This is one of the drawbacks with the MRI technique. When using MetaClasses you can define an allocateObject-method that automatically get's used by the runtime. Most of CallbackFactory's different getMethods-variants are used. This display how to have a fixed number of arguments with specific classes.

The initialize method just sets the instance variables and then call the method regenerate. Note that this isn't a Ruby method anymore. I didn't feel it was necessary to expose it, and using Java call semantics makes this slightly more efficient. Apart from that, there is nothing really strange in this code. I use the fact that you can create a new Ruby array from a list to make the regeneration of @value easier. But in most cases this is purely translated Ruby to JRuby-code. The only point where something strange is happening is in fact in the each-method. Handling blocks with JRuby in Java isn't always practical, so I tend to find it easier to refactor the Ruby code into something that calls yield specifically, by itself.

Conclusion

Implementing a Java extension for JRuby can be tricky, but the hard part is mostly to know what services are available where. By having the JRuby source code available it's easy to get a peek into the internals and find out more about those things that are problematic. Taking a look at how the core classes are implemented often give some hints on how continue, too. For example, RubyZlib, RubyYAML, RubyOpenSSL, RubyStringIO and RubyEnumerable are all mostly written in this style, and there are various examples of the different styles available.

If you need the speed or if it's more practical to implement the functionality in Java, I would say that writing an extension is fairly easy once you get started. The important thing to remember is to be sure what the interface should be, and implement everything else outside of JRuby, demarcating the interface from the implementation.

4 kommentarer:

Ryan sa...

Good stuff, Ola. Now you've got me wondering if I can benefit from moving some of my modules into Java.

myk sa...

I was experimenting with the latest source of JRuby (subversion version 2362) from the trunk and Java Mustang. I am trying to write some example code where in I want to invoke the Java SystemTray class from the java.awt package.in ruby. The SystemTray class is a new addition in Java Mustang.

So I do the following:
-----------------------------
require 'java'

module Awt
include_package 'java.awt'
end

class SwingDock
def initialize
Awt::SystemTray systemTray = Awt::SystemTray.getSystemTray()
end
end
sd = SwingDock.new
-----------------------------

$jruby swingdock1.rb
swingdock1.rb:9:in `method_missing': undefined method `SystemTray' for
Awt:Module (NoMethodError)
from swingdock1.rb:9:in `initialize'
from swingdock1.rb:12:in `new'
from swingdock1.rb:12

I have managed to get access to other awt classes just fine. Do I have to do something special to access the SystemTray class in the java.awt package?

Googilator sa...

Found the problem out:

Awt::SystemTray systemTray = Awt::SystemTray.getSystemTray()

should have been

systemTray = Awt::SystemTray.getSystemTray()

:) sorry to have created unnecessary noise here.

luniki sa...

Great tutuorial! How about one using the MetaClasses approach?

I noticed some differences between the Ruby and Java versions and attach a diff for it.

@@ -15,9 +15,10 @@
regenerate
end
%w(n1 n2 range).each do |n|
- define_method(n) do |v|
- send("#{n}=",v)
+ define_method("#{n}=") do |v|
+ instance_variable_set("@#{n}", v)
regenerate
+ v
end
end