RichyHBM

Software engineer with a focus on game development and scalable backend development

Calling Native Code From JVM vs Go

I have been building a couple of prototypes recently and came accross the need to use native code in one of them, specifically on the backend. After debating which tech I should try out for this, I decided to give it a test and figure out if the performance differences could sway my vote between my usual go-to languages, Scala and GoLang.

To begin with I need some native code to call, this doesn’t need to do anything complex but it does need to have the basics of all the operations I am going to be doing from the caller side, mainly calling a function with parameters, and returning a value.

I decided to just go as bare bones as possible, creating 4 simple methods to represent adding, subtracting, division and multiplication in C.

float sum(float a, float b)
{
    return a + b;
}

float min(float a, float b)
{
    return a - b;
}

float mult(float a, float b)
{
    return a * b;
}

float div(float a, float b)
{
    return a / b;
}

This is compiled to a shared library, to be loaded at runtime, using GCC.

gcc -c -Wall -Werror -fpic cexample.c
gcc -shared -o libcexample.so cexample.o

The tests themselves will simply consist of calling these methods a number of times and timing the average times using a multiple of different techs.

In order to give the JVM based tests a chance to pre-warm, I am going to make a number of calls to the native code before the timed tests, this can easily simulate having had a number of requests before the requests we are timing. And to keep things consistent I am also going to do the same in the Go implementation.

GoLang Implementation

The Go implementation makes use of the built-in cgo feature to call native libraries. Because Go is compiled itself to native code, the library will be linked as it would in C++ or C, however that also means Go has to know about the methods you are going to access before hand in the form of a special set of comments.

package main

/*
#cgo LDFLAGS: -L${SRCDIR}/ -lcexample

float sum(float, float);
float min(float, float);
float mult(float, float);
float div(float, float);
*/
import "C"
import "fmt"
import "time"
import "os"

func main() {
  var a C.float = 3.0
  var b C.float = 4.0
  iters := [7]int{1, 10, 100, 1000, 10000, 100000, 1000000}

  //Warm up, this shouldn't matter for Go but want to keep consistent with all implementations
  for i := 0; i < 1000; i++ {
    C.sum(a, b)
    C.min(a, b)
    C.mult(a, b)
    C.div(a, b)
  }

  for _,iter := range iters {
    start := time.Now().UnixNano()

    for i := 0; i < iter; i++ {
      var c C.float = C.sum(a, b)
      var d C.float = C.min(c, 2.0)
      var e C.float = C.mult(d, 2.0)
      _ = C.div(e, 4.0)
    }

    end := time.Now().UnixNano()

    totalTime := end - start
    average := totalTime / int64(iter)

    f, _ := os.OpenFile("../results.csv", os.O_APPEND|os.O_WRONLY, 0600)
    defer f.Close()

    fmt.Fprintf(f, "GO, %d, %d, %d\n", iter, totalTime, average);
  }
}

CGo works by mapping the functions in the library to identical functions within its own C namespace, this namespace also contains its own set of basic types to be used when interacting with native code that takes in parameters or returns a value.

I won’t run through all of the code, as it should be pretty self explanatory, It simply runs the native calls a number of times and times these calls. It runs the code a different number of times to get a good range from which to take an average.

The one bit of code that I do want to mention is the following comment, it’s essentially a compile time flag issued to the CGo linker to let it know where to find the library and which library to link to.

#cgo LDFLAGS: -L${SRCDIR}/ -lcexample

Scala JNI/JNA Implementation

Java Native Interface, JNI for short, is the built in JVM construct for interfacing with native code allowing you to call native code or be called from the native code. JNA is a framework built on top of JNI in order to reduce the amount of boilerplate and setup code required to use JNI.

With JNA you create an interface, or in Scala’s case a trait, that extends the Library class and defines all the methods your native code provides. You then tell JNA to create you an implementation of your interface using your native code.

import com.sun.jna._
import java.io._

trait LibCexample extends Library {
  def sum(a:Float, b:Float):Float
  def min(a:Float, b:Float):Float
  def mult(a:Float, b:Float):Float
  def div(a:Float, b:Float):Float
}

object Main {
  def main(args: Array[String]): Unit = {

    System.setProperty("jna.library.path", new java.io.File(".").getCanonicalPath);
    val LibCexample = Native.loadLibrary("cexample", classOf[LibCexample]).asInstanceOf[LibCexample]

    val a:Float = 3.0f
    val b:Float = 4.0f

    //Warm up by making calls pre-timer
    for(i <- 1 to 1000) {
      LibCexample.sum(a, b)
      LibCexample.min(a, b)
      LibCexample.mult(a, b)
      LibCexample.div(a, b)
    }

    val iterations = List(1, 10, 100, 1000, 10000, 100000, 1000000)

    for (iter <- iterations) {
      val t0 = System.nanoTime()

      for(i <- 1 to iter) {
        val c = LibCexample.sum(a, b)
        val d = LibCexample.min(c, 2.0f)
        val e = LibCexample.mult(d, 2.0f)
        val f = LibCexample.div(e, 4.0f)
      }

      val t1 = System.nanoTime()
      val totalTime = t1 - t0
      val avgTime = totalTime / iter

      val write = new PrintWriter(new FileOutputStream(new File("../results.csv"),true))
      write.write(s"JNA, $iter, $totalTime, $avgTime"+ scala.util.Properties.lineSeparator)
      write.close()
    }
  }
}

Unlike with Go, Scala doesn’t compile the native code into the executable thus you will need to tell it where to find the library. This can be done a number of ways however I have opted to set the jna.library.path property that tells JNA where to search for libraries.

System.setProperty("jna.library.path", new java.io.File(".").getCanonicalPath);

Like with Go I am running the test a number of times to get a good average. The code also pre-calls the native code a number of times to allow the JVM to warm up just encase there is any setup costs within the JVM.

Scala Bridj Implementation

Bridj is an alternative Java library for interacting with C/C++ code, it doesn’t make use of JNI and focuses on ease of use and performance, however this library isn’t as mature or stable as JNI.

Firstly you need to create a class containing the methods available in your native code, this class will need to have a specific set of annotations and have all its native calls defined as native static.

package main;
import org.bridj.BridJ;
import org.bridj.CRuntime;
import org.bridj.Pointer;
import org.bridj.ann.Library;
import org.bridj.ann.Ptr;
import org.bridj.ann.Runtime;

@Library("cexample")
@Runtime(CRuntime.class)
public class LibCexample {
    static {
        BridJ.register();
    }

    public native static float sum(float a, float b);
    public native static float min(float a, float b);
    public native static float mult(float a, float b);
    public native static float div(float a, float b);
}

This class is what gives access to the native code, allowing its native static methods to interface to the C/C++ code. The Scala code can then interface with this class in order to perform the operations.

import main.LibCexample
import java.io._

object Main {
  def main(args: Array[String]): Unit = {
    val a:Float = 3.0f
    val b:Float = 4.0f

    //Warm up by making calls pre-timer
    for(i <- 1 to 1000) {
      LibCexample.sum(a, b)
      LibCexample.min(a, b)
      LibCexample.mult(a, b)
      LibCexample.div(a, b)
    }

    val iterations = List(1, 10, 100, 1000, 10000, 100000, 1000000)

    for (iter <- iterations) {
      val t0 = System.nanoTime()

      for(i <- 1 to iter) {
        val c = LibCexample.sum(a, b)
        val d = LibCexample.min(c, 2.0f)
        val e = LibCexample.mult(d, 2.0f)
        val f = LibCexample.div(e, 4.0f)
      }

      val t1 = System.nanoTime()
      val totalTime = t1 - t0
      val avgTime = totalTime / iter

      val write = new PrintWriter(new FileOutputStream(new File("../results.csv"),true))
      write.write(s"BRIDJ, $iter, $totalTime, $avgTime" + scala.util.Properties.lineSeparator)
      write.close()
    }
  }
}

As with previous tests I am running the calls a number of times to get the average and as with the previous JVM based code I am also pre-warming the JVM by making a number of calls before timing them.

Results

The results shown below show the time it takes to run the given number of iterations, these tests have been ran 30 times and the averages of these noted below.

Iterations Total - Go Average - Go Total - JNA Average - JNA Total - Bridj Average - Bridj
1 867 867 358425 358425 284028 284028
10 7820 781 108069 10806 23810 2380
100 79030 789 763540 7634 59183 591
1000 797636 797 11786822 11786 383280 382
10000 7342885 733 75253867 7524 1814188 180
100000 75006528 749 497305685 4972 13345597 133
1000000 775811573 775 4737624876 4737 87071036 86

For the most part the results are as I expected, Go gets compiled to native code and linked so it seems normal that the average time per set of calls is maintained regardless of the amount calls made.

Oddly enough, both JVM based methods have a high one-time cost, I don’t really know what it could be caused by as the methods are called before had to try and negate any setup costs, but regardless the high amount of time is consistent across all test runs.

Also, interestingly enough, the JVM versions seem to reduce the average time per call as more calls are made. I don’t understand quite what is going on here but my best guess is that the JVM performs some form of caching when calling native calls.

Finally the surprise here is the performance on Bindj, whilst it’s performance for a low amount of calls is low, in a high frequency call environment it is much faster. Ultimately I would recommend running a similar set of tests if you are going to be doing something similar, and I am still not convinced that the JVM isn’t doing something to mess with the results on a high call scenario, but it is good to see that Go is operating as expected and keeps consistent.

Like what I do?