Virtual Method Tables

May 08, 2011

One of the ingredients of the Concepts of Programming Languages course I teach at TU Delft, is an introduction to the C language. In two lectures I went through the language, emphasizing functions, structs, pointers, and memory management based on Kernighan & Ritchie and a nice piece by Nick Parlante on Pointers and Memory. The students were supposed to be able to understand basics of expressions and control-flow from exposure to Java and Scala.

For the lab assignment, William Cook had given me the idea of expressing dynamic dispatch in a procedural language as a method (no pun intended) to understand both the OO mechanism and appreciate the difference (and hard work) of memory management in C. The idea seemed simple enough, but while understanding the idea conceptually, I had never actually done the exercise myself. Searching the web did not result in a whole lot of information, but for this example. In case anyone needs inspiration for their course, here is the text of the assignment I gave the students in which I explain the code in the example and provide a slightly larger Java program for them to translate. (Corrections of my poor understanding of dynamic dispatch or C are welcome as well.)

(Thanks to pandoc for helping convert LaTeX to Markdown.)

From C to Java

C is a procedural language. C programs consists, roughly, of data structure declarations and functions working on those data structures. C is not an object-oriented language. This means that functions are not associated with objects, as methods are in Java.

In this assignment we manually translate a small Java class hiearchy to Java. This will teach us something about Java and it will teach us many aspects of C.

Virtual Methods in Java

The program below defines a small Java class hierarchy with two classes Base and Child with the latter a subclass of the former. Class Base defines a method print, which is overridden by class Child.

Class BaseTest creates a Base and a Child object, but both are assigned to a variable of type Base. When calling the method print the program decides at run-time which implementation to run based on the run-time type of the object.

class Base {
  Integer x;
  public Base(Integer v) {
    x = v;
  }
  public void print() {
    System.out.println("Base: " + x);
  }
}
class Child extends Base {
  Integer y;
  public Child(Integer v1, Integer v2) {
    super(v1);
    y = v2;
  }
  public void print() {
    System.out.println("Child: (" + x + "," + y + ")");
  }
}
class BaseTest {
  public static void main(String[] args) {
    Base base1 = new Base(45);
    Base base2 = new Child(567, 245);
    base1.print();
    base2.print();
  }
} 

Translating Virtual Methods to C

The C program below contains a translation of the Java program. It uses a so called v-table or virtual method table to support run-time look up of a function depending on its ‘type’. Lets examine the code.

Representing Objects

In C we declare data structures as structs:

  struct Base {
    int x;
  };

This defines a new record type with field x. Using a typedef we use the struct to declare the Base type:

  typedef struct Base {
    int x;
  } Base;

A variable of type Base will thus contain a record with one field. The type Base* is a pointer to a Base record.

Translating Methods

The C language does not support the associations of methods with structs. Rather, functions are defined at top-level. In order to translate methods that belong to particular type, we adopt a naming convention, using the name of the class as prefix of the function implementing a method:

  void Base_print(Base* obj) {
    printf("Base: (%d)\n", obj->x);
  }

Further, note that methods are applied to an object. C functions only have regular parameters. Thus, the Base_print function has one argument, which corresponds to the object it is applied to.

Associating Methods with Objects

In order to label a particular function as the implementation of a method for a particular object we use a virtual method table or vtable. This is enabled by the fact that C allows to store function pointers. For example, given the declaration of function Base_print above, &Base_print is a pointer to that function. A vtable is simply an array of function pointers. The slots in the array correspond to the methods of of a class. Since methods are fixed per class, we do not have to maintain an array of functions for each object. Rather, it suffices to declare one vtable array per class. Thus, the following global variable Base_Vtable stores the vtable for the Base class:

  void (*Base_Vtable[])() = { &Base_print };

The expression { &Base_print } constructs an array literal (of length 1).

In order to associate the vtable with an object, we extend the struct with a vtable field containing a pointer to an array of functions:

  typedef struct Base {
    void* (*vtable[])();
    int x;
  } Base;

Constructing Objects

C does not provide built-in support for constructing objects. Thus, we have to define functions that correspond to the constructors of a class. These functions have to do explicit memory allocation to create a chunk of memory reserved for an object. We use the malloc function to allocate a chunk of memory the size of the Base struct. As a convention we create a constructor newC for a class C. Thus, for the Base class we define:

  Base* newBase(int v) {
    Base* obj = (Base*)malloc(sizeof(Base));
    obj->vtable = Base_Vtable;
    obj->x = v;
    return obj;
  }

The vtable for the Base class is associated with the object by assigning it to the vtable field.

Inheritance

For the Child subclass of the Base class we go through the same steps as we did for the Base class, that is

  1. define a typedef struct Child to represent objects of type Child

  2. include a vtable field as first field to associate methods with the objects of the type

  3. implement the method(s) of the class as top-level functions with Child_ as prefix for the function name

  4. declare a global Child_Vtable variable to contain an array of function pointers with a pointer for each method

  5. define a newChild function representing the constructor for the class

Note that the x field is inlined in the Child struct. Since C does not support inheritance, we have to copy the fields of the superclass. (We could try to deal with that with macros, but we will consider C without macros in this course.)

Dynamic Dispatch

Finally, we arrive at the magic moment. For each unique method in the class hierarchy we define one general function that we can call for all objects in the hierarchy. The function uses the vtable field of the object to dispatch to the concrete method implementation:

  void print(Base* obj) {
    obj->vtable[0](obj);
  }

The expression obj->vtable[0] is an access of the 0th position of the vtable field of the object obj. The result is a function pointer, which is applied to obj. The main function puts this to the test:

  int main() {
    Base* base = newBase(45);
    Base* child = (Base*)newChild(567, 245);
    print(base);
    print(child);
  }

The function creates a base and a child object, of type Base and Child, respectively. However, both variables are declared as type Base*. Then the same function print is applied to both objects. Since base is constructed with newBase, its vtable is Base_Vtable and print dispatches to Base_print. The object child is constructed with newChild, thus its vtable is Child_Vtable and print dispatches to Child_print.

In general, the function called is determined at runt-time based on the concrete instance.

To abstract from the exact position in the vtable we can use an enumeration with symbolic names for the functions:

  enum { Call_print };
  void print(Base* obj) {
    obj->vtable[Call_print](obj);
  }

References

The C Code

#include <stdio.h>
#include <stdlib.h>

/* class Base */

typedef struct Base {
  void (**vtable)();
  int x;
} Base;

void Base_print(Base* obj) {
  printf("Base: (%d)\n", obj->x);
}

void (*Base_Vtable[])() = { &Base_print };

Base* newBase(int v) {
  Base* obj = (Base*)malloc(sizeof(Base));
  obj->vtable = Base_Vtable;
  obj->x = v;
  return obj;
}

enum { Call_print };

void print(Base* obj) {
  obj->vtable[Call_print](obj);
}

/* class Child */

typedef struct Child {
  void (**vtable)();
  int x; // inherited from Base
  int y;
} Child;

void Child_print(Child const* obj) {
  printf("Child: (%d,%d)\n", obj->x, obj->y);
}

void (*Child_Vtable[])() = { &Child_print };

Child* newChild(int v1, int v2) {
  Child* obj = (Child*)malloc(sizeof(Child));
  obj->vtable = Child_Vtable;
  obj->x = v1;
  obj->y = v2;
  return obj;
}

/* test it */

int main() {
  Base* base = newBase(45);
  Base* child = (Base*)newChild(567, 245);
  print(base);
  print(child);
}

Assignment

Translate the following Java program to a C program using the method above.

abstract class Expr {
  abstract public String format();
  public Integer precedence() { return 0; }
  public Boolean higherPrecedence(Expr that) { 
    if(precedence() != 0 && that.precedence() != 0) { 
      return this.precedence() > that.precedence();
    }
    return false;
  }
}
class Var extends Expr {
  public String name;
  public Var(String name) { this.name = name; }
  public String format() { return name; }
}
class Number extends Expr {
  public Integer value;
  public Number(Integer value) { this.value = value; }
  public String format() { return value.toString(); }
}
abstract class BinOp extends Expr {
  Expr left;
  Expr right;
  public BinOp(Expr l, Expr r) { 
    left = l;
    right = r;
  }
  abstract public String operator();
  public String format() {
    String l = left.format();
    String r = right.format();
    if(higherPrecedence(left) ) { l = "(" + l + ")"; } 
    if(higherPrecedence(right)) { r = "(" + r + ")"; } 
    return l + " " + operator() + " " + r;
  }
}
class Plus extends BinOp {
  public String operator() { return "+"; }
  public Integer precedence() { return 10; }
  public Plus(Expr l, Expr r){ super(l, r); }
}
class Mul extends BinOp {
  public String operator() { return "*"; }
  public Integer precedence() { return 20; }
  public Mul(Expr l, Expr r) { super(l, r); }
}
class ExprTest {
  public static void main(String[] args) {
    Expr test = new Mul(new Number(42), new Plus(new Number(3), new Var("x")));
    System.out.println(test.format());
  }
}