One of the ingredients of the Concepts of Programming Languages course I teach at TU Delft, is an introduction to the C language. In two lectures I went through the language, emphasizing functions, structs, pointers, and memory management based on Kernighan & Ritchie and a nice piece by Nick Parlante on Pointers and Memory. The students were supposed to be able to understand basics of expressions and control-flow from exposure to Java and Scala.
For the lab assignment, William Cook had given me the idea of expressing dynamic dispatch in a procedural language as a method (no pun intended) to understand both the OO mechanism and appreciate the difference (and hard work) of memory management in C. The idea seemed simple enough, but while understanding the idea conceptually, I had never actually done the exercise myself. Searching the web did not result in a whole lot of information, but for this example. In case anyone needs inspiration for their course, here is the text of the assignment I gave the students in which I explain the code in the example and provide a slightly larger Java program for them to translate. (Corrections of my poor understanding of dynamic dispatch or C are welcome as well.)
(Thanks to pandoc for helping convert LaTeX to Markdown.)
C is a procedural language. C programs consists, roughly, of data structure declarations and functions working on those data structures. C is not an object-oriented language. This means that functions are not associated with objects, as methods are in Java.
In this assignment we manually translate a small Java class hiearchy to Java. This will teach us something about Java and it will teach us many aspects of C.
The program below defines a small Java class hierarchy with two classes
Base and Child with the latter a subclass of the former. Class
Base defines a method print, which is overridden by class Child.
Class BaseTest creates a Base and a Child object, but both are
assigned to a variable of type Base. When calling the method print
the program decides at run-time which implementation to run based on
the run-time type of the object.
class Base {
Integer x;
public Base(Integer v) {
x = v;
}
public void print() {
System.out.println("Base: " + x);
}
}
class Child extends Base {
Integer y;
public Child(Integer v1, Integer v2) {
super(v1);
y = v2;
}
public void print() {
System.out.println("Child: (" + x + "," + y + ")");
}
}
class BaseTest {
public static void main(String[] args) {
Base base1 = new Base(45);
Base base2 = new Child(567, 245);
base1.print();
base2.print();
}
}
The C program below contains a translation of the Java program. It uses a so called v-table or virtual method table to support run-time look up of a function depending on its ‘type’. Lets examine the code.
In C we declare data structures as structs:
struct Base {
int x;
};
This defines a new record type with field x. Using a typedef we
use the struct to declare the Base type:
typedef struct Base {
int x;
} Base;
A variable of type Base will thus contain a record with one field. The
type Base* is a pointer to a Base record.
The C language does not support the associations of methods with structs. Rather, functions are defined at top-level. In order to translate methods that belong to particular type, we adopt a naming convention, using the name of the class as prefix of the function implementing a method:
void Base_print(Base* obj) {
printf("Base: (%d)\n", obj->x);
}
Further, note that methods are applied to an object. C functions only
have regular parameters. Thus, the Base_print function has one
argument, which corresponds to the object it is applied to.
In order to label a particular function as the implementation of a
method for a particular object we use a virtual method table or
vtable. This is enabled by the fact that C allows to store
function pointers. For example, given the declaration of function
Base_print above, &Base_print is a pointer to that
function. A vtable is simply an array of function pointers. The slots
in the array correspond to the methods of of a class. Since methods are
fixed per class, we do not have to maintain an array of functions
for each object. Rather, it suffices to declare one vtable array
per class. Thus, the following global variable Base_Vtable
stores the vtable for the Base class:
void (*Base_Vtable[])() = { &Base_print };
The expression { &Base_print } constructs an array literal (of length 1).
In order to associate the vtable with an object, we extend the struct with a vtable field containing a pointer to an array of functions:
typedef struct Base {
void* (*vtable[])();
int x;
} Base;
C does not provide built-in support for constructing objects. Thus, we
have to define functions that correspond to the constructors of a class.
These functions have to do explicit memory allocation to create a
chunk of memory reserved for an object. We use the malloc function to
allocate a chunk of memory the size of the Base struct. As a
convention we create a constructor newC for a class C. Thus, for the
Base class we define:
Base* newBase(int v) {
Base* obj = (Base*)malloc(sizeof(Base));
obj->vtable = Base_Vtable;
obj->x = v;
return obj;
}
The vtable for the Base class is associated with the object by
assigning it to the vtable field.
For the Child subclass of the Base class we go through the same
steps as we did for the Base class, that is
define a typedef struct Child to represent objects of type Child
include a vtable field as first field to associate methods with
the objects of the type
implement the method(s) of the class as top-level functions with
Child_ as prefix for the function name
declare a global Child_Vtable variable to contain an array
of function pointers with a pointer for each method
define a newChild function representing the constructor for the
class
Note that the x field is inlined in the Child struct. Since C does
not support inheritance, we have to copy the fields of the superclass.
(We could try to deal with that with macros, but we will consider C
without macros in this course.)
Finally, we arrive at the magic moment. For each unique method in the
class hierarchy we define one general function that we can call for all
objects in the hierarchy. The function uses the vtable field of the
object to dispatch to the concrete method implementation:
void print(Base* obj) {
obj->vtable[0](obj);
}
The expression obj->vtable[0] is an access of the 0th position of the
vtable field of the object obj. The result is a function pointer,
which is applied to obj. The main function
puts this to the test:
int main() {
Base* base = newBase(45);
Base* child = (Base*)newChild(567, 245);
print(base);
print(child);
}
The function creates a base and a child object, of type Base and
Child, respectively. However, both variables are declared as type
Base*. Then the same function print is applied to both objects.
Since base is constructed with newBase, its vtable is
Base_Vtable and print dispatches to Base_print.
The object child is constructed with newChild, thus its vtable is
Child_Vtable and print dispatches to Child_print.
In general, the function called is determined at runt-time based on the concrete instance.
To abstract from the exact position in the vtable we can use an enumeration with symbolic names for the functions:
enum { Call_print };
void print(Base* obj) {
obj->vtable[Call_print](obj);
}
Virtual method table in C: http://www.daniweb.com/software-development/c/threads/228277
Dynamic dispatch: http://en.wikipedia.org/wiki/Dynamic_dispatch
#include <stdio.h>
#include <stdlib.h>
/* class Base */
typedef struct Base {
void (**vtable)();
int x;
} Base;
void Base_print(Base* obj) {
printf("Base: (%d)\n", obj->x);
}
void (*Base_Vtable[])() = { &Base_print };
Base* newBase(int v) {
Base* obj = (Base*)malloc(sizeof(Base));
obj->vtable = Base_Vtable;
obj->x = v;
return obj;
}
enum { Call_print };
void print(Base* obj) {
obj->vtable[Call_print](obj);
}
/* class Child */
typedef struct Child {
void (**vtable)();
int x; // inherited from Base
int y;
} Child;
void Child_print(Child const* obj) {
printf("Child: (%d,%d)\n", obj->x, obj->y);
}
void (*Child_Vtable[])() = { &Child_print };
Child* newChild(int v1, int v2) {
Child* obj = (Child*)malloc(sizeof(Child));
obj->vtable = Child_Vtable;
obj->x = v1;
obj->y = v2;
return obj;
}
/* test it */
int main() {
Base* base = newBase(45);
Base* child = (Base*)newChild(567, 245);
print(base);
print(child);
}
Translate the following Java program to a C program using the method above.
abstract class Expr {
abstract public String format();
public Integer precedence() { return 0; }
public Boolean higherPrecedence(Expr that) {
if(precedence() != 0 && that.precedence() != 0) {
return this.precedence() > that.precedence();
}
return false;
}
}
class Var extends Expr {
public String name;
public Var(String name) { this.name = name; }
public String format() { return name; }
}
class Number extends Expr {
public Integer value;
public Number(Integer value) { this.value = value; }
public String format() { return value.toString(); }
}
abstract class BinOp extends Expr {
Expr left;
Expr right;
public BinOp(Expr l, Expr r) {
left = l;
right = r;
}
abstract public String operator();
public String format() {
String l = left.format();
String r = right.format();
if(higherPrecedence(left) ) { l = "(" + l + ")"; }
if(higherPrecedence(right)) { r = "(" + r + ")"; }
return l + " " + operator() + " " + r;
}
}
class Plus extends BinOp {
public String operator() { return "+"; }
public Integer precedence() { return 10; }
public Plus(Expr l, Expr r){ super(l, r); }
}
class Mul extends BinOp {
public String operator() { return "*"; }
public Integer precedence() { return 20; }
public Mul(Expr l, Expr r) { super(l, r); }
}
class ExprTest {
public static void main(String[] args) {
Expr test = new Mul(new Number(42), new Plus(new Number(3), new Var("x")));
System.out.println(test.format());
}
}