There are many aspects of building general-purpose interfaces between Squeak and external facilities that are outside of the scope of a basic 9-page introduction, and are left out here.
Comments are invited; please send them to stp@create.ucsb.edu.
Please note that if lots of us start writing random and not-really-well-motivated primitives we won't be able to share any code at all any more. The namespace of primitives is limited; there is no formal mechanism for managing that space with multiple primitive-writers; and merging two virtual machines with different primitive extensions can be a *real* pain. Do not do this lightly.
For the purposes of this example, I'll take a method from the Siren/Squeak MIDI I/O interface. This is the input primitive that reads a MIDI data packet from the OS-level driver. The details are moot for this presentation.
I have a class MIDIPacket that has inst. vars. as shown in the following definition.
Object subclass: #MIDIPacket instanceVariableNames: 'length time flags data ' ...
The first three inst. vars. are integers, the last is a ByteArray (which is pre-allocated to 3 bytes--the max size of normal MIDI messages [system exclusive packets are handled specially]).
The primitive will live in class PrimMIDIPort and will take a MIDIPacket and pass it down to the VM, who will fill it in with data read from the MIDI driver. The primitive returns the number of bytes read (the length inst. var. of the packet). Since the primitive does not use the state of its receiver, it could be put in just about any class. The argument is the important component.
So, the primitive method will look like,
PrimMIDIPort >> primReadPacket: packet data: data
I pass the packet object and the data byte array separately for simplicity of the C code and for flexibility (in case I decide to split them into two Smalltalk objects in the future). It's also easier to decompose an object in Smalltalk than it is in C.
int sqReadMIDIPacket(int MIDIpacket, int dataBuffer); // Read input into a MIDI packet. (prim. 614) // 'MIDIpacket' is interpreted as (MIDIPacket *) and is // written into with the time-stamp, flags, and length. // 'dataBuffer' is interpreted as (unsigned char *) and // gets the MIDI message data. // Answer the number of data bytes in the packet.
Note that all arguments are passed as ints; you can cast them into 'whatever' at will in the C code (see below). (This is yet another reason to write really good comments in your interfaces.)
Most of my primitives return integers (negative values for common error conditions, I apologize, I just programmed in C for too long in my youth) and fail only in extreme situations. This is a personal preference--I tend to pass the error return values up to higher levels of code to handle. Other designers might always want to have a failure handler right in the method that called the prim--see the discussion below. It would be easier if Squeak had well-integrated exception handling and raised a system exception on primitive failure so that the calling method could decide whether to use the primitive method's failure code or not.
get: packet "Read the data from the receiver into the argument (a MIDIPacket)." | len | "reads packet header and data, answers amt. of data read" len := self primReadPacket: packet data: packet data. len >= 0 ifFalse: [...What to do on bad return value rather than failure...]. ^len
In Siren, this is called by a read loop that's triggered by a semaphore coming up from the VM, but that's outside of the scope here.
The actual primitive methods generally have names that start with "prim" as shown above.
primReadPacket: packet data: data "Read a packet from the MIDI driver." "Write data into the arguments; answer the number of bytes read." <primitive: 614> self error: 'MIDI read failed.'The "<primitive: XXX>" construct is a primitive call--it's Smalltalk's way of "trapping" into the VM. The body of the method is the primitive. The primitive number (614) is an index into the table of all primitives that's in the Interpreter class.
If the primitive returns successfully, the statements that follow the primitive call will not be executed. On the other hand, if the primitive fails, the Smalltalk code that follows the primitive call will be executed. This is quite hand for cases where you want to try a Smalltalk implementation (i.e., a good number of primitives fail if the arguments are not of the default types), or re-try the primitive with different arguments (i.e., coerce one of the arguments and re-send the method).
The return value from the primitive (actually, the thing left on the top of the stack by the glue code--see below) will be the return value of this method.
There is a great deal of flexibility here, and interested readers are encouraged to read and analyze more of the Interpreter primitive methods (for example the sound I/O or network interface methods). Remember that these are all translated to C, so they cannot use all the language features of Smalltalk. (I'd give you the deatils of the Smalltalk-to-C translator if I understood them.)
The example that follows demonstrates the basic flow of the several stages in a typical glue method:
I have annotated the method below with these stages (in parentheses). Also note that I generally include both the Smalltalk method header and C function prototype as comments in this method; this makes debugging it much easier. In the Interpreter (and/or DynamicInterpreter) class, we have to write,
primitiveReadMIDIPacket "Read a message (a MIDIPacket) from the MIDI interface." "ST: PrimMIDIPort primReadPacket: packet data: data" "C: int sqReadMIDIPacket (int packet, int data);" | packet data answer | "Get the arguments" (1) data := self stackValue: 0. (1) packet := self stackValue: 1. "Make sure that 'data' is byte-like" (2) self success: (self isBytes: data). "Call the primitive" successFlag (3) ifTrue: [answer := self cCode: 'sqReadMIDIPacket (packet, data + 4)']. "Pop the args and rcvr object" successFlag (4) ifTrue: [self pop: 3. "Answer the number of data bytes read" (5) self push: (self integerObjectOf: answer)]
For (1), note that the arguments are pushed onto the stack in reverse order (so the last argument is stack(0), the next-to-last is stack(1), etc.). There are methods (in ObjectMemory, the superclass if Interpreter) that allow you to get integers and other kind of things from the stack with automatic conversion. (Look at the other primitive methods in class Interpreter for lots of examples.) Since both of the arguments here are pointers, I use stackValue:.
Step (2) is a simple example of type-checking on primitive arguments. The success: message sets the primitive success/fail flag based on whether the second argument is a ByteArray. The method/function success: is used in the Smalltalk glue code and in C primitive implementations to signal primitive success or failure; to fail, set success to false, as in the test in step 2.
Step (3) uses the message "cCode: aString"; it takes a C function prototype as its argument and it is here that we actually call our C-language primitive. Note that I must use the actual variable names packet and data in the string. The "data + 4" means that the argument is a ByteArray but that the C code casts it as (unsigned char *); 4 is the size of the object header, so I skip it to pass the base address of the byte array's actual (char *) data. This is a hard-coded special value that implies that I know the object header is 32 bits.
In step (4), we pop the two arguments *and* the receiver object (a PrimMIDIPort instance) off of the stack if the primitive succeeded.
Step (5) pushes the C function's return value onto the stack as an integer. There are other coercion functions in ObjectMemory that can be found used in other primitive methods in class Interpreter.
I have not discussed data sharing between glue code and primitives, but there are some nifty and flexible facilities for it. (You can actually declare a temporary variable or argument to the glue method with the exact format it will have in C.) Look at John Maloney's sound primitives, or browse senders of var:declareC: as used in Interpreter >> primitiveSoundGetRecordingSampleRate (or pay me a really fat consulting fee to tell you about it :-) .
Because the VM is single-threaded, the garbage collector will not run while your glue code (and the primitive it calls) is active, so the objects you pass to C are safe for the duration of the primitive. If you want to pass an object pointer down to C code and have it held onto across primitive calls, you have to register it with the garbage collector as special so it will not be moved. (I've generated gigabytes of core dumps over the years with a whole array of VMs by forgetting this.) Look at the senders of SystemDictionary >> registerExternalObject: for places that do this.
The glue code method is translated to C when you generate a new interp.c file (see below) so it is important that you can't just send arbitrary Smalltalk messages from here. Look at the other primitive glue code methods in Interpreter (or DynamicInterpreter) for more examples.
... (614 primitiveReadMIDIPacket) ...
The init. method is called automagically when you regenerate the interpreter, so you don't have to do that now.
Although there is no formal method for registering primitive numbers, Ward Cunningham's Wiki server does have a page for "voluntary" reservations (see http://c2.com:8080/PrimitiveNumberRegistry). I strongly recommend that you coordinate with other developers by looking here and telling the world what numbers you're using.
Interpreter translate: 'interp.c' doInlining: true.This'll take a while, and will create a file named "interp.c" in the same directory as the VI. If you haven't already done so, you also need to write out all the other VM sources by executing,
InterpreterSupportCode writeMacSourceFilesor whatever is appropriate on your platform.
If you're new to VM-generation, you should definitely make sure you can re-create the default Squeak VM for your platform before you try adding new primitives. Test the development project or makefile and platform-specific source files first, then add your new primitive code.
/*************************************************************** * sqReadMIDIPacket -- Sent in response to the input semaphore * This is to up-load MIDI messages into the arguments (a MIDIPacket * and its data ByteArray -- both passed as ints and cast here). * ST: PrimMIDIPort primReadPacket: packet data: data */ int sqReadMIDIPacket(int ipacket, int idata) { // The ipacket object is defined as: // Object subclass: #MIDIPacket // instanceVariableNames: 'length time flags data ' // idata is a byte array (+4 to skip the object header) sqMIDIEvent *outPtr; unsigned char *cdata; int len, i; unsigned char *pdata = (unsigned char *)idata; success(true); // Set the success flag to true. if (itemsInInQ == 0) // Return immediately if there is no input. return (0); if (sqCallback == 0) // Answer an error code if input is off. // success(true); // Set the success flag to false to fail. return (-1); // Get a pointer to the MIDI event structure. outPtr = &sqMIDIInQ;[itemsInInQ2++]; // Print a message for debugging--yes, // you can use printf() in the VM! if(debug_mode) printf("%x %x %x\n", outPtr->data[0], outPtr->data[1], outPtr->data[2]); len = outPtr->len; // Copy the response fields. // Copy the driver data into the packet. // (Inst vars are 1-based.) instVarAtPutInt(ipacket, 1, len); // Copy length, time, flags. instVarAtPutInt(ipacket, 2, (int)(outPtr->timeStamp)); instVarAtPutInt(ipacket, 3, (int)(outPtr->flags)); cdata = &(outPtr->data[0]); // Copy MIDI message bytes into the packet. for (i=0; i<len; i++) *pdata++ = *cdata++; return (len); // Answer len. } // End of fcn
Most of this should be pretty obvious to the seasoned C programmer. The cast of the idata argument from int to (unsigned char *) will work because it's actually a ByteArray (+ 4) in Smalltalk. The instVarAtPutInt() macro is defined as,
#define longAtput(i, val) (*((int *) (i)) = val) #define instVarAtPutInt(obj, slot, val) \ longAtput(((char *)obj + (slot << 2)), (((int)val << 1) | 1))This is nasty, but allows you to stuff 31-bit integers into SmallInteger instance variables with abandon. If you look into interp.c, there are more useful macros for primitive writers that would help you if you need to write floats, etc.
It's outside of the scope of this introduction to go into the details of object unpacking in C, but given the above macros, Dan Ingalls's notes on the Squeak object format (from their OOPSLA paper), and a good debugger, you can pretty much do anything. As I stated above, however, it's my opinion that it's easier to unpack objects in Smalltalk, so I have lots of primitives that take several arguments that are the components of one top-level Squeak object.
The last line of the function returns an integer to the glue code, which pushes it onto the stack explicitly after popping the arguments and receiver object.
Note that I also use printf() for debugging, On a Mac, printf() from the VM pops up an output window for the messages. I use the following macros for debugging primitives,
Boolean debug_mode = true; // Enable/Disable printfs (see macros below) // Debugging macros #define dprint1(str) if(debug_mode) printf(str) #define dprint2(str, val) if(debug_mode) printf(str, val) #define dprint3(str, v1, v2) if(debug_mode) printf(str, v1, v2) etc...(The same could be done with #ifdef, of course.)
/* MIDI Prims */ // Added by STP #include "OMS.h" // OMS definitions and structs #include <MIDI.h> // Apple MIDI Libraries #include "sqMIDI.h" // Squeak MIDI Structs and Primsand in my package's header file--sqMIDI.h--I have,
int sqReadMIDIPacket(int MIDIpacket, int dataBuffer);Note that I have to include another header file for the OMS libraries, and to include the Apple MIDI library. This would not be necessary for a simpler primitive that had less (baggage) of its own.
There's another whole note yet to be written about debugging primitives, but on most platforms you can simply use the debugger to put breakpoints in the C primitive methods and single-step through them (Smalltalk will be frozen all the while, of course).
There is really no net (in terms of memory protection or "safe" primitives) here; it's quite easy to corrupt Smalltalk's heap or other memory with C, and to end up with a system that crashes unpredictable some time after you call your primitive. Be really careful about memory and stack management. Also remember the note above (in all-bold) that objects can be moved by the garbage collector between primitive calls, so if you ever pass a poiner to the VM to hold onto, you have to register it in Squeak as being external.
You can also trigger Smalltalk semaphores from C primitives; see John Maloney's SoundPlayer class or Siren's PrimMIDIPort for examples. This is by far the best way to implement "call-backs" from C to Smalltalk--have the Smalltalk application class pass down a semaphore to the VM and then start a loop process that waits for the semaphore and handles it asynchronously (If you're really clever, you can even create events and post them in Squeak's event input queue.)
For more examples: See the socket primitives for a simple interface to an external API (that passes structures around and coerces between Smalltalk objects and C structs); see the sound player primitives for examples of asynchronous I/O; see the AbstractSound classes for examples of automatically generated primitives.
Many thanks to Reinier van Loon (R.L.J.M.W.van.Loon@inter.nl.net) for the initial HTML translation of this text.
Comments are invited.
Stephen Travis Pope