To run the simulator you need to install first PVM software. Refer to PVM documentation to properly install this package. The simulator works with internet sockets only, so you need to define NOUNIXDOM in PVM configuration file before you compile PVM. SBL checks this condition and prints error messages indicating what exactly should be done if it is not satisfied.
Each SBL process is represented by one PVM task. Standard output and standard error of processes started with SBL_Spawn go to temporary file (in standard configuration it is /tmp/pvml.userid on each node).
The simulator consists of the following source files:
The following restrictions are introduced by this simulator:
CLIENTSERVER compilation variable is defined in SBL makefile (in DEFINES macro). When this variable is defined, SBL takes care of cases when processes die unexpectedly. In such cases, PVM delivers notifications to surviving processes, and SBL is able to clean up its state. With CLIENTSERVER undefined, SBL library is oblivious to deaths of other processes.
The correct semantics of SBL is obtained with CLIENTSERVER defined, at the cost of a few spurious error messages printed by PVM daemons. If you run only multicomputer programs, where processes do not die unexpectedly, you can undefine CLIENTSERVER in SBL makefile and avoid these messages. For more discussion see below.
SBLMALLOC compilation variable is defined in SBL makefile. When this variable is defined, SBL supplied malloc is used instead of standard malloc. You should not undefine this variable, because PVM may call malloc from inside functions called by SBLSIG handler.
Next, you can start your own program in a standard way (for example from shell prompt).
If nodes in your PVM machine use the same file system (NFS mounted), you can keep one copy of an executable, and start it with absolute path (this path must identify the executable on all nodes). Alternatively, you can place executables in default PVM directory for binaries (in the standard configuration it is ~/pvm3/\$(PVM\_ARCH)/bin).
Each process can produce its own log. To do so, SBL should be compiled with DEBUG defined and LOGDIR set to the path where log files should go. By default this is /u/$(LOGNAME)/tmp. Note that the same directory is used by each process, so it should be accessible on all nodes running given SBL program (for example, by using NFS).
Log file is printed to a file by calling SBL_PrintLog(). This call produces event log for this process only, so to produce all log files you need to call it inside each process. The event log of a process pid executing on node with hostname Hname goes to text file $(LOGDIR)/
Log file is a text file, with each line representing one event. Each event has its name (text string) followed by an event-specific text. Events can be logically grouped into receive events, send events, and misc events (spawn, task exit, export and user-defined event). Both send and receive events print, after event name, id of a sender followed by id of a receiver associated with this particular event, followed by a sequence number of sender. This number is incremented each time a process sends a message (system or user-level). The triple (sender-id, receiver-id, seqno) uniquely identifies a message across all log files of a given SBL program.
Note that for a receive event, the receiver is always a process to which this log belongs. For send events, owner of this log is the sender process.
How an SBL process is identified inside log files depends on how many nodes are used by a PVM machine a given SBL program executes on. If only one node is used, an id is just a UNIX proces id; if more than one node is used, an id consists of a UNIX process id followed by a colon followed by a hostname.
User-defined events can be added to a log by calling SBL_AddLogEvent(char *evdesc). This adds a string evdesc to this process log as a description of a user event.
The following events are defined:
See the twonodes sample program for illustration on how to use logging, and how to add user-defined events.
Two processes are started: master and slave. Both are started form the same executable (twonodes.c). Master is the main process. It takes no arguments. Slave is started automatically by a master with one argument - string "slave" (note that you should not start slave directly).
Master process first uses SBL_Hosts to get machine configuration, and prints it on the standard output. Next, slave is started. Master process exports one receive buffer (rbuf) of size 1 SHRIMP word and with buffer id rbufId set to 2. After this export, master spins waiting for a write coming from the slave.
Slave imports receive buffer created by the master and writes to it once. Note that slave uses SBL_Parent to get node and process id of its master, which are needed for import call.
Master spins at the end waiting for a flag to become 1. This flag is set when notification arrives and the message handler associated with exported receive buffer is called. Please note that message itself does not set the flag (unlike in original twonodes), as the data of the message is one integer equal to zero.
First process, a master, is started with one argument, total number of processes to start. This number should be greater than 1.
Slaves and masters execute from the same file. To distinguish between master and slaves, first argument to slave is slave string, the second argument is number of processes. You should not attempt to start slave process.
Master starts nprocs -1 slaves and initializes them by sending one initialization message to each slave. This message goes to initialization receive buffer (InitRbuf) on each slave (note that master has to import all initialization buffers of slaves). Initialization message contains node and process ids of all processes started by ring. Additionally, this message contains process arguments specific to a given slave (order number of this slave on the ring).
To build a ring, each process exports its elment of the ring (receive buffer RingElem), and imports a proxy for the ring element of next process. Finally, each slave process receives an integer, increments it and forwards it in the ring. Master process prints incoming token on standard output (the value of the token should be equal to nprocs.
Use of signals is caused by a semantic gap between programming model supported by PVM and message passing in SHRIMP. Namely, in PVM model there is explicit receive operation, whereas in SHRIMP model messages are received by hardware and no software intervention is necessary. Since signal handler is provided by SBL, the use of signals is transparent to user programs with the following exceptions: SBLSIG signal should not be used by user programs, signals should be blocked with care and PVM calls should not be used directly. The last restriction is necessary since PVM routines are not reentrant so if a signal handler is invoked in one of them unpredictable things may happen.
To avoid such problems in SBL, SBLSIG signal is blocked each time we call SBL function. After SBL call returns, this signal is unblocked. Obviously, care must be taken to avoid deadlock with this scheme, as blocking signals turns off blind (i.e. signal-generated) message receiving. If an SBL function must block spin waiting, SBLSIG is unblocked to enable receiving.
Any non-reentrant functions from other libraries (like stdio) cannot be called from within signal handler. For this reason SBL supplies its own malloc related functions, which replace system-specific malloc. This malloc is SBLSIG signal-safe and can be used from iside signal handler.