Real parallel programming always has been based on specific types of application programming interfaces (APIs). These APIs have been designed around experience and need, integrating techniques of concurrent programming (Mutual Exclusion, Monitors, Threads) and interprocessor communication (PVM, RPC, NX, MPI).
Because these APIs are not models of computation, it is difficult to use them to study or to predict the behavior of parallel algorithms running on parallel machines.Recent efforts such as LogP model can be used to study the behavior of message passing programs quite realistically, but they are not designed for programming parallel algorithms or studying theoretical properties of actual parallel machines.
A model of parallel computing analogous to the von Neumann model of sequential computing could wed theoretical analysis to real parallel programming. The BSP model was introduced by Les Valiant as just such a bridging model, linking architecture and software, theory and practice. BSP offers both a a common abstraction toward which computer architects and compiler writers can design, and a concise model of parallel program execution enabling accurate performance prediction for proactive application design.
A bulk synchronous parallel computer is a set of processing nodes, each performing processing and/or memory functions,a router that delivers messages point-to-point between processing nodes, andan efficient means of synchronizing all or a subset of the processing nodes.
A bulk synchronous parallel computation is divided into supersteps, delineated by synchronization operations. During a superstep, a processing node may make memory references, perform I/O, compute, and send messages. Messages sent during a superstep are not guaranteed to arrive until the following superstep.
Why use VMMC?
BSP semantics permit two modes of messaging, which we call deferred and immediate. In the deferred mode, data values are changed during synchronization, while inimmediate mode, data values are changed when a send operation actually occurs. Clearly, VMMC provides an efficient platform on which to implement immediatemode, since "remote write" is the basic mechanism that VMMC provides. Care must be taken to insure that sending and receive processes are in the samesuperstep. The tradeoff is between this cost and the cost of copying imposed by the deferred model. Research also shows that VMMC supports deferred mode efficiently. Synchronization can be effected both by traditional exchange of special messages and by counting the number of incoming messages.Surprisingly, receivers almost always "know" the number of incoming messages during a given superstep. In those instances where they don't, senders can inform receivers explicitely. VMMC is an excellent platform to test these different mode and models.