Next: References
Previous: History
- OMIS is no longer oriented towards PVM. Instead it tries to
cover all common message passing programming models without
preferring one or the other. In future it will also be extended to
work with shared memory systems.
- Several cooperations were started since OMIS 1.0 was
released. We had intensive discussions with other researchers who
would like to employ an OMIS compliant monitoring system for their
own tool environments.
- Design and implementation of an OMIS compliant monitoring system
(OCM) were started in 1996. The text gives some details.
- The system model was clarified with respect to target
architecture and target platforms. Execution objects and node
objects are defined more clearly (see Chapter 4).
- The technical terms synchronous and asynchronous service had
been replaced (see Chapter 5).
- The technical terms information service, manipulation service,
event service, conditional and unconditional request have been
introduced (see Chapter 5).
- The list of objects was enhanced by threads (see
Chapter 5).
- Examples in Chapter 5 were re-written
according to the new syntax.
- Some technical terms have been renamed to make their intended
meaning more clear and to avoid confusion (see Glossary).
- An extensive specification of the semantics of service requests
and replies has been added to the document (see
Chapter 7). This also includes the specification of
error handling.
- The procedural interface has been extended and revised. It is
more flexible and should be rather complete now (see
Section 7.1).
- Syntax and semantics of service requests have been modified in
order to simplify the usage of OMIS and to make it more general (see
Section 7.2 to 7.2.4):
- The OMIS 2.0 interface offers location transparency, i.e.
services no longer have to be prefixed with a list of node
numbers. Instead, services now get as a parameter a list of
objects they shall work on. The nodes a service will be executed on
are uniquely defined by this object list.
- Objects like nodes, processes etc. are no longer addressed by
some concrete ID (e.g. a PVM task ID), but by an abstract token
generated by the monitoring system. In this way, objects can be
identified in both a globally unique and platform independent way.
- OMIS 2.0 now defines a hierarchy of application objects, that
allows a conversion between the different object types, e.g.
expanding a node token into a list of tokens for all processes on
that node (see Section 7.2.2).
- Tokens are now also used to identify monitor objects,
especially conditional service requests. Thus, services no longer
have to be preceded with a request ID. Instead, a request token is
returned upon successful definition of conditional request.
Requests containing actions modifying their own request (e.g.
deleting it in order to implement a temporary breakpoint) are
still possible by using a special event context parameter (see
Section 7.2.4). Dependency cycles between conditional
requests may also be broken via user defined events.
- To support tools extracting large amounts of data from an
application, a new data type ``binary string'' has been added to
avoid the overhead of converting to/from an ASCII representation.
- The specification no longer requires an atomic multicast
protocol to be used for the distribution of requests touching
multiple nodes. Instead, less restrictive requirements on the
ordering of service execution have been specified (see
Section 7.2.3). To enforce exclusive execution of a
request, action lists can now be locked.
- The specification on ordering constraints on the individual
actions in an action list has been revised. There is no longer a
parallel action lists where even actions on the same node may be
executed in parallel or in any order different from the specified
one. Instead, execution of services on the same node is now
guaranteed to be sequential; parallelism can occur between nodes.
The separator ';' can be used anywhere within an action list to
denote a barrier-like synchronization (see
Section 7.2.3). The separator ',' has been removed.
- Event context parameters now are referenced by name rather
than by number to increase ease-of-use.
- The colon separating the event definition from the action list
in a request now is even required for unconditional requests. This
simplifies parsing considerably.
- The format of service replies has been changed completely.
Instead of a single linear string that has to be parsed by the tool,
a more structured representation is used now. A reply consist of a
sequence of sub-replies, one for each service contained in the
request. Each of these sub-replies is again a sequence of results,
one for each object the service operated on. Identical results may
be merged into a single entry (see Section 7.3).
- The way how the monitoring system connects to an application
program has been changed in order to remove the previous dependency
on the notion of a virtual machine (as in PVM). In OMIS 2.0, a tool
must explicitly attach to all the nodes and processes it wants to be
monitored. This scheme also implies that, if multiple tools connect
to the same monitoring system, each of these tools has its own
specific view of the observed system. In the course of this change,
we also had to change the handling of creation and deletion of
processes and nodes (see Section 8.1.1,
8.1.2, and 8.1.3).
- There is a large couple of changes in the description of the
specific monitoring services:
- The naming of services is more systematic now. A prefix
indicates the type of object the service is working on. In addition,
names are chosen in a way that indicates the kind of service
(i.e. information service, manipulation service, event service).
- The set of process services of OMIS 1.0 has been carefully
analyzed and split into separate services for processes and
threads. Thus, OMIS 2.0 can also be used for multithreaded
systems.
- Some parameter lists have been extended (e.g. block length
and stride for memory accesses, length of stack backtrace, etc.).
- Specifications are now more accurate and complete. This mostly
concerns the services node_get_info,
proc_get_info, and thread_get_info.
- Finally, the set of services has been split into a basic set
that is independent of the parallel programming library used (see
Chapter 8), and a PVM extension providing
services to support PVM (see Chapter 9).
- The sections on implementation concepts have been removed from
this document. When our implementation of the OMIS compliant
monitoring is finished, a design document will be prepared as a
separate report.
First version of the OMIS document.
Next: References
Previous: History
Thomas Ludwig
9/11/1997