This page is currently incomplete. The completed guide will be made available in a future release and on the PTP wiki page.



The PTP Configurable Resource Manager

Introduction

The JAXB Resource Manager plug-ins, part of the Parallel Tools Platform (PTP), allow you to launch and monitor applications on local or remote resources using resource managers which are configured from an XML file via JAXB ( javax.xml.bind ) technology.

There are two main motivations for providing this class of resource managers:

  1. To allow for maximum configurability. Often job schedulers (PBS, LSF, LoadLeveler, etc.) or interactive runtime systems (OpenMPI, PE, SLURM, etc.) are set up by system administrators in special or non-standard ways which make it difficult to use a generic tool. The configuration file allows a user or community of users to fit the resource manager to a class of systems or even to a single host.
  2. Building the resource manager and its UI presentation from an XML configuration means that in most cases no special coding is necessary. Users should be able to accommodate new systems, at least on the client-end, without writing and loading additional Eclipse plugins. (The only qualification here is that the monitoring component also support that type of scheduler or runtime; see the following paragraphs.)

An additional consideration in designing a generically configurable resource manager was to partition the client functionality so as to eliminate the need for special server-side proxies and to scale more successfully in the updating of job and resource information.

To this end, JAXB resource managers now consist of two components, a "control", which governs the configuration, launch and cancellation of individual jobs entirely from the client end, and a "monitor", which displays job status as well as global information about the HPC resource. In most cases, the monitor will be a pre-built type provided by the PTP distribution, implemented using LLview. Since LLview already supports a good number of the standard scheduler types, adding a new resource manager type will normally entail only the specific configuration of its control part. We plan to make available both user-initiated as well as system-wide deployment of the necessary LLview parts (mostly Perl scripts) for monitoring the target resource. See further under the User pages for more information

The following is a guide to creating or modifying a resource manager configuration. Those interested only in using the JAXB resource managers already provided with the PTP distribution should consult the User pages under the relevant scheduler (currently only the PBS resource managers are JAXB-configurable).


Configuring the Resource Manager

The JAXB Resource Manager is model-driven ; this means that its functioning and appearance are determined by a set of definitions provided usually via an XML file. What follows is a detailed explanation of the schema governing the resource manager XML configuration.

The Resource Manager Data Type

ResourceManagerData

The top-level of the definition tree consists of three elements: site-, control- and monitor-data. In addition, a resource manager should be given a name which sufficiently distinguishes it from others of a similar type; e.g., pbs-torque-v_2.3.7_abe is specific to an installation on the host abe, ll-v_4.0 suits all installations of LoadLeveler version 4, etc.

The site-data element provides an optional place to set default remote site information. The connection strings are URIs which are specific to the PTP RemoteServices definitions. The scheme for these URIs will usually name the specific remote service (e.g, rse: or remotetools: ; local is simply file: ). The host name and port given here will appear as defaults in the resource manager selection wizard when you create a new connection.

The principal section of the schema is devoted to defining the resource manager's control part. The top-level control elements include properties and attributes, files to be staged, job script to be generated (if any), commands specific to the resource manager, and the layout of the Launch Tab.

The resource manager implementation constructs a variable map from the defined properties and attributes which serves as the resource manager "environment"; these variables are dereferenced in the configuration file via ${ptp_rm:name#fieldName} , e.g., ${ptp_rm:queues#value} (see further below on the fields for properties and attributes); all properties and attributes defined in the configuration are mapped. The following hard-coded properties are also added at runtime:

control.user.name
control.address
control.workding.dir
executablePath
progArgs
directory

Commands are externally executed calls, either to a local or remote OS, depending on the connection defined for the resource manager. The start-up- and shut-down-commands are arbitrary commands to be run (in order) at startup or exit. The submit commands are those used to launch jobs. Currently a configuration may have only a batch or an interactive mode. Thus it may have only two submission modes, a run and a debug, for the given type. In the future we may allow all four to coexist in a single configuration. get-job-status is a user-initiated (on-demand) request to refresh the status information for a submission. Normal (polled) updates are the responsibility of the monitor. The command nevertheless needs to be implemented in most cases, as it will be called internally just after submission. The remaining commands are operations which can be applied to running jobs; with the exception of terminate, the rest of these have to do with schedulers (batch-systems) and do not apply to resource managers which connect to interactive runtime-systems such as OpenMPI or PE. Note: if the submission type is interactive, the terminate-job command usually does not need to be implemented, as the process termination will be handled internally. However, in some cases (such as PBS -I) which require the interactive job to run as a pseudo-terminal, one may need this command in order to force its termination externally.

The Control Data Type

ControlData

The majority of the XML definition is given over to the set-up of the Resource Manager control. One can think of this section as having four subdivisions:

  1. Configuration Variable Definitions (the «Environment»)
  2. Files and Scripts
  3. External Commands and their Stream Parsers
  4. UI Configuration (Launch Tab)

We will look at these each in turn.

Property and Attribute Types

A property is any variable necessary for the functioning of the resource manager. Properties often (but not necessarily) are not visible. The value for properties can be any primitive type, or lists or maps of strings. If stdout and stderr from a scheduled job is to be delivered to the client, the properties stdout_remote_path and stderr_remote_path should be included in the resource manager property set.

NOTE: the untyped "value" element on properties is for internal use only; to give a predefined (primitive) value, use the "default" element along with the "type" attribute.

XML Resource Manager Schema


Back to Table of Contents