Mapping

One of the essential steps during embedded software development for multi-core platforms is the mapping from elements of the software to elements of the hardware, e.g. Tasks to Schedulers, Labels to Memorys etc. This is usually a non trivial task, as an infinite number of combinations arises if either the software or the hardware becomes very complex. The purpose of APP4MCs OpenMapping Plugin is to determine such a mapping and store it in a Mapping Model which will contain the allocations of elements of the Software Model to elements of the Hardware Model .

Concept

The conceptual implementation of OpenMapping Plugin is shown in the following Figure.

As shown in the top of this figure, it requires several models to operate. The models for Software, Hardware and Constraints are mandatory while the Property Constraints Model is optional.

Using the OpenMapping Plugin, the user is able to choose between different mapping strategies. Currently these strategies are split into two categories: Heuristic methods and Integer Linear Programming (ILP) based methods. Unlike ILP based methods, Heuristic methods, such as the Heuristic Data Flow Graph (DFG) load balancing, will immediately create a mapping.

ILP based methods on the other hand will first need to generate an ILP model of the mapping problem according to the selected mapping strategy, e.g. ILP based load balancing or Energy aware mapping. Once the ILP model has been created, it will be solved by one of the mathematical Solvers. Currently, the open source project Oj!Algo 1 has been used in OpenMapping Plugin. Furthermore, the user can activate an optional MPS generator, which will generate an MPS file containing the ILP problem. This file may be used to solve the ILP problem by external (e.g. commercial) solvers, which tend to be more efficient in solving larger models compared to open source Java implementations.

Once a mapping has been determined, it is displayed within the eclipse console and following output models are generated:

Implementation

The following subsections give a short introduction about the different algorithm implementations of the OpenMapping Plugin. Section Task generation describes the task generation method which is used to convert process prototypes into tasks. It is meant to be used by mapping algorithms which do not feature task generation by themselves. Sections Mapping Strategy 1: Heuristic DFG load balancing and Mapping Strategy 2: ILP based load balancing describe a heuristic and a mathematical load balancing approach for mapping tasks to cores. Finally, a more complex method for energy efficient task mapping with its own task creation algorithm is outlined in section Mapping Strategy 3: Minimizing Energy Consumption.

Task generation

The task generation method in OpenMapping Plugin is a pragmatic way to create tasks for other mapping algorithms which require Tasks, i.e. are not designed to agglomerate Runnables into Tasks on their own. This step utilizes ProcessPrototypes which are generated by the partitioning plugin (see Chapter Partitioning) and transforms them into Tasks. Furthermore, it will also create the Stimuli Model which contains the activation elements for the Tasks, i.e. PeriodicStimulus. An overview about the transformed elements and their sources as well as destinations is shown in Table 1.

Source Model Source Element Target Model Target Element
SW ProcessPrototype SW Task
SW PeriodicActivation Stimuli PeriodicStimulus
SW TaskRunnableCall (within ProcessPrototype) SW TaskRunnableCall (within Task)

Table 1

Mapping Strategy 1: Heuristic DFG load balancing

The Heuristic Data Flow Graph (DFG) load balancing algorithm aims at achieving an equal utilization of a hardware platforms cores for DFG based software models.

The first step in this algorithm is to determine the most complex Task (usually representing the critical path) and allocate it to the best fit core of a hardware platform. The runtime for each Task will now be estimated for every Core within the System and allocated to a Core which has the smallest increase of the longest overall runtime within all cores.

One of the major benefits of this algorithm is its very low runtime. The information which is processed by this mapping strategy and, as such, has to be present in the input models, is shown in Table 2.

Mapping Strategy 2: ILP based load balancing

This section described a comparatively simple ILP based strategy for allocating tasks to processors while minimizing the total execution time. This method supports multiple processors with the same processing speed (e.g. homogeneous processors) and it does not consider any dependencies between the tasks (e.g. waiting for the results of the predecessor).

Load balancing within this method is achieved by minimizing the highest execution duration C max of all m processing units with n tasks. The variable x ij is set to 1 if a task j is allocated to processor i and 0 otherwise. The model guarantees that each task is allocated to exactly one processor and limits the variables x ij type to boolean values. The duration (execution time) of a task j is specified by p j .

One of the downsides in this algorithm is caused by variable p j which forces an equal processing duration of a task j on all cores. It is however possible to expand the method to support heterogeneous processors (in this case: processors with different processing speeds) with a minor modification: replacing p j with p ij , i.e. a separate processing duration of task j for every core i, will solve this problem.

The minimal amount of information which is required to execute this algorithm is outlined in Table 2.

Source Model Element Description
HW Core A Core represents the target of an allocation. One OS Model with a Scheduler for each Core will be generated.
CoreType Prescaler Quartz A cores Prescaler, the referenced Quartz and the CoreTypes attribute CyclesPerTick of a Core are used to determine the number of processed Instructions per second.
SW Task Tasks will be allocated to a Core (over the cores Scheduler)
Runnable Runnables are derived from a tasks TaskRunnableCalls, their attribute Instructions is used during the load calculation for each Core
Stimuli Stimulus ( PeriodicStimulus) The PeriodicStimulus is used to specify a tasks activation rate, i.e. the period between its calls

Table 2

Mapping Strategy 3: Minimizing Energy Consumption

This mapping algorithm is based on the work "Task Scheduling and Voltage Selection for Energy Minimization" from Zhang et al. which presents a framework that aims at minimizing the energy consumption of variable voltage processors executing real time dependent tasks. This method is implemented as a two phase approach which integrates

In the first phase, opportunities for energy minimization are revealed by ordering real-time dependent tasks and assigning them to processors on the respective target platform.

Once the scheduling is created, there will be time frames between the end of one task and the start of another during which the processor is not being utilized (so called slacks). These slacks the prerequisites for the second phase, which performs the voltage selection. This phase aims at determining the resp. (optimal) processor voltage for each of its task executions without harming the constraints and eventually minimizing the total energy consumption of the system. In order to determine these voltages, the task scheduling is transformed into a directed acyclic graph (DAG) that is used to model the selection problem as integer programming (IP) problem. Once the model has been set up, it is optimized by a mathematical solver.

This algorithm has been implemented with two constraints:

Table 3 lists the minimal amount of information which has to be present in the input models in order for this mapping strategy to work as well the special annotations which are added to the mapping model.

Source Model Element Description
HW Core A Core represents the target of an allocation. One OS Model with a Scheduler for each Core will be generated.
CoreType Prescaler Quartz A cores Prescaler, the referenced Quartz and the CoreTypes attribute CyclesPerTick of a Core are used to determine the number of processed Instructions per second.
The CoreTypes contained CustomProperty ( DoubleValue) starting with the label EnEf-Volt_{SomeID} and EnEf-Scale_{SomeID} are used to specify the voltage levels, i.e. the performance of a core during a specific voltage.
SW Runnable Runnables will be distributed to the cores (over the cores Scheduler), their attribute Instructions is used during the load calculation for each Core
Activation ( PeriodicActivation) The PeriodicActivation specifies the recurrence of the Runnable. The lowest recurrence is used to specify the overall deadline of all Runnables, i.e. the max amount of time for the sum of all Runnable executions.
Constraints RunnableEntityGroup RunnableSequencingConstraint Are used to determine the executional order of the Runnables as well as their interdependencies
Mapping RunnableAllocation CustomProperty LongValue Specifies the selected voltage level and the number of ExecutionCycles at this voltage level.

Table 3

h3sec:mapUtilization). Utilization of the OpenMapping Plugin

This section provides information on the utilization of the AMALTHEA Mapping Plugin, i.e. its configuration (section Configuration and Preferences ) and how to generate mappings (section Generating a mapping ).

Configuration and Preferences

The configuration of OpenMapping Plugin can be performed through its preferences page. It is integrated into APP4MC and can be accessed through the menu bar under ‘Window’ -> ‘Preferences’ -> ‘AMALTHEA Mapping’. The configurable fields, their types and their descriptions are listed below.

Enabling verbose logging

Checking the box 'Enable verbose logging to console' will enable verbose logging to stdout. This may help to identify problems if the mapping plugin should fail to generate a mapping.

Specifying the output location

The radio buttons under 'Select output location' allow to customize the directory which where newly generated files will be placed into.

Hint: It should be noted, that using this option will NOT update the project explorers folder list once the mapping is finished. It should be avoided to use this option in combination with a target location within the eclipse workspace.

Selecting a mapping algorithm

The radio buttons within 'Select mapping algorithm' allow to customize the mapping strategy which should be applied during the mapping process. Currently, there are three valid options:

Configuring Mathematical Solver

Hint: The settings described in this section only affect ILP based algorithms!

The section Solver Settings allows to configure the solver which is used to approximate the ILP problems, specify the minimal accuracy of the found solution and activate the MPS file output of the - ready to solve - ILP problem.

Setting this value to 0.0 will order to solver to continue until either the final solution reaches the same value as the LP relaxion or another limit (below) has been reached while 1.0 will consider the first feasible solution being optimal.
Valid values: 0.0 – 1.0

Furthermore, it is possible to specify the maximum number of iterations or time spend on finding an optimal solution.

Setting one of these values to zero will pass the value of INT_MAX to the solver, technically removing the respective constraint.

Generating a mapping

Depending on the selected mapping strategy, it may be required to create tasks in advance of the mapping algorithm. The method 'Create Tasks', which is accessible through the AMALTHEA Software Models file context menu (right click on *.amxmi and *.amxmi-sw files), is capable of transforming partitioned software models into software models with tasks.

The mapping can be performed once input models with the required amount of information are present. Opening the context menu again (right click on *.amxmi and *.amxmi-sw files) and selecting 'Perform Mapping' will open the ‘Perform Mapping GUI’.

The fields within the GUI are described below.

1 Oj!Algorithms, licensed under the MIT license, see: http://ojalgo.org