LTTng Kernel Analysis

Historically, LTTng was developed to trace the Linux kernel and, over time, a number of kernel-oriented analysis views were developed and organized in a perspective.

This section presents a description of the OS Tracing Overview perspective and the LTTng Kernel perspective.

OS Tracing Overview Perspective

The OS Tracing Overview perspective groups the following views:

The perspective can be opened from the Eclipse Open Perspective dialog ( Window > Open Perspective... > Other).

This perspective is intended to be used to locate performance issues by observing resource usage.

The perspective can show times resource usage is anomalous. This can help locating the causes of system slowdowns in throughput or response time.

An example can be program that is doing a lot of processing then slows down due to a database access. The user will see a dip in CPU usage and maybe a slight rise in I/O access. The user should consider both spike and slums to be an indication of an area worth investigating.

Once a performance issue has been localized, it can be further investigated with the #LTTng kernel Perspective.

LTTng Kernel Perspective

The LTTng Kernel perspective is built upon the Tracing Perspective, re-organizes them slightly and adds the following views:

The perspective can be opened from the Eclipse Open Perspective dialog ( Window > Open Perspective... > Other).

Control Flow View

The Control Flow view is a LTTng-specific view that shows per-process events graphically. The Linux Kernel Analysis is executed the first time a LTTng Kernel is opened. After opening the trace, the element Control Flow is added under the Linux Kernel Analysis tree element in the Project Explorer. To open the view, double-click the Control Flow tree element.

Alternatively, select Control Flow under LTTng within the Show View window ( Window -> Show View -> Other...):

You should get something like this:

The view is divided into the following important sections: process tree and information, control flow and the toolbar. The time axis is aligned with other views that support automatic time axis alignment (see Automatic Time Axis Alignment).

The following sections provide detailed information for each part of the Control Flow View.

Process tree and information

Processes are organized as a tree within this view. This way, child and parent processes are easy to identify.

The layout is based on the states computed from the trace events.

A given process may be shown at different places within the tree since the nodes are unique (TID, birth time) couples. This means that if process B of parent A dies, you'll still see it in the tree. If process A forks process B again, it will be shown as a different node since it won't have the same birth time (and probably not the same TID). This has the advantage that the tree, once loaded, never changes: horizontal scrolling within the control flow remains possible.

The TID column shows the process node's thread ID and the PTID column shows its parent thread ID (nothing is shown if the process has no parent).

It is possible to sort the columns of the tree by clicking on the column header. Subsequent clicking will change the sort order. The hierarchy, i.e. the parent-child relationship is kept. When opening a trace for the first time, the processes are sorted by birth time. The sort order and column will be preserved when switching between open traces. Note that when opening an experiment the processes will be sorted within each trace.

Control flow

This part of the Control Flow View is probably the most interesting one. Using the mouse, you can navigate through the trace (go left, right) and zoom on a specific region to inspect its details.

The colored bars you see represent states for the associated process node. When a process state changes in time, so does the color. For state SYSCALL the name of the system call is displayed in the state bar. States colors legend is available through a toolbar button:

This dark yellow is what you'll see most of the time since scheduling puts processes on hold while others run.

The vertical blue line with T1 above it is the current selection indicator. When a time range is selected, the region between the begin and end time of the selection will be shaded and two lines with T1 and T2 above will be displayed. The time stamps corresponding to T1, T2 and their delta are shown in the status line when the mouse is hovering over the control flow.

Arrows can be displayed that follow the execution of each CPU across processes. The arrows indicate when the scheduler switches from one process to another for a given CPU. The CPU being followed is indicated on the state tooltip. When the scheduler switches to and from the idle process, the arrow skips to the next process which executes on the CPU after the idle process. Note that an appropriate zoom level is required for all arrows to be displayed.

The display of arrows is optional and can be toggled using the Hide Arrows toolbar button. It is also possible to follow a CPU's execution across state changes and the scheduler's process switching using the Follow CPU Forward/Backward toolbar buttons.

Using the mouse

The following mouse actions are available:

When the current time indicator is changed (when clicking in the states flow), all the other views are synchronized. For example, the Events Editor will show the event matching the current time indicator. The reverse behaviour is also implemented: selecting an event within the Events View will update the Control Flow View current time indicator.

Using the keyboard

The following keyboard shortcuts are available:

When the mouse cursor is over entries (left pane):

Please note that the behavior of some shortcuts can slightly differ based on the operating system.

When the selection indicators are changed, all the other views are synchronized. For example, the Events Editor will show the event matching the current time indicator. The reverse behaviour is also implemented: selecting an event within the Events View will update the Control Flow View current time indicator.

Incomplete regions

You'll notice small dots over the colored bars at some places:

Those dots mean the underlying region is incomplete: there's not enough pixels to view all the events. In other words, you have to zoom in.

When zooming in, small dots start to disappear:

When no dots are left, you are viewing all the events and states within that region.

Zoom region

To zoom in on a specific region, right-click and drag in order to draw a time range:

The states flow horizontal space will only show the selected region.

Tooltips

Hover the cursor over a colored bar and a tooltip will pop up:

The tooltip indicates:

Toolbar

The Control Flow View toolbar, located at the top right of the view, has shortcut buttons to perform common actions:

Align Views Disable and enable the automatic time axis alignment of time-based views. Disabling the alignment in this view will disable this feature across all the views because it's a workspace preference.
Show View Filter Opens the process filter dialog. Filter settings will be preserved when switching between open traces.
Show Legend Displays the states legend.
Optimize "Optimize" the row order of the control flow view. This groups the threads by minimizing the distance in the graph of transitions by CPU for a given time range. The scheduler will often keep tightly coupled threads on the same CPU to preserve data locality. An interesting side effect of this is that threads communicating together in that time range will be moved closer together when pressing the button. The button moves the rows only when pressed. When the time range is moved, the optimization action can be re-applied.
Reset the Time Scale to Default Resets the zoom window to the full range.
Select Previous State Change Selects the previous state for the selected process. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Select Next State Change Selects the next state for the selected process. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Add Bookmark... Adds a bookmark at the current selection range. A bookmark is a user-defined interval marker. The Add Bookmark dialog is opened where the user can enter a description and choose the highlighting color and alpha (transparency) value. This button is replaced by the Remove Bookmark button if the current selection range corresponds to an existing bookmark. The bookmarks can also be managed in the Bookmark View.
Remove Bookmark Removes the bookmark at the current selection range. This button replaces the Add Bookmark when the current selection range corresponds to an existing bookmark.
Previous Marker Selects the previous active marker. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Next Marker Selects the next active marker. Pressing the Shift key at the same time will update the selection end time of the current selection range. Clicking the button drop-down arrow will open a menu where marker categories can be made active or inactive for navigation. To toggle the display of a marker category, use the View Menu instead.
Select Previous Process Selects the previous process.
Select Next Process Selects the next process.
Zoom In Zooms in on the selection by 50%.
Zoom Out Zooms out on the selection by 50%.
Hide Arrows Toggles the display of arrows on or off.
Follow CPU Backward Selects the previous state following CPU execution across processes. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Follow CPU Forward Selects the next state following CPU execution across processes. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Go to previous event of the selected thread Move to the closest previous event belonging to the selected thread. This action looks through all trace events, unlike the Select Previous State Change action which only stops at state changes.
Go to next event of the selected thread Move to the closest following event belonging to the selected thread. This action looks through all trace events, unlike the Select Next State Change action which only stops at state changes.

View Menu

Show Markers A marker highlights a time interval. A marker can be used for instance to indicate a time range where lost events occurred or to bookmark an interesting interval for future reference. Selecting a category name will toggle the visibility of markers of that category.
Marker Set The user can select from one of the configured market sets, or choose None to use no marker set. The setting applies to all views that support marker sets. The marker set configuration can be edited by selecting the Edit... menu item (see Marker Set Configuration XML Format). After saving the changes in the opened editor, the marker set should then be re-selected to update the view.
Thread Presentation Select the threads layout. Two layouts are available. Flat layout lists the threads in a flat list per trace. Hierarchical layout shows the threads in a parent-child tree per trace.

Marker Axis

The marker axis is visible only when at least one marker category with markers for the current trace is shown.

The marker axis displays one row per marker category. Each marker's time range and/or label (if applicable) are drawn on the marker axis.

Clicking on any marker's time range or label will set the current time selection to the marker's time or time range.

Clicking on the "X" icon to the left of the marker category name will hide this marker category from the time graph. It can be shown again using the corresponding Show Markers menu item in the view menu.

The marker axis can be collapsed and expanded by clicking on the icon at the top left of the marker axis. The marker axis can be completely removed by hiding all available marker categories.

Resources View

This view is specific to LTTng kernel traces. The Linux Kernel Analysis is executed the first time a LTTng Kernel is opened. After opening the trace, the element Resources is added under the Linux Kernel Analysis tree element of the Project Explorer. To open the view, double-click the Resources tree element.

Alternatively, go in Window -> Show View -> Other... and select LTTng/Resources in the list.

Example of resources view with all trace points and syscalls enabled

This view shows the state of system resources i.e. if changes occurred during the trace either on CPUs, IRQs or soft IRQs, it will appear in this view. The left side of the view present a list of resources that are affected by at least one event of the trace. The right side illustrate the state in which each resource is at some point in time. For state USERMODE it also prints the process name in the state bar. For state SYSCALL the name of the system call is displayed in the state region.

When an IRQ is handled by a CPU, its states are shown under the corresponding CPU entry. Similarly, the CPU that was handling an IRQ is shown under the handled IRQ. Therefore, the trace can be visualized from a CPU point of view or from an IRQ point of view.

Just like other views, according to which trace points and system calls are activated, the content of this view may change from one trace to another.

The time axis is aligned with other views that support automatic time axis alignment (see Automatic Time Axis Alignment).

Each state are represented by one color so it is faster to say what is happening.

Color for each state

To go through the state of a resource, you first have to select the resource and the timestamp that interest you. For the latter, you can pick some time before the interesting part of the trace.

Shows the state of an IRQ

Then, by selecting Next Event, it will show the next state transition and the event that occurred at this time.

Shows the next state of the IRQ

This view is also synchronized with the others : Histogram View, Events Editor, Control Flow View, etc.

Follow CPU

It is possible to follow a CPU by right-clicking on its entry in the view, then selecting Follow CPU X where X is the number of the CPU. Following a CPU will filter the CPU Usage View to display only usage for the selected CPU. To unfollow a CPU, one needs to right-click on any CPU entry and select Unfollow CPU.

See Control Flow View's Using the mouse , Using the keyboard and Zoom region .

Incomplete regions

See Control Flow View's Incomplete regions .

Toolbar

The Resources View toolbar, located at the top right of the view, has shortcut buttons to perform common actions:

Align Views Disable and enable the automatic time axis alignment of time-based views. Disabling the alignment in this view will disable this feature across all the views because it's a workspace preference.
Show View Filter Opens the resources filter dialog. Filter settings will be preserved when switching between open traces.
Show Legend Displays the states legend.
Reset the Time Scale to Default Resets the zoom window to the full range.
Select Previous State Change Selects the previous state for the selected resource. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Select Next State Change Selects the next state for the selected resource. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Add Bookmark... Adds a bookmark at the current selection range. A bookmark is a user-defined interval marker. The Add Bookmark dialog is opened where the user can enter a description and choose the highlighting color and alpha (transparency) value. This button is replaced by the Remove Bookmark button if the current selection range corresponds to an existing bookmark. The bookmarks can also be managed in the Bookmark View.
Remove Bookmark Removes the bookmark at the current selection range. This button replaces the Add Bookmark when the current selection range corresponds to an existing bookmark.
Previous Marker Selects the previous active marker. Pressing the Shift key at the same time will update the selection end time of the current selection range.
Next Marker Selects the next active marker. Pressing the Shift key at the same time will update the selection end time of the current selection range. Clicking the button drop-down arrow will open a menu where marker categories can be made active or inactive for navigation.
Select Previous Resource Selects the previous resource
Select Next Resource Selects the next resource
Zoom In Zooms in on the selection by 50%.
Zoom Out Zooms out on the selection by 50%.

View Menu

Show Markers A marker highlights a time interval. A marker can be used for instance to indicate a time range where lost events occurred or to bookmark an interesting interval for future reference. Selecting a category name will toggle the visibility of markers of that category. Marker Set The user can select from one of the configured market sets, or choose None to use no marker set. The setting applies to all views that support marker sets. The marker set configuration can be edited by selecting the Edit... menu item (see Marker Set Configuration XML Format). After saving the changes in the opened editor, the marker set should then be re-selected to update the view.

Marker Axis

See Control Flow View's Marker Axis .

LTTng CPU Usage View

The CPU Usage analysis and view is specific to LTTng Kernel traces. The CPU usage is derived from a kernel trace as long as the sched_switch event was enabled during the collection of the trace. This analysis is executed the first time that the CPU Usage view is opened after opening the trace. To open the view, double-click on the CPU Usage tree element under the Linux Kernel Analysis tree element of the Project Explorer.

Now, the CPU Usage view will show:

The view is divided into the following important sections: Process Information and the CPU Usage Chart. The time axis is aligned with other views that support automatic time axis alignment (see Automatic Time Axis Alignment).

Process Information

The Process Information is displayed on the left side of the view and shows all threads that were executing on all available CPUs in the current time range. For each process, it shows in different columns the thread ID (TID), process name (Process), the average (%) execution time and the actual execution time (Time) during the current time range. It shows all threads that were executing on the CPUs in the current time range.

CPU Usage Chart

The CPU Usage Chart on the right side of the view, plots the total time spent on all CPUs of all processes and the time of the selected process.

Using the mouse

The CPU Usage chart is usable with the mouse. The following actions are set:

Tooltips

Hover the cursor over a line of the chart and a tooltip will pop up with the following information:

Toolbar

The CPU Usage View toolbar, located at the top right of the view, has shortcut buttons to perform common actions:

Align Views Disable and enable the automatic time axis alignment of time-based views. Disabling the alignment in the this view will disable this feature across all the views because it's a workspace preference

CPU Filtering

Follow a CPU will filter the CPU Usage View and will display only usage for the followed CPU.

Kernel Memory Usage View

The Kernel Memory Usage and view is specific to kernel traces. To open the view, double-click on the Kernel Memory Usage Analysis tree element under the Kernel tree element of the Project Explorer.

Now, the Kernel memory usage view will show:

Where:

The view is divided into the following important sections: Process Information and the Relative Kernel Memory Usage. The time axis is aligned with other views that support automatic time axis alignment (see Automatic Time Axis Alignment).

Process Information

The Process Information is displayed on the left side of the view and shows all threads that were executing on all available CPUs in the current time range. For each process, it shows in different columns the thread ID (TID) and the process name (Process).

Relative Kernel Memory Chart

The Relative Kernel Memory Chart on the right side of the view, plots the relative amount of memory that was allocated and deallocated during that period of time.

Using the mouse

The Relative Kernel Memory chart is usable with the mouse. The following actions are set:

Tooltips

Hover the cursor over a line of the chart and a tooltip will pop up with the following information:

Process Wait Analysis

TraceCompass can recover wait causes of local and distributed processes using operating system events. The analysis highlights the tasks and devices causing wait. Wait cause recovery is recursive, comprise all tasks running on the system and works across computers using packet trace synchronization.

The analysis details are available in the paper Wait analysis of distributed systems using kernel tracing.

Prerequisites

The analysis requires a Linux kernel trace. Additional instrumentation may be required for specific kernel version and for distributed tracing. This instrumentation is available in LTTng modules addons on GitHub.

The required events are:

To analyze a distributed program, all computers involved in the processing must be traced simultaneously. The LTTng Tracer Control of TraceCompass can trace a remote computer, but controlling simultaneous tracing is not supported at the moment, meaning that all sessions must be started separately and interactively. TraceCompass will support this feature in the future. For now, it is suggested to use lttng-cluster command line tool to control simultaneous tracing sessions on multiple computers. This tool is based on Fabric and uses SSH to start the tracing sessions, execute a workload, stop the sessions and gather traces on the local computer. For more information, refer to the lttng-cluster documentation.

We use the Django trace as an example to demonstrate the wait analysis. Django is a popular Web framework. The application is the Django Poll app tutorial. The traces were recorded on three computers, namely the client (implemented with Python Mechanize), the Web server (Apache with WSGI) and the database server (PostgreSQL). The client simulates a vote in the poll.

Running the analysis

To open all three traces simultaneously, we first create an experiment containing these traces and then synchronize the traces, such that they have a common time base. Then, the analysis is done by selecting a task in the Control Flow View. The result is displayed in the Critical Flow View, which works like the Control Flow View. The steps to load the Django example follows.

  1. Download and extract the Django trace archive.
  2. In TraceCompass, open the LTTng Kernel Perspective.
  3. Create a new tracing project. Select File -> New -> Tracing -> Tracing Project, choose a name and click Finish.
  4. Under the created tracing project, right-click on Traces and select Import.... In the import dialog, select the root directory containing the extracted trace by clicking on Browse. Three traces should be listed. Select the traces and click Finish. After the import is completed, the traces should be listed below Traces.
  5. Right-click on Experiments, select New... and enter a name for the experiment, such as django.
  6. Right-click on the django experiment and click on Select Traces.... In the dialog, check the three traces django-client, django-httpd and django-db. These traces will appear below the experiment. If the experiment is opened at this stage, the traces are not synchronized and there will be a large time gap between events from different traces.
  7. To synchronize the traces, right-click on the django experiment and select Synchronize Traces. In the Select reference trace dialog, select any available trace and click Finish. Once the synchronization is completed, a new entry with an underline suffix will appear for each modified trace. The created trace entries have a function which is applied to the timestamps of events in order to shift the time according to the reference trace. The Project Explorer after the import is shown below.
  8. Open the experiment django. The Control Flow and the Resources views should display the three traces simultaneously.
  9. In the main menu, select Window -> Show View -> Other... and under LTTng select Critical Flow View. The view is empty for the moment.
  10. In the Critical Flow View, right-click on the Process entry to analyze and select Follow, as shown in the figure below.
    The analysis will execute and the result will appear in the Critical Flow View. For the Django example, use the View Filters to search for the python process with TID 2327. When zooming on the execution, the view displays the work done by the Web server and the database to process the request of the python client. Vertical arrows represent synchronization and communication between processes. The legend displays the colors associated with the processes states.

Input/Output Analysis

TraceCompass can analyse disk input/output through the read/write system calls to get the read/write per processes, but also with the disk request events, to get the actual reads and writes to disk.

Get the trace

The following tracepoints should be enabled to get the disk read/write data. Also, enabling syscalls will allow to match the reads and writes per processes.

   # sudo lttng list -k
   Kernel events:
   -------------
     ...
     block_rq_complete (loglevel: TRACE_EMERG (0)) (type: tracepoint)
     block_rq_insert (loglevel: TRACE_EMERG (0)) (type: tracepoint)
     block_rq_issue (loglevel: TRACE_EMERG (0)) (type: tracepoint) # on the guest
     block_bio_frontmerge (loglevel: TRACE_EMERG (0)) (type: tracepoint) # on the guest
     ...

For full disk request tracking, some extra tracepoints are necessary. They are not required for the I/O analysis, but make the analysis more complete. Here is the procedure to get those tracepoints that are not yet part of the mainline kernel.

   # git clone https://github.com/giraldeau/lttng-modules.git
   # cd lttng-modules

Checkout the addons branch, compile and install lttng-modules as per the lttng-modules documentation.

   # git checkout addons
   # make
   # sudo make modules_install
   # sudo depmod -a

The lttng addons modules must be inserted manually for the extra tracepoints to be available:

   # sudo modprobe lttng-addons
   # sudo modprobe lttng-elv

And enable the following tracepoint

   addons_elv_merge_requests

Input/Output Views

The following views are available for input/output analyses:

Disk I/O Activity View

A time aligned XY chart of the read and write speed for the different disks on the system. This view is useful to see where there was more activity on the disks and whether it was mostly reads or writes.

 Disk I/O Activity Example

System Call Latency Analysis

The System Call Latency Analysis measures the system call latency between system call entry and exit per type of system call. The durations are visualized using the Latency views. For more information about the Latency views see chapter Latency Analyses.

Futex Contention Latency Analysis

The Futex Contention Latency Analysis measures the futexes contention latency between futex entry and exit event for a thread. The durations are visualized using the Latency views. For more information about the Latency views see chapter Latency Analyses.

The following views are also available for the Futex Contention Latency Analysis:

Uaddr vs Waiter

A timegraph view of the waiters by futex uaddr. This view is useful to see which threads are waiting on a specific futex and understand blocked threads.

 uaddr vs waiter example

Scenarios

A timegraph view of the futex wait/lock and wake/unlock scenarios (from futex entry to futex exit). This view is useful to suss up the general level of contention in a given trace. It highlights futex lifespans.

 scenario example

Latency analysis for IRQ handlers

The Latency analysis for IRQ handlers measures the latency between the IRQ handlers entry and exit events. The durations are visualized using the Latency views. For more information about the Latency views see chapter Latency Analyses.

LTTng Kernel Events Editor

The LTTng Kernel Events editor is the plain TMF Events Editor, except that it provides its own specialized viewer to replace the standard one. In short, it has exactly the same behaviour but the layout is slightly different: