Changes

Page slugs have been fixed up from Wordpress to Wiki specs -- a few issues may remain though.
Philippe Gerum · 0ef1a71a
--- a/Finding_Spurious_Relaxes.asciidoc
+++ b/Finding_Spurious_Relaxes.asciidoc
+Finding spurious relaxes
+========================
+:author:	Philippe Gerum
+:email:	 	rpm@xenomai.org
+:categories:	Application
+:tags:		troubleshooting
+
+Sources of spurious relaxes
+---------------------------
+
+In a dual kernel configuration, an application does not want to switch
+to the regular Linux execution mode unexpectedly, aka _secondary
+mode_. When short and bounded response time is a strong requirement,
+the application wants to run in _primary mode_ exclusively, because
+only this mode is deemed to deliver real-time capabilities.
+
+There are several reasons for an application to be moved out of
+real-time mode automatically by the Xenomai co-kernel. This happens
+when the application is:
+
+- invoking a regular Linux syscall triggering a processor fault
+- receiving a Linux-originated signal (i.e. *not* a Xenomai-originated
+- one), possibly sent by the Xenomai kernel upon detecting a
+- consistency issue with the application
+
+At any rate, we want to know *where* in the application code the
+signal was received, which will be a strong hint to find out *why* we
+switched away from real-time mode.
+
+[TIP]
+The reason Xenomai has to move the application out of real-time mode
+upon such events stems from the way a dual kernel system works.  As
+the Cobalt kernel may preempt the regular Linux kernel at any time in
+the course of its execution for dealing with a high priority event,
+including when the latter assumes to be traversing some atomic
+section, special care has to be taken for re-entering a valid Linux
+context before a regular kernel service can be delivered.
+
+Manually debugging spurious relaxes
+-----------------------------------
+
+Xenomai provides a mechanism for sending a misbehaving application a
+regular Linux signal immediately when it switches to _secondary mode_
+unexpectedly. When run over GDB, the backtrace of the current thread
+receiving the signal can be examined. This feature is available with
+both Xenomai 2 and Xenomai 3.
+
+Each time a spurious migration to _secondary mode_ happens, the thread
+is sent a SIGDEBUG signal, along with additional information in the
+siginfo_t+ data. First of all, the application should trap such
+signal to a handler, like this:
+
+-------------------------------------------------------------------------------
+#include <signal.h>
+
+    struct sigaction sa;
+    sigemptyset(&sa.sa_mask);
+    sa.sa_sigaction = sigdebug_handler;
+    sa.sa_flags = SA_SIGINFO;
+    sigaction(SIGDEBUG, &sa, NULL);
+-------------------------------------------------------------------------------
+
+The +sigdebug_handler()+ routine would have to retrieve and decode the
+current backtrace, writing out the cause of the SIGDEBUG notification
+and hopefully a meaningful call stack:
+
+-------------------------------------------------------------------------------
+#include <sys/types.h>
+#include <pthread.h>
+#include <signal.h>
+#include <unistd.h>
+#include <execinfo.h>
+
+static const char *sigdebug_reasons[] = {
+	[SIGDEBUG_UNDEFINED] = "latency: received SIGXCPU for unknown reason",
+	[SIGDEBUG_MIGRATE_SIGNAL] = "received signal",
+	[SIGDEBUG_MIGRATE_SYSCALL] = "invoked syscall",
+	[SIGDEBUG_MIGRATE_FAULT] = "triggered fault",
+	[SIGDEBUG_MIGRATE_PRIOINV] = "affected by priority inversion",
+	[SIGDEBUG_NOMLOCK] = "Xenomai: process memory not locked "
+	"(missing mlockall?)",
+	[SIGDEBUG_WATCHDOG] = "Xenomai: watchdog triggered "
+	"(period too short?)",
+};
+
+void sigdebug_handler(int sig, siginfo_t *si, void *context)
+{
+	const char fmt[] = "Mode switch (reason: %s), aborting. Backtrace:\n";
+	unsigned int reason = sigdebug_reason(si);
+	static char buffer[256];
+	static void *bt[200];
+	unsigned int n;
+
+	if (reason > SIGDEBUG_WATCHDOG)
+		reason = SIGDEBUG_UNDEFINED;
+
+	switch(reason) {
+	case SIGDEBUG_UNDEFINED:
+	case SIGDEBUG_NOMLOCK:
+	case SIGDEBUG_WATCHDOG:
+		/* These errors are lethal, something went really wrong. */
+		n = snprintf(buffer, sizeof(buffer),
+			     "%s\n", sigdebug_reasons[reason]);
+		write(STDERR_FILENO, buffer, n);
+		exit(EXIT_FAILURE);
+	}
+
+	/* Retrieve the current backtrace, and decode it to stdout. */
+	n = snprintf(buffer, sizeof(buffer), fmt, sigdebug_reason[reason]);
+	n = write(STDERR_FILENO, buffer, n);
+	n = backtrace(bt, sizeof(bt)/sizeof(bt[0]));
+	backtrace_symbols_fd(bt, n, STDERR_FILENO);
+
+	signal(sig, SIG_DFL);
+	kill(getpid(), sig);
+}
+-------------------------------------------------------------------------------
+
+Then, you need to enable the _warn_upon_switch_ capability on a
+per-thread basis. For instance, a POSIX-based application would run
+this code from the thread to monitor for spurious relaxes:
+
+-------------------------------------------------------------------------------
+    /* Ask Xenomai to warn us upon switches to secondary mode. */
+    pthread_set_mode_np(0, PTHREAD_WARNSW);
+-------------------------------------------------------------------------------
+
+Alternatively, an application based on the _native_ API would run this
+code instead:
+
+-------------------------------------------------------------------------------
+    /* Ask Xenomai to warn us upon switches to secondary mode. */
+    rt_task_set_mode(0, T_WARNSW, NULL);
+-------------------------------------------------------------------------------
+
+Debugging spurious relaxes with Xenomai 3 and *slackspot*
+----------------------------------------------------------
+
+Xenomai 3 introduces a new utility called *slackspot*, for pinpointing
+code locations which caused transitions to _secondary mode_.
+
+*slackspot* combines dedicated kernel support exporting trace
+information about spurious relaxes through the
+/proc/xenomai/debug/relax+ virtual file, and a userland utility
+called +sbin/slackspot+, which parses the virtual file output to
+display the program backtrace, so that we can locate the offending
+code easily.
+
+The motivation to centralize tracing information about relaxes
+directly into kernel space is fourfold:
+
+- it allows to gather all the trace data into a single location and
+keep it safe there, with no external log file involved.
+
+- enabling the tracing does not impose any requirement on the
+application (aside of being compiled with debug symbols for best
+interpreting that information). We only need a kernel config switch
+for this (i.e. CONFIG_XENO_OPT_DEBUG_TRACE_RELAX).
+
+- the data is collected and can be made available exactly the same
+way regardless of the application emitting the relax requests, or
+whether it is still alive when the trace data are displayed.
+
+- the kernel is able to provide accurate and detailed trace
+information, such as the relative offset of instructions causing relax
+requests within dynamic shared objects, without having to guess it
+roughly from /proc/pid/maps, or relying on ldd's --function-relocs
+feature, which both require to run on the target system to get the
+needed information. Instead, we allow a build host to use a
+cross-compilation toolchain later to extract the source location, from
+the raw data the kernel has provided on the target system.
+
+A typical debug session with *slackspot*
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Here is a typical debugging session, involving a target board running
+the Cobalt-based application, and a host system, providing the
+cross-compilation toolchain used to build that application. We use the
+netcat utility to pull the trace data over the wire from the target,
+and process it locally on our debug host:
+
+---------------------------------------------------------------------------
+target> netcat -l -p <port> -c < /proc/xenomai/debug/relax
+
+host> nc <target-ip> <port> | CROSS_COMPILE=ppc_6xx- ./slackspot --path=/opt/rootfs/MPC5200/lib:$HOME/frags/relax --filter thread=Task*
+
+Thread[828] "Task 2" started by /relax:
+   #0  0xfff00000 ???
+   #1  0x000001bb ???
+   #2  0x00064393 _IO_file_doallocate() in ??:?
+   #3  0x00073d6f _IO_doallocbuf() in ??:?
+   #4  0x00072d87 _IO_file_overflow() in ??:?
+   #5  0x00075f83 __overflow() in ??:?
+   #6  0x0006997b putchar() in ??:?
+   #7  0x100017db task2_func() in /home/rpm/frags/relax/relax.c:23
+   #8  0x000078d7 task_entry() in /home/rpm/git/xenomai-forge/lib/alchemy/task.c:235
+   #9  0x00005a6b start_thread() in pthread_create.c:?
+   #10 0x000d389f __clone() in ??:?
+
+Thread[828] "Task 2" started by /relax (4 times):
+   #0  0x000c443f write() in ??:?
+   #1  0x00072553 _IO_file_write() in ??:?
+   #2  0x000721cf _IO_file_seek() in ??:?
+   #3  0x000724c7 _IO_do_write() in ??:?
+   #4  0x00072c2f _IO_file_sync() in ??:?
+   #5  0x00064a4f _IO_fflush() in ??:?
+   #6  0x100017eb task2_func() in /home/rpm/frags/relax/relax.c:24
+   #7  0x000078d7 task_entry() in /home/rpm/git/xenomai-forge/lib/alchemy/task.c:235
+   #8  0x00005a6b start_thread() in pthread_create.c:?
+   #9  0x000d389f __clone() in ??:?
+---------------------------------------------------------------------------
+
+The *slackspot* output here shows that something wrong is happening at
+lines 23 and 24 from the example file, as the application calls
+buffered standard I/O routines from _primary mode_, which in turn
+invoke regular kernel services, causing the thread to relax and leave
+the real-time mode.