1. 26 Aug, 2009 1 commit
  2. 11 Aug, 2009 1 commit
    • Jason Baron's avatar
      tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro · 63fbdab3
      Jason Baron authored
      
      
      Introduce a new 'DECLARE_TRACE_WITH_CALLBACK()' macro, so that
      tracepoints can associate an external register/unregister function.
      
      This prepares for the syscalls tracer conversion to trace events. We
      will need to perform arch level operations once a syscall event is
      turned on/off, such as TIF flags setting, hence the need of such
      specific callbacks.
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      63fbdab3
  3. 16 Jun, 2009 1 commit
    • GeunSik Lim's avatar
      debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem. · 156f5a78
      GeunSik Lim authored
      Many developers use "/debug/" or "/debugfs/" or "/sys/kernel/debug/"
      directory name to mount debugfs filesystem for ftrace according to
      ./Documentation/tracers/ftrace.txt file.
      
      And, three directory names(ex:/debug/, /debugfs/, /sys/kernel/debug/) is
      existed in kernel source like ftrace, DRM, Wireless, Documentation,
      Network[sky2]files to mount debugfs filesystem.
      
      debugfs means debug filesystem for debugging easy to use by greg kroah
      hartman. "/sys/kernel/debug/" name is suitable as directory name
      of debugfs filesystem.
      - debugfs related reference: http://lwn.net/Articles/334546/
      
      
      
      Fix inconsistency of directory name to mount debugfs filesystem.
      
      * From Steven Rostedt
        - find_debugfs() and tracing_files() in this patch.
      Signed-off-by: default avatarGeunSik Lim <geunsik.lim@samsung.com>
      Acked-by     : Inaky Perez-Gonzalez <inaky@linux.intel.com>
      Reviewed-by  : Steven Rostedt <rostedt@goodmis.org>
      Reviewed-by  : James Smart <james.smart@emulex.com>
      CC: Jiri Kosina <trivial@kernel.org>
      CC: David Airlie <airlied@linux.ie>
      CC: Peter Osterlund <petero2@telia.com>
      CC: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      CC: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      CC: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      156f5a78
  4. 24 Apr, 2009 1 commit
  5. 14 Apr, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: consolidate trace and trace_event headers · ea20d929
      Steven Rostedt authored
      
      
      Impact: clean up
      
      Neil Horman (et. al.) criticized the way the trace events were broken up
      into two files. The reason for that was that ftrace needed to separate out
      the declarations from where the #include <linux/tracepoint.h> was used.
      It then dawned on me that the tracepoint.h header only needs to define the
      TRACE_EVENT macro if it is not already defined.
      
      The solution is simply to test if TRACE_EVENT is defined, and if it is not
      then the linux/tracepoint.h header can define it. This change consolidates
      all the <traces>.h and <traces>_event_types.h into the <traces>.h file.
      Reported-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarTheodore Tso <tytso@mit.edu>
      Reported-by: default avatarJiaying Zhang <jiayingz@google.com>
      Cc: Zhaolei <zhaolei@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ea20d929
  6. 10 Mar, 2009 6 commits
    • Steven Rostedt's avatar
      tracing: remove funky whitespace in the trace code · ef18012b
      Steven Rostedt authored
      
      
      Impact: clean up
      
      There existed a lot of <space><tab>'s in the tracing code. This
      patch removes them.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      ef18012b
    • Steven Rostedt's avatar
      tracing: document TRACE_EVENT macro in tracepoint.h · 823f9124
      Steven Rostedt authored
      
      
      Impact: clean up / comments
      
      Kosaki Motohiro asked about an explanation to the TRACE_EVENT macro.
      Ingo Molnar replied with a nice description.
      
      This patch takes the description that Ingo wrote (with some slight
      modifications) and adds it to the tracepoint.h file.
      Reported-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      823f9124
    • Steven Rostedt's avatar
      tracing: flip the TP_printk and TP_fast_assign in the TRACE_EVENT macro · 30a8fecc
      Steven Rostedt authored
      
      
      Impact: clean up
      
      In trying to stay consistant with the C style format in the TRACE_EVENT
      macro, it makes more sense to do the printk after the assigning of
      the variables.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      30a8fecc
    • Steven Rostedt's avatar
      tracing: remove obsolete TRACE_EVENT_FORMAT macro · 157587d7
      Steven Rostedt authored
      
      
      Impact: clean up
      
      The TRACE_EVENT_FORMAT macro is no longer used by trace points
      and only the DECLARE_TRACE, TRACE_FORMAT or TRACE_EVENT macros should
      be used by them. Although the TRACE_EVENT_FORMAT macro is still used
      by the internal tracing utility, it should not be used in core
      kernel code.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      157587d7
    • Steven Rostedt's avatar
      tracing: new format for specialized trace points · da4d0302
      Steven Rostedt authored
      
      
      Impact: clean up and enhancement
      
      The TRACE_EVENT_FORMAT macro looks quite ugly and is limited in its
      ability to save data as well as to print the record out. Working with
      Ingo Molnar, we came up with a new format that is much more pleasing to
      the eye of C developers. This new macro is more C style than the old
      macro, and is more obvious to what it does.
      
      Here's the example. The only updated macro in this patch is the
      sched_switch trace point.
      
      The old method looked like this:
      
       TRACE_EVENT_FORMAT(sched_switch,
              TP_PROTO(struct rq *rq, struct task_struct *prev,
                      struct task_struct *next),
              TP_ARGS(rq, prev, next),
              TP_FMT("task %s:%d ==> %s:%d",
                    prev->comm, prev->pid, next->comm, next->pid),
              TRACE_STRUCT(
                      TRACE_FIELD(pid_t, prev_pid, prev->pid)
                      TRACE_FIELD(int, prev_prio, prev->prio)
                      TRACE_FIELD_SPECIAL(char next_comm[TASK_COMM_LEN],
                                          next_comm,
                                          TP_CMD(memcpy(TRACE_ENTRY->next_comm,
                                                       next->comm,
                                                       TASK_COMM_LEN)))
                      TRACE_FIELD(pid_t, next_pid, next->pid)
                      TRACE_FIELD(int, next_prio, next->prio)
              ),
              TP_RAW_FMT("prev %d:%d ==> next %s:%d:%d")
              );
      
      The above method is hard to read and requires two format fields.
      
      The new method:
      
       /*
        * Tracepoint for task switches, performed by the scheduler:
        *
        * (NOTE: the 'rq' argument is not used by generic trace events,
        *        but used by the latency tracer plugin. )
        */
       TRACE_EVENT(sched_switch,
      
      	TP_PROTO(struct rq *rq, struct task_struct *prev,
      		 struct task_struct *next),
      
      	TP_ARGS(rq, prev, next),
      
      	TP_STRUCT__entry(
      		__array(	char,	prev_comm,	TASK_COMM_LEN	)
      		__field(	pid_t,	prev_pid			)
      		__field(	int,	prev_prio			)
      		__array(	char,	next_comm,	TASK_COMM_LEN	)
      		__field(	pid_t,	next_pid			)
      		__field(	int,	next_prio			)
      	),
      
      	TP_printk("task %s:%d [%d] ==> %s:%d [%d]",
      		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
      		__entry->next_comm, __entry->next_pid, __entry->next_prio),
      
      	TP_fast_assign(
      		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
      		__entry->prev_pid	= prev->pid;
      		__entry->prev_prio	= prev->prio;
      		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
      		__entry->next_pid	= next->pid;
      		__entry->next_prio	= next->prio;
      	)
       );
      
      This macro is called TRACE_EVENT, it is broken up into 5 parts:
      
       TP_PROTO:        the proto type of the trace point
       TP_ARGS:         the arguments of the trace point
       TP_STRUCT_entry: the structure layout of the entry in the ring buffer
       TP_printk:       the printk format
       TP_fast_assign:  the method used to write the entry into the ring buffer
      
      The structure is the definition of how the event will be saved in the
      ring buffer. The printk is used by the internal tracing in case of
      an oops, and the kernel needs to print out the format of the record
      to the console. This the TP_printk gives a means to show the records
      in a human readable format. It is also used to print out the data
      from the trace file.
      
      The TP_fast_assign is executed directly. It is basically like a C function,
      where the __entry is the handle to the record.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      da4d0302
    • Steven Rostedt's avatar
      tracing: replace TP<var> with TP_<var> · 2939b046
      Steven Rostedt authored
      
      
      Impact: clean up
      
      The macros TPPROTO, TPARGS, TPFMT, TPRAWFMT, and TPCMD all look a bit
      ugly. This patch adds an underscore to their names.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      2939b046
  7. 28 Feb, 2009 1 commit
  8. 26 Feb, 2009 2 commits
  9. 25 Feb, 2009 1 commit
  10. 16 Nov, 2008 4 commits
  11. 03 Nov, 2008 1 commit
  12. 14 Oct, 2008 3 commits
    • Mathieu Desnoyers's avatar
      tracepoints: synchronize unregister static inline · 231375cc
      Mathieu Desnoyers authored
      
      
      Turn tracepoint synchronize unregister into a static inline. There is no
      reason to keep it as a macro over a static inline.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      231375cc
    • Mathieu Desnoyers's avatar
      tracepoints: tracepoint_synchronize_unregister() · f2461fc8
      Mathieu Desnoyers authored
      
      
      Create tracepoint_synchronize_unregister() which must be called before the end
      of exit() to make sure every probe callers have exited the non preemptible
      section and thus are not executing the probe code anymore.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f2461fc8
    • Mathieu Desnoyers's avatar
      tracing: Kernel Tracepoints · 97e1c18e
      Mathieu Desnoyers authored
      Implementation of kernel tracepoints. Inspired from the Linux Kernel
      Markers. Allows complete typing verification by declaring both tracing
      statement inline functions and probe registration/unregistration static
      inline functions within the same macro "DEFINE_TRACE". No format string
      is required. See the tracepoint Documentation and Samples patches for
      usage examples.
      
      Taken from the documentation patch :
      
      "A tracepoint placed in code provides a hook to call a function (probe)
      that you can provide at runtime. A tracepoint can be "on" (a probe is
      connected to it) or "off" (no probe is attached). When a tracepoint is
      "off" it has no effect, except for adding a tiny time penalty (checking
      a condition for a branch) and space penalty (adding a few bytes for the
      function call at the end of the instrumented function and adds a data
      structure in a separate section).  When a tracepoint is "on", the
      function you provide is called each time the tracepoint is executed, in
      the execution context of the caller. When the function provided ends its
      execution, it returns to the caller (continuing from the tracepoint
      site).
      
      You can put tracepoints at important locations in the code. They are
      lightweight hooks that can pass an arbitrary number of parameters, which
      prototypes are described in a tracepoint declaration placed in a header
      file."
      
      Addition and removal of tracepoints is synchronized by RCU using the
      scheduler (and preempt_disable) as guarantees to find a quiescent state
      (this is really RCU "classic"). The update side uses rcu_barrier_sched()
      with call_rcu_sched() and the read/execute side uses
      "preempt_disable()/preempt_enable()".
      
      We make sure the previous array containing probes, which has been
      scheduled for deletion by the rcu callback, is indeed freed before we
      proceed to the next update. It therefore limits the rate of modification
      of a single tracepoint to one update per RCU period. The objective here
      is to permit fast batch add/removal of probes on _different_
      tracepoints.
      
      Changelog :
      - Use #name ":" #proto as string to identify the tracepoint in the
        tracepoint table. This will make sure not type mismatch happens due to
        connexion of a probe with the wrong type to a tracepoint declared with
        the same name in a different header.
      - Add tracepoint_entry_free_old.
      - Change __TO_TRACE to get rid of the 'i' iterator.
      
      Masami Hiramatsu <mhiramat@redhat.com> :
      Tested on x86-64.
      
      Performance impact of a tracepoint : same as markers, except that it
      adds about 70 bytes of instructions in an unlikely branch of each
      instrumented function (the for loop, the stack setup and the function
      call). It currently adds a memory read, a test and a conditional branch
      at the instrumentation site (in the hot path). Immediate values will
      eventually change this into a load immediate, test and branch, which
      removes the memory read which will make the i-cache impact smaller
      (changing the memory read for a load immediate removes 3-4 bytes per
      site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it
      also saves the d-cache hit).
      
      About the performance impact of tracepoints (which is comparable to
      markers), even without immediate values optimizations, tests done by
      Hideo Aoki on ia64 show no regression. His test case was using hackbench
      on a kernel where scheduler instrumentation (about 5 events in code
      scheduler code) was added.
      
      Quoting Hideo Aoki about Markers :
      
      I evaluated overhead of kernel marker using linux-2.6-sched-fixes git
      tree, which includes several markers for LTTng, using an ia64 server.
      
      While the immediate trace mark feature isn't implemented on ia64, there
      is no major performance regression. So, I think that we don't have any
      issues to propose merging marker point patches into Linus's tree from
      the viewpoint of performance impact.
      
      I prepared two kernels to evaluate. The first one was compiled without
      CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS.
      
      I downloaded the original hackbench from the following URL:
      http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c
      
      
      
      I ran hackbench 5 times in each condition and calculated the average and
      difference between the kernels.
      
          The parameter of hackbench: every 50 from 50 to 800
          The number of CPUs of the server: 2, 4, and 8
      
      Below is the results. As you can see, major performance regression
      wasn't found in any case. Even if number of processes increases,
      differences between marker-enabled kernel and marker- disabled kernel
      doesn't increase. Moreover, if number of CPUs increases, the differences
      doesn't increase either.
      
      Curiously, marker-enabled kernel is better than marker-disabled kernel
      in more than half cases, although I guess it comes from the difference
      of memory access pattern.
      
      * 2 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      4.811   |       4.872  |  +0.061  |  +1.27  |
            100 |      9.854   |      10.309  |  +0.454  |  +4.61  |
            150 |     15.602   |      15.040  |  -0.562  |  -3.6   |
            200 |     20.489   |      20.380  |  -0.109  |  -0.53  |
            250 |     25.798   |      25.652  |  -0.146  |  -0.56  |
            300 |     31.260   |      30.797  |  -0.463  |  -1.48  |
            350 |     36.121   |      35.770  |  -0.351  |  -0.97  |
            400 |     42.288   |      42.102  |  -0.186  |  -0.44  |
            450 |     47.778   |      47.253  |  -0.526  |  -1.1   |
            500 |     51.953   |      52.278  |  +0.325  |  +0.63  |
            550 |     58.401   |      57.700  |  -0.701  |  -1.2   |
            600 |     63.334   |      63.222  |  -0.112  |  -0.18  |
            650 |     68.816   |      68.511  |  -0.306  |  -0.44  |
            700 |     74.667   |      74.088  |  -0.579  |  -0.78  |
            750 |     78.612   |      79.582  |  +0.970  |  +1.23  |
            800 |     85.431   |      85.263  |  -0.168  |  -0.2   |
      --------------------------------------------------------------
      
      * 4 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.586   |       2.584  |  -0.003  |  -0.1   |
            100 |      5.254   |       5.283  |  +0.030  |  +0.56  |
            150 |      8.012   |       8.074  |  +0.061  |  +0.76  |
            200 |     11.172   |      11.000  |  -0.172  |  -1.54  |
            250 |     13.917   |      14.036  |  +0.119  |  +0.86  |
            300 |     16.905   |      16.543  |  -0.362  |  -2.14  |
            350 |     19.901   |      20.036  |  +0.135  |  +0.68  |
            400 |     22.908   |      23.094  |  +0.186  |  +0.81  |
            450 |     26.273   |      26.101  |  -0.172  |  -0.66  |
            500 |     29.554   |      29.092  |  -0.461  |  -1.56  |
            550 |     32.377   |      32.274  |  -0.103  |  -0.32  |
            600 |     35.855   |      35.322  |  -0.533  |  -1.49  |
            650 |     39.192   |      38.388  |  -0.804  |  -2.05  |
            700 |     41.744   |      41.719  |  -0.025  |  -0.06  |
            750 |     45.016   |      44.496  |  -0.520  |  -1.16  |
            800 |     48.212   |      47.603  |  -0.609  |  -1.26  |
      --------------------------------------------------------------
      
      * 8 CPUs
      
      Number of | without      | with         | diff     | diff    |
      processes | Marker [Sec] | Marker [Sec] |   [Sec]  |   [%]   |
      --------------------------------------------------------------
             50 |      2.094   |       2.072  |  -0.022  |  -1.07  |
            100 |      4.162   |       4.273  |  +0.111  |  +2.66  |
            150 |      6.485   |       6.540  |  +0.055  |  +0.84  |
            200 |      8.556   |       8.478  |  -0.078  |  -0.91  |
            250 |     10.458   |      10.258  |  -0.200  |  -1.91  |
            300 |     12.425   |      12.750  |  +0.325  |  +2.62  |
            350 |     14.807   |      14.839  |  +0.032  |  +0.22  |
            400 |     16.801   |      16.959  |  +0.158  |  +0.94  |
            450 |     19.478   |      19.009  |  -0.470  |  -2.41  |
            500 |     21.296   |      21.504  |  +0.208  |  +0.98  |
            550 |     23.842   |      23.979  |  +0.137  |  +0.57  |
            600 |     26.309   |      26.111  |  -0.198  |  -0.75  |
            650 |     28.705   |      28.446  |  -0.259  |  -0.9   |
            700 |     31.233   |      31.394  |  +0.161  |  +0.52  |
            750 |     34.064   |      33.720  |  -0.344  |  -1.01  |
            800 |     36.320   |      36.114  |  -0.206  |  -0.57  |
      --------------------------------------------------------------
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatar'Peter Zijlstra' <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      97e1c18e