1. 28 Dec, 2009 1 commit
    • Li Zefan's avatar
      perf events: Remove CONFIG_EVENT_PROFILE · 07b139c8
      Li Zefan authored
      
      
      Quoted from Ingo:
      
      | This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE -
      | it's an unnecessary Kconfig complication. If both PERF_EVENTS and
      | EVENT_TRACING is enabled we should expose generic tracepoints.
      |
      | Nor is it limited to event 'profiling', so it has become a misnomer as
      | well.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07b139c8
  2. 13 Dec, 2009 5 commits
  3. 26 Nov, 2009 1 commit
    • Ingo Molnar's avatar
      events: Rename TRACE_EVENT_TEMPLATE() to DECLARE_EVENT_CLASS() · 091ad365
      Ingo Molnar authored
      
      
      It is not quite obvious at first sight what TRACE_EVENT_TEMPLATE
      does: does it define an event as well beyond defining a template?
      
      To clarify this, rename it to DECLARE_EVENT_CLASS, which follows
      the various 'DECLARE_*()' idioms we already have in the kernel:
      
        DECLARE_EVENT_CLASS(class)
      
          DEFINE_EVENT(class, event1)
          DEFINE_EVENT(class, event2)
          DEFINE_EVENT(class, event3)
      
      To complete this logic we should also rename TRACE_EVENT() to:
      
        DEFINE_SINGLE_EVENT(single_event)
      
      ... but in a more quiet moment of the kernel cycle.
      
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4B0E286A.2000405@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      091ad365
  4. 24 Nov, 2009 2 commits
    • Steven Rostedt's avatar
      tracing: Create new DEFINE_EVENT_PRINT · e5bc9721
      Steven Rostedt authored
      
      
      After creating the TRACE_EVENT_TEMPLATE I started to look at other
      trace points to see what duplication was made. I noticed that there
      are several trace points where they are almost identical except for
      the name and the output format. Since TRACE_EVENT_TEMPLATE was successful
      in bringing down the size of trace events, I added a DEFINE_EVENT_PRINT.
      
      DEFINE_EVENT_PRINT is used just like DEFINE_EVENT is. That is, the
      DEFINE_EVENT_PRINT also uses a TRACE_EVENT_TEMPLATE, but it allows the
      developer to overwrite the print format. If there are two or more
      TRACE_EVENTS that are identical except for the name and print, then
      they can be converted to use a TRACE_EVENT_TEMPLATE. Since the
      TRACE_EVENT_TEMPLATE already does the print output, the first trace event
      would have its print format held in the TRACE_EVENT_TEMPLATE and
      be defined with a DEFINE_EVENT. The rest will use the DEFINE_EVENT_PRINT
      and override the print format.
      
      Converting the sched trace points to both DEFINE_EVENT and
      DEFINE_EVENT_PRINT. Five were converted to DEFINE_EVENT and two were
      converted to DEFINE_EVENT_PRINT.
      
      I was able to get the following:
      
      $ size kernel/sched.o-*
         text	   data	    bss	    dec	    hex	filename
        79299	   6776	   2520	  88595	  15a13	kernel/sched.o-notrace
       101941	  11896	   2584	 116421	  1c6c5	kernel/sched.o-templ
       104779	  11896	   2584	 119259	  1d1db	kernel/sched.o-trace
      
      sched.o-notrace is the scheduler compiled with no trace points.
      sched.o-templ is with the use of DEFINE_EVENT and DEFINE_EVENT_PRINT
      sched.o-trace is the current trace events.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e5bc9721
    • Steven Rostedt's avatar
      tracing: Create new TRACE_EVENT_TEMPLATE · ff038f5c
      Steven Rostedt authored
      
      
      There are some places in the kernel that define several tracepoints and
      they are all identical besides the name. The code to enable, disable and
      record is created for every trace point even if most of the code is
      identical.
      
      This patch adds TRACE_EVENT_TEMPLATE that lets the developer create
      a template TRACE_EVENT and create trace points with DEFINE_EVENT, which
      is based off of a given template. Each trace point used by this
      will share most of the code, and bring down the size of the kernel
      when there are several duplicate events.
      
      Usage is:
      
      TRACE_EVENT_TEMPLATE(name, proto, args, tstruct, assign, print);
      
      Which would be the same as defining a normal TRACE_EVENT.
      
      To create the trace events that the trace points will use:
      
      DEFINE_EVENT(template, name, proto, args) is done. The template
      is the name of the TRACE_EVENT_TEMPLATE to use. The name is the
      name of the trace point. The parameters proto and args must be the same
      as the proto and args of the template. If they are not the same,
      then a compile error will result. I tried hard removing this duplication
      but the C preprocessor is not powerful enough (or my CPP magic
      experience points is not at a high enough level) to not need them.
      
      A lot of trace events are coming in with new XFS development. Most of
      the trace points are identical except for the name. The following shows
      the advantage of having TRACE_EVENT_TEMPLATE:
      
      $ size fs/xfs/xfs.o.*
          text          data     bss     dec     hex filename
        452114          2788    3520  458422   6feb6 fs/xfs/xfs.o.old
        638482         38116    3744  680342   a6196 fs/xfs/xfs.o.template
        996954         38116    4480 1039550   fdcbe fs/xfs/xfs.o.trace
      
      xfs.o.old is without any tracepoints.
      xfs.o.template uses the new TRACE_EVENT_TEMPLATE.
      xfs.o.trace uses the current TRACE_EVENT macros.
      Requested-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ff038f5c
  5. 23 Nov, 2009 1 commit
  6. 22 Nov, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing: Use the perf recursion protection from trace event · ce71b9df
      Frederic Weisbecker authored
      
      
      When we commit a trace to perf, we first check if we are
      recursing in the same buffer so that we don't mess-up the buffer
      with a recursing trace. But later on, we do the same check from
      perf to avoid commit recursion. The recursion check is desired
      early before we touch the buffer but we want to do this check
      only once.
      
      Then export the recursion protection from perf and use it from
      the trace events before submitting a trace.
      
      v2: Put appropriate Reported-by tag
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1258864015-10579-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ce71b9df
  7. 14 Nov, 2009 1 commit
    • Johannes Berg's avatar
      tracing: Fix event format export · 811cb50b
      Johannes Berg authored
      
      
      For some reason the export of the event print format to userspace
      uses '#fmt' which breaks if the format string is anything but a plain
      string, for example if it is built with macros then the macro names
      are exported instead of their contents.
      
      Use
      	"\"%s\"", fmt
      instead of
      	"%s", #fmt
      to export the string and not the way it is built.
      
      For example, in net/mac80211/driver-trace.h for the trace event drv_start
      there is:
      
              TP_printk(
                      LOCAL_PR_FMT, LOCAL_PR_ARG
              )
      
      Which use to produce:
      
       print fmt: LOCAL_PR_FMT, REC->wiphy_name
      
      Now produces:
      
       print fmt: "%s", REC->wiphy_name
      Signed-off-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      LKML-Reference: <20091113224009.GB23942@elte.hu>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      811cb50b
  8. 08 Nov, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing, perf_events: Protect the buffer from recursion in perf · 444a2a3b
      Frederic Weisbecker authored
      
      
      While tracing using events with perf, if one enables the
      lockdep:lock_acquire event, it will infect every other perf
      trace events.
      
      Basically, you can enable whatever set of trace events through
      perf but if this event is part of the set, the only result we
      can get is a long list of lock_acquire events of rcu read lock,
      and only that.
      
      This is because of a recursion inside perf.
      
      1) When a trace event is triggered, it will fill a per cpu
         buffer and submit it to perf.
      
      2) Perf will commit this event but will also protect some data
         using rcu_read_lock
      
      3) A recursion appears: rcu_read_lock triggers a lock_acquire
         event that will fill the per cpu event and then submit the
         buffer to perf.
      
      4) Perf detects a recursion and ignores it
      
      5) Perf continues its work on the previous event, but its buffer
         has been overwritten by the lock_acquire event, it has then
         been turned into a lock_acquire event of rcu read lock
      
      Such scenario also happens with lock_release with
      rcu_read_unlock().
      
      We could turn the rcu_read_lock() into __rcu_read_lock() to drop
      the lock debugging from perf fast path, but that would make us
      lose the rcu debugging and that doesn't prevent from other
      possible kind of recursion from perf in the future.
      
      This patch adds a recursion protection based on a counter on the
      perf trace per cpu buffers to solve the problem.
      
      -v2: Fixed lost whitespace, added reviewed-by tag
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1257477185-7838-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      444a2a3b
  9. 06 Oct, 2009 1 commit
  10. 21 Sep, 2009 1 commit
    • Ingo Molnar's avatar
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar authored
      
      
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Reviewed-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cdd6c482
  11. 18 Sep, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing: Allocate the ftrace event profile buffer dynamically · 20ab4425
      Frederic Weisbecker authored
      
      
      Currently the trace event profile buffer is allocated in the stack. But
      this may be too much for the stack, as the events can have large
      statically defined field size and can also grow with dynamic arrays.
      
      Allocate two per cpu buffer for all profiled events. The first cpu
      buffer is used to host every non-nmi context traces. It is protected
      by disabling the interrupts while writing and committing the trace.
      
      The second buffer is reserved for nmi. So that there is no race between
      them and the first buffer.
      
      The whole write/commit section is rcu protected because we release
      these buffers while deactivating the last profiling trace event.
      
      v2: Move the buffers from trace_event to be global, as pointed by
          Steven Rostedt.
      
      v3: Fix the syscall events to handle the profiling buffer races
          by disabling interrupts, now that the buffers are globals.
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      20ab4425
    • Frederic Weisbecker's avatar
      tracing: Factorize the events profile accounting · e5e25cf4
      Frederic Weisbecker authored
      
      
      Factorize the events enabling accounting in a common tracing core
      helper. This reduces the size of the profile_enable() and
      profile_disable() callbacks for each trace events.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      e5e25cf4
  12. 14 Sep, 2009 2 commits
  13. 04 Sep, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: pass around ring buffer instead of tracer · e77405ad
      Steven Rostedt authored
      
      
      The latency tracers (irqsoff and wakeup) can swap trace buffers
      on the fly. If an event is happening and has reserved data on one of
      the buffers, and the latency tracer swaps the global buffer with the
      max buffer, the result is that the event may commit the data to the
      wrong buffer.
      
      This patch changes the API to the trace recording to be recieve the
      buffer that was used to reserve a commit. Then this buffer can be passed
      in to the commit.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e77405ad
  14. 31 Aug, 2009 1 commit
    • Li Zefan's avatar
      tracing/filters: Defer pred allocation · 8e254c1d
      Li Zefan authored
      
      
      init_preds() allocates about 5392 bytes of memory (on x86_32) for
      a TRACE_EVENT. With my config, at system boot total memory occupied
      is:
      
      	5392 * (642 + 15) == 3459KB
      
      642 == cat available_events | wc -l
      15 == number of dirs in events/ftrace
      
      That's quite a lot, so we'd better defer memory allocation util
      it's needed, that's when filter is used.
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      LKML-Reference: <4A9B8EA5.6020700@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8e254c1d
  15. 28 Aug, 2009 1 commit
    • Frederic Weisbecker's avatar
      tracing: Fix double CPP substitution in TRACE_EVENT_FN · 0dd7b747
      Frederic Weisbecker authored
      
      
      TRACE_EVENT_FN relays on TRACE_EVENT by reprocessing its parameters
      into the ftrace events CPP macro. This leads to a double substitution
      in some cases.
      
      For example, a bad consequence is a format always prefixed by
      "%s, %s\n" for every TRACE_EVENT_FN based events.
      
      Eg:
      	cat /debug/tracing/events/syscalls/sys_enter/format
      	[...]
      	print fmt: "%s, %s\n", "\"NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)\"",\
      	"REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3],\
      	REC->args[4], REC->args[5]"
      
      This creates a failure in post-processing tools such as perf trace or
      trace-cmd.
      
      Then drop this double substitution and replace it by a new __cpparg()
      macro that relays CPP arguments containing commas.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Josh Stone <jistone@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      LKML-Reference: <1251413406-6704-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0dd7b747
  16. 26 Aug, 2009 2 commits
    • Masami Hiramatsu's avatar
      tracing: Ftrace dynamic ftrace_event_call support · bd1a5c84
      Masami Hiramatsu authored
      Add dynamic ftrace_event_call support to ftrace. Trace engines can add
      new ftrace_event_call to ftrace on the fly. Each operator function of
      the call takes an ftrace_event_call data structure as an argument,
      because these functions may be shared among several ftrace_event_calls.
      
      Changes from v13:
       - Define remove_subsystem_dir() always (revirt a2ca5e03
      
      ), because
         trace_remove_event_call() uses it.
       - Modify syscall tracer because of ftrace_event_call change.
      
      [fweisbec@gmail.com: Fixed conflict against latest tracing/core]
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Frank Ch. Eigler <fche@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Jim Keniston <jkenisto@us.ibm.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      LKML-Reference: <20090813203453.31965.71901.stgit@localhost.localdomain>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      bd1a5c84
    • Li Zefan's avatar
      tracing/filters: Add __field_ext() to TRACE_EVENT · 43b51ead
      Li Zefan authored
      
      
      Add __field_ext(), so a field can be assigned to a specific
      filter_type, which matches a corresponding filter function.
      
      For example, a later patch will allow this:
      	__field_ext(const char *, str, FILTER_PTR_STR);
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9272.60507095
      
      @cn.fujitsu.com>
      
      [
        Fixed a -1 to FILTER_OTHER
        Forward ported to latest kernel.
      ]
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      43b51ead
  17. 25 Aug, 2009 1 commit
    • Josh Stone's avatar
      tracing: Move tracepoint callbacks from declaration to definition · 97419875
      Josh Stone authored
      
      
      It's not strictly correct for the tracepoint reg/unreg callbacks to
      occur when a client is hooking up, because the actual tracepoint may not
      be present yet.  This happens to be fine for syscall, since that's in
      the core kernel, but it would cause problems for tracepoints defined in
      a module that hasn't been loaded yet.  It also means the reg/unreg has
      to be EXPORTed for any modules to use the tracepoint (as in SystemTap).
      
      This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
      DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
      callbacks are used now when the active state of the tracepoint changes
      in set_tracepoint & disable_tracepoint.
      
      This also introduces TRACE_EVENT_FN, so ftrace events can also provide
      registration callbacks if needed.
      Signed-off-by: default avatarJosh Stone <jistone@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      97419875
  18. 19 Aug, 2009 2 commits
  19. 11 Aug, 2009 2 commits
    • Frederic Weisbecker's avatar
      tracing: Add ftrace event call parameter to its field descriptor handler · e8f9f4d7
      Frederic Weisbecker authored
      
      
      Add the struct ftrace_event_call as a parameter of its show_format()
      callback. This way we can use it from the syscall trace events to
      retrieve the syscall name from the ftrace event call parameter and
      describe its fields using the syscalls metadata.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      e8f9f4d7
    • Jason Baron's avatar
      tracing: Add ftrace_event_call void * 'data' field · 69fd4f0e
      Jason Baron authored
      
      
      add an optional void * pointer to 'ftrace_event_call' that is
      passed in for regfunc and unregfunc.
      
      This prepares for syscall tracepoints creation by passing the name of
      the syscall we want to trace and then retrieve its number through our
      arch syscall table.
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      69fd4f0e
  20. 10 Aug, 2009 3 commits
    • Frederic Weisbecker's avatar
      perf_counter: Zero dead bytes from ftrace raw samples size alignment · 1853db0e
      Frederic Weisbecker authored
      
      
      After aligning the ftrace raw samples, there are dead bytes storing
      random data from the stack. We don't want to leak these to userspace,
      then zero these out.
      
      Before:
      
      	0x2de88 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 2c 00 00 00 00 00 00 00  h...h...,......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 40 7f 46 81 ff ff ff ff 00 10 1b 7f  h...@.F........
                                                            ^  ^  ^  ^
                                                               Leak
      
      After:
      
      	0x2d318 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 68 14 00 00 00 00 00 00  h...h...h......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 a0 80 46 81 ff ff ff ff 00 00 00 00  h.....F........
                                                            ^  ^  ^  ^
      							 Fixed
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249915116-5210-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      1853db0e
    • Frederic Weisbecker's avatar
      perf_counter: Subtract the buffer size field from the event record size · 304703ab
      Frederic Weisbecker authored
      
      
      We compute the perf raw sample size by aligning the raw ftrace
      event size plus the buffer size field itself. We do that
      instead of aligning only the perf raw sample size, so that we
      might economize some in some cases.
      
      But this buffer size field is not stored in the perf raw
      sample, we must then substract its size from the buffer once we
      computed the alignment unless we may get a useless u32 field in
      the buffer.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090810141129.GA5124@nowhere>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      304703ab
    • Peter Zijlstra's avatar
      perf_counter: Correct PERF_SAMPLE_RAW output · a044560c
      Peter Zijlstra authored
      
      
      PERF_SAMPLE_* output switches should unconditionally output the
      correct format, as they are the only way to unambiguously parse
      the PERF_EVENT_SAMPLE data.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249896447.17467.74.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a044560c
  21. 09 Aug, 2009 2 commits
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      
      
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
    • Peter Zijlstra's avatar
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra authored
      
      
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a659305
  22. 20 Jul, 2009 2 commits
  23. 10 Jun, 2009 1 commit
    • Steven Rostedt's avatar
      tracing: do not translate event helper macros in print format · 6ff9a64d
      Steven Rostedt authored
      
      
      By moving the macro that creates the print format code above the
      defining of the event macro helpers (__get_str, __print_symbolic,
      and __get_dynamic_array), we get a little cleaner print format.
      
      Instead of:
      
        (char *)((void *)REC + REC->__data_loc_name)
      
      we get:
      
         __get_str(name)
      
      Instead of:
      
         ({ static const struct trace_print_flags symbols[] = { { HI_SOFTIRQ, "HI" }, {
      
      we get:
      
         __print_symbolic(REC->vec, { HI_SOFTIRQ, "HI" }, {
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6ff9a64d
  24. 03 Jun, 2009 1 commit
    • Steven Whitehouse's avatar
      tracing: fix multiple use of __print_flags and __print_symbolic · 56d8bd3f
      Steven Whitehouse authored
      
      
      Here is an updated patch to include the extra call to
      trace_seq_init() as requested. This is vs. the latest
      -tip tree and fixes the use of multiple __print_flags
      and __print_symbolic in a single tracer. Also tested
      to ensure its working now:
      
      mount.gfs2-2534  [000]   235.850587: gfs2_glock_queue: 8.7 glock 1:2 dequeue PR
      mount.gfs2-2534  [000]   235.850591: gfs2_demote_rq: 8.7 glock 1:0 demote EX to NL flags:DI
      mount.gfs2-2534  [000]   235.850591: gfs2_glock_queue: 8.7 glock 1:0 dequeue EX
      glock_workqueue-2529  [000]   235.850666: gfs2_glock_state_change: 8.7 glock 1:0 state EX => NL tgt:NL dmt:NL flags:lDpI
      glock_workqueue-2529  [000]   235.850672: gfs2_glock_put: 8.7 glock 1:0 state NL => IV flags:I
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      LKML-Reference: <1244037123.29604.603.camel@localhost.localdomain>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      56d8bd3f
  25. 02 Jun, 2009 2 commits
    • Li Zefan's avatar
      tracing/events: introduce __dynamic_array() · 7fcb7c47
      Li Zefan authored
      
      
      __string() is limited:
      
        - it's a char array, but we may want to define array with other types
        - a source string should be available, but we may just know the string size
      
      We introduce __dynamic_array() to break those limitations, and __string()
      becomes a wrapper of it. As a side effect, now __get_str() can be used
      in TP_fast_assign but not only TP_print.
      
      Take XFS for example, we have the string length in the dirent, but the
      string itself is not NULL-terminated, so __dynamic_array() can be used:
      
      TRACE_EVENT(xfs_dir2,
      	TP_PROTO(struct xfs_da_args *args),
      	TP_ARGS(args),
      
      	TP_STRUCT__entry(
      		__field(int, namelen)
      		__dynamic_array(char, name, args->namelen + 1)
      		...
      	),
      
      	TP_fast_assign(
      		char *name = __get_str(name);
      
      		if (args->namelen)
      			memcpy(name, args->name, args->namelen);
      		name[args->namelen] = '\0';
      
      		__entry->namelen = args->namelen;
      	),
      
      	TP_printk("name %.*s namelen %d",
      		  __entry->namelen ? __get_str(name) : NULL
      		  __entry->namelen)
      );
      
      [ Impact: allow defining dynamic size arrays ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384D2.3080403@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      7fcb7c47
    • Li Zefan's avatar
      tracing/events: put TP_fast_assign into braces · a9c1c3ab
      Li Zefan authored
      
      
      Currently TP_fast_assign has a limitation that we can't define local
      variables in it.
      
      Here's one use case when we introduce __dynamic_array():
      
      TP_fast_assign(
      	type *p = __get_dynamic_array(item);
      
      	foo(p);
      	bar(p);
      ),
      
      [ Impact: allow defining local variables in TP_fast_assign ]
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A2384B1.90100@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a9c1c3ab