1. 22 Dec, 2013 1 commit
    • Tom Zanussi's avatar
      tracing: Add and use generic set_trigger_filter() implementation · bac5fb97
      Tom Zanussi authored
      Add a generic event_command.set_trigger_filter() op implementation and
      have the current set of trigger commands use it - this essentially
      gives them all support for filters.
      Syntactically, filters are supported by adding 'if <filter>' just
      after the command, in which case only events matching the filter will
      invoke the trigger.  For example, to add a filter to an
      enable/disable_event command:
          echo 'enable_event:system:event if common_pid == 999' > \
      The above command will only enable the system:event event if the
      common_pid field in the othersys:otherevent event is 999.
      As another example, to add a filter to a stacktrace command:
          echo 'stacktrace if common_pid == 999' > \
      The above command will only trigger a stacktrace if the common_pid
      field in the event is 999.
      The filter syntax is the same as that described in the 'Event
      filtering' section of Documentation/trace/events.txt.
      Because triggers can now use filters, the trigger-invoking logic needs
      to be moved in those cases - e.g. for ftrace_raw_event_calls, if a
      trigger has a filter associated with it, the trigger invocation now
      needs to happen after the { assign; } part of the call, in order for
      the trigger condition to be tested.
      There's still a SOFT_DISABLED-only check at the top of e.g. the
      ftrace_raw_events function, so when an event is soft disabled but not
      because of the presence of a trigger, the original SOFT_DISABLED
      behavior remains unchanged.
      There's also a bit of trickiness in that some triggers need to avoid
      being invoked while an event is currently in the process of being
      logged, since the trigger may itself log data into the trace buffer.
      Thus we make sure the current event is committed before invoking those
      triggers.  To do that, we split the trigger invocation in two - the
      first part (event_triggers_call()) checks the filter using the current
      trace record; if a command has the post_trigger flag set, it sets a
      bit for itself in the return value, otherwise it directly invoks the
      trigger.  Once all commands have been either invoked or set their
      return flag, event_triggers_call() returns.  The current record is
      then either committed or discarded; if any commands have deferred
      their triggers, those commands are finally invoked following the close
      of the current event by event_triggers_post_call().
      To simplify the above and make it more efficient, the TRIGGER_COND bit
      is introduced, which is set only if a soft-disabled trigger needs to
      use the log record for filter testing or needs to wait until the
      current log record is closed.
      The syscall event invocation code is also changed in analogous ways.
      Because event triggers need to be able to create and free filters,
      this also adds a couple external wrappers for the existing
      create_filter and free_filter functions, which are too generic to be
      made extern functions themselves.
      Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  2. 20 Dec, 2013 1 commit
    • Tom Zanussi's avatar
      tracing: Add basic event trigger framework · 85f2b082
      Tom Zanussi authored
      Add a 'trigger' file for each trace event, enabling 'trace event
      triggers' to be set for trace events.
      'trace event triggers' are patterned after the existing 'ftrace
      function triggers' implementation except that triggers are written to
      per-event 'trigger' files instead of to a single file such as the
      'set_ftrace_filter' used for ftrace function triggers.
      The implementation is meant to be entirely separate from ftrace
      function triggers, in order to keep the respective implementations
      relatively simple and to allow them to diverge.
      The event trigger functionality is built on top of SOFT_DISABLE
      functionality.  It adds a TRIGGER_MODE bit to the ftrace_event_file
      flags which is checked when any trace event fires.  Triggers set for a
      particular event need to be checked regardless of whether that event
      is actually enabled or not - getting an event to fire even if it's not
      enabled is what's already implemented by SOFT_DISABLE mode, so trigger
      mode directly reuses that.  Event trigger essentially inherit the soft
      disable logic in __ftrace_event_enable_disable() while adding a bit of
      logic and trigger reference counting via tm_ref on top of that in a
      new trace_event_trigger_enable_disable() function.  Because the base
      __ftrace_event_enable_disable() code now needs to be invoked from
      outside trace_events.c, a wrapper is also added for those usages.
      The triggers for an event are actually invoked via a new function,
      event_triggers_call(), and code is also added to invoke them for
      ftrace_raw_event calls as well as syscall events.
      The main part of the patch creates a new trace_events_trigger.c file
      to contain the trace event triggers implementation.
      The standard open, read, and release file operations are implemented
      The open() implementation sets up for the various open modes of the
      'trigger' file.  It creates and attaches the trigger iterator and sets
      up the command parser.  If opened for reading set up the trigger
      The read() implementation parses the event trigger written to the
      'trigger' file, looks up the trigger command, and passes it along to
      that event_command's func() implementation for command-specific
      The release() implementation does whatever cleanup is needed to
      release the 'trigger' file, like releasing the parser and trigger
      iterator, etc.
      A couple of functions for event command registration and
      unregistration are added, along with a list to add them to and a mutex
      to protect them, as well as an (initially empty) registration function
      to add the set of commands that will be added by future commits, and
      call to it from the trace event initialization code.
      also added are a couple trigger-specific data structures needed for
      these implementations such as a trigger iterator and a struct for
      trigger-specific data.
      A couple structs consisting mostly of function meant to be implemented
      in command-specific ways, event_command and event_trigger_ops, are
      used by the generic event trigger command implementations.  They're
      being put into trace.h alongside the other trace_event data structures
      and functions, in the expectation that they'll be needed in several
      trace_event-related files such as trace_events_trigger.c and
      The event_command.func() function is meant to be called by the trigger
      parsing code in order to add a trigger instance to the corresponding
      event.  It essentially coordinates adding a live trigger instance to
      the event, and arming the triggering the event.
      Every event_command func() implementation essentially does the
      same thing for any command:
         - choose ops - use the value of param to choose either a number or
           count version of event_trigger_ops specific to the command
         - do the register or unregister of those ops
         - associate a filter, if specified, with the triggering event
      The reg() and unreg() ops allow command-specific implementations for
      event_trigger_op registration and unregistration, and the
      get_trigger_ops() op allows command-specific event_trigger_ops
      selection to be parameterized.  When a trigger instance is added, the
      reg() op essentially adds that trigger to the triggering event and
      arms it, while unreg() does the opposite.  The set_filter() function
      is used to associate a filter with the trigger - if the command
      doesn't specify a set_filter() implementation, the command will ignore
      Each command has an associated trigger_type, which serves double duty,
      both as a unique identifier for the command as well as a value that
      can be used for setting a trigger mode bit during trigger invocation.
      The signature of func() adds a pointer to the event_command struct,
      used to invoke those functions, along with a command_data param that
      can be passed to the reg/unreg functions.  This allows func()
      implementations to use command-specific blobs and supports code
      The event_trigger_ops.func() command corrsponds to the trigger 'probe'
      function that gets called when the triggering event is actually
      invoked.  The other functions are used to list the trigger when
      needed, along with a couple mundane book-keeping functions.
      This also moves event_file_data() into trace.h so it can be used
      outside of trace_events.c.
      Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Idea-by: default avatarSteve Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  3. 26 Nov, 2013 1 commit
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Allow events to have NULL strings · 4e58e547
      Steven Rostedt (Red Hat) authored
      If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer
      then the following oops will happen:
      BUG: unable to handle kernel NULL pointer dereference at   (null)
      IP: [<c127a17b>] strlen+0x10/0x1a
      *pde = 00000000 ^M
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2
      Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M
      task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000
      EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1
      EIP is at strlen+0x10/0x1a
      EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8
      ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0
       f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046
       00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M
       00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000
      Call Trace:
       [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c106bc61>] lock_release+0x57/0x1a5
       [<c1082a93>] ? ktime_get+0x29/0x69
       [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75
       [<c1082a93>] ? ktime_get+0x29/0x69^M
       [<c1082a93>] ktime_get+0x29/0x69
       [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426
       [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d
       [<c10bc184>] ? time_hardirqs_off+0xe/0x28
       [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf
       [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62
       [<c1079242>] cpu_startup_entry+0x64/0x192
       [<c102299c>] start_secondary+0x277/0x27c
      Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff
      EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80
      CR2: 0000000000000000
      ---[ end trace 01bc47bf519ec1b2 ]---
      New tracepoints have been added that have allowed for NULL pointers
      being assigned to strings. To fix this, change the TRACE_EVENT() code
      to check for NULL and if it is, it will assign "(null)" to it instead
      (similar to what glibc printf does).
      Reported-by: default avatarShuah Khan <shuah.kh@samsung.com>
      Reported-by: default avatarJovi Zhangwei <jovi.zhangwei@gmail.com>
      Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com
      Link: http://lkml.kernel.org/r/528D6972.9010702@samsung.com
      Fixes: 9cbf1176
       ("tracing/events: provide string with undefined size support")
      Cc: stable@vger.kernel.org # 2.6.31+
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  4. 19 Nov, 2013 1 commit
  5. 05 Nov, 2013 1 commit
    • Tom Zanussi's avatar
      tracing: Update event filters for multibuffer · f306cc82
      Tom Zanussi authored
      The trace event filters are still tied to event calls rather than
      event files, which means you don't get what you'd expect when using
      filters in the multibuffer case:
        # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # mkdir /sys/kernel/debug/tracing/instances/test1
        # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 2048
        # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        bytes_alloc > 2048
      Setting the filter in tracing/instances/test1/events shouldn't affect
      the same event in tracing/events as it does above.
        # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # mkdir /sys/kernel/debug/tracing/instances/test1
        # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        bytes_alloc > 2048
      We'd like to just move the filter directly from ftrace_event_call to
      ftrace_event_file, but there are a couple cases that don't yet have
      multibuffer support and therefore have to continue using the current
      event_call-based filters.  For those cases, a new USE_CALL_FILTER bit
      is added to the event_call flags, whose main purpose is to keep the
      old behavior for those cases until they can be updated with
      multibuffer support; at that point, the USE_CALL_FILTER flag (and the
      new associated call_filter_check_discard() function) can go away.
      The multibuffer support also made filter_current_check_discard()
      redundant, so this change removes that function as well and replaces
      it with filter_check_discard() (or call_filter_check_discard() as
      Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  6. 14 Aug, 2013 3 commits
  7. 19 Jul, 2013 1 commit
  8. 21 Jun, 2013 1 commit
    • Steven Rostedt's avatar
      tracing: Add DEFINE_EVENT_FN() macro · f5abaa1b
      Steven Rostedt authored
      Each TRACE_EVENT() adds several helper functions. If two or more trace events
      share the same structure and print format, they can also share most of these
      helper functions and save a lot of space from duplicate code. This is why the
      DECLARE_EVENT_CLASS() and DEFINE_EVENT() were created.
      Some events require a trigger to be called at registering and unregistering of
      the event and to do so they use TRACE_EVENT_FN().
      If multiple events require a trigger, they currently have no choice but to use
      TRACE_EVENT_FN() as there's no DEFINE_EVENT_FN() available. This unfortunately
      causes a lot of wasted duplicate code created.
      By adding a DEFINE_EVENT_FN(), these events can still use a
      DECLARE_EVENT_CLASS() and then define their own triggers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/51C3236C.8030508@hds.com
      Signed-off-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
  9. 13 Apr, 2013 1 commit
  10. 15 Mar, 2013 6 commits
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Add a way to soft disable trace events · 417944c4
      Steven Rostedt (Red Hat) authored
      In order to let triggers enable or disable events, we need a 'soft'
      method for doing so. For example, if a function probe is added that
      lets a user enable or disable events when a function is called, that
      change must be done without taking locks or a mutex, and definitely
      it can't sleep. But the full enabling of a tracepoint is expensive.
      By adding a 'SOFT_DISABLE' flag, and converting the flags to be updated
      without the protection of a mutex (using set/clear_bit()), this soft
      disable flag can be used to allow critical sections to enable or disable
      events from being traced (after the event has been placed into "SOFT_MODE").
      Some caveats though: The comm recorder (to map pids with a comm) can not
      be soft disabled (yet). If you disable an event with with a "soft"
      disable and wait a while before reading the trace, the comm cache may be
      replaced and you'll get a bunch of <...> for comms in the trace.
      Reading the "enable" file for an event that is disabled will now give
      you "0*" where the '*' denotes that the tracepoint is still active but
      the event itself is "disabled".
      [ fixed _BIT used in & operation : thanks to Dan Carpenter and smatch ]
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Li Zefan's avatar
      tracing: Fix some section mismatch warnings · 523c8113
      Li Zefan authored
      As we've added __init annotation to field-defining functions, we should
      add __refdata annotation to event_call variables, which reference those
      Link: http://lkml.kernel.org/r/51343C1F.2050502@huawei.com
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Li Zefan's avatar
      tracing: Annotate event field-defining functions with __init · 7e4f44b1
      Li Zefan authored
      Those functions are called either during kernel boot or module init.
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1208k freed
      Freeing unused kernel memory: 1360k freed
      Freeing unused kernel memory: 1960k freed
      $ dmesg | grep 'Freeing unused kernel memory'
      Freeing unused kernel memory: 1236k freed
      Freeing unused kernel memory: 1388k freed
      Freeing unused kernel memory: 1960k freed
      Link: http://lkml.kernel.org/r/5125877D.5000201@huawei.com
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Li Zefan's avatar
      tracing: Add a helper function for event print functions · f71130de
      Li Zefan authored
      Move duplicate code in event print functions to a helper function.
      This shrinks the size of the kernel by ~13K.
         text    data     bss     dec     hex filename
      6596137 1743966 10138672        18478775        119f6b7 vmlinux.o.old
      6583002 1743849 10138672        18465523        119c2f3 vmlinux.o.new
      Link: http://lkml.kernel.org/r/51258746.2060304@huawei.com
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      tracing: Pass the ftrace_file to the buffer lock reserve code · ccb469a1
      Steven Rostedt authored
      Pass the struct ftrace_event_file *ftrace_file to the
      trace_event_buffer_lock_reserve() (new function that replaces the
      The ftrace_file holds a pointer to the trace_array that is in use.
      In the case of multiple buffers with different trace_arrays, this
      allows different events to be recorded into different buffers.
      Also fixed some of the stale comments in include/trace/ftrace.h
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      tracing: Separate out trace events from global variables · ae63b31e
      Steven Rostedt authored
      The trace events for ftrace are all defined via global variables.
      The arrays of events and event systems are linked to a global list.
      This prevents multiple users of the event system (what to enable and
      what not to).
      By adding descriptors to represent the event/file relation, as well
      as to which trace_array descriptor they are associated with, allows
      for more than one set of events to be defined. Once the trace events
      files have a link between the trace event and the trace_array they
      are associated with, we can create multiple trace_arrays that can
      record separate events in separate buffers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  11. 13 Nov, 2012 1 commit
  12. 02 Nov, 2012 1 commit
    • Steven Rostedt's avatar
      tracing: Use irq_work for wake ups and remove *_nowake_*() functions · 0d5c6e1c
      Steven Rostedt authored
      Have the ring buffer commit function use the irq_work infrastructure to
      wake up any waiters waiting on the ring buffer for new data. The irq_work
      was created for such a purpose, where doing the actual wake up at the
      time of adding data is too dangerous, as an event or function trace may
      be in the midst of the work queue locks and cause deadlocks. The irq_work
      will either delay the action to the next timer interrupt, or trigger an IPI
      to itself forcing an interrupt to do the work (in a safe location).
      With irq_work, all ring buffer commits can safely do wakeups, removing
      the need for the ring buffer commit "nowake" variants, which were used
      by events and function tracing. All commits can now safely use the
      normal commit, and the "nowake" variants can be removed.
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  13. 31 Jul, 2012 1 commit
  14. 28 Jun, 2012 1 commit
    • Namhyung Kim's avatar
      tracing/kvm: Use __print_hex() for kvm_emulate_insn tracepoint · b102f1d0
      Namhyung Kim authored
      The kvm_emulate_insn tracepoint used __print_insn()
      for printing its instructions. However it makes the
      format of the event hard to parse as it reveals TP
      Fortunately, kernel provides __print_hex for almost
      same purpose, we can use it instead of open coding
      it. The user-space can be changed to parse it later.
      That means raw kernel tracing will not be affected
      by this change:
       # cd /sys/kernel/debug/tracing/
       # cat events/kvm/kvm_emulate_insn/format
       name: kvm_emulate_insn
       ID: 29
       print fmt: "%x:%llx:%s (%s)%s", REC->csbase, REC->rip, __print_hex(REC->insn, REC->len), \
       __print_symbolic(REC->flags, { 0, "real" }, { (1 << 0) | (1 << 1), "vm16" }, \
       { (1 << 0), "prot16" }, { (1 << 0) | (1 << 2), "prot32" }, { (1 << 0) | (1 << 3), "prot64" }), \
       REC->failed ? " failed" : ""
       # echo 1 > events/kvm/kvm_emulate_insn/enable
       # cat trace
       # tracer: nop
       # entries-in-buffer/entries-written: 2183/2183   #P:12
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
               qemu-kvm-1782  [002] ...1   140.931636: kvm_emulate_insn: 0:c102fa25:89 10 (prot32)
               qemu-kvm-1781  [004] ...1   140.931637: kvm_emulate_insn: 0:c102fa25:89 10 (prot32)
      Link: http://lkml.kernel.org/n/tip-wfw6y3b9ugtey8snaow9nmg5@git.kernel.org
      Link: http://lkml.kernel.org/r/1340757701-10711-2-git-send-email-namhyung@kernel.org
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Namhyung Kim <namhyung.kim@lge.com>
      Cc: kvm@vger.kernel.org
      Acked-by: default avatarAvi Kivity <avi@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  15. 04 Oct, 2011 1 commit
  16. 26 May, 2011 1 commit
  17. 03 Feb, 2011 1 commit
    • Steven Rostedt's avatar
      tracing: Replace trace_event struct array with pointer array · e4a9ea5e
      Steven Rostedt authored
      Currently the trace_event structures are placed in the _ftrace_events
      section, and at link time, the linker makes one large array of all
      the trace_event structures. On boot up, this array is read (much like
      the initcall sections) and the events are processed.
      The problem is that there is no guarantee that gcc will place complex
      structures nicely together in an array format. Two structures in the
      same file may be placed awkwardly, because gcc has no clue that they
      are suppose to be in an array.
      A hack was used previous to force the alignment to 4, to pack the
      structures together. But this caused alignment issues with other
      architectures (sparc).
      Instead of packing the structures into an array, the structures' addresses
      are now put into the _ftrace_event section. As pointers are always the
      natural alignment, gcc should always pack them tightly together
      (otherwise initcall, extable, etc would also fail).
      By having the pointers to the structures in the section, we can still
      iterate the trace_events without causing unnecessary alignment problems
      with other architectures, or depending on the current behaviour of
      gcc that will likely change in the future just to tick us kernel developers
      off a little more.
      The _ftrace_event section is also moved into the .init.data section
      as it is now only needed at boot up.
      Suggested-by: default avatarDavid Miller <davem@davemloft.net>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  18. 19 Nov, 2010 1 commit
    • Steven Rostedt's avatar
      tracing/events: Show real number in array fields · 04295780
      Steven Rostedt authored
      Currently we have in something like the sched_switch event:
        field:char prev_comm[TASK_COMM_LEN];	offset:12;	size:16;	signed:1;
      When a userspace tool such as perf tries to parse this, the
      TASK_COMM_LEN is meaningless. This is done because the TRACE_EVENT() macro
      simply uses a #len to show the string of the length. When the length is
      an enum, we get a string that means nothing for tools.
      By adding a static buffer and a mutex to protect it, we can store the
      string into that buffer with snprintf and show the actual number.
      Now we get:
        field:char prev_comm[16];       offset:12;      size:16;        signed:1;
      Something much more useful.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  19. 18 Nov, 2010 2 commits
    • Frederic Weisbecker's avatar
      tracing: Allow syscall trace events for non privileged users · 53cf810b
      Frederic Weisbecker authored
      As for the raw syscalls events, individual syscall events won't
      leak system wide information on task bound tracing. Allow non
      privileged users to use them in such workflow.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
    • Frederic Weisbecker's avatar
      tracing: New macro to set up initial event flags value · 1ed0c597
      Frederic Weisbecker authored
      This introduces the new TRACE_EVENT_FLAGS() macro in order
      to set up initial event flags value.
      This macro must simply follow the definition of a trace event
      and take the event name and the flag value as parameters:
      TRACE_EVENT(my_event, .....
      TRACE_EVENT_FLAGS(my_event, 1)
      This will set up 1 as the initial my_event->flags value.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
  20. 01 Aug, 2010 1 commit
  21. 21 Jul, 2010 1 commit
    • Lai Jiangshan's avatar
      tracing: Reduce latency and remove percpu trace_seq · bc289ae9
      Lai Jiangshan authored
      __print_flags() and __print_symbolic() use percpu trace_seq:
      1) Its memory is allocated at compile time, it wastes memory if we don't use tracing.
      2) It is percpu data and it wastes more memory for multi-cpus system.
      3) It disables preemption when it executes its core routine
         "trace_seq_printf(s, "%s: ", #call);" and introduces latency.
      So we move this trace_seq to struct trace_iterator.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      LKML-Reference: <4C078350.7090106@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  22. 29 Jun, 2010 1 commit
  23. 08 Jun, 2010 1 commit
    • Frederic Weisbecker's avatar
      perf: Drop the skip argument from perf_arch_fetch_regs_caller · b0f82b81
      Frederic Weisbecker authored
      Drop this argument now that we always want to rewind only to the
      state of the first caller.
      It means frame pointers are not necessary anymore to reliably get
      the source of an event. But this also means we need this helper
      to be a macro now, as an inline function is not an option since
      we need to know when to provide a default implentation.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
  24. 31 May, 2010 1 commit
  25. 25 May, 2010 1 commit
    • Steven Rostedt's avatar
      tracing: Add __used annotation to event variable · 49c17746
      Steven Rostedt authored
      The TRACE_EVENT() macros automate creation of trace events. To automate
      initialization, the set up variables are loaded in a special section
      that is read on boot up. GCC is not aware that these static variables
      are used and will complain about them if we do not inform GCC that
      they are indeed used.
      One of the declarations of the event element was missing a __used
      annotation. This patch adds it.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  26. 21 May, 2010 2 commits
  27. 18 May, 2010 1 commit
  28. 14 May, 2010 4 commits
    • Steven Rostedt's avatar
      tracing: Remove duplicate id information in event structure · 32c0edae
      Steven Rostedt authored
      Now that the trace_event structure is embedded in the ftrace_event_call
      structure, there is no need for the ftrace_event_call id field.
      The id field is the same as the trace_event type field.
      Removing the id and re-arranging the structure brings down the tracepoint
      footprint by another 5K.
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4895024	1023812	 861512	6780348	 6775bc	vmlinux.print
      4894944	1018052	 861512	6774508	 675eec	vmlinux.id
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      tracing: Move print functions into event class · 80decc70
      Steven Rostedt authored
      Currently, every event has its own trace_event structure. This is
      fine since the structure is needed anyway. But the print function
      structure (trace_event_functions) is now separate. Since the output
      of the trace event is done by the class (with the exception of events
      defined by DEFINE_EVENT_PRINT), it makes sense to have the class
      define the print functions that all events in the class can use.
      This makes a bigger deal with the syscall events since all syscall events
      use the same class. The savings here is another 30K.
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4900382	1048964	 861512	6810858	 67ecea	vmlinux.init
      4900446	1049028	 861512	6810986	 67ed6a	vmlinux.preprint
      4895024	1023812	 861512	6780348	 6775bc	vmlinux.print
      To accomplish this, and to let the class know what event is being
      printed, the event structure is embedded in the ftrace_event_call
      structure. This should not be an issues since the event structure
      was created for each event anyway.
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      tracing: Allow events to share their print functions · a9a57763
      Steven Rostedt authored
      Multiple events may use the same method to print their data.
      Instead of having all events have a pointer to their print funtions,
      the trace_event structure now points to a trace_event_functions structure
      that will hold the way to print ouf the event.
      The event itself is now passed to the print function to let the print
      function know what kind of event it should print.
      This opens the door to consolidating the way several events print
      their output.
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4900382	1048964	 861512	6810858	 67ecea	vmlinux.init
      4900446	1049028	 861512	6810986	 67ed6a	vmlinux.preprint
      This change slightly increases the size but is needed for the next change.
      v3: Fix the branch tracer events to handle this change.
      v2: Fix the new function graph tracer event calls to handle this change.
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      tracing: Move raw_init from events to class · 0405ab80
      Steven Rostedt authored
      The raw_init function pointer in the event is used to initialize
      various kinds of events. The type of initialization needed is usually
      classed to the kind of event it is.
      Two events with the same class will always have the same initialization
      function, so it makes sense to move this to the class structure.
      Perhaps even making a special system structure would work since
      the initialization is the same for all events within a system.
      But since there's no system structure (yet), this will just move it
      to the class.
         text	   data	    bss	    dec	    hex	filename
      4913961	1088356	 861512	6863829	 68bbd5	vmlinux.orig
      4900375	1053380	 861512	6815267	 67fe23	vmlinux.fields
      4900382	1048964	 861512	6810858	 67ecea	vmlinux.init
      The text grew very slightly, but this is a constant growth that happened
      with the changing of the C files that call the init code.
      The bigger savings is the data which will be saved the more events share
      a class.
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>