1. 21 Dec, 2011 6 commits
    • Steven Rostedt's avatar
      ftrace: Decouple hash items from showing filtered functions · 69a3083c
      Steven Rostedt authored
      
      
      The set_ftrace_filter shows "hashed" functions, which are functions
      that are added with operations to them (like traceon and traceoff).
      
      As other subsystems may be able to show what functions they are
      using for function tracing, the hash items should no longer
      be shown just because the FILTER flag is set. As they have nothing
      to do with other subsystems filters.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      69a3083c
    • Steven Rostedt's avatar
      ftrace: Allow other users of function tracing to use the output listing · fc13cb0c
      Steven Rostedt authored
      
      
      The function tracer is set up to allow any other subsystem (like perf)
      to use it. Ftrace already has a way to list what functions are enabled
      by the global_ops. It would be very helpful to let other users of
      the function tracer to be able to use the same code.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      fc13cb0c
    • Steven Rostedt's avatar
      ftrace: Replace record newlist with record page list · 85ae32ae
      Steven Rostedt authored
      
      
      As new functions come in to be initalized from mcount to nop,
      they are done by groups of pages. Whether it is the core kernel
      or a module. There's no need to keep track of these on a per record
      basis.
      
      At startup, and as any module is loaded, the functions to be
      traced are stored in a group of pages and added to the function
      list at the end. We just need to keep a pointer to the first
      page of the list that was added, and use that to know where to
      start on the list for initializing functions.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      85ae32ae
    • Steven Rostedt's avatar
      ftrace: Remove usage of "freed" records · 32082309
      Steven Rostedt authored
      
      
      Records that are added to the function trace table are
      permanently there, except for modules. By separating out the
      modules to their own pages that can be freed in one shot
      we can remove the "freed" flag and simplify some of the record
      management.
      
      Another benefit of doing this is that we can also move the
      records around; sort them.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      32082309
    • Steven Rostedt's avatar
      ftrace: Allow archs to modify code without stop machine · c88fd863
      Steven Rostedt authored
      
      
      The stop machine method to modify all functions in the kernel
      (some 20,000 of them) is the safest way to do so across all archs.
      But some archs may not need this big hammer approach to modify code
      on SMP machines, and can simply just update the code it needs.
      
      Adding a weak function arch_ftrace_update_code() that now does the
      stop machine, will also let any arch override this method.
      
      If the arch needs to check the system and then decide if it can
      avoid stop machine, it can still call ftrace_run_stop_machine() to
      use the old method.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c88fd863
    • Steven Rostedt's avatar
      ftrace: Do not function trace inlined functions · 45959ee7
      Steven Rostedt authored
      
      
      When gcc inlines a function, it does not mark it with the mcount
      prologue, which in turn means that inlined functions are not traced
      by the function tracer. But if CONFIG_OPTIMIZE_INLINING is set, then
      gcc is allowed not to inline a function that is marked inline.
      
      Depending on the options and the compiler, a function may or may
      not be traced by the function tracer, depending on whether gcc
      decides to inline a function or not. This has caused several
      problems in the pass becaues gcc is not always consistent with
      what it decides to inline between different gcc versions.
      
      Some places should not be traced (like paravirt native_* functions)
      and these are mostly marked as inline. When gcc decides not to
      inline the function, and if that function should not be traced, then
      the ftrace function tracer will suddenly break when it use to work
      fine. This becomes even harder to debug when different versions of
      gcc will not inline that function, making the same kernel and config
      work for some gcc versions and not work for others.
      
      By making all functions marked inline to not be traced will remove
      the ambiguity that gcc adds when it comes to tracing functions marked
      inline. All gcc versions will be consistent with what functions are
      traced and having volatile working code will be removed.
      
      Note, only the inline macro when CONFIG_OPTIMIZE_INLINING is set needs
      to have notrace added, as the attribute __always_inline will force
      the function to be inlined and then not traced.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      45959ee7
  2. 06 Dec, 2011 4 commits
  3. 05 Dec, 2011 2 commits
    • Li Zefan's avatar
      tracing: Restore system filter behavior · 27b14b56
      Li Zefan authored
      Though not all events have field 'prev_pid', it was allowed to do this:
      
        # echo 'prev_pid == 100' > events/sched/filter
      
      but commit 75b8e982 (tracing/filter: Swap
      entire filter of events) broke it without any reason.
      
      Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      27b14b56
    • Peter Zijlstra's avatar
      perf: Fix loss of notification with multi-event · 10c6db11
      Peter Zijlstra authored
      
      
      When you do:
              $ perf record -e cycles,cycles,cycles noploop 10
      
      You expect about 10,000 samples for each event, i.e., 10s at
      1000samples/sec. However, this is not what's happening. You
      get much fewer samples, maybe 3700 samples/event:
      
      $ perf report -D | tail -15
      Aggregated stats:
                 TOTAL events:      10998
                  MMAP events:         66
                  COMM events:          2
                SAMPLE events:      10930
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      cycles stats:
                 TOTAL events:       3642
                SAMPLE events:       3642
      cycles stats:
                 TOTAL events:       3644
                SAMPLE events:       3644
      
      On a Intel Nehalem or even AMD64, there are 4 counters capable
      of measuring cycles, so there is plenty of space to measure those
      events without multiplexing (even with the NMI watchdog active).
      And even with multiplexing, we'd expect roughly the same number
      of samples per event.
      
      The root of the problem was that when the event that caused the buffer
      to become full was not the first event passed on the cmdline, the user
      notification would get lost. The notification was sent to the file
      descriptor of the overflowed event but the perf tool was not polling
      on it.  The perf tool aggregates all samples into a single buffer,
      i.e., the buffer of the first event. Consequently, it assumes
      notifications for any event will come via that descriptor.
      
      The seemingly straight forward solution of moving the waitq into the
      ringbuffer object doesn't work because of life-time issues. One could
      perf_event_set_output() on a fd that you're also blocking on and cause
      the old rb object to be freed while its waitq would still be
      referenced by the blocked thread -> FAIL.
      
      Therefore link all events to the ringbuffer and broadcast the wakeup
      from the ringbuffer object to all possible events that could be waited
      upon. This is rather ugly, and we're open to better solutions but it
      works for now.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Finished-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20111126014731.GA7030@quad
      
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      10c6db11
  4. 11 Nov, 2011 6 commits
  5. 09 Nov, 2011 1 commit
  6. 08 Nov, 2011 1 commit
  7. 07 Nov, 2011 7 commits
  8. 06 Nov, 2011 1 commit
  9. 04 Nov, 2011 12 commits