1. 23 Jan, 2013 10 commits
    • Steven Rostedt's avatar
      ring-buffer: User context bit recursion checking · 567cd4da
      Steven Rostedt authored
      
      
      Using context bit recursion checking, we can help increase the
      performance of the ring buffer.
      
      Before this patch:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 10.285
      Time: 10.407
      Time: 10.243
      Time: 10.372
      Time: 10.380
      Time: 10.198
      Time: 10.272
      Time: 10.354
      Time: 10.248
      Time: 10.253
      
      (average: 10.3012)
      
      Now we have:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 9.712
      Time: 9.824
      Time: 9.861
      Time: 9.827
      Time: 9.962
      Time: 9.905
      Time: 9.886
      Time: 10.088
      Time: 9.861
      Time: 9.834
      
      (average: 9.876)
      
       a 4% savings!
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      567cd4da
    • Steven Rostedt's avatar
      ftrace: Use only the preempt version of function tracing · 897f68a4
      Steven Rostedt authored
      
      
      The function tracer had two different versions of function tracing.
      
      The disabling of irqs version and the preempt disable version.
      
      As function tracing in very intrusive and can cause nasty recursion
      issues, it has its own recursion protection. But the old method to
      do this was a flat layer. If it detected that a recursion was happening
      then it would just return without recording.
      
      This made the preempt version (much faster than the irq disabling one)
      not very useful, because if an interrupt were to occur after the
      recursion flag was set, the interrupt would not be traced at all,
      because every function that was traced would think it recursed on
      itself (due to the context it preempted setting the recursive flag).
      
      Now that we have a recursion flag for every context level, we
      no longer need to worry about that. We can disable preemption,
      set the current context recursion check bit, and go on. If an
      interrupt were to come along, it would check its own context bit
      and happily continue to trace.
      
      As the preempt version is faster than the irq disable version,
      there's no more reason to keep the preempt version around.
      And the irq disable version still had an issue with missing
      out on tracing NMI code.
      
      Remove the irq disable function tracer version and have the
      preempt disable version be the default (and only version).
      
      Before this patch we had from running:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 12.028
      Time: 11.945
      Time: 11.925
      Time: 11.964
      Time: 12.002
      Time: 11.910
      Time: 11.944
      Time: 11.929
      Time: 11.941
      Time: 11.924
      
      (average: 11.9512)
      
      Now we have:
      
       # echo function > /debug/tracing/current_tracer
       # for i in `seq 10`; do ./hackbench 50; done
      Time: 10.285
      Time: 10.407
      Time: 10.243
      Time: 10.372
      Time: 10.380
      Time: 10.198
      Time: 10.272
      Time: 10.354
      Time: 10.248
      Time: 10.253
      
      (average: 10.3012)
      
       a 13.8% savings!
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      897f68a4
    • Steven Rostedt's avatar
      tracing: Avoid unnecessary multiple recursion checks · edc15caf
      Steven Rostedt authored
      
      
      When function tracing occurs, the following steps are made:
        If arch does not support a ftrace feature:
         call internal function (uses INTERNAL bits) which calls...
        If callback is registered to the "global" list, the list
         function is called and recursion checks the GLOBAL bits.
         then this function calls...
        The function callback, which can use the FTRACE bits to
         check for recursion.
      
      Now if the arch does not suppport a feature, and it calls
      the global list function which calls the ftrace callback
      all three of these steps will do a recursion protection.
      There's no reason to do one if the previous caller already
      did. The recursion that we are protecting against will
      go through the same steps again.
      
      To prevent the multiple recursion checks, if a recursion
      bit is set that is higher than the MAX bit of the current
      check, then we know that the check was made by the previous
      caller, and we can skip the current check.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      edc15caf
    • Steven Rostedt's avatar
      tracing: Make the trace recursion bits into enums · e46cbf75
      Steven Rostedt authored
      
      
      Convert the bits into enums which makes the code a little easier
      to maintain.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e46cbf75
    • Steven Rostedt's avatar
      ftrace: Add context level recursion bit checking · c29f122c
      Steven Rostedt authored
      
      
      Currently for recursion checking in the function tracer, ftrace
      tests a task_struct bit to determine if the function tracer had
      recursed or not. If it has, then it will will return without going
      further.
      
      But this leads to races. If an interrupt came in after the bit
      was set, the functions being traced would see that bit set and
      think that the function tracer recursed on itself, and would return.
      
      Instead add a bit for each context (normal, softirq, irq and nmi).
      
      A check of which context the task is in is made before testing the
      associated bit. Now if an interrupt preempts the function tracer
      after the previous context has been set, the interrupt functions
      can still be traced.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c29f122c
    • Steven Rostedt's avatar
      ftrace: Optimize the function tracer list loop · 0a016409
      Steven Rostedt authored
      
      
      There is lots of places that perform:
      
             op = rcu_dereference_raw(ftrace_control_list);
             while (op != &ftrace_list_end) {
      
      Add a helper macro to do this, and also optimize for a single
      entity. That is, gcc will optimize a loop for either no iterations
      or more than one iteration. But usually only a single callback
      is registered to the function tracer, thus the optimized case
      should be a single pass. to do this we now do:
      
      	op = rcu_dereference_raw(list);
      	do {
      		[...]
      	} while (likely(op = rcu_dereference_raw((op)->next)) &&
      	       unlikely((op) != &ftrace_list_end));
      
      An op is always registered (ftrace_list_end when no callbacks is
      registered), thus when a single callback is registered, the link
      list looks like:
      
       top => callback => ftrace_list_end => NULL.
      
      The likely(op = op->next) still must be performed due to the race
      of removing the callback, where the first op assignment could
      equal ftrace_list_end. In that case, the op->next would be NULL.
      But this is unlikely (only happens in a race condition when
      removing the callback).
      
      But it is very likely that the next op would be ftrace_list_end,
      unless more than one callback has been registered. This tells
      gcc what the most common case is and makes the fast path with
      the least amount of branches.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0a016409
    • Steven Rostedt's avatar
      ftrace: Fix function tracing recursion self test · 9640388b
      Steven Rostedt authored
      
      
      The function tracing recursion self test should not crash
      the machine if the resursion test fails. If it detects that
      the function tracing is recursing when it should not be, then
      bail, don't go into an infinite recursive loop.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9640388b
    • Steven Rostedt's avatar
      ftrace: Fix global function tracers that are not recursion safe · 63503794
      Steven Rostedt authored
      
      
      If one of the function tracers set by the global ops is not recursion
      safe, it can still be called directly without the added recursion
      supplied by the ftrace infrastructure.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      63503794
    • Steven Rostedt's avatar
      tracing: Fix selftest function recursion accounting · 05cbbf64
      Steven Rostedt authored
      
      
      The test that checks function recursion does things differently
      if the arch does not support all ftrace features. But that really
      doesn't make a difference with how the test runs, and either way
      the count variable should be 2 at the end.
      
      Currently the test wrongly fails for archs that don't support all
      the ftrace features.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      05cbbf64
    • Steven Rostedt's avatar
      tracing: Fix race with max_tr and changing tracers · 34600f0e
      Steven Rostedt authored
      
      
      There's a race condition between the setting of a new tracer and
      the update of the max trace buffers (the swap). When a new tracer
      is added, it sets current_trace to nop_trace before disabling
      the old tracer. At this moment, if the old tracer uses update_max_tr(),
      the update may trigger the warning against !current_trace->use_max-tr,
      as nop_trace doesn't have that set.
      
      As update_max_tr() requires that interrupts be disabled, we can
      add a check to see if current_trace == nop_trace and bail if it
      does. Then when disabling the current_trace, set it to nop_trace
      and run synchronize_sched(). This will make sure all calls to
      update_max_tr() have completed (it was called with interrupts disabled).
      
      As a clean up, this commit also removes shrinking and recreating
      the max_tr buffer if the old and new tracers both have use_max_tr set.
      The old way use to always shrink the buffer, and then expand it
      for the next tracer. This is a waste of time.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      34600f0e
  2. 22 Jan, 2013 2 commits
    • Steven Rostedt's avatar
      tracing: Remove trace.h header from trace_clock.c · 0a71e4c6
      Steven Rostedt authored
      
      
      As trace_clock is used by other things besides tracing, and it
      does not require anything from trace.h, it is best not to include
      the header file in trace_clock.c.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      0a71e4c6
    • Steven Rostedt's avatar
      tracing: Remove the extra 4 bytes of padding in events · b000c806
      Steven Rostedt authored
      Due to a userspace issue with PowerTop v2beta, which hardcoded
      the offset of event fields that it was using, it broke when
      we removed the Big Kernel Lock counter from the event header.
      
       (commit e6e1e259 "tracing: Remove lock_depth from event entry")
      
      Because this broke userspace, it was determined that we must
      keep those 4 bytes around.
      
       (commit a3a4a5ac
      
       "Regression: partial revert "tracing: Remove lock_depth from event entry"")
      
      This unfortunately wastes space in the ring buffer. 4 bytes per
      event, where a lot of events are just 24 bytes. That's 16% of the
      buffer wasted. A million events will add 4 megs of white space
      into the buffer.
      
      It was later noticed that PowerTop v2beta could not work on systems
      where the kernel was 64 bit but the userspace was 32 bits.
      The reason was because the offsets are different between the
      two and the hard coded offset of one would not work with the other.
      
      With PowerTop v2 final, it implemented the same interface that both
      perf and trace-cmd use. That is, it reads the format file of
      the event to find the offsets of the fields it needs. This fixes
      the problem with running powertop on a 32 bit userspace running
      on a 64 bit kernel. It also no longer requires the 4 byte padding.
      
      As PowerTop v2 has been out for a while, and is included in all
      major distributions, it is time that we can safely remove the
      4 bytes of padding. Users of PowerTop v2beta should upgrade to
      PowerTop v2 final.
      
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b000c806
  3. 21 Jan, 2013 13 commits
  4. 18 Jan, 2013 1 commit
  5. 17 Jan, 2013 2 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 72ffaa48
      Linus Torvalds authored
      Pull more s390 patches from Martin Schwidefsky:
       "A couple of bug fixes: one of the transparent huge page primitives is
        broken, the sched_clock function overflows after 417 days, the XFS
        module has grown too large for -fpic and the new pci code has broken
        normal channel subsystem notifications."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/chsc: fix SEI usage
        s390/time: fix sched_clock() overflow
        s390: use -fPIC for module compile
        s390/mm: fix pmd_pfn() for thp
      72ffaa48
    • Linus Torvalds's avatar
      Merge tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfs · dfdebc24
      Linus Torvalds authored
      Pull xfs bugfixes from Ben Myers:
      
       - fix(es) for compound buffers
      
       - fix for dquot soft timer asserts due to overflow of d_blk_softlimit
      
       - fix for regression in dir v2 code introduced in commit 20f7e9f3
         ("xfs: factor dir2 block read operations")
      
      * tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfs:
        xfs: recalculate leaf entry pointer after compacting a dir2 block
        xfs: remove int casts from debug dquot soft limit timer asserts
        xfs: fix the multi-segment log buffer format
        xfs: fix segment in xfs_buf_item_format_segment
        xfs: rename bli_format to avoid confusion with bli_formats
        xfs: use b_maps[] for discontiguous buffers
      dfdebc24
  6. 16 Jan, 2013 12 commits