1. 21 Dec, 2011 12 commits
    • Steven Rostedt's avatar
      ftrace: Allow access to the boot time function enabling · 2a85a37f
      Steven Rostedt authored
      
      
      Change set_ftrace_early_filter() to ftrace_set_early_filter()
      and make it a global function. This will allow other subsystems
      in the kernel to be able to enable function tracing at start
      up and reuse the ftrace function parsing code.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2a85a37f
    • Steven Rostedt's avatar
      ftrace: Decouple hash items from showing filtered functions · 69a3083c
      Steven Rostedt authored
      
      
      The set_ftrace_filter shows "hashed" functions, which are functions
      that are added with operations to them (like traceon and traceoff).
      
      As other subsystems may be able to show what functions they are
      using for function tracing, the hash items should no longer
      be shown just because the FILTER flag is set. As they have nothing
      to do with other subsystems filters.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      69a3083c
    • Steven Rostedt's avatar
      ftrace: Allow other users of function tracing to use the output listing · fc13cb0c
      Steven Rostedt authored
      
      
      The function tracer is set up to allow any other subsystem (like perf)
      to use it. Ftrace already has a way to list what functions are enabled
      by the global_ops. It would be very helpful to let other users of
      the function tracer to be able to use the same code.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      fc13cb0c
    • Steven Rostedt's avatar
      ftrace: Create ftrace_hash_empty() helper routine · 06a51d93
      Steven Rostedt authored
      
      
      There are two types of hashes in the ftrace_ops; one type
      is the filter_hash and the other is the notrace_hash. Either
      one may be null, meaning it has no elements. But when elements
      are added, the hash is allocated.
      
      Throughout the code, a check needs to be made to see if a hash
      exists or the hash has elements, but the check if the hash exists
      is usually missing causing the possible "NULL pointer dereference bug".
      
      Add a helper routine called "ftrace_hash_empty()" that returns
      true if the hash doesn't exist or its count is zero. As they mean
      the same thing.
      Last-bug-reported-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      06a51d93
    • Steven Rostedt's avatar
      ftrace: Fix ftrace hash record update with notrace · c842e975
      Steven Rostedt authored
      
      
      When disabling the "notrace" records, that means we want to trace them.
      If the notrace_hash is zero, it means that we want to trace all
      records. But to disable a zero notrace_hash means nothing.
      
      The check for the notrace_hash count was incorrect with:
      
      	if (hash && !hash->count)
      		return
      
      With the correct comment above it that states that we do nothing
      if the notrace_hash has zero count. But !hash also means that
      the notrace hash has zero count. I think this was done to
      protect against dereferencing NULL. But if !hash is true, then
      we go through the following loop without doing a single thing.
      
      Fix it to:
      
      	if (!hash || !hash->count)
      		return;
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c842e975
    • Steven Rostedt's avatar
      ftrace: Use bsearch to find record ip · 5855fead
      Steven Rostedt authored
      
      
      Now that each set of pages in the function list are sorted by
      ip, we can use bsearch to find a record within each set of pages.
      This speeds up the ftrace_location() function by magnitudes.
      
      For archs (like x86) that need to add a breakpoint at every function
      that will be converted from a nop to a callback and vice versa,
      the breakpoint callback needs to know if the breakpoint was for
      ftrace or not. It requires finding the breakpoint ip within the
      records. Doing a linear search is extremely inefficient. It is
      a must to be able to do a fast binary search to find these locations.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5855fead
    • Steven Rostedt's avatar
      ftrace: Sort the mcount records on each page · 68950619
      Steven Rostedt authored
      
      
      Sort records by ip locations of the ftrace mcount calls on each of the
      set of pages in the function list. This helps in localizing cache
      usuage when updating the function locations, as well as gives us
      the ability to quickly find an ip location in the list.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      68950619
    • Steven Rostedt's avatar
      ftrace: Replace record newlist with record page list · 85ae32ae
      Steven Rostedt authored
      
      
      As new functions come in to be initalized from mcount to nop,
      they are done by groups of pages. Whether it is the core kernel
      or a module. There's no need to keep track of these on a per record
      basis.
      
      At startup, and as any module is loaded, the functions to be
      traced are stored in a group of pages and added to the function
      list at the end. We just need to keep a pointer to the first
      page of the list that was added, and use that to know where to
      start on the list for initializing functions.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      85ae32ae
    • Steven Rostedt's avatar
      ftrace: Allocate the mcount record pages as groups · a7900875
      Steven Rostedt authored
      
      
      Allocate the mcount record pages as a group of pages as big
      as can be allocated and waste no more than a single page.
      
      Grouping the mcount pages as much as possible helps with cache
      locality, as we do not need to redirect with descriptors as we
      cross from page to page. It also allows us to do more with the
      records later on (sort them with bigger benefits).
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a7900875
    • Steven Rostedt's avatar
      ftrace: Remove usage of "freed" records · 32082309
      Steven Rostedt authored
      
      
      Records that are added to the function trace table are
      permanently there, except for modules. By separating out the
      modules to their own pages that can be freed in one shot
      we can remove the "freed" flag and simplify some of the record
      management.
      
      Another benefit of doing this is that we can also move the
      records around; sort them.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      32082309
    • Steven Rostedt's avatar
      ftrace: Allow archs to modify code without stop machine · c88fd863
      Steven Rostedt authored
      
      
      The stop machine method to modify all functions in the kernel
      (some 20,000 of them) is the safest way to do so across all archs.
      But some archs may not need this big hammer approach to modify code
      on SMP machines, and can simply just update the code it needs.
      
      Adding a weak function arch_ftrace_update_code() that now does the
      stop machine, will also let any arch override this method.
      
      If the arch needs to check the system and then decide if it can
      avoid stop machine, it can still call ftrace_run_stop_machine() to
      use the old method.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c88fd863
    • Jiri Olsa's avatar
      ftrace: Fix unregister ftrace_ops accounting · 30fb6aa7
      Jiri Olsa authored
      Multiple users of the function tracer can register their functions
      with the ftrace_ops structure. The accounting within ftrace will
      update the counter on each function record that is being traced.
      When the ftrace_ops filtering adds or removes functions, the
      function records will be updated accordingly if the ftrace_ops is
      still registered.
      
      When a ftrace_ops is removed, the counter of the function records,
      that the ftrace_ops traces, are decremented. When they reach zero
      the functions that they represent are modified to stop calling the
      mcount code.
      
      When changes are made, the code is updated via stop_machine() with
      a command passed to the function to tell it what to do. There is an
      ENABLE and DISABLE command that tells the called function to enable
      or disable the functions. But the ENABLE is really a misnomer as it
      should just update the records, as records that have been enabled
      and now have a count of zero should be disabled.
      
      The DISABLE command is used to disable all functions regardless of
      their counter values. This is the big off switch and is not the
      complement of the ENABLE command.
      
      To make matters worse, when a ftrace_ops is unregistered and there
      is another ftrace_ops registered, neither the DISABLE nor the
      ENABLE command are set when calling into the stop_machine() function
      and the records will not be updated to match their counter. A command
      is passed to that function that will update the mcount code to call
      the registered callback directly if it is the only one left. This
      means that the ftrace_ops that is still registered will have its callback
      called by all functions that have been set for it as well as the ftrace_ops
      that was just unregistered.
      
      Here's a way to trigger this bug. Compile the kernel with
      CONFIG_FUNCTION_PROFILER set and with CONFIG_FUNCTION_GRAPH not set:
      
       CONFIG_FUNCTION_PROFILER=y
       # CONFIG_FUNCTION_GRAPH is not set
      
      This will force the function profiler to use the function tracer instead
      of the function graph tracer.
      
        # cd /sys/kernel/debug/tracing
        # echo schedule > set_ftrace_filter
        # echo function > current_tracer
        # cat set_ftrace_filter
       schedule
        # cat trace
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 692/68108025   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
            kworker/0:2-909   [000] ....   531.235574: schedule <-worker_thread
                 <idle>-0     [001] .N..   531.235575: schedule <-cpu_idle
            kworker/0:2-909   [000] ....   531.235597: schedule <-worker_thread
                   sshd-2563  [001] ....   531.235647: schedule <-schedule_hrtimeout_range_clock
      
        # echo 1 > function_profile_enabled
        # echo 0 > function_porfile_enabled
        # cat set_ftrace_filter
       schedule
        # cat trace
       # tracer: function
       #
       # entries-in-buffer/entries-written: 159701/118821262   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
                 <idle>-0     [002] ...1   604.870655: local_touch_nmi <-cpu_idle
                 <idle>-0     [002] d..1   604.870655: enter_idle <-cpu_idle
                 <idle>-0     [002] d..1   604.870656: atomic_notifier_call_chain <-enter_idle
                 <idle>-0     [002] d..1   604.870656: __atomic_notifier_call_chain <-atomic_notifier_call_chain
      
      The same problem could have happened with the trace_probe_ops,
      but they are modified with the set_frace_filter file which does the
      update at closure of the file.
      
      The simple solution is to change ENABLE to UPDATE and call it every
      time an ftrace_ops is unregistered.
      
      Link: http://lkml.kernel.org/r/1323105776-26961-3-git-send-email-jolsa@redhat.com
      
      
      
      Cc: stable@vger.kernel.org # 3.0+
      Signed-off-by: default avatarJiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      30fb6aa7
  2. 05 Dec, 2011 2 commits
    • Steven Rostedt's avatar
      ftrace: Fix hash record accounting bug · ddf6e0e5
      Steven Rostedt authored
      If the set_ftrace_filter is cleared by writing just whitespace to
      it, then the filter hash refcounts will be decremented but not
      updated. This causes two bugs:
      
      1) No functions will be enabled for tracing when they all should be
      
      2) If the users clears the set_ftrace_filter twice, it will crash ftrace:
      
      ------------[ cut here ]------------
      WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7()
      Modules linked in:
      Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32
      Call Trace:
       [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b
       [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c
       [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7
       [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f
       [<ffffffff8111bdfe>] ? kfree+0xe5/0x115
       [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151
       [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f
       [<ffffffff8112e49a>] fput+0xfd/0x1c2
       [<ffffffff8112b54c>] filp_close+0x6d/0x78
       [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1
       [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54
       [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b
      ---[ end trace 77a3a7ee73794a02 ]---
      
      Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian
      
      Reported-by: default avatarRabin Vincent <rabin@rab.in>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ddf6e0e5
    • Steven Rostedt's avatar
      ftrace: Remove force undef config value left for testing · c7c6ec8b
      Steven Rostedt authored
      A forced undef of a config value was used for testing and was
      accidently left in during the final commit. This causes x86 to
      run slower than needed while running function tracing as well
      as causes the function graph selftest to fail when DYNMAIC_FTRACE
      is not set. This is because the code in MCOUNT expects the ftrace
      code to be processed with the config value set that happened to
      be forced not set.
      
      The forced config option was left in by:
          commit 6331c28c
          ftrace: Fix dynamic selftest failure on some archs
      
      Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian
      
      
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRabin Vincent <rabin@rab.in>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c7c6ec8b
  3. 07 Nov, 2011 2 commits
    • Steven Rostedt's avatar
      ftrace: Fix hash record accounting bug · d4d34b98
      Steven Rostedt authored
      If the set_ftrace_filter is cleared by writing just whitespace to
      it, then the filter hash refcounts will be decremented but not
      updated. This causes two bugs:
      
      1) No functions will be enabled for tracing when they all should be
      
      2) If the users clears the set_ftrace_filter twice, it will crash ftrace:
      
      ------------[ cut here ]------------
      WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7()
      Modules linked in:
      Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32
      Call Trace:
       [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b
       [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c
       [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7
       [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f
       [<ffffffff8111bdfe>] ? kfree+0xe5/0x115
       [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151
       [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f
       [<ffffffff8112e49a>] fput+0xfd/0x1c2
       [<ffffffff8112b54c>] filp_close+0x6d/0x78
       [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1
       [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54
       [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b
      ---[ end trace 77a3a7ee73794a02 ]---
      
      Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian
      
      Reported-by: default avatarRabin Vincent <rabin@rab.in>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d4d34b98
    • Steven Rostedt's avatar
      ftrace: Remove force undef config value left for testing · 8ee3c92b
      Steven Rostedt authored
      A forced undef of a config value was used for testing and was
      accidently left in during the final commit. This causes x86 to
      run slower than needed while running function tracing as well
      as causes the function graph selftest to fail when DYNMAIC_FTRACE
      is not set. This is because the code in MCOUNT expects the ftrace
      code to be processed with the config value set that happened to
      be forced not set.
      
      The forced config option was left in by:
          commit 6331c28c
          ftrace: Fix dynamic selftest failure on some archs
      
      Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian
      
      
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRabin Vincent <rabin@rab.in>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      8ee3c92b
  4. 31 Oct, 2011 1 commit
  5. 11 Oct, 2011 1 commit
    • Steven Rostedt's avatar
      tracing: Warn on output if the function tracer was found corrupted · e0a413f6
      Steven Rostedt authored
      
      
      As the function tracer is very intrusive, lots of self checks are
      performed on the tracer and if something is found to be strange
      it will shut itself down keeping it from corrupting the rest of the
      kernel. This shutdown may still allow functions to be traced, as the
      tracing only stops new modifications from happening. Trying to stop
      the function tracer itself can cause more harm as it requires code
      modification.
      
      Although a WARN_ON() is executed, a user may not notice it. To help
      the user see that something isn't right with the tracing of the system
      a big warning is added to the output of the tracer that lets the user
      know that their data may be incomplete.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      e0a413f6
  6. 15 Jul, 2011 1 commit
    • Steven Rostedt's avatar
      ftrace: Fix regression where ftrace breaks when modules are loaded · f7bc8b61
      Steven Rostedt authored
      
      
      Enabling function tracer to trace all functions, then load a module and
      then disable function tracing will cause ftrace to fail.
      
      This can also happen by enabling function tracing on the command line:
      
        ftrace=function
      
      and during boot up, modules are loaded, then you disable function tracing
      with 'echo nop > current_tracer' you will trigger a bug in ftrace that
      will shut itself down.
      
      The reason is, the new ftrace code keeps ref counts of all ftrace_ops that
      are registered for tracing. When one or more ftrace_ops are registered,
      all the records that represent the functions that the ftrace_ops will
      trace have a ref count incremented. If this ref count is not zero,
      when the code modification runs, that function will be enabled for tracing.
      If the ref count is zero, that function will be disabled from tracing.
      
      To make sure the accounting was working, FTRACE_WARN_ON()s were added
      to updating of the ref counts.
      
      If the ref count hits its max (> 2^30 ftrace_ops added), or if
      the ref count goes below zero, a FTRACE_WARN_ON() is triggered which
      disables all modification of code.
      
      Since it is common for ftrace_ops to trace all functions in the kernel,
      instead of creating > 20,000 hash items for the ftrace_ops, the hash
      count is just set to zero, and it represents that the ftrace_ops is
      to trace all functions. This is where the issues arrise.
      
      If you enable function tracing to trace all functions, and then add
      a module, the modules function records do not get the ref count updated.
      When the function tracer is disabled, all function records ref counts
      are subtracted. Since the modules never had their ref counts incremented,
      they go below zero and the FTRACE_WARN_ON() is triggered.
      
      The solution to this is rather simple. When modules are loaded, and
      their functions are added to the the ftrace pool, look to see if any
      ftrace_ops are registered that trace all functions. And for those,
      update the ref count for the module function records.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f7bc8b61
  7. 14 Jul, 2011 3 commits
    • Steven Rostedt's avatar
      ftrace: Fix dynamic selftest failure on some archs · 6331c28c
      Steven Rostedt authored
      
      
      Archs that do not implement CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST, will
      fail the dynamic ftrace selftest.
      
      The function tracer has a quick 'off' variable that will prevent
      the call back functions from being called. This variable is called
      function_trace_stop. In x86, this is implemented directly in the mcount
      assembly, but for other archs, an intermediate function is used called
      ftrace_test_stop_func().
      
      In dynamic ftrace, the function pointer variable ftrace_trace_function is
      used to update the caller code in the mcount caller. But for archs that
      do not have CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST set, it only calls
      ftrace_test_stop_func() instead, which in turn calls __ftrace_trace_function.
      
      When more than one ftrace_ops is registered, the function it calls is
      ftrace_ops_list_func(), which will iterate over all registered ftrace_ops
      and call the callbacks that have their hash matching.
      
      The issue happens when two ftrace_ops are registered for different functions
      and one is then unregistered. The __ftrace_trace_function is then pointed
      to the remaining ftrace_ops callback function directly. This mean it will
      be called for all functions that were registered to trace by both ftrace_ops
      that were registered.
      
      This is not an issue for archs with CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST,
      because the update of ftrace_trace_function doesn't happen until after all
      functions have been updated, and then the mcount caller is updated. But
      for those archs that do use the ftrace_test_stop_func(), the update is
      immediate.
      
      The dynamic selftest fails because it hits this situation, and the
      ftrace_ops that it registers fails to only trace what it was suppose to
      and instead traces all other functions.
      
      The solution is to delay the setting of __ftrace_trace_function until
      after all the functions have been updated according to the registered
      ftrace_ops. Also, function_trace_stop is set during the update to prevent
      function tracing from calling code that is caused by the function tracer
      itself.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6331c28c
    • Steven Rostedt's avatar
      ftrace: Update filter when tracing enabled in set_ftrace_filter() · 072126f4
      Steven Rostedt authored
      
      
      Currently, if set_ftrace_filter() is called when the ftrace_ops is
      active, the function filters will not be updated. They will only be updated
      when tracing is disabled and re-enabled.
      
      Update the functions immediately during set_ftrace_filter().
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      072126f4
    • Steven Rostedt's avatar
      ftrace: Balance records when updating the hash · 41fb61c2
      Steven Rostedt authored
      
      
      Whenever the hash of the ftrace_ops is updated, the record counts
      must be balance. This requires disabling the records that are set
      in the original hash, and then enabling the records that are set
      in the updated hash.
      
      Moving the update into ftrace_hash_move() removes the bug where the
      hash was updated but the records were not, which results in ftrace
      triggering a warning and disabling itself because the ftrace_ops filter
      is updated while the ftrace_ops was registered, and then the failure
      happens when the ftrace_ops is unregistered.
      
      The current code will not trigger this bug, but new code will.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      41fb61c2
  8. 08 Jul, 2011 1 commit
    • Steven Rostedt's avatar
      ftrace: Do not disable interrupts for modules in mcount update · 4376cac6
      Steven Rostedt authored
      
      
      When I mounted an NFS directory, it caused several modules to be loaded. At the
      time I was running the preemptirqsoff tracer, and it showed the following
      output:
      
      # tracer: preemptirqsoff
      #
      # preemptirqsoff latency trace v1.1.5 on 2.6.33.9-rt30-mrg-test
      # --------------------------------------------------------------------
      # latency: 1177 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
      #    -----------------
      #    | task: modprobe-19370 (uid:0 nice:0 policy:0 rt_prio:0)
      #    -----------------
      #  => started at: ftrace_module_notify
      #  => ended at:   ftrace_module_notify
      #
      #
      #                  _------=> CPU#
      #                 / _-----=> irqs-off
      #                | / _----=> need-resched
      #                || / _---=> hardirq/softirq
      #                ||| / _--=> preempt-depth
      #                |||| /_--=> lock-depth
      #                |||||/     delay
      #  cmd     pid   |||||| time  |   caller
      #     \   /      ||||||   \   |   /
      modprobe-19370   3d....    0us!: ftrace_process_locs <-ftrace_module_notify
      modprobe-19370   3d.... 1176us : ftrace_process_locs <-ftrace_module_notify
      modprobe-19370   3d.... 1178us : trace_hardirqs_on <-ftrace_module_notify
      modprobe-19370   3d.... 1178us : <stack trace>
       => ftrace_process_locs
       => ftrace_module_notify
       => notifier_call_chain
       => __blocking_notifier_call_chain
       => blocking_notifier_call_chain
       => sys_init_module
       => system_call_fastpath
      
      That's over 1ms that interrupts are disabled on a Real-Time kernel!
      
      Looking at the cause (being the ftrace author helped), I found that the
      interrupts are disabled before the code modification of mcounts into nops. The
      interrupts only need to be disabled on start up around this code, not when
      modules are being loaded.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      4376cac6
  9. 07 Jul, 2011 1 commit
    • Steven Rostedt's avatar
      ftrace: Fix regression of :mod:module function enabling · 43dd61c9
      Steven Rostedt authored
      
      
      The new code that allows different utilities to pick and choose
      what functions they trace broke the :mod: hook that allows users
      to trace only functions of a particular module.
      
      The reason is that the :mod: hook bypasses the hash that is setup
      to allow individual users to trace their own functions and uses
      the global hash directly. But if the global hash has not been
      set up, it will cause a bug:
      
      echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter
      
      produces:
      
       [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
       [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
       BUG: unable to handle kernel paging request at ffffffff8160ec90
       IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
       Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
       Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt
      
       Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
       RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
       RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
       RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
       RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
       R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
       R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
       FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
       Stack:
        0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
        ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
        ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
       Call Trace:
        [<ffffffff810d92f5>] match_records+0x155/0x1b0
        [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
        [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
        [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
        [<ffffffff81166e48>] vfs_write+0xc8/0x190
        [<ffffffff81167001>] sys_write+0x51/0x90
        [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
       Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
       RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
        RSP <ffff88003a96bda8>
       CR2: ffffffff8160ec90
       ---[ end trace a5d031828efdd88e ]---
      Reported-by: default avatarBrian Marete <marete@toshnix.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      43dd61c9
  10. 15 Jun, 2011 2 commits
  11. 07 Jun, 2011 2 commits
  12. 26 May, 2011 3 commits
    • Steven Rostedt's avatar
      ftrace: Add internal recursive checks · b1cff0ad
      Steven Rostedt authored
      Witold reported a reboot caused by the selftests of the dynamic function
      tracer. He sent me a config and I used ktest to do a config_bisect on it
      (as my config did not cause the crash). It pointed out that the problem
      config was CONFIG_PROVE_RCU.
      
      What happened was that if multiple callbacks are attached to the
      function tracer, we iterate a list of callbacks. Because the list is
      managed by synchronize_sched() and preempt_disable, the access to the
      pointers uses rcu_dereference_raw().
      
      When PROVE_RCU is enabled, the rcu_dereference_raw() calls some
      debugging functions, which happen to be traced. The tracing of the debug
      function would then call rcu_dereference_raw() which would then call the
      debug function and then... well you get the idea.
      
      I first wrote two different patches to solve this bug.
      
      1) add a __rcu_dereference_raw() that would not do any checks.
      2) add notrace to the offending debug functions.
      
      Both of these patches worked.
      
      Talking with Paul McKenney on IRC, he suggested to add recursion
      detection instead. This seemed to be a better solution, so I decided to
      implement it. As the task_struct already has a trace_recursion to detect
      recursion in the ring buffer, and that has a very small number it
      allows, I decided to use that same variable to add flags that can detect
      the recursion inside the infrastructure of the function tracer.
      
      I plan to change it so that the task struct bit can be checked in
      mcount, but as that requires changes to all archs, I will hold that off
      to the next merge window.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/1306348063.1465.116.camel@gandalf.stny.rr.com
      
      Reported-by: default avatarWitold Baryluk <baryluk@smp.if.uj.edu.pl>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b1cff0ad
    • Steven Rostedt's avatar
      ftrace: Set ops->flag to enabled even on static function tracing · 3b6cfdb1
      Steven Rostedt authored
      
      
      When dynamic ftrace is not configured, the ops->flags still needs
      to have its FTRACE_OPS_FL_ENABLED bit set in ftrace_startup().
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      3b6cfdb1
    • Steven Rostedt's avatar
      ftrace: Have ftrace_startup() return failure code · a1cd6173
      Steven Rostedt authored
      
      
      The register_ftrace_function() returns an error code on failure
      except if the call to ftrace_startup() fails. Add a error return to
      ftrace_startup() if it fails to start, allowing register_ftrace_funtion()
      to return a proper error value.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      a1cd6173
  13. 18 May, 2011 9 commits
    • Steven Rostedt's avatar
      ftrace: Modify ftrace_set_filter/notrace to take ops · 936e074b
      Steven Rostedt authored
      
      
      Since users of the function tracer can now pick and choose which
      functions they want to trace agnostically from other users of the
      function tracer, we need to pass the ops struct to the ftrace_set_filter()
      functions.
      
      The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
      is added to keep the old filter functions which are used to modify
      the generic function tracers.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      936e074b
    • Steven Rostedt's avatar
      ftrace: Allow dynamically allocated function tracers · cdbe61bf
      Steven Rostedt authored
      
      
      Now that functions may be selected individually, it only makes sense
      that we should allow dynamically allocated trace structures to
      be traced. This will allow perf to allocate a ftrace_ops structure
      at runtime and use it to pick and choose which functions that
      structure will trace.
      
      Note, a dynamically allocated ftrace_ops will always be called
      indirectly instead of being called directly from the mcount in
      entry.S. This is because there's no safe way to prevent mcount
      from being preempted before calling the function, unless we
      modify every entry.S to do so (not likely). Thus, dynamically allocated
      functions will now be called by the ftrace_ops_list_func() that
      loops through the ops that are allocated if there are more than
      one op allocated at a time. This loop is protected with a
      preempt_disable.
      
      To determine if an ftrace_ops structure is allocated or not, a new
      util function was added to the kernel/extable.c called
      core_kernel_data(), which returns 1 if the address is between
      _sdata and _edata.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      cdbe61bf
    • Steven Rostedt's avatar
      ftrace: Implement separate user function filtering · b848914c
      Steven Rostedt authored
      
      
      ftrace_ops that are registered to trace functions can now be
      agnostic to each other in respect to what functions they trace.
      Each ops has their own hash of the functions they want to trace
      and a hash to what they do not want to trace. A empty hash for
      the functions they want to trace denotes all functions should
      be traced that are not in the notrace hash.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b848914c
    • Steven Rostedt's avatar
      ftrace: Free hash with call_rcu_sched() · 07fd5515
      Steven Rostedt authored
      
      
      When a hash is modified and might be in use, we need to perform
      a schedule RCU operation on it, as the hashes will soon be used
      directly in the function tracer callback.
      
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      07fd5515
    • Steven Rostedt's avatar
      ftrace: Have global_ops store the functions that are to be traced · 2b499381
      Steven Rostedt authored
      
      
      This is a step towards each ops structure defining its own set
      of functions to trace. As the current code with pid's and such
      are specific to the global_ops, it is restructured to be used
      with the global ops.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      2b499381
    • Steven Rostedt's avatar
      ftrace: Add ops parameter to ftrace_startup/shutdown functions · bd69c30b
      Steven Rostedt authored
      
      
      In order to allow different ops to enable different functions,
      the ftrace_startup() and ftrace_shutdown() functions need the
      ops parameter passed to them.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      bd69c30b
    • Steven Rostedt's avatar
      ftrace: Add enabled_functions file · 647bcd03
      Steven Rostedt authored
      
      
      Add the enabled_functions file that is used to show all the
      functions that have been enabled for tracing as well as their
      ref counts. This helps seeing if any function has been registered
      and what functions are being traced.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      647bcd03
    • Steven Rostedt's avatar
      ftrace: Use counters to enable functions to trace · ed926f9b
      Steven Rostedt authored
      
      
      Every function has its own record that stores the instruction
      pointer and flags for the function to be traced. There are only
      two flags: enabled and free. The enabled flag states that tracing
      for the function has been enabled (actively traced), and the free
      flag states that the record no longer points to a function and can
      be used by new functions (loaded modules).
      
      These flags are now moved to the MSB of the flags (actually just
      the top 32bits). The rest of the bits (30 bits) are now used as
      a ref counter. Everytime a tracer register functions to trace,
      those functions will have its counter incremented.
      
      When tracing is enabled, to determine if a function should be traced,
      the counter is examined, and if it is non-zero it is set to trace.
      
      When a ftrace_ops is registered to trace functions, its hashes
      are examined. If the ftrace_ops filter_hash count is zero, then
      all functions are set to be traced, otherwise only the functions
      in the hash are to be traced. The exception to this is if a function
      is also in the ftrace_ops notrace_hash. Then that function's counter
      is not incremented for this ftrace_ops.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      ed926f9b
    • Steven Rostedt's avatar
      ftrace: Separate hash allocation and assignment · 33dc9b12
      Steven Rostedt authored
      
      
      When filtering, allocate a hash to insert the function records.
      After the filtering is complete, assign it to the ftrace_ops structure.
      
      This allows the ftrace_ops structure to have a much smaller array of
      hash buckets instead of wasting a lot of memory.
      
      A read only empty_hash is created to be the minimum size that any ftrace_ops
      can point to.
      
      When a new hash is created, it has the following steps:
      
      o Allocate a default hash.
      o Walk the function records assigning the filtered records to the hash
      o Allocate a new hash with the appropriate size buckets
      o Move the entries from the default hash to the new hash.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      33dc9b12