1. 08 Dec, 2008 1 commit
  2. 05 Dec, 2008 4 commits
  3. 04 Dec, 2008 8 commits
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: handle ftrace_printk entries · 1fd8f2a3
      Frederic Weisbecker authored
      
      
      Handle the TRACE_PRINT entries from the function grapg tracer
      and output them as a C comment just below the function that called
      it, as if it was a comment inside this function.
      
      Example with an ftrace_printk inside might_sleep() function:
      
      void __might_sleep(char *file, int line)
      {
      	static unsigned long prev_jiffy;	/* ratelimiting */
      
      	ftrace_printk("Hi I'm a comment in might_sleep() :-)");
      
      A chunk of a resulting trace:
      
       0)               |        _reiserfs_free_block() {
       0)               |          reiserfs_read_bitmap_block() {
       0)               |            __bread() {
       0)               |              __getblk() {
       0)               |                __find_get_block() {
       0)   0.698 us    |                  mark_page_accessed();
       0)   2.267 us    |                }
       0)               |                __might_sleep() {
       0)               |                  /* Hi I'm a comment in might_sleep() :-) */
       0)   1.321 us    |                }
       0)   5.872 us    |              }
       0)   7.313 us    |            }
       0)   8.718 us    |          }
      
      And this patch brings two minor fixes:
      
      - The newline after a switch-out task has disappeared
      - The "|" sign just before the cpu number on task-switch has been deleted.
      
       0)   0.616 us    |                pick_next_task_rt();
       0)   1.457 us    |                _spin_trylock();
       0)   0.653 us    |                _spin_unlock();
       0)   0.728 us    |                _spin_trylock();
       0)   0.631 us    |                _spin_unlock();
       0)   0.729 us    |                native_load_sp0();
       0)   0.593 us    |                native_load_tls();
       ------------------------------------------
       0)    cat-2834    =>   migrati-3
       ------------------------------------------
      
       0)               |    finish_task_switch() {
       0)   0.841 us    |      _spin_unlock_irq();
       0)   0.616 us    |      post_schedule_rt();
       0)   3.882 us    |    }
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1fd8f2a3
    • Liming Wang's avatar
      ftrace: avoid duplicated function when writing set_graph_function · faec2ec5
      Liming Wang authored
      
      
      Impact: fix a bug in function filter setting
      
      when writing function to set_graph_function, we should check whether it
      has existed in set_graph_function to avoid duplicating.
      Signed-off-by: default avatarLiming Wang <liming.wang@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      faec2ec5
    • Ingo Molnar's avatar
      tracing: fix typo and missing inline function · 6b253930
      Ingo Molnar authored
      
      
      Impact: fix build bugs
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      6b253930
    • Steven Rostedt's avatar
      ftrace: add ability to only trace swapper tasks · e32d8956
      Steven Rostedt authored
      
      
      Impact: new feature
      
      This patch lets the swapper tasks of all CPUS be filtered by the
      set_ftrace_pid file.
      
      If '0' is echoed into this file, then all the idle tasks (aka swapper)
      is flagged to be traced.  This affects all CPU idle tasks.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e32d8956
    • Steven Rostedt's avatar
      ftrace: use struct pid · 978f3a45
      Steven Rostedt authored
      
      
      Impact: clean up, extend PID filtering to PID namespaces
      
      Eric Biederman suggested using the struct pid for filtering on
      pids in the kernel. This patch is based off of a demonstration
      of an implementation that Eric sent me in an email.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      978f3a45
    • Steven Rostedt's avatar
      ftrace: trace single pid for function graph tracer · 804a6851
      Steven Rostedt authored
      
      
      Impact: New feature
      
      This patch makes the changes to set_ftrace_pid apply to the function
      graph tracer.
      
        # echo $$ > /debugfs/tracing/set_ftrace_pid
        # echo function_graph > /debugfs/tracing/current_tracer
      
      Will cause only the current task to be traced. Note, the trace flags are
      also inherited by child processes, so the children of the shell
      will also be traced.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      804a6851
    • Steven Rostedt's avatar
      ftrace: use task struct trace flag to filter on pid · 0ef8cde5
      Steven Rostedt authored
      
      
      Impact: clean up
      
      Use the new task struct trace flags to determine if a process should be
      traced or not.
      
      Note: this moves the searching of the pid to the slow path of setting
      the pid field. This needs to be converted to the pid name space.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0ef8cde5
    • Steven Rostedt's avatar
      ftrace: graph of a single function · ea4e2bc4
      Steven Rostedt authored
      
      
      This patch adds the file:
      
         /debugfs/tracing/set_graph_function
      
      which can be used along with the function graph tracer.
      
      When this file is empty, the function graph tracer will act as
      usual. When the file has a function in it, the function graph
      tracer will only trace that function.
      
      For example:
      
       # echo blk_unplug > /debugfs/tracing/set_graph_function
       # cat /debugfs/tracing/trace
       [...]
       ------------------------------------------
       | 2)  make-19003  =>  kjournald-2219
       ------------------------------------------
      
       2)               |  blk_unplug() {
       2)               |    dm_unplug_all() {
       2)               |      dm_get_table() {
       2)      1.381 us |        _read_lock();
       2)      0.911 us |        dm_table_get();
       2)      1. 76 us |        _read_unlock();
       2) +   12.912 us |      }
       2)               |      dm_table_unplug_all() {
       2)               |        blk_unplug() {
       2)      0.778 us |          generic_unplug_device();
       2)      2.409 us |        }
       2)      5.992 us |      }
       2)      0.813 us |      dm_table_put();
       2) +   29. 90 us |    }
       2) +   34.532 us |  }
      
      You can add up to 32 functions into this file. Currently we limit it
      to 32, but this may change with later improvements.
      
      To add another function, use the append '>>':
      
        # echo sys_read >> /debugfs/tracing/set_graph_function
        # cat /debugfs/tracing/set_graph_function
        blk_unplug
        sys_read
      
      Using the '>' will clear out the function and write anew:
      
        # echo sys_write > /debug/tracing/set_graph_function
        # cat /debug/tracing/set_graph_function
        sys_write
      
      Note, if you have function graph running while doing this, the small
      time between clearing it and updating it will cause the graph to
      record all functions. This should not be an issue because after
      it sets the filter, only those functions will be recorded from then on.
      If you need to only record a particular function then set this
      file first before starting the function graph tracer. In the future
      this side effect may be corrected.
      
      The set_graph_function file is similar to the set_ftrace_filter but
      it does not take wild cards nor does it allow for more than one
      function to be set with a single write. There is no technical reason why
      this is the case, I just do not have the time yet to implement that.
      
      Note, dynamic ftrace must be enabled for this to appear because it
      uses the dynamic ftrace records to match the name to the mcount
      call sites.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ea4e2bc4
  4. 03 Dec, 2008 11 commits
    • Steven Rostedt's avatar
      ftrace: fix race in function graph during fork · e8e1abe9
      Steven Rostedt authored
      
      
      Impact: graph tracer race/crash fix
      
      There is a nasy race in startup of a new process running the
      function graph tracer. In fork.c:
      
      	total_forks++;
      	spin_unlock(&current->sighand->siglock);
      	write_unlock_irq(&tasklist_lock);
      	ftrace_graph_init_task(p);
      	proc_fork_connector(p);
      	cgroup_post_fork(p);
      	return p;
      
      The new task is free to run as soon as the tasklist_lock is released.
      This is before the ftrace_graph_init_task. If the task does run
      it will be using the same ret_stack and curr_ret_stack as the parent.
      This will cause crashes that are difficult to debug.
      
      This patch moves the ftrace_graph_init_task to just after the alloc_pid
      code. This fixes the above race.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e8e1abe9
    • Steven Rostedt's avatar
      trace: fix output of stack trace · 0a37119d
      Steven Rostedt authored
      
      
      Impact: fix to output of stack trace
      
      If a function is not found in the stack of the stack tracer, the
      number printed is quite strange. This fixes the algorithm to handle
      missing functions better.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0a37119d
    • Ingo Molnar's avatar
      tracing/function-graph-tracer: enabled by default · 764f3b95
      Ingo Molnar authored
      
      
      CONFIG_FUNCTION_GRAPH_TRACER depends on FUNCTION_TRACER already,
      (turning it non-default) so it so making it default-n is pointless.
      
      So enable it by default - it's a nice extension of the function tracer.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      764f3b95
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: improve duration output · 166d3c79
      Frederic Weisbecker authored
      
      
      Impact: better trace output of duration for long calls
      
      The old duration output didn't exceeded 9999.999 us to fit the column
      and the nanosecs were always 3 numbers. As Ingo suggested, it's better
      to have the whole microseconds elapsed time and shift the nanosecs precision
      if needed to fit the maximum 7 numbers. And usec need more number, the case
      should be rare and important enough to break a bit the column alignment to
      show it.
      
      So, depending of the duration value, we now have these patterns:
      
          u.nnn us
         uu.nnn us
        uuu.nnn us
       uuuu.nnn us
       uuuuu.nn us
       uuuuuu.n us
       uuuuuuuu..... us
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      166d3c79
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: display unified style cmdline and pid · 11e84acc
      Frederic Weisbecker authored
      
      
      Impact: extend function-graph output: let one know which thread called a function
      
      This patch implements a helper function to print the couple cmdline/pid.
      Its output is provided during task switching and on each row if the new
      "funcgraph-proc" defualt-off option is set through trace_options file.
      
      The output is center aligned and never exceeds 14 characters. The cmdline
      is truncated over 7 chars.
      But note that if the pid exceeds 6 characters, the column will overflow (but
      the situation is abnormal).
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      11e84acc
    • Steven Rostedt's avatar
      ftrace: function graph return for function entry · e49dc19c
      Steven Rostedt authored
      
      
      Impact: feature, let entry function decide to trace or not
      
      This patch lets the graph tracer entry function decide if the tracing
      should be done at the end as well. This requires all function graph
      entry functions return 1 if it should trace, or 0 if the return should
      not be traced.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e49dc19c
    • Steven Rostedt's avatar
      ring-buffer: change "page" variable names to "bpage" · 044fa782
      Steven Rostedt authored
      
      
      Impact: clean up
      
      Andrew Morton pointed out that the kernel convention of a variable
      named page should be of type page struct. The ring buffer uses
      a variable named "page" for a pointer to something else.
      
      This patch converts those to be called "bpage" (as in "buffer page").
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      044fa782
    • Steven Rostedt's avatar
      ftrace: add ftrace_graph_stop() · 14a866c5
      Steven Rostedt authored
      
      
      Impact: new ftrace_graph_stop function
      
      While developing more features of function graph, I hit a bug that
      caused the WARN_ON to trigger in the prepare_ftrace_return function.
      Well, it was hard for me to find out that was happening because the
      bug would not print, it would just cause a hard lockup or reboot.
      The reason is that it is not safe to call printk from this function.
      
      Looking further, I also found that it calls unregister_ftrace_graph,
      which grabs a mutex and calls kstop machine. This would definitely
      lock the box up if it were to trigger.
      
      This patch adds a fast and safe ftrace_graph_stop() which will
      stop the function tracer. Then it is safe to call the WARN ON.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      14a866c5
    • Steven Rostedt's avatar
      ring-buffer: read page interface · 8789a9e7
      Steven Rostedt authored
      
      
      Impact: new API to ring buffer
      
      This patch adds a new interface into the ring buffer that allows a
      page to be read from the ring buffer on a given CPU. For every page
      read, one must also be given to allow for a "swap" of the pages.
      
       rpage = ring_buffer_alloc_read_page(buffer);
       if (!rpage)
      	goto err;
       ret = ring_buffer_read_page(buffer, &rpage, cpu, full);
       if (!ret)
      	goto empty;
       process_page(rpage);
       ring_buffer_free_read_page(rpage);
      
      The caller of these functions must handle any waits that are
      needed to wait for new data. The ring_buffer_read_page will simply
      return 0 if there is no data, or if "full" is set and the writer
      is still on the current page.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8789a9e7
    • Steven Rostedt's avatar
      ring-buffer: move some metadata into buffer page · abc9b56d
      Steven Rostedt authored
      
      
      Impact: get ready for splice changes
      
      This patch moves the commit and timestamp into the beginning of each
      data page of the buffer. This change will allow the page to be moved
      to another location (disk, network, etc) and still have information
      in the page to be able to read it.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      abc9b56d
    • Steven Rostedt's avatar
      ftrace: replace raw_local_irq_save with local_irq_save · a5e25883
      Steven Rostedt authored
      
      
      Impact: fix for lockdep and ftrace
      
      The raw_local_irq_save/restore confuses lockdep. This patch
      converts them to the local_irq_save/restore variants.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a5e25883
  5. 02 Dec, 2008 4 commits
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: support for x86-64 · 48d68b20
      Frederic Weisbecker authored
      
      
      Impact: extend and enable the function graph tracer to 64-bit x86
      
      This patch implements the support for function graph tracer under x86-64.
      Both static and dynamic tracing are supported.
      
      This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
      wanted to use probe_kernel_read/write to make the return address
      saving/patching code more generic but it causes tracing recursion.
      
      That would be perhaps useful to implement a notrace version of these
      function for other archs ports.
      
      Note that arch/x86/process_64.c is not traced, as in X86-32. I first
      thought __switch_to() was responsible of crashes during tracing because I
      believed current task were changed inside but that's actually not the
      case (actually yes, but not the "current" pointer).
      
      So I will have to investigate to find the functions that harm here, to
      enable tracing of the other functions inside (but there is no issue at
      this time, while process_64.c stays out of -pg flags).
      
      A little possible race condition is fixed inside this patch too. When the
      tracer allocate a return stack dynamically, the current depth is not
      initialized before but after. An interrupt could occur at this time and,
      after seeing that the return stack is allocated, the tracer could try to
      trace it with a random uninitialized depth. It's a prevention, even if I
      hadn't problems with it.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tim Bird <tim.bird@am.sony.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      48d68b20
    • Liming Wang's avatar
      function trace: fix a bug of single thread function trace · 66eafebc
      Liming Wang authored
      
      
      Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func()
      
      When disabling single thread function trace using
      "echo -1 > set_ftrace_pid", the normal function trace
      has to restore to original function, otherwise the normal
      function trace will not work well.
      
      Without this commit, something like below:
      
      	$ ps |grep 850
      	  850 root      2556 S    -/bin/sh
      	$ echo 850 > /debug/tracing/set_ftrace_pid
      	$ echo function > /debug/tracing/current_tracer
      	$ echo 1 > /debug/tracing/tracing_enabled
      	$ sleep 1
      	$ echo 0 > /debug/tracing/tracing_enabled
      	$ cat /debug/tracing/trace_pipe |wc -l
      	59704
      	$ echo -1 > /debug/tracing/set_ftrace_pid
      	$ echo 1 > /debug/tracing/tracing_enabled
      	$ sleep 1
      	$ echo 0 > /debug/tracing/tracing_enabled
      	$ more /debug/tracing/trace_pipe
      		<====== nothing output now!
      			it should output trace record.
      Signed-off-by: default avatarLiming Wang <liming.wang@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      66eafebc
    • Arjan van de Ven's avatar
      taint: add missing comment · a8005992
      Arjan van de Ven authored
      
      
      The description for 'D' was missing in the comment...  (causing me a
      minute of WTF followed by looking at more of the code)
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8005992
    • Davide Libenzi's avatar
      epoll: introduce resource usage limits · 7ef9964e
      Davide Libenzi authored
      
      
      It has been thought that the per-user file descriptors limit would also
      limit the resources that a normal user can request via the epoll
      interface.  Vegard Nossum reported a very simple program (a modified
      version attached) that can make a normal user to request a pretty large
      amount of kernel memory, well within the its maximum number of fds.  To
      solve such problem, default limits are now imposed, and /proc based
      configuration has been introduced.  A new directory has been created,
      named /proc/sys/fs/epoll/ and inside there, there are two configuration
      points:
      
        max_user_instances = Maximum number of devices - per user
      
        max_user_watches   = Maximum number of "watched" fds - per user
      
      The current default for "max_user_watches" limits the memory used by epoll
      to store "watches", to 1/32 of the amount of the low RAM.  As example, a
      256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
      That should be enough to not break existing heavy epoll users.  The
      default value for "max_user_instances" is set to 128, that should be
      enough too.
      
      This also changes the userspace, because a new error code can now come out
      from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
      listed, so that should be ok.
      
      [akpm@linux-foundation.org: use get_current_user()]
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: <stable@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Reported-by: default avatarVegard Nossum <vegardno@ifi.uio.no>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ef9964e
  6. 30 Nov, 2008 2 commits
  7. 29 Nov, 2008 3 commits
    • Ingo Molnar's avatar
      sched: prevent divide by zero error in cpu_avg_load_per_task, update · af6d596f
      Ingo Molnar authored
      Regarding the bug addressed in:
      
        4cd42620
      
      : sched: prevent divide by zero error in cpu_avg_load_per_task
      
      Linus points out that the fix is not complete:
      
      > There's nothing that keeps gcc from deciding not to reload
      > rq->nr_running.
      >
      > Of course, in _practice_, I don't think gcc ever will (if it decides
      > that it will spill, gcc is likely going to decide that it will
      > literally spill the local variable to the stack rather than decide to
      > reload off the pointer), but it's a valid compiler optimization, and
      > it even has a name (rematerialization).
      >
      > So I suspect that your patch does fix the bug, but it still leaves the
      > fairly unlikely _potential_ for it to re-appear at some point.
      >
      > We have ACCESS_ONCE() as a macro to guarantee that the compiler
      > doesn't rematerialize a pointer access. That also would clarify
      > the fact that we access something unsafe outside a lock.
      
      So make sure our nr_running value is immutable and cannot change
      after we check it for nonzero.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      af6d596f
    • Ingo Molnar's avatar
      sched, cpusets: fix warning in kernel/cpuset.c · 1583715d
      Ingo Molnar authored
      
      
      this warning:
      
        kernel/cpuset.c: In function ‘generate_sched_domains’:
        kernel/cpuset.c:588: warning: ‘ndoms’ may be used uninitialized in this function
      
      triggers because GCC does not recognize that ndoms stays uninitialized
      only if doms is NULL - but that flow is covered at the end of
      generate_sched_domains().
      
      Help out GCC by initializing this variable to 0. (that's prudent anyway)
      
      Also, this function needs a splitup and code flow simplification:
      with 160 lines length it's clearly too long.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1583715d
    • Frederic Weisbecker's avatar
      tracing/branch-tracer: include missing irqflags.h · 65c6dc6a
      Frederic Weisbecker authored
      
      
      Impact: fix build error on branch tracer
      
      This should fix a build error reported on alpha in linux-next:
      
       CC      kernel/trace/trace_branch.o
        kernel/trace/trace_branch.c: In function 'probe_likely_condition':
        kernel/trace/trace_branch.c:44: error: implicit declaration of function 'raw_local_irq_save'
        kernel/trace/trace_branch.c:76: error: implicit declaration of function 'raw_local_irq_restore'
      
      Unfortunately, I can't test it since I don't have any Alpha build environment.
      Reported-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      65c6dc6a
  8. 28 Nov, 2008 4 commits
    • Liming Wang's avatar
      ftrace: improve seq_operation of ftrace · 50cdaf08
      Liming Wang authored
      
      
      Impact: make ftrace position computing more sane
      
      First remove useless ->pos field. Then we needn't check seq_printf
      in .show like other place.
      Signed-off-by: default avatarLiming Wang <liming.wang@windriver.com>
      Reviewed-by: default avatarBruce Ashfield <bruce.ashfield@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      50cdaf08
    • Török Edwin's avatar
      tracing, alpha: fix build: add missing #ifdef CONFIG_STACKTRACE · c7425acb
      Török Edwin authored
      
      
      There are architectures that still have no stacktrace support.
      Signed-off-by: default avatarTörök Edwin <edwintorok@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7425acb
    • Ingo Molnar's avatar
      tracing/function-graph-tracer: more output tweaks · d51090b3
      Ingo Molnar authored
      
      
      Impact: prettify the output some more
      
      Before:
      
      0)           |     sys_read() {
      0)      0.796 us |   fget_light();
      0)           |       vfs_read() {
      0)           |         rw_verify_area() {
      0)           |           security_file_permission() {
      ------------8<---------- thread sshd-1755 ------------8<----------
      
      After:
      
       0)               |  sys_read() {
       0)      0.796 us |    fget_light();
       0)               |    vfs_read() {
       0)               |      rw_verify_area() {
       0)               |        security_file_permission() {
       ------------------------------------------
       | 1)  migration/0--1  =>  sshd-1755
       ------------------------------------------
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d51090b3
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: adjustments of the trace informations · 1a056155
      Frederic Weisbecker authored
      
      
      Impact: increase the visual qualities of the call-graph-tracer output
      
      This patch applies various trace output formatting changes:
      
       - CPU is now a decimal number, followed by a parenthesis.
      
       - Overhead is now on the second column (gives a good visibility)
      
       - Cost is now on the third column, can't exceed 9999.99 us. It is
         followed by a virtual line based on a "|" character.
      
       - Functions calls are now the last column on the right. This way, we
         haven't dynamic column (which flow is harder to follow) on its right.
      
       - CPU and Overhead have their own option flag. They are default-on but you
         can disable them easily:
      
            echo nofuncgraph-cpu > trace_options
            echo nofuncgraph-overhead > trace_options
      
      TODO:
      
      _ Refactoring of the thread switch output.
      _ Give a default-off option to output the thread and its pid on each row.
      _ Provide headers
      _ ....
      
      Here is an example of the new trace style:
      
      0)           |             mutex_unlock() {
      0)      0.639 us |           __mutex_unlock_slowpath();
      0)      1.607 us |         }
      0)           |             remove_wait_queue() {
      0)      0.616 us |           _spin_lock_irqsave();
      0)      0.616 us |           _spin_unlock_irqrestore();
      0)      2.779 us |         }
      0)      0.495 us |         n_tty_set_room();
      0) ! 9999.999 us |       }
      0)           |           tty_ldisc_deref() {
      0)      0.615 us |         _spin_lock_irqsave();
      0)      0.616 us |         _spin_unlock_irqrestore();
      0)      2.793 us |       }
      0)           |           current_fs_time() {
      0)      0.488 us |         current_kernel_time();
      0)      0.495 us |         timespec_trunc();
      0)      2.486 us |       }
      0) ! 9999.999 us |     }
      0) ! 9999.999 us |   }
      0) ! 9999.999 us | }
      0)           |     sys_read() {
      0)      0.796 us |   fget_light();
      0)           |       vfs_read() {
      0)           |         rw_verify_area() {
      0)           |           security_file_permission() {
      0)      0.488 us |         cap_file_permission();
      0)      1.720 us |       }
      0)      3.  4 us |     }
      0)           |         tty_read() {
      0)      0.488 us |       tty_paranoia_check();
      0)           |           tty_ldisc_ref_wait() {
      0)           |             tty_ldisc_try() {
      0)      0.615 us |           _spin_lock_irqsave();
      0)      0.615 us |           _spin_unlock_irqrestore();
      0)      5.436 us |         }
      0)      6.427 us |       }
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1a056155
  9. 27 Nov, 2008 3 commits
    • Frederic Weisbecker's avatar
      tracing/function-graph-tracer: enhancements for the trace output · 83a8df61
      Frederic Weisbecker authored
      
      
      Impact: enhance the output of the graph-tracer
      
      This patch applies some ideas of Ingo Molnar and Steven Rostedt.
      
      * Output leaf functions in one line with parenthesis, semicolon and duration
        output.
      
      * Add a second column (after cpu) for an overhead sign.
        if duration > 100 us, "!"
        if duration > 10 us, "+"
        else " "
      
      * Print output in us with remaining nanosec: u.n
      
      * Print duration on the right end, following the indentation of the functions.
        Use also visual clues: "-" on entry call (no duration to output) and "+" on
        return (duration output).
      
      The name of the tracer has been fixed as well: function-branch becomes
      function_branch.
      
      Here is an example of the new output:
      
      CPU[000]           dequeue_entity() {                    -
      CPU[000]             update_curr() {                    -
      CPU[000]               update_min_vruntime();                    + 0.512 us
      CPU[000]             }                                + 1.504 us
      CPU[000]             clear_buddies();                    + 0.481 us
      CPU[000]             update_min_vruntime();                    + 0.504 us
      CPU[000]           }                                + 4.557 us
      CPU[000]           hrtick_update() {                    -
      CPU[000]             hrtick_start_fair();                    + 0.489 us
      CPU[000]           }                                + 1.443 us
      CPU[000] +       }                                + 14.655 us
      CPU[000] +     }                                + 15.678 us
      CPU[000] +   }                                + 16.686 us
      CPU[000]     msecs_to_jiffies();                    + 0.481 us
      CPU[000]     put_prev_task_fair();                    + 0.504 us
      CPU[000]     pick_next_task_fair();                    + 0.482 us
      CPU[000]     pick_next_task_rt();                    + 0.504 us
      CPU[000]     pick_next_task_fair();                    + 0.481 us
      CPU[000]     pick_next_task_idle();                    + 0.489 us
      CPU[000]     _spin_trylock();                    + 0.655 us
      CPU[000]     _spin_unlock();                    + 0.609 us
      
      CPU[000]  ------------8<---------- thread bash-2794 ------------8<----------
      
      CPU[000]               finish_task_switch() {                    -
      CPU[000]                 _spin_unlock_irq();                    + 0.722 us
      CPU[000]               }                                + 2.369 us
      CPU[000] !           }                                + 501972.605 us
      CPU[000] !         }                                + 501973.763 us
      CPU[000]           copy_from_read_buf() {                    -
      CPU[000]             _spin_lock_irqsave();                    + 0.670 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.699 us
      CPU[000]             copy_to_user() {                    -
      CPU[000]               might_fault() {                    -
      CPU[000]                 __might_sleep();                    + 0.503 us
      CPU[000]               }                                + 1.632 us
      CPU[000]               __copy_to_user_ll();                    + 0.542 us
      CPU[000]             }                                + 3.858 us
      CPU[000]             tty_audit_add_data() {                    -
      CPU[000]               _spin_lock_irq();                    + 0.609 us
      CPU[000]               _spin_unlock_irq();                    + 0.624 us
      CPU[000]             }                                + 3.196 us
      CPU[000]             _spin_lock_irqsave();                    + 0.624 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.625 us
      CPU[000] +         }                                + 13.611 us
      CPU[000]           copy_from_read_buf() {                    -
      CPU[000]             _spin_lock_irqsave();                    + 0.624 us
      CPU[000]             _spin_unlock_irqrestore();                    + 0.616 us
      CPU[000]           }                                + 2.820 us
      CPU[000]
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      83a8df61
    • Steven Rostedt's avatar
      sched: prevent divide by zero error in cpu_avg_load_per_task · 4cd42620
      Steven Rostedt authored
      
      
      Impact: fix divide by zero crash in scheduler rebalance irq
      
      While testing the branch profiler, I hit this crash:
      
      divide error: 0000 [#1] PREEMPT SMP
      [...]
      RIP: 0010:[<ffffffff8024a008>]  [<ffffffff8024a008>] cpu_avg_load_per_task+0x50/0x7f
      [...]
      Call Trace:
       <IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa
       [<ffffffff8025da75>] rebalance_domains+0x2da/0xa21
       [<ffffffff80478769>] ? find_next_bit+0x1b2/0x1e6
       [<ffffffff8025e2ce>] run_rebalance_domains+0x112/0x19f
       [<ffffffff8026d7c2>] __do_softirq+0xa8/0x232
       [<ffffffff8020ea7c>] call_softirq+0x1c/0x3e
       [<ffffffff8021047a>] do_softirq+0x94/0x1cd
       [<ffffffff8026d5eb>] irq_exit+0x6b/0x10e
       [<ffffffff8022e6ec>] smp_apic_timer_interrupt+0xd3/0xff
       [<ffffffff8020e4b3>] apic_timer_interrupt+0x13/0x20
      
      The code for cpu_avg_load_per_task has:
      
      	if (rq->nr_running)
      		rq->avg_load_per_task = rq->load.weight / rq->nr_running;
      
      The runqueue lock is not held here, and there is nothing that prevents
      the rq->nr_running from going to zero after it passes the if condition.
      
      The branch profiler simply made the race window bigger.
      
      This patch saves off the rq->nr_running to a local variable and uses that
      for both the condition and the division.
      Signed-off-by: default avatarSteven Rostedt <srostedt@redhat.com>
      Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4cd42620
    • Lai Jiangshan's avatar
      ftrace: prevent recursion · 4f5a7f40
      Lai Jiangshan authored
      
      
      Impact: prevent unnecessary stack recursion
      
      if the resched flag was set before we entered, then don't reschedule.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4f5a7f40