• Frederic Weisbecker's avatar
    tracing/function-graph-tracer: various fixes and features · 9005f3eb
    Frederic Weisbecker authored
    
    
    This patch brings various bugfixes:
    
    - Drop the first irrelevant task switch on the very beginning of a trace.
    
    - Drop the OVERHEAD word from the headers, the DURATION word is sufficient
      and will not overlap other columns.
    
    - Make the headers fit well their respective columns whatever the
      selected options.
    
    Ie, default options:
    
     # tracer: function_graph
     #
     # CPU  DURATION                  FUNCTION CALLS
     # |     |   |                     |   |   |   |
    
      1)   0.646 us    |                    }
      1)               |                    mem_cgroup_del_lru_list() {
      1)   0.624 us    |                      lookup_page_cgroup();
      1)   1.970 us    |                    }
    
     echo funcgraph-proc > trace_options
    
     # tracer: function_graph
     #
     # CPU  TASK/PID        DURATION                  FUNCTION CALLS
     # |    |    |           |   |                     |   |   |   |
    
      0)   bash-2937    |   0.895 us    |                }
      0)   bash-2937    |   0.888 us    |                __rcu_read_unlock();
      0)   bash-2937    |   0.864 us    |                conv_uni_to_pc();
      0)   bash-2937    |   1.015 us    |                __rcu_read_lock();
    
     echo nofuncgraph-cpu > trace_options
     echo nofuncgraph-proc > trace_options
    
     # tracer: function_graph
     #
     #   DURATION                  FUNCTION CALLS
     #    |   |                     |   |   |   |
    
       3.752 us    |                  native_pud_val();
       0.616 us    |                  native_pud_val();
       0.624 us    |                  native_pmd_val();
    
    About features, one can now disable the duration (this will hide the
    overhead too for convenient reasons and because on  doesn't need
    overhead if it hasn't the duration):
    
     echo nofuncgraph-duration > trace_options
    
     # tracer: function_graph
     #
     #                FUNCTION CALLS
     #                |   |   |   |
    
               cap_vm_enough_memory() {
                 __vm_enough_memory() {
                   vm_acct_memory();
                 }
               }
             }
    
    And at last, an option to print the absolute time:
    
     //Restart from default options
     echo funcgraph-abstime > trace_options
    
     # tracer: function_graph
     #
     #      TIME       CPU  DURATION                  FUNCTION CALLS
     #       |         |     |   |                     |   |   |   |
    
       261.339774 |   1) + 42.823 us   |    }
       261.339775 |   1)   1.045 us    |    _spin_lock_irq();
       261.339777 |   1)   0.940 us    |    _spin_lock_irqsave();
       261.339778 |   1)   0.752 us    |    _spin_unlock_irqrestore();
       261.339780 |   1)   0.857 us    |    _spin_unlock_irq();
       261.339782 |   1)               |    flush_to_ldisc() {
       261.339783 |   1)               |      tty_ldisc_ref() {
       261.339783 |   1)               |        tty_ldisc_try() {
       261.339784 |   1)   1.075 us    |          _spin_lock_irqsave();
       261.339786 |   1)   0.842 us    |          _spin_unlock_irqrestore();
       261.339788 |   1)   4.211 us    |        }
       261.339788 |   1)   5.662 us    |      }
    
    The format is seconds.usecs.
    
    I guess no one needs the nanosec precision here, the main goal is to have
    an overview about the general timings of events, and to see the place when
    the trace switches from one cpu to another.
    
    ie:
    
       274.874760 |   1)   0.676 us    |      _spin_unlock();
       274.874762 |   1)   0.609 us    |      native_load_sp0();
       274.874763 |   1)   0.602 us    |      native_load_tls();
       274.878739 |   0)   0.722 us    |                  }
       274.878740 |   0)   0.714 us    |                  native_pmd_val();
       274.878741 |   0)   0.730 us    |                  native_pmd_val();
    
    Here there is a 4000 usecs difference when we switch the cpu.
    
    Changes in V2:
    
    - Completely fix the first pointless task switch.
    Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    9005f3eb