1. 26 Aug, 2009 4 commits
    • Xiao Guangrong's avatar
      tracing/events: fix the include file dependencies · 5ac35daa
      Xiao Guangrong authored
      
      
      The TRACE_EVENT depends on the include/linux/tracepoint.h first
      and include/trace/ftrace.h later, if we include the ftrace.h early,
      a building error will occur.
      
      Both define TRACE_EVENT in trace_a.h and trace_b.h, if we include
      those in .c file, like this:
      
      #define CREATE_TRACE_POINTS
      include <trace/events/trace_a.h>
      include <trace/events/trace_b.h>
      
      The above will not work, because the TRACE_EVENT was re-defined by
      the previous .h file.
      
      Reported-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      LKML-Reference: <4A937F5E.3020802@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5ac35daa
    • Li Zefan's avatar
      tracing/filters: Support filtering for char * strings · 87a342f5
      Li Zefan authored
      
      
      Usually, char * entries are dangerous in traces because the string
      can be released whereas a pointer to it can still wait to be read from
      the ring buffer.
      
      But sometimes we can assume it's safe, like in case of RO data
      (eg: __file__ or __line__, used in bkl trace event). If these RO data
      are in a module and so is the call to the trace event, then it's safe,
      because the ring buffer will be flushed once this module get unloaded.
      
      To allow char * to be treated as a string:
      
      	TRACE_EVENT(...,
      
      		TP_STRUCT__entry(
      			__field_ext(const char *, name, FILTER_PTR_STRING)
      			...
      		)
      
      		...
      	);
      
      The filtering will not dereference "char *" unless the developer
      explicitly sets FILTER_PTR_STR in __field_ext.
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9287.90205@cn.fujitsu.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      87a342f5
    • Li Zefan's avatar
      tracing/filters: Add __field_ext() to TRACE_EVENT · 43b51ead
      Li Zefan authored
      
      
      Add __field_ext(), so a field can be assigned to a specific
      filter_type, which matches a corresponding filter function.
      
      For example, a later patch will allow this:
      	__field_ext(const char *, str, FILTER_PTR_STR);
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A7B9272.60507095
      
      @cn.fujitsu.com>
      
      [
        Fixed a -1 to FILTER_OTHER
        Forward ported to latest kernel.
      ]
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      43b51ead
    • Steven Rostedt's avatar
      tracing/sched: show CPU task wakes up on in trace event · f0693c8b
      Steven Rostedt authored
      
      
      While debugging the scheduler push / pull algorithm, I found
      it very annoying that the sched wake up events did not show
      the CPU that the task was waking on. In order to analyze the
      scheduler, I needed that information.
      
      This patch adds recording of the CPU that a task is waking up
      on.
      
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f0693c8b
  2. 19 Aug, 2009 4 commits
    • Li Zefan's avatar
      tracing/syscalls: Add filtering support · 540b7b8d
      Li Zefan authored
      
      
      Add filtering support for syscall events:
      
       # echo 'mode == 0666' > events/syscalls/sys_enter_open
       # echo 'ret == 0' > events/syscalls/sys_exit_open
       # echo 1 > events/syscalls/sys_enter_open
       # echo 1 > events/syscalls/sys_exit_open
       # cat trace
       ...
         modprobe-3084 [001] 117.463140: sys_open(filename: 917d3e8, flags: 0, mode: 1b6)
         modprobe-3084 [001] 117.463176: sys_open -> 0x0
             less-3086 [001] 117.510455: sys_open(filename: 9c6bdb8, flags: 8000, mode: 1b6)
         sendmail-2574 [001] 122.145840: sys_open(filename: b807a365, flags: 0, mode: 1b6)
       ...
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFCB.1040006@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      540b7b8d
    • Li Zefan's avatar
      tracing/events: Add trace_define_common_fields() · e647d6b3
      Li Zefan authored
      
      
      Extract duplicate code. Also prepare for the later patch.
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAFB8.1010304@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e647d6b3
    • Li Zefan's avatar
      tracing/events: Add ftrace_event_call param to define_fields() · 14be96c9
      Li Zefan authored
      
      
      This parameter is needed by syscall events to add define_fields()
      handler.
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF90.6060801@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      14be96c9
    • Li Zefan's avatar
      tracing/syscalls: Add fields format for exit events · 10a5b66f
      Li Zefan authored
      
      
      Add "format" file for syscall exit events:
      
       # cat events/syscalls/sys_exit_open/format
       name: sys_exit_open
       ID: 344
       format:
               field:unsigned short common_type;       offset:0;       size:2;
               field:unsigned char common_flags;       offset:2;       size:1;
               field:unsigned char common_preempt_count;       offset:3;       size:1;
               field:int common_pid;   offset:4;       size:4;
               field:int common_tgid;  offset:8;       size:4;
      
               field:int nr;   offset:12;      size:4;
               field:unsigned long ret;        offset:16;      size:4;
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <4A8BAF61.3060307@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      10a5b66f
  3. 17 Aug, 2009 1 commit
    • Li Zefan's avatar
      tracing/events: Add module tracepoints · 7ead8b83
      Li Zefan authored
      
      
      Add trace points to trace module_load, module_free, module_get,
      module_put and module_request, and use trace_event facility to
      get the trace output.
      
      Here's the sample output:
      
           TASK-PID    CPU#    TIMESTAMP  FUNCTION
              | |       |          |         |
          <...>-42    [000]     1.758380: module_request: fb0 wait=1 call_site=fb_open
          ...
          <...>-60    [000]     3.269403: module_load: scsi_wait_scan
          <...>-60    [000]     3.269432: module_put: scsi_wait_scan call_site=sys_init_module refcnt=0
          <...>-61    [001]     3.273168: module_free: scsi_wait_scan
          ...
          <...>-1021  [000]    13.836081: module_load: sunrpc
          <...>-1021  [000]    13.840589: module_put: sunrpc call_site=sys_init_module refcnt=-1
          <...>-1027  [000]    13.848098: module_get: sunrpc call_site=try_module_get refcnt=0
          <...>-1027  [000]    13.848308: module_get: sunrpc call_site=get_filesystem refcnt=1
          <...>-1027  [000]    13.848692: module_put: sunrpc call_site=put_filesystem refcnt=0
          ...
       modprobe-2587  [001]  1088.437213: module_load: trace_events_sample F
       modprobe-2587  [001]  1088.437786: module_put: trace_events_sample call_site=sys_init_module refcnt=0
      
      Note:
      
      - the taints flag can be 'F', 'C' and/or 'P' if mod->taints != 0
      
      - the module refcnt is percpu, so it can be negative in a
        specific cpu
      
      Signed-off-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Acked-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      LKML-Reference: <4A891B3C.5030608@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7ead8b83
  4. 11 Aug, 2009 9 commits
    • Frederic Weisbecker's avatar
      tracing: Add fields format definition for syscall events · dc4ddb4c
      Frederic Weisbecker authored
      
      
      Define the format of the syscall trace fields to parse the binary
      values from a raw trace using the syscall events "format" file.
      
      This is defined dynamically using the syscalls metadata.
      It prepares the export of syscall event raw records to perf
      counters.
      
      Example:
      
      $ cat /debug/tracing/events/syscalls/sys_enter_sched_getparam/format
      name: sys_enter_sched_getparam
      ID: 39
      format:
      	field:unsigned short common_type;	offset:0;	size:2;
      	field:unsigned char common_flags;	offset:2;	size:1;
      	field:unsigned char common_preempt_count;	offset:3;	size:1;
      	field:int common_pid;	offset:4;	size:4;
      	field:int common_tgid;	offset:8;	size:4;
      
      	field:pid_t pid;	offset:12;	size:8;
      	field:struct sched_param * param;	offset:20;	size:8;
      
      print fmt: "pid: 0x%08lx, param: 0x%08lx", ((unsigned long)(REC->pid)), ((unsigned long)(REC->param))
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      dc4ddb4c
    • Frederic Weisbecker's avatar
      tracing: Add ftrace event call parameter to its field descriptor handler · e8f9f4d7
      Frederic Weisbecker authored
      
      
      Add the struct ftrace_event_call as a parameter of its show_format()
      callback. This way we can use it from the syscall trace events to
      retrieve the syscall name from the ftrace event call parameter and
      describe its fields using the syscalls metadata.
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      e8f9f4d7
    • Jason Baron's avatar
      tracing: Add perf counter support for syscalls tracing · f4b5ffcc
      Jason Baron authored
      
      
      The perf counter support is automated for usual trace events. But we
      have to define specific callbacks for this to handle syscalls trace
      events
      
      Make 'perf stat -e syscalls:sys_enter_blah' work with syscall style
      tracepoints.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      f4b5ffcc
    • Jason Baron's avatar
      tracing: Add individual syscalls tracepoint id support · 64c12e04
      Jason Baron authored
      
      
      The current state of syscalls tracepoints generates only one event id
      for every syscall events.
      
      This patch associates an id with each syscall trace event, so that we
      can identify each syscall trace event using the 'perf' tool.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      64c12e04
    • Jason Baron's avatar
      tracing: Add trace events for each syscall entry/exit · fb34a08c
      Jason Baron authored
      
      
      Layer Frederic's syscall tracer on tracepoints. We create trace events
      via hooking into the SYSCALL_DEFINE macros. This allows us to
      individually toggle syscall entry and exit points on/off.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      fb34a08c
    • Jason Baron's avatar
      tracing: Add ftrace_event_call void * 'data' field · 69fd4f0e
      Jason Baron authored
      
      
      add an optional void * pointer to 'ftrace_event_call' that is
      passed in for regfunc and unregfunc.
      
      This prepares for syscall tracepoints creation by passing the name of
      the syscall we want to trace and then retrieve its number through our
      arch syscall table.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      69fd4f0e
    • Jason Baron's avatar
      tracing: Add syscall tracepoints · a871bd33
      Jason Baron authored
      
      
      add two tracepoints in syscall exit and entry path, conditioned on
      TIF_SYSCALL_FTRACE. Supports the syscall trace event code.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      a871bd33
    • Jason Baron's avatar
      tracing: Add DECLARE_TRACE_WITH_CALLBACK() macro · 63fbdab3
      Jason Baron authored
      
      
      Introduce a new 'DECLARE_TRACE_WITH_CALLBACK()' macro, so that
      tracepoints can associate an external register/unregister function.
      
      This prepares for the syscalls tracer conversion to trace events. We
      will need to perform arch level operations once a syscall event is
      turned on/off, such as TIF flags setting, hence the need of such
      specific callbacks.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      63fbdab3
    • Jason Baron's avatar
      tracing: Call arch_init_ftrace_syscalls at boot · 066e0378
      Jason Baron authored
      
      
      Call arch_init_ftrace_syscalls at boot, so we can determine early the
      set of syscalls for the syscall trace events.
      
      Signed-off-by: default avatarJason Baron <jbaron@redhat.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Jiaying Zhang <jiayingz@google.com>
      Cc: Martin Bligh <mbligh@google.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      066e0378
  5. 10 Aug, 2009 3 commits
    • Frederic Weisbecker's avatar
      perf_counter: Zero dead bytes from ftrace raw samples size alignment · 1853db0e
      Frederic Weisbecker authored
      
      
      After aligning the ftrace raw samples, there are dead bytes storing
      random data from the stack. We don't want to leak these to userspace,
      then zero these out.
      
      Before:
      
      	0x2de88 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 2c 00 00 00 00 00 00 00  h...h...,......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 40 7f 46 81 ff ff ff ff 00 10 1b 7f  h...@.F........
                                                            ^  ^  ^  ^
                                                               Leak
      
      After:
      
      	0x2d318 [0x50]: event: 9
      	.
      	. ... raw event: size 80 bytes
      	.  0000:  09 00 00 00 01 00 50 00 d0 c7 00 81 ff ff ff ff  ......P........
      	.  0010:  68 01 00 00 68 01 00 00 68 14 00 00 00 00 00 00  h...h...h......
      	.  0020:  2c 00 00 00 2b 00 01 02 68 01 00 00 68 01 00 00  ,...+...h...h..
      	.  0030:  6b 6f 6e 64 65 6d 61 6e 64 2f 30 00 00 00 00 00  kondemand/0....
      	.  0040:  68 01 00 00 a0 80 46 81 ff ff ff ff 00 00 00 00  h.....F........
                                                            ^  ^  ^  ^
      							 Fixed
      
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <1249915116-5210-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      1853db0e
    • Frederic Weisbecker's avatar
      perf_counter: Subtract the buffer size field from the event record size · 304703ab
      Frederic Weisbecker authored
      
      
      We compute the perf raw sample size by aligning the raw ftrace
      event size plus the buffer size field itself. We do that
      instead of aligning only the perf raw sample size, so that we
      might economize some in some cases.
      
      But this buffer size field is not stored in the perf raw
      sample, we must then substract its size from the buffer once we
      computed the alignment unless we may get a useless u32 field in
      the buffer.
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <20090810141129.GA5124@nowhere>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      304703ab
    • Peter Zijlstra's avatar
      perf_counter: Correct PERF_SAMPLE_RAW output · a044560c
      Peter Zijlstra authored
      
      
      PERF_SAMPLE_* output switches should unconditionally output the
      correct format, as they are the only way to unambiguously parse
      the PERF_EVENT_SAMPLE data.
      
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249896447.17467.74.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a044560c
  6. 09 Aug, 2009 3 commits
    • Frederic Weisbecker's avatar
      perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68
      Frederic Weisbecker authored
      
      
      Based on Peter's comments, make tracepoint sampling generic
      just like all the other sampling bits are. This is a rename
      with no code changes:
      
      - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
      - struct perf_tracepoint_record to perf_raw_record
      
      We want the system in place that transport tracepoints raw
      samples events into the perf ring buffer to be generalized and
      usable by any type of counter.
      
      Reported-by; Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a43ce68
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      
      
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
    • Peter Zijlstra's avatar
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra authored
      
      
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a659305
  7. 07 Aug, 2009 4 commits
    • Phillip Lougher's avatar
      bzip2/lzma/gzip: fix comments describing decompressor API · daeb6b6f
      Phillip Lougher authored
      
      
      Fix and improve comments in decompress/generic.h that describe the
      decompressor API.  Also remove an unused definition, and rename INBUF_LEN
      in lib/decompress_inflate.c to conform to bzip2/lzma naming.
      
      Signed-off-by: default avatarPhillip Lougher <phillip@lougher.demon.co.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      daeb6b6f
    • KAMEZAWA Hiroyuki's avatar
      mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware · 4bfc4495
      KAMEZAWA Hiroyuki authored
      At first, init_task's mems_allowed is initialized as this.
       init_task->mems_allowed == node_state[N_POSSIBLE]
      
      And cpuset's top_cpuset mask is initialized as this
       top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]
      
      Before 2.6.29:
      policy's mems_allowed is initialized as this.
      
        1. update tasks->mems_allowed by its cpuset->mems_allowed.
        2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)
      
      Updating task's mems_allowed in reference to top_cpuset's one.
      cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.
      
      In 2.6.30: After commit 58568d2a
      
      
      ("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
      is initialized as this.
      
        1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)
      
      Here, if task is in top_cpuset, task->mems_allowed is not updated from
      init's one.  Assume user excutes command as #numactrl --interleave=all
      ,....
      
        policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)
      
      Then, policy's mems_allowd can includes a possible node, which has no pgdat.
      
      MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
      directly.
      
        NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL
      
      Then, what's we need is making policy->mems_allowed be aware of
      N_HIGH_MEMORY.  This patch does that.  But to do so, extra nodemask will
      be on statck.  Because I know cpumask has a new interface of
      CPUMASK_ALLOC(), I added it to node.
      
      This patch stands on old behavior.  But I feel this fix itself is just a
      Band-Aid.  But to do fundametal fix, we have to take care of memory
      hotplug and it takes time.  (task->mems_allowd should be N_HIGH_MEMORY, I
      think.)
      
      mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
      should be includes only online nodes.
      
      In old behavior, this is guaranteed by frequent reference to cpuset's
      code.  Now, most of them are removed and mempolicy has to check it by
      itself.
      
      To do check, a few nodemask_t will be used for calculating nodemask.  But,
      size of nodemask_t can be big and it's not good to allocate them on stack.
      
      Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
      NODEMASK_ALLOC/FREE shoudl be there.
      
      [akpm@linux-foundation.org: cleanups & tweaks]
      Tested-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Miao Xie <miaox@cn.fujitsu.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Paul Menage <menage@google.com>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4bfc4495
    • Christoph Hellwig's avatar
      vfs: add __destroy_inode · 2e00c97e
      Christoph Hellwig authored
      
      
      When we want to tear down an inode that lost the add to the cache race
      in XFS we must not call into ->destroy_inode because that would delete
      the inode that won the race from the inode cache radix tree.
      
      This patch provides the __destroy_inode helper needed to fix this,
      the actual fix will be in th next patch.  As XFS was the only reason
      destroy_inode was exported we shift the export to the new __destroy_inode.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarEric Sandeen <sandeen@sandeen.net>
      2e00c97e
    • Christoph Hellwig's avatar
      vfs: fix inode_init_always calling convention · 54e34621
      Christoph Hellwig authored
      
      
      Currently inode_init_always calls into ->destroy_inode if the additional
      initialization fails.  That's not only counter-intuitive because
      inode_init_always did not allocate the inode structure, but in case of
      XFS it's actively harmful as ->destroy_inode might delete the inode from
      a radix-tree that has never been added.  This in turn might end up
      deleting the inode for the same inum that has been instanciated by
      another process and cause lots of cause subtile problems.
      
      Also in the case of re-initializing a reclaimable inode in XFS it would
      free an inode we still want to keep alive.
      
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarEric Sandeen <sandeen@sandeen.net>
      54e34621
  8. 06 Aug, 2009 2 commits
  9. 05 Aug, 2009 2 commits
  10. 04 Aug, 2009 2 commits
  11. 03 Aug, 2009 4 commits
    • Luis R. Rodriguez's avatar
      cfg80211: fix regression on beacon world roaming feature · 37184244
      Luis R. Rodriguez authored
      A regression was added through patch a4ed90d6
      
      :
      
      "cfg80211: respect API on orig_flags on channel for beacon hint"
      
      We did indeed respect _orig flags but the intention was not clearly
      stated in the commit log. This patch fixes firmware issues picked
      up by iwlwifi when we lift passive scan of beaconing restrictions
      on channels its EEPROM has been configured to always enable.
      
      By doing so though we also disallowed beacon hints on devices
      registering their wiphy with custom world regulatory domains
      enabled, this happens to be currently ath5k, ath9k and ar9170.
      The passive scan and beacon restrictions on those devices would
      never be lifted even if we did find a beacon and the hardware did
      support such enhancements when world roaming.
      
      Since Johannes indicates iwlwifi firmware cannot be changed to
      allow beacon hinting we set up a flag now to specifically allow
      drivers to disable beacon hints for devices which cannot use them.
      
      We enable the flag on iwlwifi to disable beacon hints and by default
      enable it for all other drivers. It should be noted beacon hints lift
      passive scan flags and beacon restrictions when we receive a beacon from
      an AP on any 5 GHz non-DFS channels, and channels 12-14 on the 2.4 GHz
      band. We don't bother with channels 1-11 as those channels are allowed
      world wide.
      
      This should fix world roaming for ath5k, ath9k and ar9170, thereby
      improving scan time when we receive the first beacon from any AP,
      and also enabling beaconing operation (AP/IBSS/Mesh) on cards which
      would otherwise not be allowed to do so. Drivers not using custom
      regulatory stuff (wiphy_apply_custom_regulatory()) were not affected
      by this as the orig_flags for the channels would have been cleared
      upon wiphy registration.
      
      I tested this with a world roaming ath5k card.
      
      Cc: Jouni Malinen <jouni.malinen@atheros.com>
      Signed-off-by: default avatarLuis R. Rodriguez <lrodriguez@atheros.com>
      Reviewed-by: default avatarJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      37184244
    • Dave Young's avatar
      bluetooth: rfcomm_init bug fix · af0d3b10
      Dave Young authored
      rfcomm tty may be used before rfcomm_tty_driver initilized,
      The problem is that now socket layer init before tty layer, if userspace
      program do socket callback right here then oops will happen.
      
      reporting in:
      http://marc.info/?l=linux-bluetooth&m=124404919324542&w=2
      
      
      
      make 3 changes:
      1. remove #ifdef in rfcomm/core.c,
      make it blank function when rfcomm tty not selected in rfcomm.h
      
      2. tune the rfcomm_init error patch to ensure
      tty driver initilized before rfcomm socket usage.
      
      3. remove __exit for rfcomm_cleanup_sockets
      because above change need call it in a __init function.
      
      Reported-by: default avatarOliver Hartkopp <oliver@hartkopp.net>
      Tested-by: default avatarOliver Hartkopp <oliver@hartkopp.net>
      Signed-off-by: default avatarDave Young <hidave.darkstar@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af0d3b10
    • Saeed Bishara's avatar
      mtd: fix the conversion from dev to mtd_info · 6afc4fdb
      Saeed Bishara authored
      
      
      The patch fixes a bug when converting dev to mtd_info by using the
      drvdata of the dev, the previous code used
      container_of(dev, struct mtd_info, dev), but won't work for the mtdXro
      devices as they created without being contained inside mtd_info structure.
      
      Signed-off-by: default avatarSaeed Bishara <saeed@marvell.com>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      6afc4fdb
    • Nicolas Pitre's avatar
      mtd: let include/linux/mtd/partitions.h stand on its own · 7699ad35
      Nicolas Pitre authored
      
      
      When declaring static MTD partitions in board specific code, only
      including <include/linux/mtd/partitions.h> should suffice without
      gcc nagging us with:
      
      In file included from arch/arm/mach-kirkwood/sheevaplug-setup.c:14:
      include/linux/mtd/partitions.h:50: warning: 'struct mtd_info' declared inside parameter list
      include/linux/mtd/partitions.h:50: warning: its scope is only this definition or declaration, which is probably not what you want
      include/linux/mtd/partitions.h:51: warning: 'struct mtd_info' declared inside parameter list
      include/linux/mtd/partitions.h:61: warning: 'struct mtd_info' declared inside parameter list
      include/linux/mtd/partitions.h:67: warning: 'struct mtd_info' declared inside parameter list
      
      Signed-off-by: default avatarNicolas Pitre <nico@marvell.com>
      Signed-off-by: default avatarArtem Bityutskiy <Artem.Bityutskiy@nokia.com>
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      7699ad35
  12. 02 Aug, 2009 1 commit
    • Peter Zijlstra's avatar
      perf_counter: Full task tracing · 9f498cc5
      Peter Zijlstra authored
      
      
      In order to be able to distinguish between no samples due to
      inactivity and no samples due to task ended, Arjan asked for
      PERF_EVENT_EXIT events. This is useful to the boot delay
      instrumentation (bootchart) app.
      
      This patch changes the PERF_EVENT_FORK to be emitted on every
      clone, and adds PERF_EVENT_EXIT to be emitted on task exit,
      after the task's counters have been closed.
      
      This task tracing is controlled through: attr.comm || attr.mmap
      and through the new attr.task field.
      
      Suggested-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anton Blanchard <anton@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      [ cleaned up perf_counter.h a bit ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9f498cc5
  13. 01 Aug, 2009 1 commit