1. 09 Aug, 2009 5 commits
    • Frederic Weisbecker's avatar
      perf_counter: Fix tracepoint sampling to be part of generic sampling · 3a43ce68
      Frederic Weisbecker authored
      
      
      Based on Peter's comments, make tracepoint sampling generic
      just like all the other sampling bits are. This is a rename
      with no code changes:
      
      - PERF_SAMPLE_TP_RECORD to PERF_SAMPLE_RAW
      - struct perf_tracepoint_record to perf_raw_record
      
      We want the system in place that transport tracepoints raw
      samples events into the perf ring buffer to be generalized and
      usable by any type of counter.
      
      Reported-by; Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-4-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a43ce68
    • Frederic Weisbecker's avatar
      perf_counter: Work around gcc warning by initializing tracepoint record unconditionally · 10b8e306
      Frederic Weisbecker authored
      
      
      Despite that the tracepoint record is always present when the
      PERF_SAMPLE_TP_RECORD flag is set, gcc raises a warning,
      thinking it might not be initialized:
      
        kernel/perf_counter.c: In function ‘perf_counter_output’:
        kernel/perf_counter.c:2650: warning: ‘tp’ may be used uninitialized in this function
      
      Then, initialize it to NULL and always check if it's not NULL
      before dereference it.
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1249698400-5441-2-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      10b8e306
    • Peter Zijlstra's avatar
      perf_counter: Fix software counters for fast moving event sources · 7b4b6658
      Peter Zijlstra authored
      
      
      Reimplement the software counters to deal with fast moving
      event sources (such as tracepoints). This means being able
      to generate multiple overflows from a single 'event' as well
      as support throttling.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7b4b6658
    • Frederic Weisbecker's avatar
      perf_counter: Fix/complete ftrace event records sampling · f413cdb8
      Frederic Weisbecker authored
      
      
      This patch implements the kernel side support for ftrace event
      record sampling.
      
      A new counter sampling attribute is added:
      
         PERF_SAMPLE_TP_RECORD
      
      which requests ftrace events record sampling. In this case
      if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
      fires, we emit the tracepoint binary record to the
      perfcounter event buffer, as a sample.
      
      Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
      record:
      
       perf record -f -F 1 -a -e workqueue:workqueue_execution
       perf report -D
      
       0x21e18 [0x48]: event: 9
       .
       . ... raw event: size 72 bytes
       .  0000:  09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff  ......H........
       .  0010:  0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00  ........!......
       .  0020:  2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e  +...........eve
       .  0030:  74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00  ts/1...........
       .  0040:  e0 b1 31 81 ff ff ff ff                          .......
      .
      0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
      
      The raw ftrace binary record starts at offset 0020.
      
      Translation:
      
       struct trace_entry {
      	type		= 0x2b = 43;
      	flags		= 1;
      	preempt_count	= 2;
      	pid		= 0xa = 10;
      	tgid		= 0xa = 10;
       }
      
       thread_comm = "events/1"
       thread_pid  = 0xa = 10;
       func	    = 0xffffffff8131b1e0 = flush_to_ldisc()
      
      What will come next?
      
       - Userspace support ('perf trace'), 'flight data recorder' mode
         for perf trace, etc.
      
       - The unconditional copy from the profiling callback brings
         some costs however if someone wants no such sampling to
         occur, and needs to be fixed in the future. For that we need
         to have an instant access to the perf counter attribute.
         This is a matter of a flag to add in the struct ftrace_event.
      
       - Take care of the events recursivity! Don't ever try to record
         a lock event for example, it seems some locking is used in
         the profiling fast path and lead to a tracing recursivity.
         That will be fixed using raw spinlock or recursivity
         protection.
      
       - [...]
      
       - Profit! :-)
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Tom Zanussi <tzanussi@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f413cdb8
    • Peter Zijlstra's avatar
      perf_counter, ftrace: Fix perf_counter integration · 3a659305
      Peter Zijlstra authored
      
      
      Adds possible second part to the assign argument of TP_EVENT().
      
        TP_perf_assign(
      	__perf_count(foo);
      	__perf_addr(bar);
        )
      
      Which, when specified make the swcounter increment with @foo instead
      of the usual 1, and report @bar for PERF_SAMPLE_ADDR (data address
      associated with the event) when this triggers a counter overflow.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3a659305
  2. 06 Aug, 2009 1 commit
    • Peter Zijlstra's avatar
      perf_counter: Fix double list iteration in per task precise stats · 1054598c
      Peter Zijlstra authored
      
      
      Brice Goglin reported this crash with per task precise stats:
      
      > I finally managed to test the threaded perfcounter statistics (thanks a
      > lot for implementing it). I am running 2.6.31-rc5 (with the AMD
      > magny-cours patches but I don't think they matter here). I am trying to
      > measure local/remote memory accesses per thread during the well-known
      > stream benchmark. It's compiled with OpenMP using 16 threads on a
      > quad-socket quad-core barcelona machine.
      >
      > Command line is:
      >  /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s
      > -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream
      >
      > It seems to work fine with a single -e <counter> on the command line
      > while it crashes when there are at least 2 of them.
      > It seems to work fine without -s as well.
      
      A silly copy-paste resulted in a messed up iteration which would
      cause the OOPS.
      Reported-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: default avatarBrice Goglin <Brice.Goglin@inria.fr>
      LKML-Reference: <1249574786.32113.550.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1054598c
  3. 02 Aug, 2009 2 commits
  4. 22 Jul, 2009 5 commits
  5. 18 Jul, 2009 1 commit
    • Anton Blanchard's avatar
      perf_counter: Make sure we dont leak kernel memory to userspace · 413ee3b4
      Anton Blanchard authored
      
      
      There are a few places we are leaking tiny amounts of kernel
      memory to userspace. This happens when writing out strings
      because we always align the end to 64 bits.
      
      To avoid this we should always use an appropriately sized
      temporary buffer and ensure it is zeroed.
      
      Since d_path assembles the string from the end of the buffer
      backwards, we need to add 64 bits after the buffer to allow for
      alignment.
      
      We also need to copy arch_vma_name to the temporary buffer,
      because if we use it directly we may end up copying to
      userspace a number of bytes after the end of the string
      constant.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090716104817.273972048@samba.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      413ee3b4
  6. 13 Jul, 2009 1 commit
    • Chris Wilson's avatar
      perf_counter: Fix the tracepoint channel to perfcounters · d4d7d0b9
      Chris Wilson authored
      
      
      Fix a missed rename in EVENT_PROFILE support so that it gets
      built and allows tracepoint tracing from the 'perf' tool.
      
      Fix a typo in the (never before built & enabled) portion in
      perf_counter.c as well, and update that code to the
      attr.config changes as well.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Ben Gamari <bgamari.foss@gmail.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <1246869094-21237-1-git-send-email-chris@chris-wilson.co.uk>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d4d7d0b9
  7. 10 Jul, 2009 1 commit
  8. 06 Jul, 2009 1 commit
    • Kevin Cernekee's avatar
      Fix virt_to_phys() warnings · 5bfd7560
      Kevin Cernekee authored
      
      
      These warnings were observed on MIPS32 using 2.6.31-rc1 and gcc-4.2.0:
      
      mm/page_alloc.c: In function 'alloc_pages_exact':
      mm/page_alloc.c:1986: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast
      
      drivers/usb/mon/mon_bin.c: In function 'mon_alloc_buff':
      drivers/usb/mon/mon_bin.c:1264: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast
      
      [akpm@linux-foundation.org: fix kernel/perf_counter.c too]
      Signed-off-by: default avatarKevin Cernekee <cernekee@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5bfd7560
  9. 30 Jun, 2009 1 commit
  10. 26 Jun, 2009 1 commit
  11. 25 Jun, 2009 5 commits
  12. 23 Jun, 2009 3 commits
  13. 20 Jun, 2009 1 commit
  14. 19 Jun, 2009 2 commits
    • Peter Zijlstra's avatar
      perf_counter: Close race in perf_lock_task_context() · b49a9e7e
      Peter Zijlstra authored
      
      
      perf_lock_task_context() is buggy because it can return a dead
      context.
      
      the RCU read lock in perf_lock_task_context() only guarantees
      the memory won't get freed, it doesn't guarantee the object is
      valid (in our case refcount > 0).
      
      Therefore we can return a locked object that can get freed the
      moment we release the rcu read lock.
      
      perf_pin_task_context() then increases the refcount and does an
      unlock on freed memory.
      
      That increased refcount will cause a double free, in case it
      started out with 0.
      
      Ammend this by including the get_ctx() functionality in
      perf_lock_task_context() (all users already did this later
      anyway), and return a NULL context when the found one is
      already dead.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b49a9e7e
    • Peter Zijlstra's avatar
      perf_counter: Simplify and fix task migration counting · e5289d4a
      Peter Zijlstra authored
      The task migrations counter was causing rare and hard to decypher
      memory corruptions under load. After a day of debugging and bisection
      we found that the problem was introduced with:
      
        3f731ca6
      
      : perf_counter: Fix cpu migration counter
      
      Turning them off fixes the crashes. Incidentally, the whole
      perf_counter_task_migration() logic can be done simpler as well,
      by injecting a proper sw-counter event.
      
      This cleanup also fixed the crashes. The precise failure mode is
      not completely clear yet, but we are clearly not unhappy about
      having a fix ;-)
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e5289d4a
  15. 18 Jun, 2009 1 commit
    • Peter Zijlstra's avatar
      perf_counter: Add event overlow handling · 43a21ea8
      Peter Zijlstra authored
      
      
      Alternative method of mmap() data output handling that provides
      better overflow management and a more reliable data stream.
      
      Unlike the previous method, that didn't have any user->kernel
      feedback and relied on userspace keeping up, this method relies on
      userspace writing its last read position into the control page.
      
      It will ensure new output doesn't overwrite not-yet read events,
      new events for which there is no space left are lost and the
      overflow counter is incremented, providing exact event loss
      numbers.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      43a21ea8
  16. 15 Jun, 2009 1 commit
  17. 13 Jun, 2009 2 commits
  18. 12 Jun, 2009 2 commits
    • Peter Zijlstra's avatar
      perf_counter: Add forward/backward attribute ABI compatibility · 974802ea
      Peter Zijlstra authored
      
      
      Provide for means of extending the perf_counter_attr in a 'natural' way.
      
      We allow growing the structure by appending fields at the end by specifying
      the full structure size inside it.
      
      When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
      When an old kernel sees a larger (new) structure, it will verify the tail
      consists of 0s, otherwise fail.
      
      If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
      native attribe size back into the provided structure.
      
      Furthermore, add some attribute verification, so that we'll fail counter
      creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
      the __reserved fields).
      
      (This ABI detail is introduced while keeping the existing syscall ABI.)
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      974802ea
    • Peter Zijlstra's avatar
      perf_counter: Remove PERF_TYPE_RAW special casing · 081fad86
      Peter Zijlstra authored
      
      
      The PERF_TYPE_RAW special case seems superfluous these days. Remove
      it and add it to the switch() stmt like the others.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      081fad86
  19. 11 Jun, 2009 4 commits