1. 25 May, 2010 1 commit
    • Peter Zijlstra's avatar
      perf, trace: Fix !x86 build bug · 87f44bbc
      Peter Zijlstra authored
      Patch b7e2ecef
      
       (perf, trace: Optimize tracepoints by removing
      IRQ-disable from perf/tracepoint interaction) made the
      unfortunate mistake of assuming the world is x86 only, correct
      this.
      
      The problem was that perf_fetch_caller_regs() did
      local_save_flags() into regs->flags, and I re-used that to
      remove another local_save_flags(), forgetting !x86 doesn't have
      regs->flags.
      
      Do the reverse, remove the local_save_flags() from
      perf_fetch_caller_regs() and let the ftrace site do the
      local_save_flags() instead.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: acme@redhat.com
      Cc: efault@gmx.de
      Cc: fweisbec@gmail.com
      Cc: rostedt@goodmis.org
      LKML-Reference: <1274778175.5882.623.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      87f44bbc
  2. 07 May, 2010 7 commits
  3. 20 Apr, 2010 1 commit
    • Zhang, Yanmin's avatar
      perf & kvm: Clean up some of the guest profiling callback API details · dcf46b94
      Zhang, Yanmin authored
      
      
      Fix some build bug and programming style issues:
      
       - use valid C
       - fix up various style details
      Signed-off-by: default avatarZhang Yanmin <yanmin_zhang@linux.intel.com>
      Cc: Avi Kivity <avi@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Sheng Yang <sheng@linux.intel.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: oerg Roedel <joro@8bytes.org>
      Cc: Jes Sorensen <Jes.Sorensen@redhat.com>
      Cc: Gleb Natapov <gleb@redhat.com>
      Cc: Zachary Amsden <zamsden@redhat.com>
      Cc: zhiteng.huang@intel.com
      Cc: tim.c.chen@intel.com
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      LKML-Reference: <1271729638.2078.624.camel@ymzhang.sh.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dcf46b94
  4. 19 Apr, 2010 1 commit
  5. 04 Apr, 2010 1 commit
  6. 02 Apr, 2010 5 commits
  7. 01 Apr, 2010 1 commit
    • Frederic Weisbecker's avatar
      perf: Use hot regs with software sched switch/migrate events · e49a5bd3
      Frederic Weisbecker authored
      
      
      Scheduler's task migration events don't work because they always
      pass NULL regs perf_sw_event(). The event hence gets filtered
      in perf_swevent_add().
      
      Scheduler's context switches events use task_pt_regs() to get
      the context when the event occured which is a wrong thing to
      do as this won't give us the place in the kernel where we went
      to sleep but the place where we left userspace. The result is
      even more wrong if we switch from a kernel thread.
      
      Use the hot regs snapshot for both events as they belong to the
      non-interrupt/exception based events family. Unlike page faults
      or so that provide the regs matching the exact origin of the event,
      we need to save the current context.
      
      This makes the task migration event working and fix the context
      switch callchains and origin ip.
      
      Example: perf record -a -e cs
      
      Before:
      
          10.91%      ksoftirqd/0                  0  [k] 0000000000000000
                      |
                      --- (nil)
                          perf_callchain
                          perf_prepare_sample
                          __perf_event_overflow
                          perf_swevent_overflow
                          perf_swevent_add
                          perf_swevent_ctx_event
                          do_perf_sw_event
                          __perf_sw_event
                          perf_event_task_sched_out
                          schedule
                          run_ksoftirqd
                          kthread
                          kernel_thread_helper
      
      After:
      
          23.77%  hald-addon-stor  [kernel.kallsyms]  [k] schedule
                  |
                  --- schedule
                     |
                     |--60.00%-- schedule_timeout
                     |          wait_for_common
                     |          wait_for_completion
                     |          blk_execute_rq
                     |          scsi_execute
                     |          scsi_execute_req
                     |          sr_test_unit_ready
                     |          |
                     |          |--66.67%-- sr_media_change
                     |          |          media_changed
                     |          |          cdrom_media_changed
                     |          |          sr_block_media_changed
                     |          |          check_disk_change
                     |          |          cdrom_open
      
      v2: Always build perf_arch_fetch_caller_regs() now that software
      events need that too. They don't need it from modules, unlike trace
      events, so we keep the EXPORT_SYMBOL in trace_event_perf.c
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      e49a5bd3
  8. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  9. 26 Mar, 2010 1 commit
  10. 18 Mar, 2010 2 commits
    • Stephane Eranian's avatar
      perf_events: Fix resource leak in x86 __hw_perf_event_init() · 4b24a88b
      Stephane Eranian authored
      
      
      If reserve_pmc_hardware() succeeds but reserve_ds_buffers()
      fails, then we need to release_pmc_hardware. It won't be done
      by the destroy() callback because we return before setting it
      in case of error.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: <stable@kernel.org>
      Cc: peterz@infradead.org
      Cc: paulus@samba.org
      Cc: davem@davemloft.net
      Cc: fweisbec@gmail.com
      Cc: robert.richter@amd.com
      Cc: perfmon2-devel@lists.sf.net
      LKML-Reference: <4ba1568b.15185e0a.182a.7802@mx.google.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      --
       arch/x86/kernel/cpu/perf_event.c |    5 ++++-
       1 file changed, 4 insertions(+), 1 deletion(-)
      4b24a88b
    • Cyrill Gorcunov's avatar
      x86, perf: Use apic_write unconditionally · 7335f75e
      Cyrill Gorcunov authored
      
      
      Since apic_write() maps to a plain noop in the !CONFIG_X86_LOCAL_APIC
      case we're safe to remove this conditional compilation and clean up
      the code a bit.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: fweisbec@gmail.com
      Cc: acme@redhat.com
      Cc: eranian@google.com
      Cc: peterz@infradead.org
      LKML-Reference: <20100317104356.232371479@openvz.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7335f75e
  11. 17 Mar, 2010 6 commits
  12. 16 Mar, 2010 1 commit
    • Frederic Weisbecker's avatar
      perf: Fix unexported generic perf_arch_fetch_caller_regs · 1d199b1a
      Frederic Weisbecker authored
      
      
      perf_arch_fetch_caller_regs() is exported for the overriden x86
      version, but not for the generic weak version.
      
      As a general rule, weak functions should not have their symbol
      exported in the same file they are defined.
      
      So let's export it on trace_event_perf.c as it is used by trace
      events only.
      
      This fixes:
      
      	ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined!
      	ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!
      
      -v2: And also only build it if trace events are enabled.
      -v3: Fix changelog mistake
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      1d199b1a
  13. 12 Mar, 2010 1 commit
    • Cyrill Gorcunov's avatar
      x86, perf: Fix NULL deref on not assigned x86_pmu · 0b861225
      Cyrill Gorcunov authored
      
      
      In case of not assigned x86_pmu and software events NULL dereference may
      being hit via x86_pmu::schedule_events method.
      
      Fix it by checking if x86_pmu is initialized at all.
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      LKML-Reference: <20100311215016.GG25162@lenovo>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0b861225
  14. 11 Mar, 2010 3 commits
    • Cyrill Gorcunov's avatar
      perf, x86: Implement initial P4 PMU driver · a072738e
      Cyrill Gorcunov authored
      
      
      The netburst PMU is way different from the "architectural
      perfomance monitoring" specification that current CPUs use.
      P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
      perfomance monitoring events.
      
      A few implementational details:
      
      1) We need a separate x86_pmu::hw_config helper in struct
         x86_pmu since register bit-fields are quite different from P6,
         Core and later cpu series.
      
      2) For the same reason is a x86_pmu::schedule_events helper
         introduced.
      
      3) hw_perf_event::config consists of packed ESCR+CCCR values.
         It's allowed since in reality both registers only use a half
         of their size. Of course before making a real write into a
         particular MSR we need to unpack the value and extend it to
         a proper size.
      
      4) The tuple of packed ESCR+CCCR in hw_perf_event::config
         doesn't describe the memory address of ESCR MSR register
         so that we need to keep a mapping between these tuples
         used and available ESCR (various P4 events may use same
         ESCRs but not simultaneously), for this sake every active
         event has a per-cpu map of hw_perf_event::idx <--> ESCR
         addresses.
      
      5) Since hw_perf_event::idx is an offset to counter/control register
         we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
         strips it down to 8 registers and event armed may never be turned
         off (ie the bit in active_mask is set but the loop never reaches
         this index to check), thanks to Peter Zijlstra
      
      Restrictions:
      
       - No cascaded counters support (do we ever need them?)
       - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
         doesn't work for now)
       - There are events with same counters which can't work simultaneously
         (need to use intersected ones due to broken counter 1)
       - No PERF_COUNT_HW_CACHE_ events yet
      
      Todo:
      
       - Implement dependent events
       - Need proper hashing for event opcodes (no linear search, good for
         debugging stage but not in real loads)
       - Some events counted during a clock cycle -- need to set threshold
         for them and count every clock cycle just to get summary statistics
         (ie to behave the same way as other PMUs do)
       - Need to swicth to use event_constraints
       - To support RAW events we need to encode a global list of P4 events
         into p4_templates
       - Cache events need to be added
      
      Event support status matrix:
      
       Event			status
       -----------------------------
       cycles			works
       cache-references	works
       cache-misses		works
       branch-misses		works
       bus-cycles		partially (does not work on 64bit cpu with HT enabled)
       instruction		doesnt work (needs dependent event [mop tagging])
       branches		doesnt work
      Signed-off-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Signed-off-by: default avatarLin Ming <ming.m.lin@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <20100311165439.GB5129@lenovo>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a072738e
    • Xiao Guangrong's avatar
      perf: export perf_trace_regs and perf_arch_fetch_caller_regs · 639fe4b1
      Xiao Guangrong authored
      
      
      Export perf_trace_regs and perf_arch_fetch_caller_regs since module will
      use these.
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      [ use EXPORT_PER_CPU_SYMBOL_GPL() ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <4B989C1B.2090407@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      639fe4b1
    • Peter Zijlstra's avatar
      perf, x86: Fix hw_perf_enable() event assignment · 45e16a68
      Peter Zijlstra authored
      
      
      What happens is that we schedule badly like:
      
      <...>-1987  [019]   280.252808: x86_pmu_start: event-46/1300c0: idx: 0
      <...>-1987  [019]   280.252811: x86_pmu_start: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252812: x86_pmu_start: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252813: x86_pmu_start: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252814: x86_pmu_start: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252825: x86_pmu_stop: event-46/1300c0: idx: 0
      <...>-1987  [019]   280.252826: x86_pmu_stop: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252827: x86_pmu_stop: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252828: x86_pmu_stop: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252829: x86_pmu_stop: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252834: x86_pmu_start: event-47/1300c0: idx: 1
      <...>-1987  [019]   280.252834: x86_pmu_start: event-48/1300c0: idx: 2
      <...>-1987  [019]   280.252835: x86_pmu_start: event-49/1300c0: idx: 3
      <...>-1987  [019]   280.252836: x86_pmu_start: event-50/1300c0: idx: 32
      <...>-1987  [019]   280.252837: x86_pmu_start: event-51/1300c0: idx: 32 *FAIL*
      
      This happens because we only iterate the n_running events in the first
      pass, and reset their index to -1 if they don't match to force a
      re-assignment.
      
      Now, in our RR example, n_running == 0 because we fully unscheduled, so
      event-50 will retain its idx==32, even though in scheduling it will have
      gotten idx=0, and we don't trigger the re-assign path.
      
      The easiest way to fix this is the below patch, which simply validates
      the full assignment in the second pass.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1268311069.5037.31.camel@laptop>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      45e16a68
  15. 10 Mar, 2010 8 commits
    • Frederic Weisbecker's avatar
      perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot · 5331d7b8
      Frederic Weisbecker authored
      
      
      Events that trigger overflows by interrupting a context can
      use get_irq_regs() or task_pt_regs() to retrieve the state
      when the event triggered. But this is not the case for some
      other class of events like trace events as tracepoints are
      executed in the same context than the code that triggered
      the event.
      
      It means we need a different api to capture the regs there,
      namely we need a hot snapshot to get the most important
      informations for perf: the instruction pointer to get the
      event origin, the frame pointer for the callchain, the code
      segment for user_mode() tests (we always use __KERNEL_CS as
      trace events always occur from the kernel) and the eflags
      for further purposes.
      
      v2: rename perf_save_regs to perf_fetch_caller_regs as per
      Masami's suggestion.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: Jason Baron <jbaron@redhat.com>
      Cc: Archs <linux-arch@vger.kernel.org>
      5331d7b8
    • Peter Zijlstra's avatar
      perf, x86: Remove checking_{wr,rd}msr() usage · 7645a24c
      Peter Zijlstra authored
      
      
      We don't need checking_{wr,rd}msr() calls, since we should know what cpu
      we're running on and not use blindly poke at msrs.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7645a24c
    • Peter Zijlstra's avatar
      perf, x86: Disable PEBS on clovertown chips · 3c44780b
      Peter Zijlstra authored
      
      
      This CPU has just too many handycaps to be really useful.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100305154128.890278662@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3c44780b
    • Peter Zijlstra's avatar
      perf, x86: Clean up IA32_PERF_CAPABILITIES usage · 8db909a7
      Peter Zijlstra authored
      
      
      Saner PERF_CAPABILITIES support, which also exposes pebs_trap. Use that
      latter to make PEBS's use of LBR conditional since a fault-like pebs
      should already report the correct IP.
      
      ( As of this writing there is no known hardware that implements
        !pebs_trap )
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.770650663@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8db909a7
    • Peter Zijlstra's avatar
      perf, x86: use LBR for PEBS IP+1 fixup · ef21f683
      Peter Zijlstra authored
      
      
      Use the LBR to fix up the PEBS IP+1 issue.
      
      As said, PEBS reports the next instruction, here we use the LBR to find
      the last branch and from that construct the actual IP. If the IP matches
      the LBR-TO, we use LBR-FROM, otherwise we use the LBR-TO address as the
      beginning of the last basic block and decode forward.
      
      Once we find a match to the current IP, we use the previous location.
      
      This patch introduces a new ABI element: PERF_RECORD_MISC_EXACT, which
      conveys that the reported IP (PERF_SAMPLE_IP) is the exact instruction
      that caused the event (barring CPU errata).
      
      The fixup can fail due to various reasons:
      
       1) LBR contains invalid data (quite possible)
       2) part of the basic block got paged out
       3) the reported IP isn't part of the basic block (see 1)
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: Masami Hiramatsu <mhiramat@redhat.com>
      Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.619375431@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ef21f683
    • Peter Zijlstra's avatar
      perf, x86: Implement simple LBR support · caff2bef
      Peter Zijlstra authored
      
      
      Implement simple suport Intel Last-Branch-Record, it supports all
      hardware that implements FREEZE_LBRS_ON_PMI, but does not (yet) implement
      the LBR config register.
      
      The Intel LBR is a FIFO of From,To addresses describing the last few
      branches the hardware took.
      
      This patch does not add perf interface to the LBR, but merely provides an
      interface for internal use.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.544191154@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      caff2bef
    • Peter Zijlstra's avatar
      perf, x86: Add PEBS infrastructure · ca037701
      Peter Zijlstra authored
      
      
      This patch implements support for Intel Precise Event Based Sampling,
      which is an alternative counter mode in which the counter triggers a
      hardware assist to collect information on events. The hardware assist
      takes a trap like snapshot of a subset of the machine registers.
      
      This data is written to the Intel Debug-Store, which can be programmed
      with a data threshold at which to raise a PMI.
      
      With the PEBS hardware assist being trap like, the reported IP is always
      one instruction after the actual instruction that triggered the event.
      
      This implements a simple PEBS model that always takes a single PEBS event
      at a time. This is done so that the interaction with the rest of the
      system is as expected (freq adjust, period randomization, lbr,
      callchains, etc.).
      
      It adds an ABI element: perf_event_attr::precise, which indicates that we
      wish to use this (constrained, but precise) mode.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <20100304140100.392111285@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ca037701
    • Peter Zijlstra's avatar
      perf, x86: Fix double enable calls · f3d46b2e
      Peter Zijlstra authored
      
      
      hw_perf_enable() would enable already enabled events.
      
      This causes problems with code that assumes that ->enable/->disable calls
      are balanced (like the LBR code does).
      
      What happens is that events that were already running and left in place
      would get enabled again.
      
      Avoid this by only enabling new events that match their previous
      assignment.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
      Cc: paulus@samba.org
      Cc: eranian@google.com
      Cc: robert.richter@amd.com
      Cc: fweisbec@gmail.com
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f3d46b2e