1. 18 Jul, 2017 1 commit
    • Kan Liang's avatar
      perf/x86/intel: Add Goldmont Plus CPU PMU support · dd0b06b5
      Kan Liang authored
      Add perf core PMU support for Intel Goldmont Plus CPU cores:
      
       - The init code is based on Goldmont.
       - There is a new cache event list, based on the Goldmont cache event
         list.
       - All four general-purpose performance counters support PEBS.
       - The first general-purpose performance counter is for reduced skid
         PEBS mechanism. Using :ppp to indicate the event which want to do
         reduced skid PEBS.
       - Goldmont Plus has 4-wide pipeline for Topdown
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20170712134423.17766-1-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dd0b06b5
  2. 22 Jun, 2017 1 commit
  3. 26 May, 2017 1 commit
  4. 23 May, 2017 1 commit
    • Kan Liang's avatar
      perf/x86: Add sysfs entry to freeze counters on SMI · 6089327f
      Kan Liang authored
      Currently, the SMIs are visible to all performance counters, because
      many users want to measure everything including SMIs. But in some
      cases, the SMI cycles should not be counted - for example, to calculate
      the cost of an SMI itself. So a knob is needed.
      
      When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance
      counters will be effected. There is no way to do per-counter freeze
      on SMI. So it should not use the per-event interface (e.g. ioctl or
      event attribute) to set FREEZE_WHILE_SMM bit.
      
      Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM
      bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages
      while in SMM.
      
      Value has to be 0 or 1. It will be applied to all processors.
      
      Also serialize the entire setting so we don't get multiple concurrent
      threads trying to update to different values.
      Signed-off-by: default avatarKan Liang <Kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6089327f
  5. 14 Apr, 2017 1 commit
    • Kan Liang's avatar
      perf/x86: Fix spurious NMI with PEBS Load Latency event · fd583ad1
      Kan Liang authored
      Spurious NMIs will be observed with the following command:
      
        while :; do
          perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
                        -e "cpu/umask=0x03,event=0x0/"
                        -e "cpu/umask=0x02,event=0x0/"
                        -e cycles,branches,cache-misses
                        -e cache-references -- sleep 10
        done
      
      The bug was introduced by commit:
      
        8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      
      That commit clears the status bits for the counters used for PEBS
      events, by masking the whole 64 bits pebs_enabled. However, only the
      low 32 bits of both status and pebs_enabled are reserved for PEBS-able
      counters.
      
      For status bits 32-34 are fixed counter overflow bits. For
      pebs_enabled bits 32-34 are for PEBS Load Latency.
      
      In the test case, the PEBS Load Latency event and fixed counter event
      could overflow at the same time. The fixed counter overflow bit will
      be cleared by mistake. Once it is cleared, the fixed counter overflow
      never be processed, which finally trigger spurious NMI.
      
      Correct the PEBS enabled mask by ignoring the non-PEBS bits.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fd583ad1
  6. 16 Mar, 2017 1 commit
  7. 17 Jan, 2017 1 commit
    • Zhou Chengming's avatar
      perf/x86/intel: Handle exclusive threadid correctly on CPU hotplug · 4e71de79
      Zhou Chengming authored
      The CPU hotplug function intel_pmu_cpu_starting() sets
      cpu_hw_events.excl_thread_id unconditionally to 1 when the shared exclusive
      counters data structure is already availabe for the sibling thread.
      
      This works during the boot process because the first sibling gets threadid
      0 assigned and the second sibling which shares the data structure gets 1.
      
      But when the first thread of the core is offlined and onlined again it
      shares the data structure with the second thread and gets exclusive thread
      id 1 assigned as well.
      
      Prevent this by checking the threadid of the already online thread.
      
      [ tglx: Rewrote changelog ]
      Signed-off-by: default avatarZhou Chengming <zhouchengming1@huawei.com>
      Cc: NuoHan Qiao <qiaonuohan@huawei.com>
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: kan.liang@intel.com
      Cc: dave.hansen@linux.intel.com
      Cc: eranian@google.com
      Cc: qiaonuohan@huawei.com
      Cc: davidcc@google.com
      Cc: guohanjun@huawei.com
      Link: http://lkml.kernel.org/r/1484536871-3131-1-git-send-email-zhouchengming1@huawei.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      ---					---
       arch/x86/events/intel/core.c |    7 +++++--
       1 file changed, 5 insertions(+), 2 deletions(-)
      4e71de79
  8. 11 Jan, 2017 1 commit
  9. 22 Dec, 2016 1 commit
    • Stephane Eranian's avatar
      perf/x86/pebs: Fix handling of PEBS buffer overflows · daa864b8
      Stephane Eranian authored
      This patch solves a race condition between PEBS and the PMU handler.
      
      In case multiple PEBS events are sampled at the same time,
      it is possible to have GLOBAL_STATUS bit 62 set indicating
      PEBS buffer overflow and also seeing at most 3 PEBS counters
      having their bits set in the status register. This is a sign
      that there was at least one PEBS record pending at the time
      of the PMU interrupt. PEBS counters must only be processed
      via the drain_pebs() calls, and not via the regular sample
      processing loop coming after that the function, otherwise
      phony regular samples may be generated in the sampling buffer
      not marked with the EXACT tag.
      
      Another possibility is to have one PEBS event and at least
      one non-PEBS event whic hoverflows while PEBS has armed. In this
      case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
      status bit for the PEBS counter will be on Skylake.
      
      To avoid this problem, we systematically ignore the PEBS-enabled
      counters from the GLOBAL_STATUS mask and we always process PEBS
      events via drain_pebs().
      
      The problem manifested itself by having non-exact samples when
      sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
      not have the EXACT flag set.
      
      Note that this problem is only present on Skylake processor.
      This fix is harmless on older processors.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      daa864b8
  10. 06 Dec, 2016 1 commit
    • Peter Zijlstra (Intel)'s avatar
      perf/x86: Fix full width counter, counter overflow · 7f612a7f
      Peter Zijlstra (Intel) authored
      Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM.
      
      Both these parts have full_width_write set, and that does indeed have
      a problem. In order to deal with counter wrap, we must sample the
      counter at at least half the counter period (see also the sampling
      theorem) such that we can unambiguously reconstruct the count.
      
      However commit:
      
        069e0c3c ("perf/x86/intel: Support full width counting")
      
      sets the sampling interval to the full period, not half.
      
      Fixing that exposes another issue, in that we must not sign extend the
      delta value when we shift it right; the counter cannot have
      decremented after all.
      
      With both these issues fixed, counter overflow functions correctly
      again.
      Reported-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Tested-by: default avatarLiang, Kan <kan.liang@intel.com>
      Tested-by: default avatarOdzioba, Lukasz <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@vger.kernel.org
      Fixes: 069e0c3c ("perf/x86/intel: Support full width counting")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7f612a7f
  11. 28 Oct, 2016 1 commit
    • Imre Palik's avatar
      perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors · f92b7604
      Imre Palik authored
      perf doesn't seem to honour the number of fixed counters specified by CPUID
      leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters.
      
      So if some of the fixed counters are masked out by the hypervisor, it still
      tries to check/set them.
      
      This patch makes perf behave nicer when the kernel is running under a
      hypervisor that doesn't expose all the counters.
      
      This patch contains some ideas from Matt Wilson.
      Signed-off-by: default avatarImre Palik <imrep@amazon.de>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Kozyrev <alexander.kozyrev@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Artyom Kuanbekov <artyom.kuanbekov@intel.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Wilson <msw@amazon.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1477037939-15605-1-git-send-email-imrep.amz@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f92b7604
  12. 17 Oct, 2016 1 commit
  13. 15 Sep, 2016 1 commit
  14. 10 Aug, 2016 1 commit
    • Peter Zijlstra's avatar
      perf/x86: Ensure perf_sched_cb_{inc,dec}() is only called from pmu::{add,del}() · 68f7082f
      Peter Zijlstra authored
      Currently perf_sched_cb_{inc,dec}() are called from
      pmu::{start,stop}(), which has the problem that this can happen from
      NMI context, this is making it hard to optimize perf_pmu_sched_task().
      
      Furthermore, we really only need this accounting on pmu::{add,del}(),
      so doing it from pmu::{start,stop}() is doing more work than we really
      need.
      
      Introduce x86_pmu::{add,del}() and wire up the LBR and PEBS.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      68f7082f
  15. 14 Jul, 2016 1 commit
  16. 03 Jul, 2016 1 commit
    • Stephane Eranian's avatar
      perf/x86/intel: Update event constraints when HT is off · 9010ae4a
      Stephane Eranian authored
      This patch updates the event constraints for non-PEBS mode for
      Intel Broadwell and Skylake processors. When HT is off, each
      CPU gets 8 generic counters. However, not all events can be
      programmed on any of the 8 counters.  This patch adds the
      constraints for the MEM_* events which can only be measured on the
      bottom 4 counters. The constraints are also valid when HT is off
      because, then, there are only 4 generic counters and they are the
      bottom counters.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: kan.liang@intel.com
      Link: http://lkml.kernel.org/r/1467411742-13245-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9010ae4a
  17. 27 Jun, 2016 2 commits
    • David Carrillo-Cisneros's avatar
      perf/x86/intel: Fix MSR_LAST_BRANCH_FROM_x bug when no TSX · 19fc9ddd
      David Carrillo-Cisneros authored
      Intel's SDM states that bits 61:62 in MSR_LAST_BRANCH_FROM_x are the
      TSX flags for formats with LBR_TSX flags (i.e. LBR_FORMAT_EIP_EFLAGS2).
      
      However, when the CPU has TSX support deactivated, bits 61:62 actually
      behave as follows:
      
        - For wrmsr(), bits 61:62 are considered part of the sign extension.
        - When capturing branches, the LBR hw will always clear bits 61:62.
          regardless of the sign extension.
      
      Therefore, if:
      
        1) LBR has TSX format.
        2) CPU has no TSX support enabled.
      
      ... then any value passed to wrmsr() must be sign extended to 63 bits
      and any value from rdmsr() must be converted to have a sign extension
      of 61 bits, ignoring the values at TSX flags.
      
      This bug was masked by the work-around to the Intel's CPU bug:
      BJ94. "LBR May Contain Incorrect Information When Using FREEZE_LBRS_ON_PMI"
      in Document Number: 324643-037US.
      
      The aforementioned work-around uses hw flags to filter out all kernel
      branches, limiting LBR callstack to user level execution only.
      
      Since user addresses are not sign extended, they do not trigger the wrmsr()
      bug in MSR_LAST_BRANCH_FROM_x when saved/restored at context switch.
      
      To verify the hw bug:
      
        $ perf record -b -e cycles sleep 1
        $ rdmsr -p 0 0x680
        0x1fffffffb0b9b0cc
        $ wrmsr -p 0 0x680 0x1fffffffb0b9b0cc
        write(): Input/output error
      
      The quirk for LBR_FROM_ MSRs is required before calls to wrmsrl() and
      after rdmsrl().
      
      This patch introduces it for wrmsrl()'s done for testing LBR support.
      
      Future patch in series adds the quirk for context switch, that would
      be required if LBR callstack is to be enabled for ring 0.
      Signed-off-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-3-git-send-email-davidcc@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      19fc9ddd
    • David Carrillo-Cisneros's avatar
      perf/x86/intel: Print LBR support statement after validation · f09509b9
      David Carrillo-Cisneros authored
      The following commit:
      
        338b522c ("perf/x86/intel: Protect LBR and extra_regs against KVM lying")
      
      added an additional test to LBR support detection that is performed after
      printing the LBR support statement to dmesg.
      
      Move the LBR support output after the very last test, to make sure we
      print the true status of LBR support.
      Signed-off-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1466533874-52003-2-git-send-email-davidcc@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f09509b9
  18. 08 Jun, 2016 1 commit
    • Dave Hansen's avatar
      perf/x86/intel: Use Intel family macros for core perf events · ef5f9f47
      Dave Hansen authored
      Use the new model number macros instead of spelling things out
      in the comments.
      
      Note that this is missing a Nehalem model that is mentioned in
      intel_idle which is fixed up in a later patch.
      
      The resulting binary (arch/x86/events/intel/core.o) is exactly
      the same with and without this patch modulo some harmless changes
      to restoring %esi in the return path of functions, even those
      untouched by this patch.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: jacob.jun.pan@intel.com
      Link: http://lkml.kernel.org/r/20160603001929.C5F1C079@viggo.jf.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ef5f9f47
  19. 03 Jun, 2016 5 commits
    • Andi Kleen's avatar
      perf/x86/intel: Use new topology_max_smt_threads() in HT leak workaround · 030ba6cd
      Andi Kleen authored
      Now that we have topology_max_smt_threads() use it
      to detect the HT workarounds for older CPUs.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-6-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      030ba6cd
    • Andi Kleen's avatar
      perf/x86/intel: Add topdown events to Intel Atom · eb12b8ec
      Andi Kleen authored
      Add topdown event declarations to Silvermont / Airmont.
      These cores do not support the full Top Down metrics, but an useful
      subset (FrontendBound, Retiring, Backend Bound/Bad Speculation).
      
      The perf stat tool automatically handles the missing events
      and combines the available metrics.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-5-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      eb12b8ec
    • Andi Kleen's avatar
      perf/x86/intel: Add topdown events to Intel Core · a39fcae7
      Andi Kleen authored
      Add declarations for the events needed for topdown to the
      Intel big core CPUs starting with Sandy Bridge. We need
      to report different values if HyperThreading is on or off.
      
      The only thing this patch does is to export some events
      in sysfs.
      
      topdown level 1 uses a set of abstracted metrics which
      are generic to out of order CPU cores (although some
      CPUs may not implement all of them):
      
      topdown-total-slots	  Available slots in the pipeline
      topdown-slots-issued	  Slots issued into the pipeline
      topdown-slots-retired	  Slots successfully retired
      topdown-fetch-bubbles	  Pipeline gaps in the frontend
      topdown-recovery-bubbles  Pipeline gaps during recovery
      			  from misspeculation
      
      A slot is a single operation in the CPU pipe line.
      
      These metrics then allow to compute four useful metrics:
      FrontendBound, BackendBound, Retiring, BadSpeculation.
      
      The formulas to compute the metrics are generic, they
      only change based on the availability on the abstracted
      input values.
      
      The kernel declares the events supported by the current
      CPU and their scaling factors (such as the pipeline width)
      and perf stat then computes the formulas based on the
      available metrics.  This is similar how existing
      perf metrics, such as TSC metrics or IPC, are implemented.
      
      This abstracts all CPU pipe line specific knowledge in the
      kernel driver, but still avoids the need for larger scale perf
      interface changes.
      
      For HyperThreading the any bit is needed to get accurate
      values when both threads are executing. This implies that
      the events can only be collected as root or with
      perf_event_paranoid=-1 for now.
      
      The basic scheme is based on the following paper:
        Yasin, A Top Down Method for Performance analysis and Counter architecture ISPASS14 (pdf available via google)
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1463703002-19686-4-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a39fcae7
    • Lukasz Odzioba's avatar
      perf/x86/intel: Change offcore response masks for Knights Landing · 9c489fce
      Lukasz Odzioba authored
      Due to change in register definition we need to update OCR mask:
      
      MSR_OFFCORE_RESP0 reserved bits: 3,4,18,29,30,33,34, 8,11,14
      MSR_OFFCORE_RESP1 reserved bits: 3,4,18,29,30,33,34, 38
      Reported-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: kan.liang@intel.com
      Cc: lukasz.anaczkowski@intel.com
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/1463433419-16893-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c489fce
    • Lukasz Odzioba's avatar
      perf/x86/intel: Add 'static' keyword to locally used arrays · 20f36278
      Lukasz Odzioba authored
      Add the 'static' keyword to intel_bdw_event_constraints[], snb_events_attrs[],
      nhm_events_attrs[] and intel_skl_event_constraints arrays[], because
      they are only used locally.
      Signed-off-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: kan.liang@intel.com
      Cc: lukasz.anaczkowski@intel.com
      Cc: zheng.z.yan@intel.com
      Link: http://lkml.kernel.org/r/1463433378-16816-1-git-send-email-lukasz.odzioba@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      20f36278
  20. 12 May, 2016 1 commit
  21. 05 May, 2016 2 commits
  22. 23 Apr, 2016 3 commits
  23. 08 Mar, 2016 4 commits
    • Andi Kleen's avatar
      perf/x86/intel: Fix PEBS data source interpretation on Nehalem/Westmere · e17dc653
      Andi Kleen authored
      Jiri reported some time ago that some entries in the PEBS data source table
      in perf do not agree with the SDM. We investigated and the bits
      changed for Sandy Bridge, but the SDM was not updated.
      
      perf already implements the bits correctly for Sandy Bridge
      and later. This patch patches it up for Nehalem and Westmere.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1456871124-15985-1-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e17dc653
    • Stephane Eranian's avatar
      perf/x86/pebs: Add proper PEBS constraints for Broadwell · b3e62463
      Stephane Eranian authored
      This patch adds a Broadwell specific PEBS event constraint table.
      
      Broadwell has a fix for the HT corruption bug erratum HSD29 on
      Haswell. Therefore, there is no need to mark events 0xd0, 0xd1, 0xd2,
      0xd3 has requiring the exclusive mode across both sibling HT threads.
      This holds true for regular counting and sampling (see core.c) and
      PEBS (ds.c) which we fix in this patch.
      
      In doing so, we relax evnt scheduling for these events, they can now
      be programmed on any 4 counters without impacting what is measured on
      the sibling thread.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@redhat.com
      Cc: adrian.hunter@intel.com
      Cc: jolsa@redhat.com
      Cc: kan.liang@intel.com
      Cc: namhyung@kernel.org
      Link: http://lkml.kernel.org/r/1457034642-21837-4-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b3e62463
    • Stephane Eranian's avatar
      perf/x86/pebs: Add workaround for broken OVFL status on HSW+ · 8077eca0
      Stephane Eranian authored
      This patch fixes an issue with the GLOBAL_OVERFLOW_STATUS bits on
      Haswell, Broadwell and Skylake processors when using PEBS.
      
      The SDM stipulates that when the PEBS iterrupt threshold is crossed,
      an interrupt is posted and the kernel is interrupted. The kernel will
      find GLOBAL_OVF_SATUS bit 62 set indicating there are PEBS records to
      drain. But the bits corresponding to the actual counters should NOT be
      set. The kernel follows the SDM and assumes that all PEBS events are
      processed in the drain_pebs() callback. The kernel then checks for
      remaining overflows on any other (non-PEBS) events and processes these
      in the for_each_bit_set(&status) loop.
      
      As it turns out, under certain conditions on HSW and later processors,
      on PEBS buffer interrupt, bit 62 is set but the counter bits may be
      set as well. In that case, the kernel drains PEBS and generates
      SAMPLES with the EXACT tag, then it processes the counter bits, and
      generates normal (non-EXACT) SAMPLES.
      
      I ran into this problem by trying to understand why on HSW sampling on
      a PEBS event was sometimes returning SAMPLES without the EXACT tag.
      This should not happen on user level code because HSW has the
      eventing_ip which always point to the instruction that caused the
      event.
      
      The workaround in this patch simply ensures that the bits for the
      counters used for PEBS events are cleared after the PEBS buffer has
      been drained. With this fix 100% of the PEBS samples on my user code
      report the EXACT tag.
      
      Before:
        $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
        $ perf report -D | fgrep SAMPLES
        PERF_RECORD_SAMPLE(IP, 0x2): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
                                 \--- EXACT tag is missing
      
      After:
        $ perf record -e cpu/event=0xd0,umask=0x81/upp ./multichase
        $ perf report -D | fgrep SAMPLES
        PERF_RECORD_SAMPLE(IP, 0x4002): 11775/11775: 0x406de5 period: 73469 addr: 0 exact=Y
                                 \--- EXACT tag is set
      
      The problem tends to appear more often when multiple PEBS events are used.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: adrian.hunter@intel.com
      Cc: kan.liang@intel.com
      Cc: namhyung@kernel.org
      Link: http://lkml.kernel.org/r/1457034642-21837-3-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8077eca0
    • Kan Liang's avatar
      perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmi · c3d266c8
      Kan Liang authored
      This patch tries to fix a PEBS warning found in my stress test. The
      following perf command can easily trigger the pebs warning or spurious
      NMI error on Skylake/Broadwell/Haswell platforms:
      
        sudo perf record -e 'cpu/umask=0x04,event=0xc4/pp,cycles,branches,ref-cycles,cache-misses,cache-references' --call-graph fp -b -c1000 -a
      
      Also the NMI watchdog must be enabled.
      
      For this case, the events number is larger than counter number. So
      perf has to do multiplexing.
      
      In perf_mux_hrtimer_handler, it does perf_pmu_disable(), schedule out
      old events, rotate_ctx, schedule in new events and finally
      perf_pmu_enable().
      
      If the old events include precise event, the MSR_IA32_PEBS_ENABLE
      should be cleared when perf_pmu_disable().  The MSR_IA32_PEBS_ENABLE
      should keep 0 until the perf_pmu_enable() is called and the new event is
      precise event.
      
      However, there is a corner case which could restore PEBS_ENABLE to
      stale value during the above period. In perf_pmu_disable(), GLOBAL_CTRL
      will be set to 0 to stop overflow and followed PMI. But there may be
      pending PMI from an earlier overflow, which cannot be stopped. So even
      GLOBAL_CTRL is cleared, the kernel still be possible to get PMI. At
      the end of the PMI handler, __intel_pmu_enable_all() will be called,
      which will restore the stale values if old events haven't scheduled
      out.
      
      Once the stale pebs value is set, it's impossible to be corrected if
      the new events are non-precise. Because the pebs_enabled will be set
      to 0. x86_pmu.enable_all() will ignore the MSR_IA32_PEBS_ENABLE
      setting. As a result, the following NMI with stale PEBS_ENABLE
      trigger pebs warning.
      
      The pending PMI after enabled=0 will become harmless if the NMI handler
      does not change the state. This patch checks cpuc->enabled in pmi and
      only restore the state when PMU is active.
      
      Here is the dump:
      
        Call Trace:
         <NMI>  [<ffffffff813c3a2e>] dump_stack+0x63/0x85
         [<ffffffff810a46f2>] warn_slowpath_common+0x82/0xc0
         [<ffffffff810a483a>] warn_slowpath_null+0x1a/0x20
         [<ffffffff8100fe2e>] intel_pmu_drain_pebs_nhm+0x2be/0x320
         [<ffffffff8100caa9>] intel_pmu_handle_irq+0x279/0x460
         [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
         [<ffffffff811f290d>] ? vunmap_page_range+0x20d/0x330
         [<ffffffff811f2f11>] ?  unmap_kernel_range_noflush+0x11/0x20
         [<ffffffff8148379f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0
         [<ffffffff814839c8>] ? ghes_read_estatus+0x98/0x170
         [<ffffffff81005a7d>] perf_event_nmi_handler+0x2d/0x50
         [<ffffffff810310b9>] nmi_handle+0x69/0x120
         [<ffffffff810316f6>] default_do_nmi+0xe6/0x100
         [<ffffffff810317f2>] do_nmi+0xe2/0x130
         [<ffffffff817aea71>] end_repeat_nmi+0x1a/0x1e
         [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
         [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
         [<ffffffff810639b6>] ? native_write_msr_safe+0x6/0x40
         <<EOE>>  <IRQ>  [<ffffffff81006df8>] ?  x86_perf_event_set_period+0xd8/0x180
         [<ffffffff81006eec>] x86_pmu_start+0x4c/0x100
         [<ffffffff8100722d>] x86_pmu_enable+0x28d/0x300
         [<ffffffff811994d7>] perf_pmu_enable.part.81+0x7/0x10
         [<ffffffff8119cb70>] perf_mux_hrtimer_handler+0x200/0x280
         [<ffffffff8119c970>] ?  __perf_install_in_context+0xc0/0xc0
         [<ffffffff8110f92d>] __hrtimer_run_queues+0xfd/0x280
         [<ffffffff811100d8>] hrtimer_interrupt+0xa8/0x190
         [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
         [<ffffffff81051bd8>] local_apic_timer_interrupt+0x38/0x60
         [<ffffffff817af01d>] smp_apic_timer_interrupt+0x3d/0x50
         [<ffffffff817ad15c>] apic_timer_interrupt+0x8c/0xa0
         <EOI>  [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
         [<ffffffff81123de5>] ?  smp_call_function_single+0xd5/0x130
         [<ffffffff81123ddb>] ?  smp_call_function_single+0xcb/0x130
         [<ffffffff81199080>] ?  __perf_read_group_add.part.61+0x1a0/0x1a0
         [<ffffffff8119765a>] event_function_call+0x10a/0x120
         [<ffffffff8119c660>] ? ctx_resched+0x90/0x90
         [<ffffffff811971e0>] ? cpu_clock_event_read+0x30/0x30
         [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
         [<ffffffff8119772b>] _perf_event_enable+0x5b/0x70
         [<ffffffff81197388>] perf_event_for_each_child+0x38/0xa0
         [<ffffffff811976d0>] ? _perf_event_disable+0x60/0x60
         [<ffffffff811a0ffd>] perf_ioctl+0x12d/0x3c0
         [<ffffffff8134d855>] ? selinux_file_ioctl+0x95/0x1e0
         [<ffffffff8124a3a1>] do_vfs_ioctl+0xa1/0x5a0
         [<ffffffff81036d29>] ? sched_clock+0x9/0x10
         [<ffffffff8124a919>] SyS_ioctl+0x79/0x90
         [<ffffffff817ac4b2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
        ---[ end trace aef202839fe9a71d ]---
        Uhhuh. NMI received for unknown reason 2d on CPU 2.
        Do you have a strange power saving mode enabled?
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1457046448-6184-1-git-send-email-kan.liang@intel.com
      [ Fixed various typos and other small details. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c3d266c8
  24. 17 Feb, 2016 2 commits
  25. 29 Jan, 2016 2 commits
    • Peter Zijlstra's avatar
      perf/x86: De-obfuscate code · 8f04b853
      Peter Zijlstra authored
      Get rid of the 'onln' obfuscation.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8f04b853
    • Peter Zijlstra's avatar
      perf/x86: Fix uninitialized value usage · e01d8718
      Peter Zijlstra authored
      When calling intel_alt_er() with .idx != EXTRA_REG_RSP_* we will not
      initialize alt_idx and then use this uninitialized value to index an
      array.
      
      When that is not fatal, it can result in an infinite loop in its
      caller __intel_shared_reg_get_constraints(), with IRQs disabled.
      
      Alternative error modes are random memory corruption due to the
      cpuc->shared_regs->regs[] array overrun, which manifest in either
      get_constraints or put_constraints doing weird stuff.
      
      Only took 6 hours of painful debugging to find this. Neither GCC nor
      Smatch warnings flagged this bug.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: ae3f011f ("perf/x86/intel: Fix SLM MSR_OFFCORE_RSP1 valid_mask")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e01d8718
  26. 06 Jan, 2016 2 commits