1. 22 Nov, 2018 1 commit
  2. 20 Nov, 2018 1 commit
  3. 29 Oct, 2018 1 commit
  4. 16 Oct, 2018 1 commit
    • Jiri Olsa's avatar
      perf/x86/intel: Export mem events only if there's PEBS support · d4ae5529
      Jiri Olsa authored
      Memory events depends on PEBS support and access to LDLAT MSR, but we
      display them in /sys/devices/cpu/events even if the CPU does not
      provide those, like for KVM guests.
      
      That brings the false assumption that those events should be
      available, while they fail event to open.
      
      Separating the mem-* events attributes and merging them with
      cpu_events only if there's PEBS support detected.
      
      We could also check if LDLAT MSR is available, but the PEBS check
      seems to cover the need now.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/20180906135748.GC9577@kravaSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d4ae5529
  5. 02 Oct, 2018 4 commits
    • Kan Liang's avatar
      perf/x86/intel: Add quirk for Goldmont Plus · 7c5314b8
      Kan Liang authored
      A ucode patch is needed for Goldmont Plus while counter freezing feature
      is enabled. Otherwise, there will be some issues, e.g. PMI flood with
      some events.
      
      Add a quirk to check microcode version. If the system starts with the
      wrong ucode, leave the counter-freezing feature permanently disabled.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-3-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7c5314b8
    • Peter Zijlstra's avatar
      x86/cpu: Sanitize FAM6_ATOM naming · f2c4db1b
      Peter Zijlstra authored
      Going primarily by:
      
        https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors
      
      with additional information gleaned from other related pages; notably:
      
       - Bonnell shrink was called Saltwell
       - Moorefield is the Merriefield refresh which makes it Airmont
      
      The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE
      
        for i in `git grep -l FAM6_ATOM` ; do
      	sed -i  -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g'		\
      		-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/'		\
      		-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g'		\
      		-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g'	\
      		-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g'		\
      		-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g'	\
      		-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g'	\
      		-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g'	\
      		-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g'	\
      		-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g'		\
      		-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
        done
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: dave.hansen@linux.intel.com
      Cc: len.brown@intel.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f2c4db1b
    • Andi Kleen's avatar
      perf/x86/intel: Add a separate Arch Perfmon v4 PMI handler · af3bdb99
      Andi Kleen authored
      Implements counter freezing for Arch Perfmon v4 (Skylake and
      newer). This allows to speed up the PMI handler by avoiding
      unnecessary MSR writes and make it more accurate.
      
      The Arch Perfmon v4 PMI handler is substantially different than
      the older PMI handler.
      
      Differences to the old handler:
      
      - It relies on counter freezing, which eliminates several MSR
        writes from the PMI handler and lowers the overhead significantly.
      
        It makes the PMI handler more accurate, as all counters get
        frozen atomically as soon as any counter overflows. So there is
        much less counting of the PMI handler itself.
      
        With the freezing we don't need to disable or enable counters or
        PEBS. Only BTS which does not support auto-freezing still needs to
        be explicitly managed.
      
      - The PMU acking is done at the end, not the beginning.
        This makes it possible to avoid manual enabling/disabling
        of the PMU, instead we just rely on the freezing/acking.
      
      - The APIC is acked before reenabling the PMU, which avoids
        problems with LBRs occasionally not getting unfreezed on Skylake.
      
      - Looping is only needed to workaround a corner case which several PMIs
        are very close to each other. For common cases, the counters are freezed
        during PMI handler. It doesn't need to do re-check.
      
      This patch:
      
      - Adds code to enable v4 counter freezing
      - Fork <=v3 and >=v4 PMI handlers into separate functions.
      - Add kernel parameter to disable counter freezing. It took some time to
        debug counter freezing, so in case there are new problems we added an
        option to turn it off. Would not expect this to be used until there
        are new bugs.
      - Only for big core. The patch for small core will be posted later
        separately.
      
      Performance:
      
      When profiling a kernel build on Kabylake with different perf options,
      measuring the length of all NMI handlers using the nmi handler
      trace point:
      
      V3 is without counter freezing.
      V4 is with counter freezing.
      The value is the average cost of the PMI handler.
      (lower is better)
      
      perf options    `           V3(ns) V4(ns)  delta
      -c 100000                   1088   894     -18%
      -g -c 100000                1862   1646    -12%
      --call-graph lbr -c 100000  3649   3367    -8%
      --c.g. dwarf -c 100000      2248   1982    -12%
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      af3bdb99
    • Kan Liang's avatar
      perf/x86/intel: Factor out common code of PMI handler · ba12d20e
      Kan Liang authored
      The Arch Perfmon v4 PMI handler is substantially different than
      the older PMI handler. Instead of adding more and more ifs cleanly
      fork the new handler into a new function, with the main common
      code factored out into a common function.
      
      Fix complaint from checkpatch.pl by removing "false" from "static bool
      warned".
      
      No functional change.
      
      Based-on-code-from: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/1533712328-2834-1-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ba12d20e
  6. 25 Jul, 2018 4 commits
    • Kan Liang's avatar
      perf/x86/intel: Support Extended PEBS for Goldmont Plus · a38b0ba1
      Kan Liang authored
      Enable the extended PEBS for Goldmont Plus.
      
      There is no specific PEBS constrains for Goldmont Plus. Removing the
      pebs_constraints for Goldmont Plus.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20180309021542.11374-4-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a38b0ba1
    • Kan Liang's avatar
      perf/x86/intel/ds: Handle PEBS overflow for fixed counters · ec71a398
      Kan Liang authored
      The pebs_drain() need to support fixed counters. The DS Save Area now
      include "counter reset value" fields for each fixed counters.
      
      Extend the related variables (e.g. mask, counters, error) to support
      fixed counters. There is no extended PEBS in PEBS v2 and earlier PEBS
      format. Only need to change the code for PEBS v3 and later PEBS format.
      
      Extend the pebs_event_reset[] logic to support new "counter reset value" fields.
      
      Increase the reserve space for fixed counters.
      
      Based-on-code-from: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20180309021542.11374-3-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ec71a398
    • Kan Liang's avatar
      perf/x86/intel: Support PEBS on fixed counters · 4f08b625
      Kan Liang authored
      The Extended PEBS feature supports PEBS on fixed-function performance
      counters as well as all four general purpose counters.
      
      It has to change the order of PEBS and fixed counter enabling to make
      sure PEBS is enabled for the fixed counters.
      
      The change of the order doesn't impact the behavior of current code on
      other platforms which don't support extended PEBS.
      Because there is no dependency among those enable/disable functions.
      
      Don't enable IRQ generation (0x8) for MSR_ARCH_PERFMON_FIXED_CTR_CTRL.
      The PEBS ucode will handle the interrupt generation.
      
      Based-on-code-from: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20180309021542.11374-2-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4f08b625
    • Peter Zijlstra's avatar
      perf/x86/intel: Fix unwind errors from PEBS entries (mk-II) · 6cbc304f
      Peter Zijlstra authored
      Vince reported the perf_fuzzer giving various unwinder warnings and
      Josh reported:
      
      > Deja vu.  Most of these are related to perf PEBS, similar to the
      > following issue:
      >
      >   b8000586 ("perf/x86/intel: Cure bogus unwind from PEBS entries")
      >
      > This is basically the ORC version of that.  setup_pebs_sample_data() is
      > assembling a franken-pt_regs which ORC isn't happy about.  RIP is
      > inconsistent with some of the other registers (like RSP and RBP).
      
      And where the previous unwinder only needed BP,SP ORC also requires
      IP. But we cannot spoof IP because then the sample will get displaced,
      entirely negating the point of PEBS.
      
      So cure the whole thing differently by doing the unwind early; this
      does however require a means to communicate we did the unwind early.
      We (ab)use an unused sample_type bit for this, which we set on events
      that fill out the data->callchain before the normal
      perf_prepare_sample().
      Debugged-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Tested-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6cbc304f
  7. 25 Apr, 2018 1 commit
  8. 20 Mar, 2018 2 commits
  9. 09 Mar, 2018 3 commits
  10. 15 Feb, 2018 1 commit
  11. 27 Dec, 2017 1 commit
  12. 17 Dec, 2017 1 commit
  13. 13 Nov, 2017 1 commit
  14. 29 Sep, 2017 1 commit
  15. 14 Sep, 2017 1 commit
  16. 29 Aug, 2017 1 commit
  17. 25 Aug, 2017 4 commits
  18. 21 Jul, 2017 1 commit
    • Jiri Olsa's avatar
      perf/x86/intel: Add proper condition to run sched_task callbacks · df6c3db8
      Jiri Olsa authored
      We have 2 functions using the same sched_task callback:
      
        - PEBS drain for free running counters
        - LBR save/store
      
      Both of them are called from intel_pmu_sched_task() and
      either of them can be unwillingly triggered when the
      other one is configured to run.
      
      Let's say there's PEBS drain configured in sched_task
      callback for the event, but in the callback itself
      (intel_pmu_sched_task()) we will also run the code for
      LBR save/restore, which we did not ask for, but the
      code in intel_pmu_sched_task() does not check for that.
      
      This can lead to extra cycles in some perf monitoring,
      like when we monitor PEBS event without LBR data.
      
        # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000
      
        (We need PEBS, non freq/non timestamp event to enable
         the sched_task callback)
      
      The perf stat of cycles and msr:write_msr for above
      command before the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,519,557,441      cycles:k
              91,195,527      msr:write_msr
      
            29.334476406 seconds time elapsed
      
      And after the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,704,973,540      cycles:k
              27,184,720      msr:write_msr
      
            16.977875900 seconds time elapsed
      
      There's no affect on cycles:k because the sched_task happens
      with events switched off, however the msr:write_msr tracepoint
      counter together with almost 50% of time speedup show the
      improvement.
      
      Monitoring LBR event and having extra PEBS drain processing
      in sched_task callback showed just a little speedup, because
      the drain function does not do much extra work in case there
      is no PEBS data.
      
      Adding conditions to recognize the configured work that needs
      to be done in the x86_pmu's sched_task callback.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170719075247.GA27506@kravaSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      df6c3db8
  19. 18 Jul, 2017 1 commit
    • Kan Liang's avatar
      perf/x86/intel: Add Goldmont Plus CPU PMU support · dd0b06b5
      Kan Liang authored
      Add perf core PMU support for Intel Goldmont Plus CPU cores:
      
       - The init code is based on Goldmont.
       - There is a new cache event list, based on the Goldmont cache event
         list.
       - All four general-purpose performance counters support PEBS.
       - The first general-purpose performance counter is for reduced skid
         PEBS mechanism. Using :ppp to indicate the event which want to do
         reduced skid PEBS.
       - Goldmont Plus has 4-wide pipeline for Topdown
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Link: http://lkml.kernel.org/r/20170712134423.17766-1-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      dd0b06b5
  20. 22 Jun, 2017 1 commit
  21. 26 May, 2017 1 commit
  22. 23 May, 2017 1 commit
    • Kan Liang's avatar
      perf/x86: Add sysfs entry to freeze counters on SMI · 6089327f
      Kan Liang authored
      Currently, the SMIs are visible to all performance counters, because
      many users want to measure everything including SMIs. But in some
      cases, the SMI cycles should not be counted - for example, to calculate
      the cost of an SMI itself. So a knob is needed.
      
      When setting FREEZE_WHILE_SMM bit in IA32_DEBUGCTL, all performance
      counters will be effected. There is no way to do per-counter freeze
      on SMI. So it should not use the per-event interface (e.g. ioctl or
      event attribute) to set FREEZE_WHILE_SMM bit.
      
      Adds sysfs entry /sys/device/cpu/freeze_on_smi to set FREEZE_WHILE_SMM
      bit in IA32_DEBUGCTL. When set, freezes perfmon and trace messages
      while in SMM.
      
      Value has to be 0 or 1. It will be applied to all processors.
      
      Also serialize the entire setting so we don't get multiple concurrent
      threads trying to update to different values.
      Signed-off-by: default avatarKan Liang <Kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: acme@kernel.org
      Cc: bp@alien8.de
      Cc: jolsa@kernel.org
      Link: http://lkml.kernel.org/r/1494600673-244667-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6089327f
  23. 14 Apr, 2017 1 commit
    • Kan Liang's avatar
      perf/x86: Fix spurious NMI with PEBS Load Latency event · fd583ad1
      Kan Liang authored
      Spurious NMIs will be observed with the following command:
      
        while :; do
          perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp"
                        -e "cpu/umask=0x03,event=0x0/"
                        -e "cpu/umask=0x02,event=0x0/"
                        -e cycles,branches,cache-misses
                        -e cache-references -- sleep 10
        done
      
      The bug was introduced by commit:
      
        8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      
      That commit clears the status bits for the counters used for PEBS
      events, by masking the whole 64 bits pebs_enabled. However, only the
      low 32 bits of both status and pebs_enabled are reserved for PEBS-able
      counters.
      
      For status bits 32-34 are fixed counter overflow bits. For
      pebs_enabled bits 32-34 are for PEBS Load Latency.
      
      In the test case, the PEBS Load Latency event and fixed counter event
      could overflow at the same time. The fixed counter overflow bit will
      be cleared by mistake. Once it is cleared, the fixed counter overflow
      never be processed, which finally trigger spurious NMI.
      
      Correct the PEBS enabled mask by ignoring the non-PEBS bits.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 8077eca0 ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+")
      Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fd583ad1
  24. 16 Mar, 2017 1 commit
  25. 17 Jan, 2017 1 commit
    • Zhou Chengming's avatar
      perf/x86/intel: Handle exclusive threadid correctly on CPU hotplug · 4e71de79
      Zhou Chengming authored
      The CPU hotplug function intel_pmu_cpu_starting() sets
      cpu_hw_events.excl_thread_id unconditionally to 1 when the shared exclusive
      counters data structure is already availabe for the sibling thread.
      
      This works during the boot process because the first sibling gets threadid
      0 assigned and the second sibling which shares the data structure gets 1.
      
      But when the first thread of the core is offlined and onlined again it
      shares the data structure with the second thread and gets exclusive thread
      id 1 assigned as well.
      
      Prevent this by checking the threadid of the already online thread.
      
      [ tglx: Rewrote changelog ]
      Signed-off-by: default avatarZhou Chengming <zhouchengming1@huawei.com>
      Cc: NuoHan Qiao <qiaonuohan@huawei.com>
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: kan.liang@intel.com
      Cc: dave.hansen@linux.intel.com
      Cc: eranian@google.com
      Cc: qiaonuohan@huawei.com
      Cc: davidcc@google.com
      Cc: guohanjun@huawei.com
      Link: http://lkml.kernel.org/r/1484536871-3131-1-git-send-email-zhouchengming1@huawei.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      ---					---
       arch/x86/events/intel/core.c |    7 +++++--
       1 file changed, 5 insertions(+), 2 deletions(-)
      4e71de79
  26. 11 Jan, 2017 1 commit
  27. 22 Dec, 2016 1 commit
    • Stephane Eranian's avatar
      perf/x86/pebs: Fix handling of PEBS buffer overflows · daa864b8
      Stephane Eranian authored
      This patch solves a race condition between PEBS and the PMU handler.
      
      In case multiple PEBS events are sampled at the same time,
      it is possible to have GLOBAL_STATUS bit 62 set indicating
      PEBS buffer overflow and also seeing at most 3 PEBS counters
      having their bits set in the status register. This is a sign
      that there was at least one PEBS record pending at the time
      of the PMU interrupt. PEBS counters must only be processed
      via the drain_pebs() calls, and not via the regular sample
      processing loop coming after that the function, otherwise
      phony regular samples may be generated in the sampling buffer
      not marked with the EXACT tag.
      
      Another possibility is to have one PEBS event and at least
      one non-PEBS event whic hoverflows while PEBS has armed. In this
      case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
      status bit for the PEBS counter will be on Skylake.
      
      To avoid this problem, we systematically ignore the PEBS-enabled
      counters from the GLOBAL_STATUS mask and we always process PEBS
      events via drain_pebs().
      
      The problem manifested itself by having non-exact samples when
      sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
      not have the EXACT flag set.
      
      Note that this problem is only present on Skylake processor.
      This fix is harmless on older processors.
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      daa864b8
  28. 06 Dec, 2016 1 commit
    • Peter Zijlstra (Intel)'s avatar
      perf/x86: Fix full width counter, counter overflow · 7f612a7f
      Peter Zijlstra (Intel) authored
      Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM.
      
      Both these parts have full_width_write set, and that does indeed have
      a problem. In order to deal with counter wrap, we must sample the
      counter at at least half the counter period (see also the sampling
      theorem) such that we can unambiguously reconstruct the count.
      
      However commit:
      
        069e0c3c ("perf/x86/intel: Support full width counting")
      
      sets the sampling interval to the full period, not half.
      
      Fixing that exposes another issue, in that we must not sign extend the
      delta value when we shift it right; the counter cannot have
      decremented after all.
      
      With both these issues fixed, counter overflow functions correctly
      again.
      Reported-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Tested-by: default avatarLiang, Kan <kan.liang@intel.com>
      Tested-by: default avatarOdzioba, Lukasz <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@vger.kernel.org
      Fixes: 069e0c3c ("perf/x86/intel: Support full width counting")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7f612a7f