- 05 Dec, 2018 1 commit
-
-
Jiri Olsa authored
commit ed6101bb upstream. Moving branch tracing setup to Intel core object into separate intel_pmu_bts_config function, because it's Intel specific. Suggested-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20181121101612.16272-1-jolsa@kernel.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 01 Dec, 2018 1 commit
-
-
Kan Liang authored
[ Upstream commit c10a8de0 ] KabyLake and CoffeeLake CPUs have the same client uncore events as SkyLake. Add the PCI IDs for the KabyLake Y, U, S processor lines and CoffeeLake U, H, S processor lines. Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/20181019170419.378-1-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- 04 Nov, 2018 1 commit
-
-
Kan Liang authored
[ Upstream commit 9d92cfea ] The counters on M3UPI Link 0 and Link 3 don't count properly, and writing 0 to these counters may causes system crash on some machines. The PCI BDF addresses of the M3UPI in the current code are incorrect. The correct addresses should be: D18:F1 0x204D D18:F2 0x204E D18:F5 0x204D Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1537538826-55489-1-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org>
-
- 10 Oct, 2018 1 commit
-
-
Jacek Tomaka authored
[ Upstream commit 16160c19 ] Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. Signed-off-by:
Jacek Tomaka <jacek.tomaka@poczta.fm> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180802013830.10600-1-jacekt@dugeo.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 Oct, 2018 1 commit
-
-
Kan Liang authored
[ Upstream commit 0592e57b ] LBR has a limited stack size. If a task has a deeper call stack than LBR's stack size, only the overflowed part is reported. A complete call stack may not be reconstructed by perf tool. Current code doesn't access all LBR registers. It only read the ones below the TOS. The LBR registers above the TOS will be discarded unconditionally. When a CALL is captured, the TOS is incremented by 1 , modulo max LBR stack size. The LBR HW only records the call stack information to the register which the TOS points to. It will not touch other LBR registers. So the registers above the TOS probably still store the valid call stack information for an overflowed call stack, which need to be reported. To retrieve complete call stack information, we need to start from TOS, read all LBR registers until an invalid entry is detected. 0s can be used to detect the invalid entry, because: - When a RET is captured, the HW zeros the LBR register which TOS points to, then decreases the TOS. - The LBR registers are reset to 0 when adding a new LBR event or scheduling an existing LBR event. - A taken branch at IP 0 is not expected The context switch code is also modified to save/restore all valid LBR registers. Furthermore, the LBR registers, which don't have valid call stack information, need to be reset in restore, because they may be polluted while swapped out. Here is a small test program, tchain_deep. Its call stack is deeper than 32. noinline void f33(void) { int i; for (i = 0; i < 10000000;) { if (i%2) i++; else i++; } } noinline void f32(void) { f33(); } noinline void f31(void) { f32(); } ... ... noinline void f1(void) { f2(); } int main() { f1(); } Here is the test result on SKX. The max stack size of SKX is 32. Without the patch: $ perf record -e cycles --call-graph lbr -- ./tchain_deep $ perf report --stdio # # Children Self Command Shared Object Symbol # ........ ........ ........... ................ ................. # 100.00% 99.99% tchain_deep tchain_deep [.] f33 | --99.99%--f30 f31 f32 f33 With the patch: $ perf record -e cycles --call-graph lbr -- ./tchain_deep $ perf report --stdio # Children Self Command Shared Object Symbol # ........ ........ ........... ................ .................. # 99.99% 0.00% tchain_deep tchain_deep [.] f1 | ---f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13 f14 f15 f16 f17 f18 f19 f20 f21 f22 f23 f24 f25 f26 f27 f28 f29 f30 f31 f32 f33 Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: eranian@google.com Link: https://lore.kernel.org/lkml/1528213126-4312-1-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 03 Aug, 2018 2 commits
-
-
Kan Liang authored
[ Upstream commit d71f11c0 ] For Nehalem and Westmere, there is only one fixed counter for W-Box. There is no index which is bigger than UNCORE_PMC_IDX_FIXED. It is not correct to use >= to check fixed counter. The code quality issue will bring problem when new counter index is introduced. Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
[ Upstream commit 4749f819 ] There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only exception is client IMC uncore, which has been specially handled. For generic code, it is not correct to use >= to check fixed counter. The code quality issue will bring problem when a new counter index is introduced. Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 Jul, 2018 1 commit
-
-
Hugh Dickins authored
commit 2c991e40 upstream. Markus reported that BTS is sporadically missing the tail of the trace in the perf_event data buffer: [decode error (1): instruction overflow] shown in GDB; and bisected it to the conversion of debug_store to PTI. A little "optimization" crept into alloc_bts_buffer(), which mistakenly placed bts_interrupt_threshold away from the 24-byte record boundary. Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from the BTS buffer base that is a multiple of the BTS record size." Revert "max" from a byte count to a record count, to calculate the bts_interrupt_threshold correctly: which turns out to fix problem seen. Fixes: c1961a46 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-and-tested-by:
Markus T Metzger <markus.t.metzger@intel.com> Signed-off-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v4.14+ Link: https://lkml.kernel.org/r/alpine.LSU.2.11.1807141248290.1614@eggly.anvilsSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 03 Jul, 2018 1 commit
-
-
Kan Liang authored
commit bb9fbe1b upstream. Event select bit 7 'Use Occupancy' in PCU Box is not available for counter 0 on BDX Add a constraint to fix it. Reported-by:
Stephane Eranian <eranian@google.com> Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Stephane Eranian <eranian@google.com> Cc: peterz@infradead.org Cc: ak@linux.intel.com Link: https://lkml.kernel.org/r/1510668400-301000-1-git-send-email-kan.liang@intel.com Cc: "Jin, Yao" <yao.jin@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 20 Jun, 2018 1 commit
-
-
Kan Liang authored
[ Upstream commit 4e949e9b ] The SMM freeze feature was introduced since PerfMon V2. But the current code unconditionally enables the feature for all platforms. It can generate #GP exception, if the related FREEZE_WHILE_SMM bit is set for the machine with PerfMon V1. To disable the feature for PerfMon V1, perf needs to - Remove the freeze_on_smi sysfs entry by moving intel_pmu_attrs to intel_pmu, which is only applied to PerfMon V2 and later. - Check the PerfMon version before flipping the SMM bit when starting CPU Fixes: 6089327f ("perf/x86: Add sysfs entry to freeze counters on SMI") Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: ak@linux.intel.com Cc: eranian@google.com Cc: acme@redhat.com Link: https://lkml.kernel.org/r/1524682637-63219-1-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 30 May, 2018 4 commits
-
-
Kan Liang authored
[ Upstream commit d31fc13f ] There is a bug when reading event->count with large PEBS enabled. Here is an example: # ./read_count 0x71f0 0x122c0 0x1000000001c54 0x100000001257d 0x200000000bdc5 In fixed period mode, the auto-reload mechanism could be enabled for PEBS events, but the calculation of event->count does not take the auto-reload values into account. Anyone who reads event->count will get the wrong result, e.g x86_pmu_read(). This bug was introduced with the auto-reload mechanism enabled since commit: 851559e3 ("perf/x86/intel: Use the PEBS auto reload mechanism when possible") Introduce intel_pmu_save_and_restart_reload() to calculate the event->count only for auto-reload. Since the counter increments a negative counter value and overflows on the sign switch, giving the interval: [-period, 0] the difference between two consequtive reads is: A) value2 - value1; when no overflows have happened in between, B) (0 - value1) + (value2 - (-period)); when one overflow happened in between, C) (0 - value1) + (n - 1) * (period) + (value2 - (-period)); when @n overflows happened in between. Here A) is the obvious difference, B) is the extension to the discrete interval, where the first term is to the top of the interval and the second term is from the bottom of the next interval and C) the extension to multiple intervals, where the middle term is the whole intervals covered. The equation for all cases is: value2 - value1 + n * period Previously the event->count is updated right before the sample output. But for case A, there is no PEBS record ready. It needs to be specially handled. Remove the auto-reload code from x86_perf_event_set_period() since we'll not longer call that function in this case. Based-on-code-from: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Fixes: 851559e3 ("perf/x86/intel: Use the PEBS auto reload mechanism when possible") Link: http://lkml.kernel.org/r/1518474035-21006-2-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
[ Upstream commit f605cfca ] Large fixed period values could be truncated on Broadwell, for example: perf record -e cycles -c 10000000000 Here the fixed period is 0x2540BE400, but the period which finally applied is 0x540BE400 - which is wrong. The reason is that x86_pmu::limit_period() uses an u32 parameter, so the high 32 bits of 'period' get truncated. This bug was introduced in: commit 294fe0f5 ("perf/x86/intel: Add INST_RETIRED.ALL workarounds") It's safe to use u64 instead of u32: - Although the 'left' is s64, the value of 'left' must be positive when calling limit_period(). - bdw_limit_period() only modifies the lowest 6 bits, it doesn't touch the higher 32 bits. Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 294fe0f5 ("perf/x86/intel: Add INST_RETIRED.ALL workarounds") Link: http://lkml.kernel.org/r/1519926894-3520-1-git-send-email-kan.liang@linux.intel.com [ Rewrote unacceptably bad changelog. ] Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
[ Upstream commit 82d71ed0 ] The PMU is disabled in intel_pmu_handle_irq(), but cpuc->enabled is not updated accordingly. This is fine in current usage because no-one checks it - but fix it for future code: for example, the drain_pebs() will be modified to fix an auto-reload bug. Properly save/restore the old PMU state. Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: kernel test robot <fengguang.wu@intel.com> Link: http://lkml.kernel.org/r/6f44ee84-56f8-79f1-559b-08e371eaeb78@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stephane Eranian authored
[ Upstream commit 71eb9ee9 ] this patch fix a bug in how the pebs->real_ip is handled in the PEBS handler. real_ip only exists in Haswell and later processor. It is actually the eventing IP, i.e., where the event occurred. As opposed to the pebs->ip which is the PEBS interrupt IP which is always off by one. The problem is that the real_ip just like the IP needs to be fixed up because PEBS does not record all the machine state registers, and in particular the code segement (cs). This is why we have the set_linear_ip() function. The problem was that set_linear_ip() was only used on the pebs->ip and not the pebs->real_ip. We have profiles which ran into invalid callstacks because of this. Here is an example: ..... 0: ffffffffffffff80 recent entry, marker kernel v ..... 1: 000000000040044d <= user address in kernel space! ..... 2: fffffffffffffe00 marker enter user v ..... 3: 000000000040044d ..... 4: 00000000004004b6 oldest entry Debugging output in get_perf_callchain(): [ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0 The problem is that the kernel entry in 1: points to a user level address. How can that be? The reason is that with PEBS sampling the instruction that caused the event to occur and the instruction where the CPU was when the interrupt was posted may be far apart. And sometime during that time window, the privilege level may change. This happens, for instance, when the PEBS sample is taken close to a kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level instruction. But by the time the PMU interrupt fired, the processor had already entered kernel space. This is why the debug output shows a user address with user_mode() false. The problem comes from PEBS not recording the code segment (cs) register. The register is used in x86_64 to determine if executing in kernel vs user space. This is okay because the kernel has a software workaround called set_linear_ip(). But the issue in setup_pebs_sample_data() is that set_linear_ip() is never called on the real_ip value when it is available (Haswell and later) and precise_ip > 1. This patch fixes this problem and eliminates the callchain discrepancy. The patch restructures the code around set_linear_ip() to minimize the number of times the IP has to be set. Signed-off-by:
Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: kan.liang@intel.com Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 May, 2018 1 commit
-
-
Peter Zijlstra authored
commit a5f81290 upstream. > arch/x86/events/intel/cstate.c:307 cstate_pmu_event_init() warn: potential spectre issue 'pkg_msr' (local cap) Userspace controls @attr, sanitize cfg (attr->config) before using it to index an array. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 Mar, 2018 3 commits
-
-
Kan Liang authored
commit 320b0651 upstream. The number of CHAs is miscalculated on multi-domain PCI Skylake server systems, resulting in an uncore driver initialization error. Gary Kroening explains: "For systems with a single PCI segment, it is sufficient to look for the bus number to change in order to determine that all of the CHa's have been counted for a single socket. However, for multi PCI segment systems, each socket is given a new segment and the bus number does NOT change. So looking only for the bus number to change ends up counting all of the CHa's on all sockets in the system. This leads to writing CPU MSRs beyond a valid range and causes an error in ivbep_uncore_msr_init_box()." To fix this bug, query the number of CHAs from the CAPID6 register: it should read bits 27:0 in the CAPID6 register located at Device 30, Function 3, Offset 0x9C. These 28 bits form a bit vector of available LLC slices and the CHAs that manage those slices. Reported-by:
Kroening, Gary <gary.kroening@hpe.com> Tested-by:
Kroening, Gary <gary.kroening@hpe.com> Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: abanman@hpe.com Cc: dimitri.sivanich@hpe.com Cc: hpa@zytor.com Cc: mike.travis@hpe.com Cc: russ.anderson@hpe.com Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1520967094-13219-1-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit e5ea9b54 upstream. We intended to clear the lowest 6 bits but because of a type bug we clear the high 32 bits as well. Andi says that periods are rarely more than U32_MAX so this bug probably doesn't have a huge runtime impact. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 294fe0f5 ("perf/x86/intel: Add INST_RETIRED.ALL workarounds") Link: http://lkml.kernel.org/r/20180317115216.GB4035@mwandaSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kan Liang authored
commit 31766094 upstream. There is no event extension (bit 21) for SKX UPI, so use 'event' instead of 'event_ext'. Reported-by:
Stephane Eranian <eranian@google.com> Signed-off-by:
Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: cd34cd97 ("perf/x86/intel/uncore: Add Skylake server uncore support") Link: http://lkml.kernel.org/r/1520004150-4855-1-git-send-email-kan.liang@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 03 Mar, 2018 1 commit
-
-
Thomas Gleixner authored
[ Upstream commit 7ad1437d ] A recent commit introduced an extra merge_attr() call in the skylake branch, which causes a memory leak. Store the pointer to the extra allocated memory and free it at the end of the function. Fixes: a5df70c3 ("perf/x86: Only show format attributes when supported") Reported-by:
Tommi Rantala <tommi.t.rantala@nokia.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 22 Feb, 2018 1 commit
-
-
Jia Zhang authored
commit b399151c upstream. x86_mask is a confusing name which is hard to associate with the processor's stepping. Additionally, correct an indent issue in lib/cpu.c. Signed-off-by:
Jia Zhang <qianyue.zj@alibaba-inc.com> [ Updated it to more recent kernels. ] Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 17 Jan, 2018 1 commit
-
-
Peter Zijlstra authored
commit 99a9dc98 upstream. The intel_bts driver does not use the 'normal' BTS buffer which is exposed through the cpu_entry_area but instead uses the memory allocated for the perf AUX buffer. This obviously comes apart when using PTI because then the kernel mapping; which includes that AUX buffer memory; disappears. Fixing this requires to expose a mapping which is visible in all context and that's not trivial. As a quick fix disable this driver when PTI is enabled to prevent malfunction. Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig") Reported-by:
Vince Weaver <vincent.weaver@maine.edu> Reported-by:
Robert Święcki <robert@swiecki.net> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: greg@kroah.com Cc: hughd@google.com Cc: luto@amacapital.net Cc: Vince Weaver <vince@deater.net> Cc: torvalds@linux-foundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180114102713.GB6166@worktop.programming.kicks-ass.netSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 Jan, 2018 1 commit
-
-
Peter Zijlstra authored
commit 42f3bdc5 upstream. Thomas reported the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498 caller is native_flush_tlb_single+0x57/0xc0 native_flush_tlb_single+0x57/0xc0 __set_pte_vaddr+0x2d/0x40 set_pte_vaddr+0x2f/0x40 cea_set_pte+0x30/0x40 ds_update_cea.constprop.4+0x4d/0x70 reserve_ds_buffers+0x159/0x410 x86_reserve_hardware+0x150/0x160 x86_pmu_event_init+0x3e/0x1f0 perf_try_init_event+0x69/0x80 perf_event_alloc+0x652/0x740 SyS_perf_event_open+0x3f6/0xd60 do_syscall_64+0x5c/0x190 set_pte_vaddr is used to map the ds buffers into the cpu entry area, but there are two problems with that: 1) The resulting flush is not supposed to be called in preemptible context 2) The cpu entry area is supposed to be per CPU, but the debug store buffers are mapped for all CPUs so these mappings need to be flushed globally. Add the necessary preemption protection across the mapping code and flush TLBs globally. Fixes: c1961a46 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-by:
Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.netSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 Jan, 2018 2 commits
-
-
Hugh Dickins authored
commit c1961a46 upstream. The BTS and PEBS buffers both have their virtual addresses programmed into the hardware. This means that any access to them is performed via the page tables. The times that the hardware accesses these are entirely dependent on how the performance monitoring hardware events are set up. In other words, there is no way for the kernel to tell when the hardware might access these buffers. To avoid perf crashes, place 'debug_store' allocate pages and map them into the cpu_entry_area. The PEBS fixup buffer does not need this treatment. [ tglx: Got rid of the kaiser_add_mapping() complication ] Signed-off-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: keescook@google.com Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit 10043e02 upstream. The Intel PEBS/BTS debug store is a design trainwreck as it expects virtual addresses which must be visible in any execution context. So it is required to make these mappings visible to user space when kernel page table isolation is active. Provide enough room for the buffer mappings in the cpu_entry_area so the buffers are available in the user space visible page tables. At the point where the kernel side entry area is populated there is no buffer available yet, but the kernel PMD must be populated. To achieve this set the entries for these buffers to non present. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 Dec, 2017 1 commit
-
-
Andi Kleen authored
commit 2fe1bc1f upstream. [ Note, this is a Git cherry-pick of the following commit: a47ba4d7 ("perf/x86: Enable free running PEBS for REGS_USER/INTR") ... for easier x86 PTI code testing and back-porting. ] Currently free running PEBS is disabled when user or interrupt registers are requested. Most of the registers are actually available in the PEBS record and can be supported. So we just need to check for the supported registers and then allow it: it is all except for the segment register. For user registers this only works when the counter is limited to ring 3 only, so this also needs to be checked. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 30 Nov, 2017 1 commit
-
-
Andi Kleen authored
commit 58ba4d5a upstream. 0day testing reported a perf test regression on Haswell systems without RTM. Commit a5df70c3 hides the in_tx/in_tx_cp attributes when RTM is not available, but the TSX events are still available in sysfs. Due to the missing attributes the event parser fails on those files. Don't show the TSX events in sysfs when RTM is not available on Haswell/Broadwell/Skylake. Fixes: a5df70c3 (perf/x86: Only show format attributes when supported) Reported-by:
kernel test robot <xiaolong.ye@intel.com> Tested-by:
Jin Yao <yao.jin@linux.intel.com> Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171109000718.14137-1-andi@firstfloor.orgSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 Nov, 2017 1 commit
-
-
Greg Kroah-Hartman authored
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by:
Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by:
Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 Oct, 2017 1 commit
-
-
Alexander Shishkin authored
Commit: d2878d64 ("perf/x86/intel/bts: Disallow use by unprivileged users on paranoid systems") ... adds a privilege check in the exactly wrong place in the event init path: after the 'LBR exclusive' reference has been taken, and doesn't release it in the case of insufficient privileges. After this, nobody in the system gets to use PT or LBR afterwards. This patch moves the privilege check to where it should have been in the first place. Signed-off-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: d2878d64 ("perf/x86/intel/bts: Disallow use by unprivileged users on paranoid systems") Link: http://lkml.kernel.org/r/20171023123533.16973-1-alexander.shishkin@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 10 Oct, 2017 1 commit
-
-
Colin Ian King authored
Currently if an allocation fails then the error return paths don't free up any currently allocated pmus[].boxes and pmus causing a memory leak. Add an error clean up exit path that frees these objects. Detected by CoverityScan, CID#711632 ("Resource Leak") Signed-off-by:
Colin Ian King <colin.king@canonical.com> Acked-by:
Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Fixes: 087bfbb0 ("perf/x86: Add generic Intel uncore PMU support") Link: http://lkml.kernel.org/r/20171009172655.6132-1-colin.king@canonical.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 25 Sep, 2017 3 commits
-
-
Kan Liang authored
There are 6 IIO/IRP boxes for CBDMA, PCIe0-2, MCP 0 and MCP 1 separately. Correct the num_boxes. Signed-off-by:
Kan Liang <Kan.liang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: acme@kernel.org Link: http://lkml.kernel.org/r/1505149816-12580-1-git-send-email-kan.liang@intel.com
-
Kan Liang authored
DENVERTON and GEMINI_LAKE support same RAPL counters as Apollo Lake. Signed-off-by:
Kan Liang <Kan.liang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: piotr.luc@intel.com Cc: harry.pan@intel.com Cc: srinivas.pandruvada@linux.intel.com Link: http://lkml.kernel.org/r/20170908213449.6224-3-kan.liang@intel.com
-
Kan Liang authored
Skylake server uses the same C-state residency events as Sandy Bridge. Denverton and Gemini lake use the same C-state residency events as Apollo Lake. Signed-off-by:
Kan Liang <Kan.liang@intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: peterz@infradead.org Cc: piotr.luc@intel.com Cc: harry.pan@intel.com Cc: srinivas.pandruvada@linux.intel.com Link: http://lkml.kernel.org/r/20170908213449.6224-1-kan.liang@intel.com
-
- 14 Sep, 2017 1 commit
-
-
Peter Zijlstra authored
The lockup_detector_suspend/resume() interface is broken in several ways especially as it results in recursive locking of the CPU hotplug lock. Use the new stop/restart interface in the perf NMI watchdog to temporarily disable and reenable the already active watchdog events. That's enough to handle it. Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Don Zickus <dzickus@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: Ulrich Obergfell <uobergfe@redhat.com> Link: http://lkml.kernel.org/r/20170912194146.247141871@linutronix.deSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 29 Aug, 2017 3 commits
-
-
Peter Zijlstra authored
Move the 'max_precise' capability into generic x86 code where it belongs. This fixes a sysfs splat on !Intel systems where we fail to set x86_pmu_caps_group.atts. Reported-and-tested-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Andi Kleen <ak@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Fixes: 22688d1c20f5 ("x86/perf: Export some PMU attributes in caps/ directory") Link: http://lkml.kernel.org/r/20170828104650.2u3rsim4jafyjzv2@hirez.programming.kicks-ass.netSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Kan Liang authored
For understanding how the workload maps to memory channels and hardware behavior, it's very important to collect address maps with physical addresses. For example, 3D XPoint access can only be found by filtering the physical address. Add a new sample type for physical address. perf already has a facility to collect data virtual address. This patch introduces a function to convert the virtual address to physical address. The function is quite generic and can be extended to any architecture as long as a virtual address is provided. - For kernel direct mapping addresses, virt_to_phys is used to convert the virtual addresses to physical address. - For user virtual addresses, __get_user_pages_fast is used to walk the pages tables for user physical address. - This does not work for vmalloc addresses right now. These are not resolved, but code to do that could be added. The new sample type requires collecting the virtual address. The virtual address will not be output unless SAMPLE_ADDR is applied. For security, the physical address can only be exposed to root or privileged user. Tested-by:
Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by:
Kan Liang <kan.liang@intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: mpe@ellerman.id.au Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Alexander Shishkin authored
I just noticed that hw.itrace_started and hw.config are aliased to the same location. Now, the PT driver happens to use both, which works out fine by sheer luck: - STORE(hw.itrace_start) is ordered before STORE(hw.config), in the program order, although there are no compiler barriers to ensure that, - to the perf_log_itrace_start() hw.itrace_start looks set at the same time as when it is intended to be set because both stores happen in the same path, - hw.config is never reset to zero in the PT driver. Now, the use of hw.config by the PT driver makes more sense (it being a HW PMU) than messing around with itrace_started, which is an awkward API to begin with. This patch replaces hw.itrace_started with an attach_state bit and an API call for the PMU drivers to use to communicate the condition. Signed-off-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20170330153956.25994-1-alexander.shishkin@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 25 Aug, 2017 4 commits
-
-
Andi Kleen authored
It can be difficult to figure out for user programs what features the x86 CPU PMU driver actually supports. Currently it requires grepping in dmesg, but dmesg is not always available. This adds a caps directory to /sys/bus/event_source/devices/cpu/, similar to the caps already used on intel_pt, which can be used to discover the available capabilities cleanly. Three capabilities are defined: - pmu_name: Underlying CPU name known to the driver - max_precise: Max precise level supported - branches: Known depth of LBR. Example: % grep . /sys/bus/event_source/devices/cpu/caps/* /sys/bus/event_source/devices/cpu/caps/branches:32 /sys/bus/event_source/devices/cpu/caps/max_precise:3 /sys/bus/event_source/devices/cpu/caps/pmu_name:skylake Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170822185201.9261-3-andi@firstfloor.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Only show the Intel format attributes in sysfs when the feature is actually supported with the current model numbers. This allows programs to probe what format attributes are available, and give a sensible error message to users if they are not. This handles near all cases for intel attributes since Nehalem, except the (obscure) case when the model number if known, but PEBS is disabled in PERF_CAPABILITIES. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170822185201.9261-2-andi@firstfloor.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Skylake changed the encoding of the PEBS data source field. Some combinations are not available anymore, but some new cases e.g. for L4 cache hit are added. Fix up the conversion table for Skylake, similar as had been done for Nehalem. On Skylake server the encoding for L4 actually means persistent memory. Handle this case too. To properly describe it in the abstracted perf format I had to add some new fields. Since a hit can have only one level add a new field that is an enumeration, not a bit field to describe the level. It can describe any level. Some numbers are also used to describe PMEM and LFB. Also add a new generic remote flag that can be combined with the generic level to signify a remote cache. And there is an extension field for the snoop indication to handle the Forward state. I didn't add a generic flag for hops because it's not needed for Skylake. I changed the existing encodings for older CPUs to also fill in the new level and remote fields. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/20170816222156.19953-3-andi@firstfloor.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Andi Kleen authored
Minor cleanup: use an explicit x86_pmu flag to handle the missing Lock / TLB information on Nehalem, instead of always checking the model number for each PEBS sample. Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: jolsa@kernel.org Link: http://lkml.kernel.org/r/20170816222156.19953-2-andi@firstfloor.orgSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-