1. 11 Jan, 2017 1 commit
    • Prarit Bhargava's avatar
      perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code · 6d6daa20
      Prarit Bhargava authored
      hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot
      cpu. This works as long as the boot CPU is actually on the physical package
      0, which is normaly the case after power on / reboot.
      
      But it fails with a NULL pointer dereference when a kdump kernel is started
      on a secondary socket which has a different physical package id because the
      locigal package translation for physical package 0 does not exist.
      
      Use the logical package id of the boot cpu instead of hard coded 0.
      
      [ tglx: Rewrote changelog once more ]
      
      Fixes: cf6d445f
      
       ("perf/x86/uncore: Track packages, not per CPU data")
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Harish Chegondi <harish.chegondi@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-prarit@redhat.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      6d6daa20
  2. 05 Jan, 2017 2 commits
    • David Carrillo-Cisneros's avatar
      perf/x86: Set pmu->module in Intel PMU modules · 74545f63
      David Carrillo-Cisneros authored
      
      
      The conversion of Intel PMU drivers into modules did not include reference
      counting. The machine will crash when attempting to  access deleted code
      if an event from a module PMU is started and the module removed before the
      event is destroyed.
      
      i.e. this crashes the machine:
      
      	$ insmod intel-rapl-perf.ko
      	$ perf stat -e power/energy-cores/ -C 0 &
      	$ rmmod intel-rapl-perf.ko
      
      Set THIS_MODULE to pmu->module in Intel module PMUs so that generic code
      can handle reference counting and deny rmmod while an event still exists.
      Signed-off-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1482455860-116269-1-git-send-email-davidcc@google.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      74545f63
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-4.10-20170104' of... · 4e06d4f0
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-4.10-20170104' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
       into perf/urgent
      
      Pull perf/urgent fixes and one improvement from Arnaldo Carvalho de Melo:
      
      Fixes:
      
        - Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira)
      
        - Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo)
      
        - Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo)
      
        - Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo)
      
        - Fix 'perf record --switch-output' documentation and comment (Jiri Olsa)
      
        - Fix 'perf probe' for cross arch probing (Masami Hiramatsu)
      
      Improvement:
      
        - Show total scheduling time in 'perf sched timehist' (Namhyumg Kim)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4e06d4f0
  3. 04 Jan, 2017 2 commits
  4. 03 Jan, 2017 6 commits
  5. 02 Jan, 2017 1 commit
    • Masami Hiramatsu's avatar
      perf probe: Fix to get correct modname from elf header · 1f2ed153
      Masami Hiramatsu authored
      
      
      Since 'perf probe' supports cross-arch probes, it is possible to analyze
      different arch kernel image which has different bits-per-long.
      
      In that case, it fails to get the module name because it uses the
      MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
      of the target arch bits-per-long.
      
      This fixes above issue by changing modname-offset based on the target
      archs bit width. This is ok because linux kernel uses LP64 model on
      64bit arch.
      
      E.g. without this (on x86_64, and target module is arm32):
      
        $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
        p:probe/configfs_lookup :configfs_lookup+0
                                ^-Here is an empty module name.
      
      With this fix, you can see correct module name:
      
        $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
        p:probe/configfs_lookup configfs:configfs_lookup+0
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1f2ed153
  6. 01 Jan, 2017 2 commits
    • Linus Torvalds's avatar
      Linux 4.10-rc2 · 0c744ea4
      Linus Torvalds authored
      0c744ea4
    • Linus Torvalds's avatar
      Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 4759d386
      Linus Torvalds authored
      Pull DAX updates from Dan Williams:
       "The completion of Jan's DAX work for 4.10.
      
        As I mentioned in the libnvdimm-for-4.10 pull request, these are some
        final fixes for the DAX dirty-cacheline-tracking invalidation work
        that was merged through the -mm, ext4, and xfs trees in -rc1. These
        patches were prepared prior to the merge window, but we waited for
        4.10-rc1 to have a stable merge base after all the prerequisites were
        merged.
      
        Quoting Jan on the overall changes in these patches:
      
           "So I'd like all these 6 patches to go for rc2. The first three
            patches fix invalidation of exceptional DAX entries (a bug which
            is there for a long time) - without these patches data loss can
            occur on power failure even though user called fsync(2). The other
            three patches change locking of DAX faults so that ->iomap_begin()
            is called in a more relaxed locking context and we are safe to
            start a transaction there for ext4"
      
        These have received a build success notification from the kbuild
        robot, and pass the latest libnvdimm unit tests. There have not been
        any -next releases since -rc1, so they have not appeared there"
      
      * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        ext4: Simplify DAX fault path
        dax: Call ->iomap_begin without entry lock during dax fault
        dax: Finish fault completely when loading holes
        dax: Avoid page invalidation races and unnecessary radix tree traversals
        mm: Invalidate DAX radix tree entries only if appropriate
        ext2: Return BH_New buffers for zeroed blocks
      4759d386
  7. 30 Dec, 2016 2 commits
  8. 29 Dec, 2016 2 commits
    • Olof Johansson's avatar
      mm/filemap: fix parameters to test_bit() · 98473f9f
      Olof Johansson authored
       mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte':
        mm/filemap.c:933:9: error: too few arguments to function 'test_bit'
          return test_bit(PG_waiters);
               ^~~~~~~~
      
      Fixes: b91e1302
      
       ('mm: optimize PageWaiters bit use for unlock_page()')
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Brown-paper-bag-by: default avatarLinus Torvalds <dummy@duh.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98473f9f
    • Linus Torvalds's avatar
      mm: optimize PageWaiters bit use for unlock_page() · b91e1302
      Linus Torvalds authored
      In commit 62906027
      
       ("mm: add PageWaiters indicating tasks are
      waiting for a page bit") Nick Piggin made our page locking no longer
      unconditionally touch the hashed page waitqueue, which not only helps
      performance in general, but is particularly helpful on NUMA machines
      where the hashed wait queues can bounce around a lot.
      
      However, the "clear lock bit atomically and then test the waiters bit"
      sequence turns out to be much more expensive than it needs to be,
      because you get a nasty stall when trying to access the same word that
      just got updated atomically.
      
      On architectures where locking is done with LL/SC, this would be trivial
      to fix with a new primitive that clears one bit and tests another
      atomically, but that ends up not working on x86, where the only atomic
      operations that return the result end up being cmpxchg and xadd.  The
      atomic bit operations return the old value of the same bit we changed,
      not the value of an unrelated bit.
      
      On x86, we could put the lock bit in the high bit of the byte, and use
      "xadd" with that bit (where the overflow ends up not touching other
      bits), and look at the other bits of the result.  However, an even
      simpler model is to just use a regular atomic "and" to clear the lock
      bit, and then the sign bit in eflags will indicate the resulting state
      of the unrelated bit #7.
      
      So by moving the PageWaiters bit up to bit #7, we can atomically clear
      the lock bit and test the waiters bit on x86 too.  And architectures
      with LL/SC (which is all the usual RISC suspects), the particular bit
      doesn't matter, so they are fine with this approach too.
      
      This avoids the extra access to the same atomic word, and thus avoids
      the costly stall at page unlock time.
      
      The only downside is that the interface ends up being a bit odd and
      specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
      love the resulting name of the new primitive, but I'd rather make the
      name be descriptive and very clear about the limitation imposed by
      trying to work across all relevant architectures than make it be some
      generic thing that doesn't make the odd semantics explicit.
      
      So this introduces the new architecture primitive
      
          clear_bit_unlock_is_negative_byte();
      
      and adds the trivial implementation for x86.  We have a generic
      non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
      combination) which can be overridden by any architecture that can do
      better.  According to Nick, Power has the same hickup x86 has, for
      example, but some other architectures may not even care.
      
      All these optimizations mean that my page locking stress-test (which is
      just executing a lot of small short-lived shell scripts: "make test" in
      the git source tree) no longer makes our page locking look horribly bad.
      Before all these optimizations, just the unlock_page() costs were just
      over 3% of all CPU overhead on "make test".  After this, it's down to
      0.66%, so just a quarter of the cost it used to be.
      
      (The difference on NUMA is bigger, but there this micro-optimization is
      likely less noticeable, since the big issue on NUMA was not the accesses
      to 'struct page', but the waitqueue accesses that were already removed
      by Nick's earlier commit).
      Acked-by: default avatarNick Piggin <npiggin@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      Cc: Steven Whitehouse <swhiteho@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andreas Gruenbacher <agruenba@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b91e1302
  9. 28 Dec, 2016 5 commits
    • Arnaldo Carvalho de Melo's avatar
      samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include · b6f4c667
      Arnaldo Carvalho de Melo authored
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3awp0nv8tpnblatojmwjww7z@git.kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b6f4c667
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 2d706e79
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes a hash corruption bug in the marvell driver"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: marvell - Copy IVDIG before launching partial DMA ahash requests
      2d706e79
    • Arnaldo Carvalho de Melo's avatar
      samples/bpf sock_example: Avoid getting ethhdr from two includes · ee12996c
      Arnaldo Carvalho de Melo authored
      To avoid the following build failure on Alpine Linux 3.4, that has
      clang-3.8 with the bpf target:
      
          HOSTCC  samples/bpf/sock_example.o
        In file included from /usr/include/net/ethernet.h:10:0,
                         from /git/linux/samples/bpf/sock_example.h:7,
                         from /git/linux/samples/bpf/sock_example.c:30:
        /usr/include/netinet/if_ether.h:96:8: error: redefinition of 'struct
        ethhdr'
         struct ethhdr {
                ^
        In file included from /git/linux/samples/bpf/sock_example.c:26:0:
        ./usr/include/linux/if_ether.h:144:8: note: originally defined here
         struct ethhdr {
                ^
        scripts/Makefile.host:124: recipe for target
        'samples/bpf/sock_example.o' failed
        make[2]: *** [samples/bpf/sock_example.o] Error 1
        /git/linux/Makefile:1658: recipe for target 'samples/bpf/' failed
      
      So include net/if_ether.h for the needs of sock_example.h, using the
      same include that sock_example.c uses.
      
      Cc: Alexei Starovoitov <ast@fb.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Joe Stringer <joe@ovn.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-m9avekl1b651qe1r1zd5tzz9@git.kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ee12996c
    • Namhyung Kim's avatar
      perf sched timehist: Show total scheduling time · 9396c9cb
      Namhyung Kim authored
      
      
      Show length of analyzed sample time and rate of idle task running.
      This also takes care of time range given by --time option.
      
        $ perf sched timehist -sI | tail
        Samples do not have callchains.
        Idle stats:
            CPU  0 idle for    930.316  msec  ( 92.93%)
            CPU  1 idle for    963.614  msec  ( 96.25%)
            CPU  2 idle for    885.482  msec  ( 88.45%)
            CPU  3 idle for    938.635  msec  ( 93.76%)
      
            Total number of unique tasks: 118
        Total number of context switches: 2337
                   Total run time (msec): 3718.048
            Total scheduling time (msec): 1001.131  (x 4)
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20161222060350.17655-3-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9396c9cb
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f18e4d0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Various ipvlan fixes from Eric Dumazet and Mahesh Bandewar.
      
          The most important is to not assume the packet is RX just because
          the destination address matches that of the device. Such an
          assumption causes problems when an interface is put into loopback
          mode.
      
       2) If we retry when creating a new tc entry (because we dropped the
          RTNL mutex in order to load a module, for example) we end up with
          -EAGAIN and then loop trying to replay the request. But we didn't
          reset some state when looping back to the top like this, and if
          another thread meanwhile inserted the same tc entry we were trying
          to, we re-link it creating an enless loop in the tc chain. Fix from
          Daniel Borkmann.
      
       3) There are two different WRITE bits in the MDIO address register for
          the stmmac chip, depending upon the chip variant. Due to a bug we
          could set them both, fix from Hock Leong Kweh.
      
       4) Fix mlx4 bug in XDP_TX handling, from Tariq Toukan.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: stmmac: fix incorrect bit set in gmac4 mdio addr register
        r8169: add support for RTL8168 series add-on card.
        net: xdp: remove unused bfp_warn_invalid_xdp_buffer()
        openvswitch: upcall: Fix vlan handling.
        ipv4: Namespaceify tcp_tw_reuse knob
        net: korina: Fix NAPI versus resources freeing
        net, sched: fix soft lockup in tc_classify
        net/mlx4_en: Fix user prio field in XDP forward
        tipc: don't send FIN message from connectionless socket
        ipvlan: fix multicast processing
        ipvlan: fix various issues in ipvlan_process_multicast()
      8f18e4d0
  10. 27 Dec, 2016 17 commits