1. 14 Mar, 2016 3 commits
    • Seth Forshee's avatar
      fuse: Add reference counting for fuse_io_priv · 744742d6
      Seth Forshee authored
      The 'reqs' member of fuse_io_priv serves two purposes. First is to track
      the number of oustanding async requests to the server and to signal that
      the io request is completed. The second is to be a reference count on the
      structure to know when it can be freed.
      For sync io requests these purposes can be at odds.  fuse_direct_IO() wants
      to block until the request is done, and since the signal is sent when
      'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
      use the object after the userspace server has completed processing
      requests. This leads to some handshaking and special casing that it
      needlessly complicated and responsible for at least one race condition.
      It's much cleaner and safer to maintain a separate reference count for the
      object lifecycle and to let 'reqs' just be a count of outstanding requests
      to the userspace server. Then we can know for sure when it is safe to free
      the object without any handshaking or special cases.
      The catch here is that most of the time these objects are stack allocated
      and should not be freed. Initializing these objects with a single reference
      that is never released prevents accidental attempts to free the objects.
      Fixes: 9d5722b7
       ("fuse: handle synchronous iocbs internally")
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Robert Doebbelin's avatar
      fuse: do not use iocb after it may have been freed · 7cabc61e
      Robert Doebbelin authored
      There's a race in fuse_direct_IO(), whereby is_sync_kiocb() is called on an
      iocb that could have been freed if async io has already completed.  The fix
      in this case is simple and obvious: cache the result before starting io.
      It was discovered by KASan:
      kernel: ==================================================================
      kernel: BUG: KASan: use after free in fuse_direct_IO+0xb1a/0xcc0 at addr ffff88036c414390
      Signed-off-by: default avatarRobert Doebbelin <robert@quobyte.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: bcba24cc ("fuse: enable asynchronous processing direct IO")
      Cc: <stable@vger.kernel.org> # 3.10+
    • Linus Torvalds's avatar
      Linux 4.5 · b562e44f
      Linus Torvalds authored
  2. 13 Mar, 2016 8 commits
  3. 12 Mar, 2016 6 commits
    • Ming Lei's avatar
      block: don't optimize for non-cloned bio in bio_get_last_bvec() · 90d0f0f1
      Ming Lei authored
      For !BIO_CLONED bio, we can use .bi_vcnt safely, but it
      doesn't mean we can just simply return .bi_io_vec[.bi_vcnt - 1]
      because the start postion may have been moved in the middle of
      the bvec, such as splitting in the middle of bvec.
      Fixes: 7bcd79ac
      (block: bio: introduce helpers to get the 1st and last bvec)
      Cc: stable@vger.kernel.org
      Reported-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    • Matt Fleming's avatar
      x86/efi: Fix boot crash by always mapping boot service regions into new EFI page tables · 452308de
      Matt Fleming authored
      Some machines have EFI regions in page zero (physical address
      0x00000000) and historically that region has been added to the e820
      map via trim_bios_range(), and ultimately mapped into the kernel page
      tables. It was not mapped via efi_map_regions() as one would expect.
      Alexis reports that with the new separate EFI page tables some boot
      services regions, such as page zero, are not mapped. This triggers an
      oops during the SetVirtualAddressMap() runtime call.
      For the EFI boot services quirk on x86 we need to memblock_reserve()
      boot services regions until after SetVirtualAddressMap(). Doing that
      while respecting the ownership of regions that may have already been
      reserved by the kernel was the motivation behind this commit:
       ("x86, efi: Do not reserve boot services regions within reserved areas")
      That patch was merged at a time when the EFI runtime virtual mappings
      were inserted into the kernel page tables as described above, and the
      trick of setting ->numpages (and hence the region size) to zero to
      track regions that should not be freed in efi_free_boot_services()
      meant that we never mapped those regions in efi_map_regions(). Instead
      we were relying solely on the existing kernel mappings.
      Now that we have separate page tables we need to make sure the EFI
      boot services regions are mapped correctly, even if someone else has
      already called memblock_reserve(). Instead of stashing a tag in
      ->numpages, set the EFI_MEMORY_RUNTIME bit of ->attribute. Since it
      generally makes no sense to mark a boot services region as required at
      runtime, it's pretty much guaranteed the firmware will not have
      already set this bit.
      For the record, the specific circumstances under which Alexis
      triggered this bug was that an EFI runtime driver on his machine was
      responding to the EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE event during
      The event handler for this driver looks like this,
        sub rsp,0x28
        lea rdx,[rip+0x2445] # 0xaa948720
        mov ecx,0x4
        call func_aa9447c0  ; call to ConvertPointer(4, & 0xaa948720)
        mov r11,QWORD PTR [rip+0x2434] # 0xaa948720
        xor eax,eax
        mov BYTE PTR [r11+0x1],0x1
        add rsp,0x28
      Which is pretty typical code for an EVT_SIGNAL_VIRTUAL_ADDRESS_CHANGE
      handler. The "mov r11, QWORD PTR [rip+0x2424]" was the faulting
      instruction because ConvertPointer() was being called to convert the
      address 0x0000000000000000, which when converted is left unchanged and
      remains 0x0000000000000000.
      The output of the oops trace gave the impression of a standard NULL
      pointer dereference bug, but because we're accessing physical
      addresses during ConvertPointer(), it wasn't. EFI boot services code
      is stored at that address on Alexis' machine.
      Reported-by: default avatarAlexis Murzeau <amurzeau@gmail.com>
      Signed-off-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raphael Hertzog <hertzog@debian.org>
      Cc: Roger Shimizu <rogershimizu@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/1457695163-29632-2-git-send-email-matt@codeblueprint.co.uk
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815125
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Borislav Petkov's avatar
      x86/fpu: Fix eager-FPU handling on legacy FPU machines · 6e686709
      Borislav Petkov authored
      i486 derived cores like Intel Quark support only the very old,
      legacy x87 FPU (FSAVE/FRSTOR, CPUID bit FXSR is not set), and
      our FPU code wasn't handling the saving and restoring there
      properly in the 'eagerfpu' case.
      So after we made eagerfpu the default for all CPU types:
       x86/fpu: Default eagerfpu=on on all CPUs
      these old FPU designs broke. First, Andy Shevchenko reported a splat:
        WARNING: CPU: 0 PID: 823 at arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160
      which was us trying to execute FXRSTOR on those machines even though
      they don't support it.
      After taking care of that, Bryan O'Donoghue reported that a simple FPU
      test still failed because we weren't initializing the FPU state properly
      on those machines.
      Take care of all that.
      Reported-and-tested-by: default avatarBryan O'Donoghue <pure.logic@nexus-software.ie>
      Reported-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-cheng <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160311113206.GD4312@pd.tnic
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd · 03c668a9
      Linus Torvalds authored
      Pull MTD fixes from Brian Norris:
       "Late MTD fix for v4.5:
         - A simple error code handling fix for the NAND ECC test; this was a
           regression in v4.5-rc1
         - A MAINTAINERS update, which might as well go in ASAP"
      * tag 'for-linus-20160311' of git://git.infradead.org/linux-mtd:
        MAINTAINERS: add a maintainer for the NAND subsystem
        mtd: nand: tests: fix regression introduced in mtd_nandectest
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 3ab0a0f9
      Linus Torvalds authored
      Pull drm/i915 fixes from Dave Airlie:
       "Just two i915 regression fixes, that should be it from me"
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Actually retry with bit-banging after GMBUS timeout
        drm/i915: Fix bogus dig_port_map[] assignment for pre-HSW
    • Matthew Dawson's avatar
      mm/mempool: avoid KASAN marking mempool poison checks as use-after-free · 76401310
      Matthew Dawson authored
      When removing an element from the mempool, mark it as unpoisoned in KASAN
      before verifying its contents for SLUB/SLAB debugging.  Otherwise KASAN
      will flag the reads checking the element use-after-free writes as
      use-after-free reads.
      Signed-off-by: default avatarMatthew Dawson <matthew@mjdsystems.ca>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  4. 11 Mar, 2016 9 commits
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2a4fb270
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "Two more fixes for 4.5:
         - One is a fix for OMAP that is urgently needed to avoid DRA7xx chips
           from premature aging, by always keeping the Ethernet clock enabled.
         - The other solves a I/O memory layout issue on Armada, where SROM
           and PCI memory windows were conflicting in some configurations"
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window
        ARM: dts: dra7: do not gate cpsw clock due to errata i877
        ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property
    • Linus Torvalds's avatar
      Merge tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 95f41fb2
      Linus Torvalds authored
      Pull media fix from Mauro Carvalho Chehab:
       "One last time fix: It adds a code that prevents some media tools like
        media-ctl to hide some entities that have their IDs out of the range
        expected by those apps"
      * tag 'media/v4.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] media-device: map new functions into old types for legacy API
    • Thomas Petazzoni's avatar
      ARM: mvebu: fix overlap of Crypto SRAM with PCIe memory window · d7d5a43c
      Thomas Petazzoni authored
      When the Crypto SRAM mappings were added to the Device Tree files
      describing the Armada XP boards in commit c466d997 ("ARM: mvebu:
      define crypto SRAM ranges for all armada-xp boards"), the fact that
      those mappings were overlaping with the PCIe memory aperture was
      overlooked. Due to this, we currently have for all Armada XP platforms
      a situation that looks like this:
      Memory mapping on Armada XP boards with internal registers at
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory aperture
       - 0xf8100000 -> 0xf8110000	64KB	Crypto SRAM #0	=> OVERLAPS WITH PCIE !
       - 0xf8110000 -> 0xf8120000	64KB	Crypto SRAM #1	=> OVERLAPS WITH PCIE !
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O aperture
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      The overlap means that when PCIe devices are added, depending on their
      memory window needs, they might or might not be mapped into the
      physical address space. Indeed, they will not be mapped if the area
      allocated in the PCIe memory aperture by the PCI core overlaps with
      one of the Crypto SRAM. Typically, a Intel IGB PCIe NIC that needs 8MB
      of PCIe memory will see its PCIe memory window allocated from
      0xf80000000 for 8MB, which overlaps with the Crypto SRAM windows. Due
      to this, the PCIe window is not created, and any attempt to access the
      PCIe window makes the kernel explode:
      [    3.302213] igb: Copyright (c) 2007-2014 Intel Corporation.
      [    3.307841] pci 0000:00:09.0: enabling device (0140 -> 0143)
      [    3.313539] mvebu_mbus: cannot add window '4:f8', conflicts with another window
      [    3.320870] mvebu-pcie soc:pcie-controller: Could not create MBus window at [mem 0xf8000000-0xf87fffff]: -22
      [    3.330811] Unhandled fault: external abort on non-linefetch (0x1008) at 0xf08c0018
      This problem does not occur on Armada 370 boards, because we use the
      following memory mapping (for boards that have internal registers at
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0 => OK !
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      Obviously, the solution is to align the location of the Crypto SRAM
      mappings of Armada XP to be similar with the ones on Armada 370, i.e
      have them between the "internal registers" area and the beginning of
      the PCIe aperture.
      However, we have a special case with the OpenBlocks AX3-4 platform,
      which has a 128 MB NOR flash. Currently, this NOR flash is mapped from
      0xf0000000 to 0xf8000000. This is possible because on OpenBlocks
      AX3-4, the internal registers are not at 0xf1000000. And this explains
      why the Crypto SRAM mappings were not configured at the same place on
      Armada XP.
      Hence, the solution is two-fold:
       (1) Move the NOR flash mapping on Armada XP OpenBlocks AX3-4 from
           0xe8000000 to 0xf0000000. This frees the 0xf0000000 ->
           0xf80000000 space.
       (2) Move the Crypto SRAM mappings on Armada XP to be similar to
           Armada 370 (except of course that Armada XP has two Crypto SRAM
           and not one).
      After this patch, the memory mapping on Armada XP boards with
      registers at 0xf1 is:
       - 0x00000000 -> 0xf0000000	3.75G 	RAM
       - 0xf0000000 -> 0xf1000000	16M	NOR flashes (AXP GP / AXP DB)
       - 0xf1000000 -> 0xf1100000	1M	internal registers
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
       - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      And the memory mapping for the special case of the OpenBlocks AX3-4
      (internal registers at 0xd0000000, NOR of 128 MB):
       - 0x00000000 -> 0xc0000000	3G 	RAM
       - 0xd0000000 -> 0xd1000000	1M	internal registers
       - 0xe800000  -> 0xf0000000	128M	NOR flash
       - 0xf1100000 -> 0xf1110000	64KB	Crypto SRAM #0
       - 0xf1110000 -> 0xf1120000	64KB	Crypto SRAM #1
       - 0xf8000000 -> 0xffe0000	126M	PCIe memory
       - 0xffe00000 -> 0xfff00000	1M	PCIe I/O
       - 0xfff0000  -> 0xffffffff	1M	BootROM
      Fixes: c466d997
       ("ARM: mvebu: define crypto SRAM ranges for all armada-xp boards")
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Cc: Phil Sutter <phil@nwl.cc>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Acked-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma · 20698c92
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "Two fixes showed up in last few days, and they should be included in
        4.5.  Summary:
        Two more late fixes to drivers, nothing major here:
         - A memory leak fix in fsdma unmap the dma descriptors on freeup
         - A fix in xdmac driver for residue calculation of dma descriptor"
      * tag 'dmaengine-fix-4.5' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: at_xdmac: fix residue computation
        dmaengine: fsldma: fix memory leak
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7ae9c768
      Linus Torvalds authored
      Pull power management and ACPI fixes from Rafael Wysocki:
       "Two more fixes for issues introduced recently, one in the generic
        device properties framework and one in ACPICA.
         - Revert a recent ACPICA commit that has been reverted upstream,
           because it caused problems to happen on user systems and the
           problem it attempted to address will not be relevant any more after
           upcoming ACPI specification changes (Bob Moore).
         - Fix crash in the generic device properties framework introduced by
           a recent change that forgot to check pointers against error values
           in addition to checking them against NULL (Heikki Krogerus)"
      * tag 'pm+acpi-4.5-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
        ACPICA: Revert "Parser: Fix for SuperName method invocation"
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs · 2a62ec0a
      Linus Torvalds authored
      Pull xfs fixes from Dave Chinner:
       "This is a fix for a regression introduced in 4.5-rc1 by the new torn
        log write detection code.  The regression only affects people moving a
        clean filesystem between machines/kernels of different architecture
        (such as changing between 32 bit and 64 bit kernels), but this is the
        recommended (and only!) safe way to migrate a filesystem between
        architectures so we really need to ensure it works.
        The changes are larger than I'd prefer right at the end of the release
        cycle, but the majority of the change is just factoring code to enable
        the detection of a clean log at the correct time to avoid this issue.
         - Only perform torn log write detection on dirty logs.  This prevents
           failures being detected due to a clean filesystem being moved
           between machines or kernels of different architectures (e.g.  32 ->
           64 bit, BE -> LE, etc).  This fixes a regression introduced by the
           torn log write detection in 4.5-rc1"
      * tag 'xfs-for-linus-4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
        xfs: only run torn log write detection on dirty logs
        xfs: refactor in-core log state update to helper
        xfs: refactor unmount record detection into helper
        xfs: separate log head record discovery from verification
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 63cf207e
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A couple of fixes: Fix for my dumb braino in ncpfs and a long-standing
        breakage on recovery from failed rename() in jffs2"
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        jffs2: reduce the breakage on recovery from halfway failed rename()
        ncpfs: fix a braino in OOM handling in ncp_fill_cache()
    • Rafael J. Wysocki's avatar
      Merge branches 'device-properties-fixes' and 'acpica-fixes' · 5b3e7e05
      Rafael J. Wysocki authored
      * device-properties-fixes:
        device property: fwnode->secondary may contain ERR_PTR(-ENODEV)
      * acpica-fixes:
        ACPICA: Revert "Parser: Fix for SuperName method invocation"
    • Ville Syrjälä's avatar
      drm/i915: Actually retry with bit-banging after GMBUS timeout · 0bbca274
      Ville Syrjälä authored
      After the GMBUS transfer times out, we set force_bit=1 and
      return -EAGAIN expecting the i2c core to call the .master_xfer
      hook again so that we will retry the same transfer via bit-banging.
      This is in case the gmbus hardware is somehow faulty.
      Unfortunately we left adapter->retries to 0, meaning the i2c core
      didn't actually do the retry. Let's tell the core we want one retry
      when we return -EAGAIN.
      Note that i2c-algo-bit also uses this retry count for some internal
      retries, so we'll end up increasing those a bit as well.
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: drm-intel-fixes@lists.freedesktop.org
      Fixes: bffce907
       ("drm/i915: abstract i2c bit banging fallback in gmbus xfer")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-2-git-send-email-ville.syrjala@linux.intel.com
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      (cherry picked from commit 8b1f165a
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
  5. 10 Mar, 2016 14 commits
    • Boris BREZILLON's avatar
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · f2c12421
      Linus Torvalds authored
      Pull KVM fixes from Paolo Bonzini:
       "A few simple fixes for ARM, x86, PPC and generic code.
        The x86 MMU fix is a bit larger because the surrounding code needed a
        cleanup, but nothing worrisome"
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0
        KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
        kvm: cap halt polling at exactly halt_poll_ns
        KVM: s390: correct fprs on SIGP (STOP AND) STORE STATUS
        KVM: VMX: disable PEBS before a guest entry
        KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · c32c2cb2
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "I thought we were done for 4.5, but then the 64k-page chaps came
        crawling out of the woodwork.  *sigh*
        The vmemmap fix I sent for -rc7 caused a regression with 64k pages and
        sparsemem and at some point during the release cycle the new hugetlb
        code using contiguous ptes started failing the libhugetlbfs tests with
        64k pages enabled.
        So here are a couple of patches that fix the vmemmap alignment and
        disable the new hugetlb page sizes whilst a proper fix is being
         - Temporarily disable huge pages built using contiguous ptes
         - Ensure vmemmap region is sufficiently aligned for sparsemem
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: hugetlb: partial revert of 66b3923a
        arm64: account for sparsemem section alignment when choosing vmemmap offset
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 2da33f9f
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "Three bug fixes:
         - The fix for the page table corruption (CVE-2016-2143)
         - The diagnose statistics introduced a regression for the dasd diag
         - Boot crash on systems without the set-program-parameters facility"
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/mm: four page table levels vs. fork
        s390/cpumf: Fix lpp detection
        s390/dasd: fix diag 0x250 inline assembly
    • Mauro Carvalho Chehab's avatar
      [media] media-device: map new functions into old types for legacy API · b2cd2744
      Mauro Carvalho Chehab authored
      The legacy media controller userspace API exposes entity types that
      carry both type and function information. The new API replaces the type
      with a function. It preserves backward compatibility by defining legacy
      functions for the existing types and using them in drivers.
      This works fine, as long as newer entity functions won't be added.
      Unfortunately, some tools, like media-ctl with --print-dot argument
      rely on the now legacy MEDIA_ENT_T_V4L2_SUBDEV and MEDIA_ENT_T_DEVNODE
      numeric ranges to identify what entities will be shown.
      Also, if the entity doesn't match those ranges, it will ignore the
      major/minor information on devnodes, and won't be getting the devnode
      name via udev or sysfs.
      As we're now adding devices outside the old range, the legacy ioctl
      needs to map the new entity functions into a type at the old range,
      or otherwise we'll have a regression.
      Detected on all released media-ctl versions (e. g. versions <= 1.10).
      Fix this by deriving the type from the function to emulate the legacy
      API if the function isn't in the legacy functions range.
      Reported-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
    • Ludovic Desroches's avatar
      dmaengine: at_xdmac: fix residue computation · 25c5e962
      Ludovic Desroches authored
      When computing the residue we need two pieces of information: the current
      descriptor and the remaining data of the current descriptor. To get
      that information, we need to read consecutively two registers but we
      can't do it in an atomic way. For that reason, we have to check manually
      that current descriptor has not changed.
      Signed-off-by: default avatarLudovic Desroches <ludovic.desroches@atmel.com>
      Suggested-by: default avatarCyrille Pitchen <cyrille.pitchen@atmel.com>
      Reported-by: default avatarDavid Engraf <david.engraf@sysgo.com>
      Tested-by: default avatarDavid Engraf <david.engraf@sysgo.com>
      Fixes: e1f7c9ee
       ("dmaengine: at_xdmac: creation of the atmel
      eXtended DMA Controller driver")
      Cc: stable@vger.kernel.org #4.1 and later
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
    • Borislav Petkov's avatar
      x86/delay: Avoid preemptible context checks in delay_mwaitx() · 84477336
      Borislav Petkov authored
      We do use this_cpu_ptr(&cpu_tss) as a cacheline-aligned, seldomly
      accessed per-cpu var as the MONITORX target in delay_mwaitx(). However,
      when called in preemptible context, this_cpu_ptr -> smp_processor_id() ->
      debug_smp_processor_id() fires:
        BUG: using smp_processor_id() in preemptible [00000000] code: udevd/312
        caller is delay_mwaitx+0x40/0xa0
      But we don't care about that check - we only need cpu_tss as a MONITORX
      target and it doesn't really matter which CPU's var we're touching as
      we're going idle anyway. Fix that.
      Suggested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: spg_linux_kernel@amd.com
      Link: http://lkml.kernel.org/r/20160309205622.GG6564@pd.tnic
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Paolo Bonzini's avatar
      KVM: MMU: fix reserved bit check for ept=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 · 5f0b8199
      Paolo Bonzini authored
      KVM has special logic to handle pages with pte.u=1 and pte.w=0 when
      CR0.WP=1.  These pages' SPTEs flip continuously between two states:
      U=1/W=0 (user and supervisor reads allowed, supervisor writes not allowed)
      and U=0/W=1 (supervisor reads and writes allowed, user writes not allowed).
      When SMEP is in effect, however, U=0 will enable kernel execution of
      this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
      with U=0, making the two states U=1/W=0/NX=gpte.NX and U=0/W=1/NX=1.
      When guest EFER has the NX bit cleared, the reserved bit check thinks
      that the latter state is invalid; teach it that the smep_andnot_wp case
      will also use the NX bit of SPTEs.
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarXiao Guangrong <guangrong.xiao@linux.inel.com>
      Fixes: c258b62b
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    • Paolo Bonzini's avatar
      KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo · 844a5fe2
      Paolo Bonzini authored
      Yes, all of these are needed. :) This is admittedly a bit odd, but
      kvm-unit-tests access.flat tests this if you run it with "-cpu host"
      and of course ept=0.
      KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
      specially when pte.u=1/pte.w=0/CR0.WP=0.  Such writes cause a fault
      when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
      When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
      restarts execution.  This will still cause a user write to fault, while
      supervisor writes will succeed.  User reads will fault spuriously now,
      and KVM will then flip U and W again in the SPTE (U=1, W=0).  User reads
      will be enabled and supervisor writes disabled, going back to the
      originary situation where supervisor writes fault spuriously.
      When SMEP is in effect, however, U=0 will enable kernel execution of
      this page.  To avoid this, KVM also sets NX=1 in the shadow PTE together
      with U=0.  If the guest has not enabled NX, the result is a continuous
      stream of page faults due to the NX bit being reserved.
      The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
      switch.  (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
      control, so they do not use user-return notifiers for EFER---if they did,
      EFER.NX would be forced to the same value as the host).
      There is another bug in the reserved bit check, which I've split to a
      separate patch for easier application to stable kernels.
      Cc: stable@vger.kernel.org
      Cc: Andy Lutomirski <luto@amacapital.net>
      Reviewed-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Fixes: f6577a5f
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    • Yu-cheng Yu's avatar
      x86/fpu: Revert ("x86/fpu: Disable AVX when eagerfpu is off") · a65050c6
      Yu-cheng Yu authored
      Leonid Shatz noticed that the SDM interpretation of the following
      recent commit:
       ("x86/fpu: Disable AVX when eagerfpu is off")
      ... is incorrect and that the original behavior of the FPU code was correct.
      Because AVX is not stated in CR0 TS bit description, it was mistakenly
      believed to be not supported for lazy context switch. This turns out
      to be false:
        Intel Software Developer's Manual Vol. 3A, Sec. 2.5 Control Registers:
         'TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
          MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
          an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
          by the new task.'
        Intel Software Developer's Manual Vol. 2A, Sec. 2.4 Instruction Exception
         'AVX instructions refer to exceptions by classes that include #NM
          "Device Not Available" exception for lazy context switch.'
      So revert the commit.
      Reported-by: default avatarLeonid Shatz <leonid.shatz@ravellosystems.com>
      Signed-off-by: default avatarYu-cheng Yu <yu-cheng.yu@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
      Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1457569734-3785-1-git-send-email-yu-cheng.yu@intel.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    • Martin Schwidefsky's avatar
      s390/mm: four page table levels vs. fork · 3446c13b
      Martin Schwidefsky authored
      The fork of a process with four page table levels is broken since
      git commit 6252d702
       "[S390] dynamic page tables."
      All new mm contexts are created with three page table levels and
      an asce limit of 4TB. If the parent has four levels dup_mmap will
      add vmas to the new context which are outside of the asce limit.
      The subsequent call to copy_page_range will walk the three level
      page table structure of the new process with non-zero pgd and pud
      indexes. This leads to memory clobbers as the pgd_index *and* the
      pud_index is added to the mm->pgd pointer without a pgd_deref
      in between.
      The init_new_context() function is selecting the number of page
      table levels for a new context. The function is used by mm_init()
      which in turn is called by dup_mm() and mm_alloc(). These two are
      used by fork() and exec(). The init_new_context() function can
      distinguish the two cases by looking at mm->context.asce_limit,
      for fork() the mm struct has been copied and the number of page
      table levels may not change. For exec() the mm_alloc() function
      set the new mm structure to zero, in this case a three-level page
      table is created as the temporary stack space is located at
      STACK_TOP_MAX = 4TB.
      This fixes CVE-2016-2143.
      Reported-by: default avatarMarcin Kościelnicki <koriakin@0x04.net>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 8e0f93cd
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A few driver specific fixes for the Rockchip and i.MX SPI controllers,
        especially for the i.MX they're annoying bugs if you run into them"
      * tag 'spi-fix-v4.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: imx: fix spi resource leak with dma transfer
        spi: imx: allow only WML aligned transfers to use DMA
        spi: rockchip: add missing spi_master_put
        spi: rockchip: disable runtime pm when in err case
    • Mark Brown's avatar
    • Mark Brown's avatar