1. 22 Jul, 2015 4 commits
  2. 20 Jul, 2015 1 commit
  3. 17 Jul, 2015 1 commit
  4. 13 Jul, 2015 2 commits
    • Heiko Carstens's avatar
      s390/nmi: fix vector register corruption · cad49cfc
      Heiko Carstens authored
      If a machine check happens, the machine has the vector facility installed
      and the extended save area exists, the cpu will save vector register
      contents into the extended save area. This is regardless of control
      register 0 contents, which enables and disables the vector facility during
      runtime.
      
      On each machine check we should validate the vector registers. The current
      code however tries to validate the registers only if the running task is
      using vector registers in user space.
      
      However even the current code is broken and causes vector register
      corruption on machine checks, if user space uses them:
      the prefix area contains a pointer (absolute address) to the machine check
      extended save area. In order to save some space the save area was put into
      an unused area of the second prefix page.
      When validating vector register contents the code uses the absolute address
      of the extended save area, which is wrong. Due to prefixing the vector
      instructions will then access contents using absolute addresses instead
      of real addresses, where the machine stored the contents.
      
      If the above would work there is still the problem that register validition
      would only happen if user space uses vector registers. If kernel space uses
      them also, this may also lead to vector register content corruption:
      if the kernel makes use of vector instructions, but the current running
      user space context does not, the machine check handler will validate
      floating point registers instead of vector registers.
      Given the fact that writing to a floating point register may change the
      upper halve of the corresponding vector register, we also experience vector
      register corruption in this case.
      
      Fix all of these issues, and always validate vector registers on each
      machine check, if the machine has the vector facility installed and the
      extended save area is defined.
      
      Cc: <stable@vger.kernel.org> # 4.1+
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      cad49cfc
    • Heiko Carstens's avatar
      s390/process: fix sfpc inline assembly · e47994dd
      Heiko Carstens authored
      The sfpc inline assembly within execve_tail() may incorrectly set bits
      28-31 of the sfpc instruction to a value which is not zero.
      These bits however are currently unused and therefore should be zero
      so we won't get surprised if these bits will be used in the future.
      
      Therefore remove the second operand from the inline assembly.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      e47994dd
  5. 07 Jul, 2015 1 commit
    • Martin Schwidefsky's avatar
      s390/sclp: clear upper register halves in _sclp_print_early · f9c87a6f
      Martin Schwidefsky authored
      If the kernel is compiled with gcc 5.1 and the XZ compression option
      the decompress_kernel function calls _sclp_print_early in 64-bit mode
      while the content of the upper register half of %r6 is non-zero.
      This causes a specification exception on the servc instruction in
      _sclp_servc.
      
      The _sclp_print_early function saves and restores the upper registers
      halves but it fails to clear them for the 31-bit code of the mini sclp
      driver.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      f9c87a6f
  6. 29 Jun, 2015 1 commit
  7. 26 Jun, 2015 2 commits
    • Josh Triplett's avatar
      clone: support passing tls argument via C rather than pt_regs magic · 3033f14a
      Josh Triplett authored
      clone has some of the quirkiest syscall handling in the kernel, with a
      pile of special cases, historical curiosities, and architecture-specific
      calling conventions.  In particular, clone with CLONE_SETTLS accepts a
      parameter "tls" that the C entry point completely ignores and some
      assembly entry points overwrite; instead, the low-level arch-specific
      code pulls the tls parameter out of the arch-specific register captured
      as part of pt_regs on entry to the kernel.  That's a massive hack, and
      it makes the arch-specific code only work when called via the specific
      existing syscall entry points; because of this hack, any new clone-like
      system call would have to accept an identical tls argument in exactly
      the same arch-specific position, rather than providing a unified system
      call entry point across architectures.
      
      The first patch allows architectures to handle the tls argument via
      normal C parameter passing, if they opt in by selecting
      HAVE_COPY_THREAD_TLS.  The second patch makes 32-bit and 64-bit x86 opt
      into this.
      
      These two patches came out of the clone4 series, which isn't ready for
      this merge window, but these first two cleanup patches were entirely
      uncontroversial and have acks.  I'd like to go ahead and submit these
      two so that other architectures can begin building on top of this and
      opting into HAVE_COPY_THREAD_TLS.  However, I'm also happy to wait and
      send these through the next merge window (along with v3 of clone4) if
      anyone would prefer that.
      
      This patch (of 2):
      
      clone with CLONE_SETTLS accepts an argument to set the thread-local
      storage area for the new thread.  sys_clone declares an int argument
      tls_val in the appropriate point in the argument list (based on the
      various CLONE_BACKWARDS variants), but doesn't actually use or pass along
      that argument.  Instead, sys_clone calls do_fork, which calls
      copy_process, which calls the arch-specific copy_thread, and copy_thread
      pulls the corresponding syscall argument out of the pt_regs captured at
      kernel entry (knowing what argument of clone that architecture passes tls
      in).
      
      Apart from being awful and inscrutable, that also only works because only
      one code path into copy_thread can pass the CLONE_SETTLS flag, and that
      code path comes from sys_clone with its architecture-specific
      argument-passing order.  This prevents introducing a new version of the
      clone system call without propagating the same architecture-specific
      position of the tls argument.
      
      However, there's no reason to pull the argument out of pt_regs when
      sys_clone could just pass it down via C function call arguments.
      
      Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
      and a new copy_thread_tls that accepts the tls parameter as an additional
      unsigned long (syscall-argument-sized) argument.  Change sys_clone's tls
      argument to an unsigned long (which does not change the ABI), and pass
      that down to copy_thread_tls.
      
      Architectures that don't opt into copy_thread_tls will continue to ignore
      the C argument to sys_clone in favor of the pt_regs captured at kernel
      entry, and thus will be unable to introduce new versions of the clone
      syscall.
      
      Patch co-authored by Josh Triplett and Thiago Macieira.
      Signed-off-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thiago Macieira <thiago.macieira@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3033f14a
    • Dominik Dingel's avatar
      s390/mm: make hugepages_supported a boot time decision · bea41197
      Dominik Dingel authored
      There is a potential bug with KVM and hugetlbfs if the hardware does not
      support hugepages (EDAT1).  We fix this by making EDAT1 a hard requirement
      for hugepages and therefore removing and simplifying code.
      
      As s390, with the sw-emulated hugepages, was the only user of
      arch_prepare/release_hugepage I also removed theses calls from common and
      other architecture code.
      
      This patch (of 5):
      
      By dropping support for hugepages on machines which do not have the
      hardware feature EDAT1, we fix a potential s390 KVM bug.
      
      The bug would happen if a guest is backed by hugetlbfs (not supported
      currently), but does not get pagetables with PGSTE.  This would lead to
      random memory overwrites.
      Signed-off-by: default avatarDominik Dingel <dingel@linux.vnet.ibm.com>
      Acked-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bea41197
  8. 25 Jun, 2015 5 commits
    • Michael Holzheu's avatar
      s390/kdump: fix nosmt kernel parameter · 1592a8e4
      Michael Holzheu authored
      It turned out that SIGP set-multi-threading can only be done once.
      Therefore switching to a different MT level after switching to
      sclp.mtid_prev in the dump case fails.
      
      As a symptom specifying the "nosmt" parameter currently fails for
      the kdump kernel and the kernel starts with multi-threading enabled.
      
      So fix this and issue diag 308 subcode 1 call after collecting the
      CPU states for the dump. Also enhance the diag308_reset() function to
      be usable also with enabled lowcore protection and prefix register != 0.
      After the reset it is possible to switch the MT level again. We have
      to do the reset very early in order not to kill the already initialized
      console. Therefore instead of kmalloc() the corresponding memblock
      functions have to be used. To avoid copying the sclp cpu code into
      sclp_early, we now use the simple sigp loop method for CPU detection.
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      1592a8e4
    • Martin Schwidefsky's avatar
      s390/smp: cleanup core vs. cpu in the SCLP interface · d08d9430
      Martin Schwidefsky authored
      The SCLP interface to query, configure and deconfigure CPUs actually
      operates on cores. For a machine without the multi-threading faciltiy
      a CPU and a core are equivalent but starting with System z13 a core
      can have multiple hardware threads, also referred to as logical CPUs.
      
      To avoid confusion replace the word 'cpu' with 'core' in the SCLP
      interface. Also replace MAX_CPU_ADDRESS with SCLP_MAX_CORES.
      The core-id is an 8-bit field, the maximum thread id is in the range
      0-31. The theoretical limit for the CPU address is therefore 8191.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      d08d9430
    • Martin Schwidefsky's avatar
      s390/smp: fix sigp cpu detection loop · e7086eb1
      Martin Schwidefsky authored
      On a (theoretical) system where the read-cpu-info SCLP command does
      not work but SMT is enabled, the sigp detection loop may not find
      all configured cores. The maximum CPU address needs to be shifted
      with smp_cpu_mt_shift.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      e7086eb1
    • Michael Holzheu's avatar
      s390/kdump: fix REGSET_VX_LOW vector register ELF notes · 3c8e5105
      Michael Holzheu authored
      The REGSET_VX_LOW ELF notes should contain the lower 64 bit halfes of the
      first sixteen 128 bit vector registers. Unfortunately currently we copy
      the upper halfes.
      
      Fix this and correctly copy the lower halfes.
      
      Fixes: a62bc073 ("s390/kdump: add support for vector extension")
      Cc: stable@vger.kernel.org # 3.18+
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      3c8e5105
    • Tony Luck's avatar
      mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute · fc6daaf9
      Tony Luck authored
      Some high end Intel Xeon systems report uncorrectable memory errors as a
      recoverable machine check.  Linux has included code for some time to
      process these and just signal the affected processes (or even recover
      completely if the error was in a read only page that can be replaced by
      reading from disk).
      
      But we have no recovery path for errors encountered during kernel code
      execution.  Except for some very specific cases were are unlikely to ever
      be able to recover.
      
      Enter memory mirroring. Actually 3rd generation of memory mirroing.
      
      Gen1: All memory is mirrored
      	Pro: No s/w enabling - h/w just gets good data from other side of the
      	     mirror
      	Con: Halves effective memory capacity available to OS/applications
      
      Gen2: Partial memory mirror - just mirror memory begind some memory controllers
      	Pro: Keep more of the capacity
      	Con: Nightmare to enable. Have to choose between allocating from
      	     mirrored memory for safety vs. NUMA local memory for performance
      
      Gen3: Address range partial memory mirror - some mirror on each memory
            controller
      	Pro: Can tune the amount of mirror and keep NUMA performance
      	Con: I have to write memory management code to implement
      
      The current plan is just to use mirrored memory for kernel allocations.
      This has been broken into two phases:
      
      1) This patch series - find the mirrored memory, use it for boot time
         allocations
      
      2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
         unused mirrored memory from mm/memblock.c and only give it out to
         select kernel allocations (this is still being scoped because
         page_alloc.c is scary).
      
      This patch (of 3):
      
      Add extra "flags" to memblock to allow selection of memory based on
      attribute.  No functional changes
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Xiexiuqi <xiexiuqi@huawei.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc6daaf9
  9. 18 Jun, 2015 1 commit
  10. 08 Jun, 2015 1 commit
  11. 28 May, 2015 1 commit
    • Luis R. Rodriguez's avatar
      kernel/params: constify struct kernel_param_ops uses · 9c27847d
      Luis R. Rodriguez authored
      Most code already uses consts for the struct kernel_param_ops,
      sweep the kernel for the last offending stragglers. Other than
      include/linux/moduleparam.h and kernel/params.c all other changes
      were generated with the following Coccinelle SmPL patch. Merge
      conflicts between trees can be handled with Coccinelle.
      
      In the future git could get Coccinelle merge support to deal with
      patch --> fail --> grammar --> Coccinelle --> new patch conflicts
      automatically for us on patches where the grammar is available and
      the patch is of high confidence. Consider this a feature request.
      
      Test compiled on x86_64 against:
      
      	* allnoconfig
      	* allmodconfig
      	* allyesconfig
      
      @ const_found @
      identifier ops;
      @@
      
      const struct kernel_param_ops ops = {
      };
      
      @ const_not_found depends on !const_found @
      identifier ops;
      @@
      
      -struct kernel_param_ops ops = {
      +const struct kernel_param_ops ops = {
      };
      
      Generated-by: Coccinelle SmPL
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Junio C Hamano <gitster@pobox.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: cocci@systeme.lip6.fr
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      9c27847d
  12. 22 May, 2015 1 commit
    • Xunlei Pang's avatar
      s390: time: Provide read_boot_clock64() and read_persistent_clock64() · 689911c7
      Xunlei Pang authored
      As part of addressing the "y2038 problem" for in-kernel uses,
      this patch converts read_boot_clock() to read_boot_clock64()
      and read_persistent_clock() to read_persistent_clock64() using
      timespec64.
      
      Rename some instances of 'timespec' to 'timespec64' in time.c and
      related references
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux390@de.ibm.com
      Signed-off-by: default avatarXunlei Pang <pang.xunlei@linaro.org>
      [jstultz: Fixed minor style and grammer tweaks
       pointed out by Ingo]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      689911c7
  13. 13 May, 2015 3 commits
  14. 08 May, 2015 1 commit
  15. 13 Apr, 2015 3 commits
    • Heiko Carstens's avatar
      s390/smp: wait until secondaries are active & online · a1307bba
      Heiko Carstens authored
      This is the s390 version of 875ebe94 ("powerpc/smp: Wait until secondaries
      are active & online").
      The race described in length within the commit message is also possible on s390
      and every other architecture. So fix this race on s390 as well.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      a1307bba
    • Heiko Carstens's avatar
      s390/hibernate: fix save and restore of kernel text section · d7441949
      Heiko Carstens authored
      Sebastian reported a crash caused by a jump label mismatch after resume.
      This happens because we do not save the kernel text section during suspend
      and therefore also do not restore it during resume, but use the kernel image
      that restores the old system.
      
      This means that after a suspend/resume cycle we lost all modifications done
      to the kernel text section.
      The reason for this is the pfn_is_nosave() function, which incorrectly
      returns that read-only pages don't need to be saved. This is incorrect since
      we mark the kernel text section read-only.
      We still need to make sure to not save and restore pages contained within
      NSS and DCSS segment.
      To fix this add an extra case for the kernel text section and only save
      those pages if they are not contained within an NSS segment.
      
      Fixes the following crash (and the above bugs as well):
      
      Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0
      Found:    c0 04 00 00 00 00
      Expected: c0 f4 00 00 00 11
      New:      c0 04 00 00 00 00
      Kernel panic - not syncing: Corrupted kernel text
      CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4
      Call Trace:
        [<0000000000113972>] show_stack+0x72/0xf0
        [<000000000081f15e>] dump_stack+0x6e/0x90
        [<000000000081c4e8>] panic+0x108/0x2b0
        [<000000000081be64>] jump_label_bug.isra.2+0x104/0x108
        [<0000000000112176>] __jump_label_transform+0x9e/0xd0
        [<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50
        [<00000000001d1136>] multi_cpu_stop+0x12e/0x170
        [<00000000001d1472>] cpu_stopper_thread+0xb2/0x168
        [<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0
        [<0000000000158baa>] kthread+0x10a/0x110
        [<0000000000824a86>] kernel_thread_starter+0x6/0xc
      Reported-and-tested-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      d7441949
    • Heiko Carstens's avatar
      s390/cacheinfo: add missing facility check · 77bb36e5
      Heiko Carstens authored
      Git commit d97d929f ("s390: move cacheinfo sysfs to generic cacheinfo
      infrastructure") removed the general-instructions-extension availability
      check before the ecag instruction is executed.
      Without this check this may lead to crashes on machines without this facility.
      Therefore add the check again where needed.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      77bb36e5
  16. 12 Apr, 2015 1 commit
  17. 27 Mar, 2015 1 commit
  18. 25 Mar, 2015 10 commits