1. 13 Feb, 2012 1 commit
  2. 21 Dec, 2011 2 commits
    • Steven Rostedt's avatar
      x86: Add counter when debug stack is used with interrupts enabled · 42181186
      Steven Rostedt authored
      Mathieu Desnoyers pointed out a case that can cause issues with
      NMIs running on the debug stack:
        int3 -> interrupt -> NMI -> int3
      Because the interrupt changes the stack, the NMI will not see that
      it preempted the debug stack. Looking deeper at this case,
      interrupts only happen when the int3 is from userspace or in
      an a location in the exception table (fixup).
        userspace -> int3 -> interurpt -> NMI -> int3
      All other int3s that happen in the kernel should be processed
      without ever enabling interrupts, as the do_trap() call will
      panic the kernel if it is called to process any other location
      within the kernel.
      Adding a counter around the sections that enable interrupts while
      using the debug stack allows the NMI to also check that case.
      If the NMI sees that it either interrupted a task using the debug
      stack or the debug counter is non-zero, then it will have to
      change the IDT table to make the int3 not change stacks (which will
      corrupt the stack if it does).
      Note, I had to move the debug_usage functions out of processor.h
      and into debugreg.h because of the static inlined functions to
      inc and dec the debug_usage counter. __get_cpu_var() requires
      smp.h which includes processor.h, and would fail to build.
      Link: http://lkml.kernel.org/r/1323976535.23971.112.camel@gandalf.stny.rr.com
      Reported-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Paul Turner <pjt@google.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
    • Steven Rostedt's avatar
      x86: Keep current stack in NMI breakpoints · 228bdaa9
      Steven Rostedt authored
      We want to allow NMI handlers to have breakpoints to be able to
      remove stop_machine from ftrace, kprobes and jump_labels. But if
      an NMI interrupts a current breakpoint, and then it triggers a
      breakpoint itself, it will switch to the breakpoint stack and
      corrupt the data on it for the breakpoint processing that it
      Instead, have the NMI check if it interrupted breakpoint processing
      by checking if the stack that is currently used is a breakpoint
      stack. If it is, then load a special IDT that changes the IST
      for the debug exception to keep the same stack in kernel context.
      When the NMI is done, it puts it back.
      This way, if the NMI does trigger a breakpoint, it will keep
      using the same stack and not stomp on the breakpoint data for
      the breakpoint it interrupted.
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
  3. 06 Dec, 2011 1 commit
  4. 10 Oct, 2011 1 commit
  5. 11 Aug, 2011 1 commit
  6. 26 Jul, 2011 1 commit
  7. 07 Jun, 2011 1 commit
    • Andy Lutomirski's avatar
      x86-64: Emulate legacy vsyscalls · 5cec93c2
      Andy Lutomirski authored
      There's a fair amount of code in the vsyscall page.  It contains
      a syscall instruction (in the gettimeofday fallback) and who
      knows what will happen if an exploit jumps into the middle of
      some other code.
      Reduce the risk by replacing the vsyscalls with short magic
      incantations that cause the kernel to emulate the real
      vsyscalls. These incantations are useless if entered in the
      This causes vsyscalls to be a little more expensive than real
      syscalls.  Fortunately sensible programs don't use them.
      The only exception is time() which is still called by glibc
      through the vsyscall - but calling time() millions of times
      per second is not sensible. glibc has this fixed in the
      development tree.
      This patch is not perfect: the vread_tsc and vread_hpet
      functions are still at a fixed address.  Fixing that might
      involve making alternative patching work in the vDSO.
      Signed-off-by: default avatarAndy Lutomirski <luto@mit.edu>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Jesper Juhl <jj@chaosbits.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Arjan van de Ven <arjan@infradead.org>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: richard -rw- weinberger <richard.weinberger@gmail.com>
      Cc: Mikael Pettersson <mikpe@it.uu.se>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Louis Rilling <Louis.Rilling@kerlabs.com>
      Cc: Valdis.Kletnieks@vt.edu
      Cc: pageexec@freemail.hu
      Link: http://lkml.kernel.org/r/e64e1b3c64858820d12c48fa739efbd1485e79d5.1307292171.git.luto@mit.edu
      [ Removed the CONFIG option - it's simpler to just do it unconditionally. Tidied up the code as well. ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  8. 07 Jan, 2011 4 commits
  9. 05 Jan, 2011 1 commit
  10. 09 Dec, 2010 1 commit
  11. 18 Nov, 2010 2 commits
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove all stub function calls from old nmi_watchdog · 072b198a
      Don Zickus authored
      Now that the bulk of the old nmi_watchdog is gone, remove all
      the stub variables and hooks associated with it.
      This touches lots of files mainly because of how the io_apic
      nmi_watchdog was implemented.  Now that the io_apic nmi_watchdog
      is forever gone, remove all its fingers.
      Most of this code was not being exercised by virtue of
      nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
      risky here.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    • Don Zickus's avatar
      x86, nmi_watchdog: Remove the old nmi_watchdog · 5f2b0ba4
      Don Zickus authored
      Now that we have a new nmi_watchdog that is more generic and
      sits on top of the perf subsystem, we really do not need the old
      nmi_watchdog any more.
      In addition, the old nmi_watchdog doesn't really work if you are
      using the default clocksource, hpet.  The old nmi_watchdog code
      relied on local apic interrupts to determine if the cpu is still
      alive.  With hpet as the clocksource, these interrupts don't
      increment any more and the old nmi_watchdog triggers false
      This piece removes the old nmi_watchdog code and stubs out any
      variables and functions calls.  The stubs are the same ones used
      by the new nmi_watchdog code, so it should be well tested.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: fweisbec@gmail.com
      Cc: gorcunov@openvz.org
      LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  12. 23 Sep, 2010 1 commit
    • Bart Oldeman's avatar
      x86, vm86: Fix preemption bug for int1 debug and int3 breakpoint handlers. · 6554287b
      Bart Oldeman authored
      Impact: fix kernel bug such as:
      BUG: scheduling while atomic: dosemu.bin/19680/0x00000004
      See also Ubuntu bug 455067 at
      Commits 4915a35e
      ("Use preempt_conditional_sti/cli in do_int3, like on x86_64.")
      and 3d2a71a5
      ("x86, traps: converge do_debug handlers")
      started disabling preemption in int1 and int3 handlers on i386.
      The problem with vm86 is that the call to handle_vm86_trap() may jump
      straight to entry_32.S and never returns so preempt is never enabled
      again, and there is an imbalance in the preempt count.
      Commit be716615 ("x86, vm86:
      fix preemption bug"), which was later (accidentally?) reverted by commit
       ("hw-breakpoints: modifying
      generic debug exception to use thread-specific debug registers")
      fixed the problem for debug exceptions but not for breakpoints.
      There are three solutions to this problem.
      1. Reenable preemption before calling handle_vm86_trap(). This
      was the approach that was later reverted.
      2. Do not disable preemption for i386 in breakpoint and debug handlers.
      This was the situation before October 2008. As far as I understand
      preemption only needs to be disabled on x86_64 because a seperate stack is
      used, but it's nice to have things work the same way on
      i386 and x86_64.
      3. Let handle_vm86_trap() return instead of jumping to assembly code.
      By setting a flag in _TIF_WORK_MASK, either TIF_IRET or TIF_NOTIFY_RESUME,
      the code in entry_32.S is instructed to return to 32 bit mode from
      V86 mode. The logic in entry_32.S was already present to handle signals.
      (I chose TIF_IRET because it's slightly more efficient in
      do_notify_resume() in signal.c, but in fact TIF_IRET can probably be
      replaced by TIF_NOTIFY_RESUME everywhere.)
      I'm submitting approach 3, because I believe it is the most elegant
      and prevents future confusion. Still, an obvious
      preempt_conditional_cli(regs); is necessary in traps.c to correct the
      [ hpa: This is technically a regression, but because:
        1. the regression is so old,
        2. the patch seems relatively high risk, justifying more testing, and
        3. we're late in the 2.6.36-rc cycle,
        I'm queuing it up for the 2.6.37 merge window.  It might, however,
        justify as a -stable backport at a latter time, hence Cc: stable. ]
      Signed-off-by: default avatarBart Oldeman <bartoldeman@users.sourceforge.net>
      LKML-Reference: <alpine.DEB.2.00.1009231312330.4732@localhost.localdomain>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: K.Prasad <prasad@linux.vnet.ibm.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
  13. 09 Sep, 2010 2 commits
  14. 30 Jun, 2010 1 commit
    • Frederic Weisbecker's avatar
      x86: Send a SIGTRAP for user icebp traps · a1e80faf
      Frederic Weisbecker authored
      Before we had a generic breakpoint layer, x86 used to send a
      sigtrap for any debug event that happened in userspace,
      except if it was caused by lazy dr7 switches.
      Currently we only send such signal for single step or breakpoint
      However, there are three other kind of debug exceptions:
      - debug register access detected: trigger an exception if the
        next instruction touches the debug registers. We don't use
      - task switch, but we don't use tss.
      - icebp/int01 trap. This instruction (0xf1) is undocumented and
        generates an int 1 exception. Unlike single step through TF
        flag, it doesn't set the single step origin of the exception
        in dr6.
      icebp then used to be reported in userspace using trap signals
      but this have been incidentally broken with the new breakpoint
      code. Reenable this. Since this is the only debug event that
      doesn't set anything in dr6, this is all we have to check.
      This fixes a regression in Wine where World Of Warcraft got broken
      as it uses this for software protection checks purposes. And
      probably other apps do.
      Reported-and-tested-by: default avatarAlexandre Julliard <julliard@winehq.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Prasad <prasad@linux.vnet.ibm.com>
      Cc: 2.6.33.x 2.6.34.x <stable@kernel.org>
  15. 21 May, 2010 2 commits
  16. 12 May, 2010 1 commit
    • Don Zickus's avatar
      lockup_detector: Combine nmi_watchdog and softlockup detector · 58687acb
      Don Zickus authored
      The new nmi_watchdog (which uses the perf event subsystem) is very
      similar in structure to the softlockup detector.  Using Ingo's
      suggestion, I combined the two functionalities into one file:
      Now both the nmi_watchdog (or hardlockup detector) and softlockup
      detector sit on top of the perf event subsystem, which is run every
      60 seconds or so to see if there are any lockups.
      To detect hardlockups, cpus not responding to interrupts, I
      implemented an hrtimer that runs 5 times for every perf event
      overflow event.  If that stops counting on a cpu, then the cpu is
      most likely in trouble.
      To detect softlockups, tasks not yielding to the scheduler, I used the
      previous kthread idea that now gets kicked every time the hrtimer fires.
      If the kthread isn't being scheduled neither is anyone else and the
      warning is printed to the console.
      I tested this on x86_64 and both the softlockup and hardlockup paths
      - cleaned up the Kconfig and softlockup combination
      - surrounded hardlockup cases with #ifdef CONFIG_PERF_EVENTS_NMI
      - seperated out the softlockup case from perf event subsystem
      - re-arranged the enabling/disabling nmi watchdog from proc space
      - added cpumasks for hardlockup failure cases
      - removed fallback to soft events if no PMU exists for hard events
      - comment cleanups
      - drop support for older softlockup code
      - per_cpu cleanups
      - completely remove software clock base hardlockup detector
      - use per_cpu masking on hard/soft lockup detection
      - #ifdef cleanups
      - rename config option NMI_WATCHDOG to LOCKUP_DETECTOR
      - documentation additions
      - documentation fixes
      - convert per_cpu to __get_cpu_var
      - powerpc compile fixes
      - split apart warn flags for hard and soft lockups
      - figure out how to make an arch-agnostic clock2cycles call
        (if possible) to feed into perf events as a sample period
      [fweisbec: merged conflict patch]
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Eric Paris <eparis@redhat.com>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      LKML-Reference: <1273266711-18706-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
  17. 03 May, 2010 4 commits
  18. 26 Mar, 2010 2 commits
    • Peter Zijlstra's avatar
      x86, ptrace: Fix block-step · ea8e61b7
      Peter Zijlstra authored
      Implement ptrace-block-step using TIF_BLOCKSTEP which will set
      DEBUGCTLMSR_BTF when set for a task while preserving any other
      DEBUGCTLMSR bits.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100325135414.017536066@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    • Peter Zijlstra's avatar
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra authored
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  19. 25 Feb, 2010 1 commit
    • Don Zickus's avatar
      nmi_watchdog: Clean up various small details · 47195d57
      Don Zickus authored
      Mostly copy/paste whitespace damage with a couple of nitpicks by
      the checkpatch script. Fix the struct definition as requested by Ingo too.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: peterz@infradead.org
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      LKML-Reference: <1266880143-24943-1-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
       arch/x86/kernel/apic/hw_nmi.c |   14 +++++------
       arch/x86/kernel/traps.c       |    6 ++--
       include/linux/nmi.h           |    2 -
       kernel/nmi_watchdog.c         |   51 ++++++++++++++++++++----------------------
       4 files changed, 36 insertions(+), 37 deletions(-)
  20. 08 Feb, 2010 2 commits
    • Don Zickus's avatar
      nmi_watchdog: Config option to enable new nmi_watchdog · 84e478c6
      Don Zickus authored
      These are the bits that enable the new nmi_watchdog and safely
      isolate the old nmi_watchdog.  Only one or the other can run,
      not both at the same time.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-4-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    • Don Zickus's avatar
      x86: Move notify_die from nmi.c to traps.c · e40b1720
      Don Zickus authored
      In order to handle a new nmi_watchdog approach, I need to move
      the notify_die() routine out of nmi_watchdog_tick() and into
      default_do_nmi(). This lets me easily swap out the old
      nmi_watchdog with the new one with just a config change.
      The change probably makes sense from a high level perspective
      because the nmi_watchdog shouldn't be handling notify_die
      routines anyway.  However, this move does change the semantics a
      little bit.  Instead of checking on every nmi interrupt if the
      cpus are stuck, only check them on the nmi_watchdog interrupts.
       v2: Move notify_die call into #idef block
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: gorcunov@gmail.com
      Cc: aris@redhat.com
      Cc: peterz@infradead.org
      LKML-Reference: <1265424425-31562-2-git-send-email-dzickus@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  21. 29 Jan, 2010 1 commit
  22. 24 Sep, 2009 1 commit
  23. 20 Sep, 2009 1 commit
  24. 18 Sep, 2009 1 commit
  25. 31 Aug, 2009 1 commit
  26. 19 Jul, 2009 1 commit
  27. 10 Jul, 2009 1 commit
  28. 25 Jun, 2009 1 commit
    • Kurt Garloff's avatar
      x86: Add sysctl to allow panic on IOCK NMI error · 5211a242
      Kurt Garloff authored
      This patch introduces a new sysctl:
      which defaults to 0 (off).
      When enabled, the kernel panics when the kernel receives an NMI
      caused by an IO error.
      The IO error triggered NMI indicates a serious system
      condition, which could result in IO data corruption. Rather
      than contiuing, panicing and dumping might be a better choice,
      so one can figure out what's causing the IO error.
      This could be especially important to companies running IO
      intensive applications where corruption must be avoided, e.g. a
      bank's databases.
      [ SuSE has been shipping it for a while, it was done at the
        request of a large database vendor, for their users. ]
      Signed-off-by: default avatarKurt Garloff <garloff@suse.de>
      Signed-off-by: default avatarRoberto Angelino <robertangelino@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      LKML-Reference: <20090624213211.GA11291@kroah.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>