1. 05 Nov, 2013 6 commits
    • Tom Zanussi's avatar
      tracing: Add support for SOFT_DISABLE to syscall events · d562aff9
      Tom Zanussi authored
      The original SOFT_DISABLE patches didn't add support for soft disable
      of syscall events; this adds it.
      
      Add an array of ftrace_event_file pointers indexed by syscall number
      to the trace array and remove the existing enabled bitmaps, which as a
      result are now redundant.  The ftrace_event_file structs in turn
      contain the soft disable flags we need for per-syscall soft disable
      accounting.
      
      Adding ftrace_event_files also means we can remove the USE_CALL_FILTER
      bit, thus enabling multibuffer filter support for syscall events.
      
      Link: http://lkml.kernel.org/r/6e72b566e85d8df8042f133efbc6c30e21fb017e.1382620672.git.tom.zanussi@linux.intel.com
      
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      d562aff9
    • Tom Zanussi's avatar
      tracing: Make register/unregister_ftrace_command __init · 38de93ab
      Tom Zanussi authored
      register/unregister_ftrace_command() are only ever called from __init
      functions, so can themselves be made __init.
      
      Also make register_snapshot_cmd() __init for the same reason.
      
      Link: http://lkml.kernel.org/r/d4042c8cadb7ae6f843ac9a89a24e1c6a3099727.1382620672.git.tom.zanussi@linux.intel.com
      
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      38de93ab
    • Tom Zanussi's avatar
      tracing: Update event filters for multibuffer · f306cc82
      Tom Zanussi authored
      The trace event filters are still tied to event calls rather than
      event files, which means you don't get what you'd expect when using
      filters in the multibuffer case:
      
      Before:
      
        # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # mkdir /sys/kernel/debug/tracing/instances/test1
        # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 2048
        # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        bytes_alloc > 2048
      
      Setting the filter in tracing/instances/test1/events shouldn't affect
      the same event in tracing/events as it does above.
      
      After:
      
        # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # mkdir /sys/kernel/debug/tracing/instances/test1
        # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter
        bytes_alloc > 8192
        # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter
        bytes_alloc > 2048
      
      We'd like to just move the filter directly from ftrace_event_call to
      ftrace_event_file, but there are a couple cases that don't yet have
      multibuffer support and therefore have to continue using the current
      event_call-based filters.  For those cases, a new USE_CALL_FILTER bit
      is added to the event_call flags, whose main purpose is to keep the
      old behavior for those cases until they can be updated with
      multibuffer support; at that point, the USE_CALL_FILTER flag (and the
      new associated call_filter_check_discard() function) can go away.
      
      The multibuffer support also made filter_current_check_discard()
      redundant, so this change removes that function as well and replaces
      it with filter_check_discard() (or call_filter_check_discard() as
      appropriate).
      
      Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com
      
      Signed-off-by: default avatarTom Zanussi <tom.zanussi@linux.intel.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      f306cc82
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Have control op function callback only trace when RCU is watching · b5aa3a47
      Steven Rostedt (Red Hat) authored
      
      
      Dave Jones reported that trinity would be able to trigger the following
      back trace:
      
       ===============================
       [ INFO: suspicious RCU usage. ]
       3.10.0-rc2+ #38 Not tainted
       -------------------------------
       include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle!
       other info that might help us debug this:
      
       RCU used illegally from idle CPU!  rcu_scheduler_active = 1, debug_locks = 0
       RCU used illegally from extended quiescent state!
       1 lock held by trinity-child1/18786:
        #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8113dd48>] __perf_event_overflow+0x108/0x310
       stack backtrace:
       CPU: 3 PID: 18786 Comm: trinity-child1 Not tainted 3.10.0-rc2+ #38
        0000000000000000 ffff88020767bac8 ffffffff816e2f6b ffff88020767baf8
        ffffffff810b5897 ffff88021de92520 0000000000000000 ffff88020767bbf8
        0000000000000000 ffff88020767bb78 ffffffff8113ded4 ffffffff8113dd48
       Call Trace:
        [<ffffffff816e2f6b>] dump_stack+0x19/0x1b
        [<ffffffff810b5897>] lockdep_rcu_suspicious+0xe7/0x120
        [<ffffffff8113ded4>] __perf_event_overflow+0x294/0x310
        [<ffffffff8113dd48>] ? __perf_event_overflow+0x108/0x310
        [<ffffffff81309289>] ? __const_udelay+0x29/0x30
        [<ffffffff81076054>] ? __rcu_read_unlock+0x54/0xa0
        [<ffffffff816f4000>] ? ftrace_call+0x5/0x2f
        [<ffffffff8113dfa1>] perf_swevent_overflow+0x51/0xe0
        [<ffffffff8113e08f>] perf_swevent_event+0x5f/0x90
        [<ffffffff8113e1c9>] perf_tp_event+0x109/0x4f0
        [<ffffffff8113e36f>] ? perf_tp_event+0x2af/0x4f0
        [<ffffffff81074630>] ? __rcu_read_lock+0x20/0x20
        [<ffffffff8112d79f>] perf_ftrace_function_call+0xbf/0xd0
        [<ffffffff8110e1e1>] ? ftrace_ops_control_func+0x181/0x210
        [<ffffffff81074630>] ? __rcu_read_lock+0x20/0x20
        [<ffffffff81100cae>] ? rcu_eqs_enter_common+0x5e/0x470
        [<ffffffff8110e1e1>] ftrace_ops_control_func+0x181/0x210
        [<ffffffff816f4000>] ftrace_call+0x5/0x2f
        [<ffffffff8110e229>] ? ftrace_ops_control_func+0x1c9/0x210
        [<ffffffff816f4000>] ? ftrace_call+0x5/0x2f
        [<ffffffff81074635>] ? debug_lockdep_rcu_enabled+0x5/0x40
        [<ffffffff81074635>] ? debug_lockdep_rcu_enabled+0x5/0x40
        [<ffffffff81100cae>] ? rcu_eqs_enter_common+0x5e/0x470
        [<ffffffff8110112a>] rcu_eqs_enter+0x6a/0xb0
        [<ffffffff81103673>] rcu_user_enter+0x13/0x20
        [<ffffffff8114541a>] user_enter+0x6a/0xd0
        [<ffffffff8100f6d8>] syscall_trace_leave+0x78/0x140
        [<ffffffff816f46af>] int_check_syscall_exit_work+0x34/0x3d
       ------------[ cut here ]------------
      
      Perf uses rcu_read_lock() but as the function tracer can trace functions
      even when RCU is not currently active, this makes the rcu_read_lock()
      used by perf ineffective.
      
      As perf is currently the only user of the ftrace_ops_control_func() and
      perf is also the only function callback that actively uses rcu_read_lock(),
      the quick fix is to prevent the ftrace_ops_control_func() from calling
      its callbacks if RCU is not active.
      
      With Paul's new "rcu_is_watching()" we can tell if RCU is active or not.
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      b5aa3a47
    • Steven Rostedt's avatar
      rcu: Do not trace rcu_is_watching() functions · 9418fb20
      Steven Rostedt authored
      As perf uses the rcu_read_lock() primitives for recording into its
      ring buffer, perf tracing can not be called when RCU in inactive.
      With the perf function tracing, there are functions that can be
      traced when RCU is not active, and perf must not have its function
      callback called when this is the case.
      
      Luckily, Paul McKenney has created a way to detect when RCU is
      active or not with the rcu_is_watching() function. Unfortunately,
      this function can also be traced, and if that happens it can cause
      a bit of overhead for the perf function calls that do the check.
      Recursion protection prevents anything bad from happening, but
      there is a bit of added overhead for every function being traced that
      must detect that the rcu_is_watching() is also being traced.
      
      As rcu_is_watching() is a helper routine and not part of the
      critical logic in RCU, it does not need to be traced in order to
      debug RCU itself. Add the "notrace" annotation to all the rcu_is_watching()
      calls such that we never trace it.
      
      Link: http://lkml.kernel.org/r/20131104202736.72dd8e45@gandalf.local.home
      
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9418fb20
    • Cody P Schafer's avatar
      trace/trace_stat: use rbtree postorder iteration helper instead of opencoding · 9cd804ac
      Cody P Schafer authored
      Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead
      of opencoding an alternate postorder iteration that modifies the tree
      
      Link: http://lkml.kernel.org/r/1383345566-25087-2-git-send-email-cody@linux.vnet.ibm.com
      
      Signed-off-by: default avatarCody P Schafer <cody@linux.vnet.ibm.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9cd804ac
  2. 19 Oct, 2013 5 commits
    • Namhyung Kim's avatar
      ftrace: Add set_graph_notrace filter · 29ad23b0
      Namhyung Kim authored
      The set_graph_notrace filter is analogous to set_ftrace_notrace and
      can be used for eliminating uninteresting part of function graph trace
      output.  It also works with set_graph_function nicely.
      
        # cd /sys/kernel/debug/tracing/
        # echo do_page_fault > set_graph_function
        # perf ftrace live true
         2)               |  do_page_fault() {
         2)               |    __do_page_fault() {
         2)   0.381 us    |      down_read_trylock();
         2)   0.055 us    |      __might_sleep();
         2)   0.696 us    |      find_vma();
         2)               |      handle_mm_fault() {
         2)               |        handle_pte_fault() {
         2)               |          __do_fault() {
         2)               |            filemap_fault() {
         2)               |              find_get_page() {
         2)   0.033 us    |                __rcu_read_lock();
         2)   0.035 us    |                __rcu_read_unlock();
         2)   1.696 us    |              }
         2)   0.031 us    |              __might_sleep();
         2)   2.831 us    |            }
         2)               |            _raw_spin_lock() {
         2)   0.046 us    |              add_preempt_count();
         2)   0.841 us    |            }
         2)   0.033 us    |            page_add_file_rmap();
         2)               |            _raw_spin_unlock() {
         2)   0.057 us    |              sub_preempt_count();
         2)   0.568 us    |            }
         2)               |            unlock_page() {
         2)   0.084 us    |              page_waitqueue();
         2)   0.126 us    |              __wake_up_bit();
         2)   1.117 us    |            }
         2)   7.729 us    |          }
         2)   8.397 us    |        }
         2)   8.956 us    |      }
         2)   0.085 us    |      up_read();
         2) + 12.745 us   |    }
         2) + 13.401 us   |  }
        ...
      
        # echo handle_mm_fault > set_graph_notrace
        # perf ftrace live true
         1)               |  do_page_fault() {
         1)               |    __do_page_fault() {
         1)   0.205 us    |      down_read_trylock();
         1)   0.041 us    |      __might_sleep();
         1)   0.344 us    |      find_vma();
         1)   0.069 us    |      up_read();
         1)   4.692 us    |    }
         1)   5.311 us    |  }
        ...
      
      Link: http://lkml.kernel.org/r/1381739066-7531-5-git-send-email-namhyung@kernel.org
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      29ad23b0
    • Namhyung Kim's avatar
      ftrace: Narrow down the protected area of graph_lock · 6a10108b
      Namhyung Kim authored
      The parser set up is just a generic utility that uses local variables
      allocated by the function. There's no need to hold the graph_lock for
      this set up.
      
      This also makes the code simpler.
      
      Link: http://lkml.kernel.org/r/1381739066-7531-4-git-send-email-namhyung@kernel.org
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      6a10108b
    • Namhyung Kim's avatar
      ftrace: Introduce struct ftrace_graph_data · faf982a6
      Namhyung Kim authored
      The struct ftrace_graph_data is for generalizing the access to
      set_graph_function file.  This is a preparation for adding support to
      set_graph_notrace.
      
      Link: http://lkml.kernel.org/r/1381739066-7531-3-git-send-email-namhyung@kernel.org
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      faf982a6
    • Namhyung Kim's avatar
      ftrace: Get rid of ftrace_graph_filter_enabled · 9aa72b4b
      Namhyung Kim authored
      The ftrace_graph_filter_enabled means that user sets function filter
      and it always has same meaning of ftrace_graph_count > 0.
      
      Link: http://lkml.kernel.org/r/1381739066-7531-2-git-send-email-namhyung@kernel.org
      
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      9aa72b4b
    • Steven Rostedt's avatar
      tracing: Fix potential out-of-bounds in trace_get_user() · 057db848
      Steven Rostedt authored
      Andrey reported the following report:
      
      ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
      ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
      Accessed by thread T13003:
        #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
        #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
        #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
        #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
        #4 ffffffff812a1065 (__fput+0x155/0x360)
        #5 ffffffff812a12de (____fput+0x1e/0x30)
        #6 ffffffff8111708d (task_work_run+0x10d/0x140)
        #7 ffffffff810ea043 (do_exit+0x433/0x11f0)
        #8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
        #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
        #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)
      
      Allocated by thread T5167:
        #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
        #1 ffffffff8128337c (__kmalloc+0xbc/0x500)
        #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
        #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
        #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
        #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
        #6 ffffffff8129b668 (finish_open+0x68/0xa0)
        #7 ffffffff812b66ac (do_last+0xb8c/0x1710)
        #8 ffffffff812b7350 (path_openat+0x120/0xb50)
        #9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
        #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
        #11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
        #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)
      
      Shadow bytes around the buggy address:
        ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
        ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      =>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
        ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
        ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
        ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00
        Partially addressable: 01 02 03 04 05 06 07
        Heap redzone:          fa
        Heap kmalloc redzone:  fb
        Freed heap region:     fd
        Shadow gap:            fe
      
      The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'
      
      Although the crash happened in ftrace_regex_open() the real bug
      occurred in trace_get_user() where there's an incrementation to
      parser->idx without a check against the size. The way it is triggered
      is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
      that reads the last character stores it and then breaks out because
      there is no more characters. Then the last character is read to determine
      what to do next, and the index is incremented without checking size.
      
      Then the caller of trace_get_user() usually nulls out the last character
      with a zero, but since the index is equal to the size, it writes a nul
      character after the allocated space, which can corrupt memory.
      
      Luckily, only root user has write access to this file.
      
      Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home
      
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      057db848
  3. 10 Oct, 2013 1 commit
  4. 01 Oct, 2013 1 commit
    • Frederic Weisbecker's avatar
      irq: Force hardirq exit's softirq processing on its own stack · ded79754
      Frederic Weisbecker authored
      The commit facd8b80
      
      
      ("irq: Sanitize invoke_softirq") converted irq exit
      calls of do_softirq() to __do_softirq() on all architectures,
      assuming it was only used there for its irq disablement
      properties.
      
      But as a side effect, the softirqs processed in the end
      of the hardirq are always called on the inline current
      stack that is used by irq_exit() instead of the softirq
      stack provided by the archs that override do_softirq().
      
      The result is mostly safe if the architecture runs irq_exit()
      on a separate irq stack because then softirqs are processed
      on that same stack that is near empty at this stage (assuming
      hardirq aren't nesting).
      
      Otherwise irq_exit() runs in the task stack and so does the softirq
      too. The interrupted call stack can be randomly deep already and
      the softirq can dig through it even further. To add insult to the
      injury, this softirq can be interrupted by a new hardirq, maximizing
      the chances for a stack overrun as reported in powerpc for example:
      
      	do_IRQ: stack overflow: 1920
      	CPU: 0 PID: 1602 Comm: qemu-system-ppc Not tainted 3.10.4-300.1.fc19.ppc64p7 #1
      	Call Trace:
      	[c0000000050a8740] .show_stack+0x130/0x200 (unreliable)
      	[c0000000050a8810] .dump_stack+0x28/0x3c
      	[c0000000050a8880] .do_IRQ+0x2b8/0x2c0
      	[c0000000050a8930] hardware_interrupt_common+0x154/0x180
      	--- Exception: 501 at .cp_start_xmit+0x3a4/0x820 [8139cp]
      		LR = .cp_start_xmit+0x390/0x820 [8139cp]
      	[c0000000050a8d40] .dev_hard_start_xmit+0x394/0x640
      	[c0000000050a8e00] .sch_direct_xmit+0x110/0x260
      	[c0000000050a8ea0] .dev_queue_xmit+0x260/0x630
      	[c0000000050a8f40] .br_dev_queue_push_xmit+0xc4/0x130 [bridge]
      	[c0000000050a8fc0] .br_dev_xmit+0x198/0x270 [bridge]
      	[c0000000050a9070] .dev_hard_start_xmit+0x394/0x640
      	[c0000000050a9130] .dev_queue_xmit+0x428/0x630
      	[c0000000050a91d0] .ip_finish_output+0x2a4/0x550
      	[c0000000050a9290] .ip_local_out+0x50/0x70
      	[c0000000050a9310] .ip_queue_xmit+0x148/0x420
      	[c0000000050a93b0] .tcp_transmit_skb+0x4e4/0xaf0
      	[c0000000050a94a0] .__tcp_ack_snd_check+0x7c/0xf0
      	[c0000000050a9520] .tcp_rcv_established+0x1e8/0x930
      	[c0000000050a95f0] .tcp_v4_do_rcv+0x21c/0x570
      	[c0000000050a96c0] .tcp_v4_rcv+0x734/0x930
      	[c0000000050a97a0] .ip_local_deliver_finish+0x184/0x360
      	[c0000000050a9840] .ip_rcv_finish+0x148/0x400
      	[c0000000050a98d0] .__netif_receive_skb_core+0x4f8/0xb00
      	[c0000000050a99d0] .netif_receive_skb+0x44/0x110
      	[c0000000050a9a70] .br_handle_frame_finish+0x2bc/0x3f0 [bridge]
      	[c0000000050a9b20] .br_nf_pre_routing_finish+0x2ac/0x420 [bridge]
      	[c0000000050a9bd0] .br_nf_pre_routing+0x4dc/0x7d0 [bridge]
      	[c0000000050a9c70] .nf_iterate+0x114/0x130
      	[c0000000050a9d30] .nf_hook_slow+0xb4/0x1e0
      	[c0000000050a9e00] .br_handle_frame+0x290/0x330 [bridge]
      	[c0000000050a9ea0] .__netif_receive_skb_core+0x34c/0xb00
      	[c0000000050a9fa0] .netif_receive_skb+0x44/0x110
      	[c0000000050aa040] .napi_gro_receive+0xe8/0x120
      	[c0000000050aa0c0] .cp_rx_poll+0x31c/0x590 [8139cp]
      	[c0000000050aa1d0] .net_rx_action+0x1dc/0x310
      	[c0000000050aa2b0] .__do_softirq+0x158/0x330
      	[c0000000050aa3b0] .irq_exit+0xc8/0x110
      	[c0000000050aa430] .do_IRQ+0xdc/0x2c0
      	[c0000000050aa4e0] hardware_interrupt_common+0x154/0x180
      	 --- Exception: 501 at .bad_range+0x1c/0x110
      		 LR = .get_page_from_freelist+0x908/0xbb0
      	[c0000000050aa7d0] .list_del+0x18/0x50 (unreliable)
      	[c0000000050aa850] .get_page_from_freelist+0x908/0xbb0
      	[c0000000050aa9e0] .__alloc_pages_nodemask+0x21c/0xae0
      	[c0000000050aaba0] .alloc_pages_vma+0xd0/0x210
      	[c0000000050aac60] .handle_pte_fault+0x814/0xb70
      	[c0000000050aad50] .__get_user_pages+0x1a4/0x640
      	[c0000000050aae60] .get_user_pages_fast+0xec/0x160
      	[c0000000050aaf10] .__gfn_to_pfn_memslot+0x3b0/0x430 [kvm]
      	[c0000000050aafd0] .kvmppc_gfn_to_pfn+0x64/0x130 [kvm]
      	[c0000000050ab070] .kvmppc_mmu_map_page+0x94/0x530 [kvm]
      	[c0000000050ab190] .kvmppc_handle_pagefault+0x174/0x610 [kvm]
      	[c0000000050ab270] .kvmppc_handle_exit_pr+0x464/0x9b0 [kvm]
      	[c0000000050ab320]  kvm_start_lightweight+0x1ec/0x1fc [kvm]
      	[c0000000050ab4f0] .kvmppc_vcpu_run_pr+0x168/0x3b0 [kvm]
      	[c0000000050ab9c0] .kvmppc_vcpu_run+0xc8/0xf0 [kvm]
      	[c0000000050aba50] .kvm_arch_vcpu_ioctl_run+0x5c/0x1a0 [kvm]
      	[c0000000050abae0] .kvm_vcpu_ioctl+0x478/0x730 [kvm]
      	[c0000000050abc90] .do_vfs_ioctl+0x4ec/0x7c0
      	[c0000000050abd80] .SyS_ioctl+0xd4/0xf0
      	[c0000000050abe30] syscall_exit+0x0/0x98
      
      Since this is a regression, this patch proposes a minimalistic
      and low-risk solution by blindly forcing the hardirq exit processing of
      softirqs on the softirq stack. This way we should reduce significantly
      the opportunities for task stack overflow dug by softirqs.
      
      Longer term solutions may involve extending the hardirq stack coverage to
      irq_exit(), etc...
      Reported-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: #3.9.. <stable@vger.kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Mackerras <paulus@au1.ibm.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      ded79754
  5. 30 Sep, 2013 3 commits
  6. 28 Sep, 2013 1 commit
  7. 27 Sep, 2013 1 commit
  8. 25 Sep, 2013 10 commits
  9. 20 Sep, 2013 4 commits
  10. 16 Sep, 2013 1 commit
  11. 13 Sep, 2013 1 commit
  12. 12 Sep, 2013 6 commits