Skip to content
  • Lv Zheng's avatar
    ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() · 06a8566b
    Lv Zheng authored
    
    
    This patch fixes the issues indicated by the test results that
    ipmi_msg_handler() is invoked in atomic context.
    
    BUG: scheduling while atomic: kipmi0/18933/0x10000100
    Modules linked in: ipmi_si acpi_ipmi ...
    CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
    Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
     ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
     ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
     0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
    Call Trace:
     <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
     [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
     [<ffffffff814c73e0>] __schedule+0x83/0x59c
     [<ffffffff81058853>] __cond_resched+0x22/0x2d
     [<ffffffff814c794b>] _cond_resched+0x14/0x1d
     [<ffffffff814c6d82>] mutex_lock+0x11/0x32
     [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
     [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
     [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
     [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
     [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
     [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
     [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
     ...
    
    Also Tony Camuso says:
    
     We were getting occasional "Scheduling while atomic" call traces
     during boot on some systems. Problem was first seen on a Cisco C210
     but we were able to reproduce it on a Cisco c220m3. Setting
     CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
     tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.
    
     =================================
     [ INFO: inconsistent lock state ]
     2.6.32-415.el6.x86_64-debug-splck #1
     ---------------------------------
     inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
     ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
      (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
     {SOFTIRQ-ON-W} state was registered at:
       [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
       [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
       [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
       [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
       [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
       [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be
    
    The fix implemented by this change has been tested by Tony:
    
     Tested the patch in a boot loop with lockdep debug enabled and never
     saw the problem in over 400 reboots.
    
    Reported-and-tested-by: default avatarTony Camuso <tcamuso@redhat.com>
    Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
    Reviewed-by: default avatarHuang Ying <ying.huang@intel.com>
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    06a8566b