1. 27 Jul, 2017 3 commits
    • Petr Mladek's avatar
      printk/console: Enhance the check for consoles using init memory · 5a814231
      Petr Mladek authored
      printk_late_init() is responsible for disabling boot consoles that
      use init memory. It checks the address of struct console for this.
      
      But this is not enough. For example, there are several early
      consoles that have write() method in the init section and
      struct console in the normal section. They are not disabled
      and could cause fancy and hard to debug system states.
      
      It is even more complicated by the macros EARLYCON_DECLARE() and
      OF_EARLYCON_DECLARE() where various struct members are set at
      runtime by the provided setup() function.
      
      I have tried to reproduce this problem and forced the classic uart
      early console to stay using keep_bootcon parameter. In particular
      I used earlycon=uart,io,0x3f8 keep_bootcon console=ttyS0,115200.
      The system did not boot:
      
      [    1.570496] PM: Image not found (code -22)
      [    1.570496] PM: Image not found (code -22)
      [    1.571886] PM: Hibernation image not present or could not be loaded.
      [    1.571886] PM: Hibernation image not present or could not be loaded.
      [    1.576407] Freeing unused kernel memory: 2528K
      [    1.577244] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
      
      The double lines are caused by having both early uart console and
      ttyS0 console enabled at the same time. The early console stopped
      working when the init memory was freed. Fortunately, the invalid
      call was caught by the NX-protexted page check and did not cause
      any silent fancy problems.
      
      This patch adds a check for many other addresses stored in
      struct console. It omits setup() and match() that are used
      only when the console is registered. Therefore they have
      already been used at this point and there is no reason
      to use them again.
      
      Link: http://lkml.kernel.org/r/1500036673-7122-3-git-send-email-pmladek@suse.com
      
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: "Fabio M. Di Nitto" <fdinitto@redhat.com>
      Cc: linux-serial@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      5a814231
    • Matt Redfearn's avatar
      printk/console: Always disable boot consoles that use init memory before it is freed · 2b1be689
      Matt Redfearn authored
      Commit 4c30c6f5 ("kernel/printk: do not turn off bootconsole in
      printk_late_init() if keep_bootcon") added a check on keep_bootcon to
      ensure that boot consoles were kept around until the real console is
      registered.
      
      This can lead to problems if the boot console data and code are in the
      init section, since it can be freed before the boot console is
      unregistered.
      
      Commit 81cc26f2 ("printk: only unregister boot consoles when
      necessary") fixed this a better way. It allowed to keep boot consoles
      that did not use init data. Unfortunately it did not remove the check
      of keep_bootcon.
      
      This can lead to crashes and weird panics when the bootconsole is
      accessed after free, especially if page poisoning is in use and the
      code / data have been overwritten with a poison value.
      
      To prevent this, always free the boot console if it is within the init
      section. In addition, print a warning about that the console is removed
      prematurely.
      
      Finally there is a new comment how to avoid the warning. It replaced
      an explanation that duplicated a more comprehensive function
      description few lines above.
      
      Fixes: 4c30c6f5 ("kernel/printk: do not turn off bootconsole in printk_late_init() if keep_bootcon")
      Link: http://lkml.kernel.org/r/1500036673-7122-2-git-send-email-pmladek@suse.com
      
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: "Fabio M. Di Nitto" <fdinitto@redhat.com>
      Cc: linux-serial@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarMatt Redfearn <matt.redfearn@imgtec.com>
      [pmladek@suse.com: print the warning, code and comments clean up]
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      2b1be689
    • Pierre Kuo's avatar
      printk: Modify operators of printed_len and text_len · aec47caa
      Pierre Kuo authored
      With commit <ddb9baa8> ("printk: report lost messages in printk
      safe/nmi contexts") and commit <8b1742c9> ("printk: remove zap_locks()
      function"), it seems we can remove initialization, "=0", of text_len and
      directly assign result of log_output to printed_len.
      
      Link: http://lkml.kernel.org/r/1499755255-6258-1-git-send-email-vichy.kuo@gmail.com
      
      
      Cc: rostedt@goodmis.org
      Cc: linux-kernel@vger.kernel.org
      Cc: joe@perches.com
      Signed-off-by: default avatarPierre Kuo <vichy.kuo@gmail.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      aec47caa
  2. 08 Jun, 2017 1 commit
    • Petr Mladek's avatar
      Revert "printk: fix double printing with earlycon" · dac8bbba
      Petr Mladek authored
      This reverts commit cf39bf58.
      
      The commit regression to users that define both console=ttyS1
      and console=ttyS0 on the command line, see
      https://lkml.kernel.org/r/20170509082915.GA13236@bistromath.localdomain
      
      The kernel log messages always appeared only on one serial port. It is
      even documented in Documentation/admin-guide/serial-console.rst:
      
      "Note that you can only define one console per device type (serial,
      video)."
      
      The above mentioned commit changed the order in which the command line
      parameters are searched. As a result, the kernel log messages go to
      the last mentioned ttyS* instead of the first one.
      
      We long thought that using two console=ttyS* on the command line
      did not make sense. But then we realized that console= parameters
      were handled also by systemd, see
      http://0pointer.de/blog/projects/serial-console.html
      
      
      
      "By default systemd will instantiate one serial-getty@.service on
      the main kernel console, if it is not a virtual terminal."
      
      where
      
      "[4] If multiple kernel consoles are used simultaneously, the main
      console is the one listed first in /sys/class/tty/console/active,
      which is the last one listed on the kernel command line."
      
      This puts the original report into another light. The system is running
      in qemu. The first serial port is used to store the messages into a file.
      The second one is used to login to the system via a socket. It depends
      on systemd and the historic kernel behavior.
      
      By other words, systemd causes that it makes sense to define both
      console=ttyS1 console=ttyS0 on the command line. The kernel fix
      caused regression related to userspace (systemd) and need to be
      reverted.
      
      In addition, it went out that the fix helped only partially.
      The messages still were duplicated when the boot console was
      removed early by late_initcall(printk_late_init). Then the entire
      log was replayed when the same console was registered as a normal one.
      
      Link: 20170606160339.GC7604@pathway.suse.cz
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Sabrina Dubroca <sd@queasysnail.net>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Robin Murphy <robin.murphy@arm.com>,
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "Nair, Jayachandran" <Jayachandran.Nair@cavium.com>
      Cc: linux-serial@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reported-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      dac8bbba
  3. 23 May, 2017 1 commit
  4. 19 May, 2017 1 commit
    • Petr Mladek's avatar
      printk: Use the main logbuf in NMI when logbuf_lock is available · 719f6a70
      Petr Mladek authored
      The commit 42a0bb3f ("printk/nmi: generic solution for safe
      printk in NMI") caused that printk stores messages into a temporary
      buffer in NMI context.
      
      The buffer is per-CPU and therefore the size is rather limited.
      It works quite well for NMI backtraces. But there are longer logs
      that might get printed in NMI context, for example, lockdep
      warnings, ftrace_dump_on_oops.
      
      The temporary buffer is used to avoid deadlocks caused by
      logbuf_lock. Also it is needed to avoid races with the other
      temporary buffer that is used when PRINTK_SAFE_CONTEXT is entered.
      But the main buffer can be used in NMI if the lock is available
      and we did not interrupt PRINTK_SAFE_CONTEXT.
      
      The lock is checked using raw_spin_is_locked(). It might cause
      false negatives when the lock is taken on another CPU and
      this CPU is in the safe context from other reasons. Note that
      the safe context is used also to get console semaphore or when
      calling console drivers. For this reason, we do the check in
      printk_nmi_enter(). It makes the handling consistent for
      the entire NMI handler and avoids reshuffling of the messages.
      
      The patch also defines special printk context that allows
      to use printk_deferred() in NMI. Note that we could not flush
      the messages to the consoles because console drivers might use
      many other internal locks.
      
      The newly created vprintk_deferred() disables the preemption
      only around the irq work handling. It is needed there to keep
      the consistency between the two per-CPU variables. But there
      is no reason to disable preemption around vprintk_emit().
      
      Finally, the patch puts back explicit serialization of the NMI
      backtraces from different CPUs. It was removed by the
      commit a9edc880 ("x86/nmi: Perform a safe
      NMI stack trace on all CPUs"). It was not needed because
      the flushing of the temporary per-CPU buffers was serialized.
      
      Link: http://lkml.kernel.org/r/1493912763-24873-1-git-send-email-pmladek@suse.com
      
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <rack+kernel@arm.linux.org.uk>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: x86@kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Suggested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      719f6a70
  5. 09 May, 2017 1 commit
    • Hari Bathini's avatar
      crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE · 692f66f2
      Hari Bathini authored
      Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
      reuse crashkernel parameter for fadump", v4.
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      This patchset removes dependency with CONFIG_KEXEC for crashkernel
      parameter and vmcoreinfo related code as it can be reused without kexec
      support.  Also, crashkernel parameter is reused instead of
      fadump_reserve_mem to reserve memory for fadump.
      
      The first patch moves crashkernel parameter parsing and vmcoreinfo
      related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE.  The
      second patch reuses the definitions of append_elf_note() & final_note()
      functions under CONFIG_CRASH_CORE in IA64 arch code.  The third patch
      removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
      in powerpc.  The next patch reuses crashkernel parameter for reserving
      memory for fadump, instead of the fadump_reserve_mem parameter.  This
      has the advantage of using all syntaxes crashkernel parameter supports,
      for fadump as well.  The last patch updates fadump kernel documentation
      about use of crashkernel parameter.
      
      This patch (of 5):
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      But currently, code related to vmcoreinfo and parsing of crashkernel
      parameter is built under CONFIG_KEXEC_CORE.  This patch introduces
      CONFIG_CRASH_CORE and moves the above mentioned code under this config,
      allowing code reuse without dependency on CONFIG_KEXEC.  There is no
      functional change with this patch.
      
      Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
      
      Signed-off-by: default avatarHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: default avatarDave Young <dyoung@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      692f66f2
  6. 18 Apr, 2017 1 commit
  7. 12 Apr, 2017 3 commits
  8. 04 Apr, 2017 1 commit
    • Petr Mladek's avatar
      printk: Correctly handle preemption in console_unlock() · 257ab443
      Petr Mladek authored
      Some console drivers code calls console_conditional_schedule()
      that looks at @console_may_schedule. The value must be cleared
      when the drivers are called from console_unlock() with
      interrupts disabled. But rescheduling is fine when the same
      code is called, for example, from tty operations where the
      console semaphore is taken via console_lock().
      
      This is why @console_may_schedule is cleared before calling console
      drivers. The original value is stored to decide if we could sleep
      between lines.
      
      Now, @console_may_schedule is not cleared when we call
      console_trylock() and jump back to the "again" goto label.
      This has become a problem, since the commit 6b97a20d
      ("printk: set may_schedule for some of console_trylock() callers").
      @console_may_schedule might get enabled now.
      
      There is also the opposite problem. console_lock() can be called
      only from preemptive context. It can always enable scheduling in
      the console code. But console_trylock() is not able to detect it
      when CONFIG_PREEMPT_COUNT is disabled. Therefore we should use the
      original @console_may_schedule value after re-acquiring
      the console semaphore in console_unlock().
      
      This patch solves both problems by moving the "again" goto label.
      
      Alternative solution was to clear and restore the value around
      call_console_drivers(). Then console_conditional_schedule() could
      be used also inside console_unlock(). But there was a potential race
      with console_flush_on_panic() as reported by Sergey Senozhatsky.
      That function should be called only where there is only one CPU
      and with interrupts disabled. But better be on the safe side
      because stopping CPUs might fail.
      
      Fixes: 6b97a20d ("printk: set may_schedule for some of console_trylock() callers")
      Link: http://lkml.kernel.org/r/1490372045-22288-1-git-send-email-pmladek@suse.com
      
      Suggested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: linux-fbdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      257ab443
  9. 24 Mar, 2017 1 commit
    • Sergey Senozhatsky's avatar
      printk: use console_trylock() in console_cpu_notify() · 64ca752d
      Sergey Senozhatsky authored
      There is no need to always call blocking console_lock() in
      console_cpu_notify(), it's quite possible that console_sem can
      be locked by other CPU on the system, either already printing
      or soon to begin printing the messages. console_lock() in this
      case can simply block CPU hotplug for unknown period of time
      (console_unlock() is time unbound). Not that hotplug is very
      fast, but still, with other CPUs being online and doing
      printk() console_cpu_notify() can stuck.
      
      Use console_trylock() instead and opt-out if console_sem is
      already acquired from another CPU, since that CPU will do
      the printing for us.
      
      Link: http://lkml.kernel.org/r/20170121104729.8585-1-sergey.senozhatsky@gmail.com
      
      
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      64ca752d
  10. 02 Mar, 2017 3 commits
  11. 19 Feb, 2017 1 commit
    • Sergey Senozhatsky's avatar
      printk: use rcuidle console tracepoint · fc98c3c8
      Sergey Senozhatsky authored
      Use rcuidle console tracepoint because, apparently, it may be issued
      from an idle CPU:
      
        hw-breakpoint: Failed to enable monitor mode on CPU 0.
        hw-breakpoint: CPU 0 failed to disable vector catch
      
        ===============================
        [ ERR: suspicious RCU usage.  ]
        4.10.0-rc8-next-20170215+ #119 Not tainted
        -------------------------------
        ./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
      
        other info that might help us debug this:
      
        RCU used illegally from idle CPU!
        rcu_scheduler_active = 2, debug_locks = 0
        RCU used illegally from extended quiescent state!
        2 locks held by swapper/0/0:
         #0:  (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
         #1:  (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
      
        stack backtrace:
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
        Hardware name: Generic OMAP4 (Flattened Device Tree)
          console_unlock
          vprintk_emit
          vprintk_default
          printk
          reset_ctrl_regs
          dbg_cpu_pm_notify
          notifier_call_chain
          cpu_pm_exit
          omap_enter_idle_coupled
          cpuidle_enter_state
          cpuidle_enter_state_coupled
          do_idle
          cpu_startup_entry
          start_kernel
      
      This RCU warning, however, is suppressed by lockdep_off() in printk().
      lockdep_off() increments the ->lockdep_recursion counter and thus
      disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
      lockdep to be enabled "current->lockdep_recursion == 0".
      
      Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com
      
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Lindgren <tony@atomide.com>
      Cc: Russell King <rmk@armlinux.org.uk>
      Cc: <stable@vger.kernel.org> [3.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc98c3c8
  12. 08 Feb, 2017 7 commits
    • Sergey Senozhatsky's avatar
      printk: drop call_console_drivers() unused param · d9c23523
      Sergey Senozhatsky authored
      We do suppress_message_printing() check before we call
      call_console_drivers() now, so `level' param is not needed
      anymore.
      
      Link: http://lkml.kernel.org/r/20161224140902.1962-2-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      d9c23523
    • Sergey Senozhatsky's avatar
      printk: convert the rest to printk-safe · de6fcbdb
      Sergey Senozhatsky authored
      This patch converts the rest of logbuf users (which are
      out of printk recursion case, but can deadlock in printk).
      To make printk-safe usage easier the patch introduces 4
      helper macros:
      - logbuf_lock_irq()/logbuf_unlock_irq()
        lock/unlock the logbuf lock and disable/enable local IRQ
      
      - logbuf_lock_irqsave(flags)/logbuf_unlock_irqrestore(flags)
        lock/unlock the logbuf lock and saves/restores local IRQ state
      
      Link: http://lkml.kernel.org/r/20161227141611.940-9-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      de6fcbdb
    • Sergey Senozhatsky's avatar
      printk: remove zap_locks() function · 8b1742c9
      Sergey Senozhatsky authored
      We use printk-safe now which makes printk-recursion detection code
      in vprintk_emit() unreachable. The tricky thing here is that, apart
      from detecting and reporting printk recursions, that code also used
      to zap_locks() in case of panic() from the same CPU. However,
      zap_locks() does not look to be needed anymore:
      
      1) Since commit 08d78658 ("panic: release stale console lock to
         always get the logbuf printed out") panic flushing of `logbuf' to
         console ignores the state of `console_sem' by doing
         	panic()
      		console_trylock();
      		console_unlock();
      
      2) Since commit cf9b1106 ("printk/nmi: flush NMI messages on the
         system panic") panic attempts to zap the `logbuf_lock' spin_lock to
         successfully flush nmi messages to `logbuf'.
      
      Basically, it seems that we either already do what zap_locks() used to
      do but in other places or we ignore the state of the lock. The only
      reaming difference is that we don't re-init the console semaphore in
      printk_safe_flush_on_panic(), but this is not necessary because we
      don't call console drivers from printk_safe_flush_on_panic() due to
      the fact that we are using a deferred printk() version (as was
      suggested by Petr Mladek).
      
      Link: http://lkml.kernel.org/r/20161227141611.940-8-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      8b1742c9
    • Sergey Senozhatsky's avatar
      printk: use printk_safe buffers in printk · f975237b
      Sergey Senozhatsky authored
      Use printk_safe per-CPU buffers in printk recursion-prone blocks:
      -- around logbuf_lock protected sections in vprintk_emit() and
         console_unlock()
      -- around down_trylock_console_sem() and up_console_sem()
      
      Note that this solution addresses deadlocks caused by printk()
      recursive calls only. That is vprintk_emit() and console_unlock().
      The rest will be converted in a followup patch.
      
      Another thing to note is that we now keep lockdep enabled in printk,
      because we are protected against the printk recursion caused by
      lockdep in vprintk_emit() by the printk-safe mechanism - we first
      switch to per-CPU buffers and only then access the deadlock-prone
      locks.
      
      Examples:
      
      1) printk() from logbuf_lock spin_lock section
      
      Assume the following code:
        printk()
          raw_spin_lock(&logbuf_lock);
          WARN_ON(1);
          raw_spin_unlock(&logbuf_lock);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 366 at kernel/printk/printk.c:1811 vprintk_emit
       CPU: 0 PID: 366 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         vprintk_emit+0x1cd/0x438
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      2) printk() from semaphore sem->lock spin_lock section
      
      Assume the following code
      
        printk()
          console_trylock()
            down_trylock()
              raw_spin_lock_irqsave(&sem->lock, flags);
              WARN_ON(1);
              raw_spin_unlock_irqrestore(&sem->lock, flags);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 363 at kernel/locking/semaphore.c:141 down_trylock
       CPU: 1 PID: 363 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         down_trylock+0x3d/0x62
         ? vprintk_emit+0x3f9/0x414
         console_trylock+0x31/0xeb
         vprintk_emit+0x3f9/0x414
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      3) printk() from console_unlock()
      
      Assume the following code:
      
        printk()
          console_unlock()
            raw_spin_lock(&logbuf_lock);
            WARN_ON(1);
            raw_spin_unlock(&logbuf_lock);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 329 at kernel/printk/printk.c:2384 console_unlock
       CPU: 1 PID: 329 Comm: bash
       Call Trace:
         warn_slowpath_null+0x18/0x1a
         console_unlock+0x12d/0x559
         ? trace_hardirqs_on_caller+0x16d/0x189
         ? trace_hardirqs_on+0xd/0xf
         vprintk_emit+0x363/0x374
         vprintk_default+0x18/0x1a
         printk+0x43/0x4b
        [..]
      
      4) printk() from try_to_wake_up()
      
      Assume the following code:
      
        printk()
          console_unlock()
            up()
              try_to_wake_up()
                raw_spin_lock_irqsave(&p->pi_lock, flags);
                WARN_ON(1);
                raw_spin_unlock_irqrestore(&p->pi_lock, flags);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 3 PID: 363 at kernel/sched/core.c:2028 try_to_wake_up
       CPU: 3 PID: 363 Comm: bash
       Call Trace:
         warn_slowpath_null+0x1d/0x1f
         try_to_wake_up+0x7f/0x4f7
         wake_up_process+0x15/0x17
         __up.isra.0+0x56/0x63
         up+0x32/0x42
         __up_console_sem+0x37/0x55
         console_unlock+0x21e/0x4c2
         vprintk_emit+0x41c/0x462
         vprintk_default+0x1d/0x1f
         printk+0x48/0x50
        [..]
      
      5) printk() from call_console_drivers()
      
      Assume the following code:
        printk()
          console_unlock()
            call_console_drivers()
            ...
                WARN_ON(1);
      
      which now produces:
      
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 305 at kernel/printk/printk.c:1604 call_console_drivers
       CPU: 2 PID: 305 Comm: bash
       Call Trace:
         warn_slowpath_null+0x18/0x1a
         call_console_drivers.isra.6.constprop.16+0x3a/0xb0
         console_unlock+0x471/0x48e
         vprintk_emit+0x1f4/0x206
         vprintk_default+0x18/0x1a
         vprintk_func+0x6e/0x70
         printk+0x3e/0x46
        [..]
      
      6) unsupported placeholder in printk() format now prints an actual
         warning from vscnprintf(), instead of
         	'BUG: recent printk recursion!'.
      
       ------------[ cut here ]------------
       WARNING: CPU: 5 PID: 337 at lib/vsprintf.c:1900 format_decode
       Please remove unsupported %
        in format string
       CPU: 5 PID: 337 Comm: bash
       Call Trace:
         dump_stack+0x4f/0x65
         __warn+0xc2/0xdd
         warn_slowpath_fmt+0x4b/0x53
         format_decode+0x22c/0x308
         vsnprintf+0x89/0x3b7
         vscnprintf+0xd/0x26
         vprintk_emit+0xb4/0x238
         vprintk_default+0x1d/0x1f
         vprintk_func+0x6c/0x73
         printk+0x43/0x4b
        [..]
      
      Link: http://lkml.kernel.org/r/20161227141611.940-7-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      f975237b
    • Sergey Senozhatsky's avatar
      printk: report lost messages in printk safe/nmi contexts · ddb9baa8
      Sergey Senozhatsky authored
      Account lost messages in pritk-safe and printk-safe-nmi
      contexts and report those numbers during printk_safe_flush().
      
      The patch also moves lost message counter to struct
      `printk_safe_seq_buf' instead of having dedicated static
      counters - this simplifies the code.
      
      Link: http://lkml.kernel.org/r/20161227141611.940-6-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      ddb9baa8
    • Sergey Senozhatsky's avatar
      printk: introduce per-cpu safe_print seq buffer · 099f1c84
      Sergey Senozhatsky authored
      This patch extends the idea of NMI per-cpu buffers to regions
      that may cause recursive printk() calls and possible deadlocks.
      Namely, printk() can't handle printk calls from schedule code
      or printk() calls from lock debugging code (spin_dump() for instance);
      because those may be called with `sem->lock' already taken or any
      other `critical' locks (p->pi_lock, etc.). An example of deadlock
      can be
      
       vprintk_emit()
        console_unlock()
         up()                        << raw_spin_lock_irqsave(&sem->lock, flags);
          wake_up_process()
           try_to_wake_up()
            ttwu_queue()
             ttwu_activate()
              activate_task()
               enqueue_task()
                enqueue_task_fair()
                 cfs_rq_of()
                  task_of()
                   WARN_ON_ONCE(!entity_is_task(se))
                    vprintk_emit()
                     console_trylock()
                      down_trylock()
                       raw_spin_lock_irqsave(&sem->lock, flags)
                       ^^^^ deadlock
      
      and some other cases.
      
      Just like in NMI implementation, the solution uses a per-cpu
      `printk_func' pointer to 'redirect' printk() calls to a 'safe'
      callback, that store messages in a per-cpu buffer and flushes
      them back to logbuf buffer later.
      
      Usage example:
      
       printk()
        printk_safe_enter_irqsave(flags)
        //
        //  any printk() call from here will endup in vprintk_safe(),
        //  that stores messages in a special per-CPU buffer.
        //
        printk_safe_exit_irqrestore(flags)
      
      The 'redirection' mechanism, though, has been reworked, as suggested
      by Petr Mladek. Instead of using a per-cpu @print_func callback we now
      keep a per-cpu printk-context variable and call either default or nmi
      vprintk function depending on its value. printk_nmi_entrer/exit and
      printk_safe_enter/exit, thus, just set/celar corresponding bits in
      printk-context functions.
      
      The patch only adds printk_safe support, we don't use it yet.
      
      Link: http://lkml.kernel.org/r/20161227141611.940-4-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Reviewed-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      099f1c84
    • Sergey Senozhatsky's avatar
      printk: use vprintk_func in vprintk() · bd66a892
      Sergey Senozhatsky authored
      vprintk(), just like printk(), better be using per-cpu printk_func
      instead of direct vprintk_emit() call. Just in case if vprintk()
      will ever be called from NMI, or from any other context that can
      deadlock in printk().
      
      Link: http://lkml.kernel.org/r/20161227141611.940-2-sergey.senozhatsky@gmail.com
      
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Calvin Owens <calvinowens@fb.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      bd66a892
  13. 24 Dec, 2016 1 commit
  14. 15 Dec, 2016 4 commits
    • Geert Uytterhoeven's avatar
      printk: Remove no longer used second struct cont · 8fa9a697
      Geert Uytterhoeven authored
      If CONFIG_PRINTK=n:
      
          kernel/printk/printk.c:1893: warning: ‘cont’ defined but not used
      
      Note that there are actually two different struct cont definitions and
      objects: the first one is used if CONFIG_PRINTK=y, the second one became
      unused by removing console_cont_flush().
      
      Fixes: 5c2992ee
      
       ("printk: remove console flushing special cases for partial buffered lines")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: default avatarPetr Mladek <pmladek@suse.com>
      [ I do the occasional "allnoconfig" builds, but apparently not often
        enough  - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fa9a697
    • Linus Torvalds's avatar
      printk: remove console flushing special cases for partial buffered lines · 5c2992ee
      Linus Torvalds authored
      
      
      It actively hurts proper merging, and makes for a lot of special cases.
      There was a good(ish) reason for doing it originally, but it's getting
      too painful to maintain.  And most of the original reasons for it are
      long gone.
      
      So instead of having special code to flush partial lines to the console
      (as opposed to the record buffers), do _all_ the console writing from
      the record buffer, and be done with it.
      
      If an oops happens (or some other synchronous event), we will flush the
      partial lines due to the oops printing activity, so this does not affect
      that.  It does mean that if you have a completely hung machine, a
      partial preceding line may not have been printed out.
      
      That was some of the original reason for this complexity, in fact, back
      when we used to test for the historical i386 "halt" instruction problem
      by doing
      
      	pr_info("Checking 'hlt' instruction... ");
      
      	if (!boot_cpu_data.hlt_works_ok) {
      		pr_cont("disabled\n");
      		return;
      	}
      	halt();
      	halt();
      	halt();
      	halt();
      	pr_cont("OK\n");
      
      and that model no longer works (it the 'hlt' instruction kills the
      machine, the partial line won't have been flushed, so you won't even see
      it).
      
      Of course, that was also back in the days when people actually had
      textual console output rather than a graphical splash-screen at bootup.
      How times change..
      
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Tested-by: default avatarPetr Mladek <pmladek@suse.com>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5c2992ee
    • Linus Torvalds's avatar
      printk: remove games with previous record flags · 5aa068ea
      Linus Torvalds authored
      
      
      The record logging code looks at the previous record flags in various
      ways, and they are all wrong.
      
      You can't use the previous record flags to determine anything about the
      next record, because they may simply not be related.  In particular, the
      reason the previous record was a continuation record may well be exactly
      _because_ the new record was printed by a different process, which is
      why the previous record was flushed.
      
      So all those games are simply wrong, and make the code hard to
      understand (because the code fundamentally cdoes not make sense).
      
      So remove it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5aa068ea
    • Petr Mladek's avatar
      kdb: call vkdb_printf() from vprintk_default() only when wanted · 34aaff40
      Petr Mladek authored
      kdb_trap_printk allows to pass normal printk() messages to kdb via
      vkdb_printk().  For example, it is used to get backtrace using the
      classic show_stack(), see kdb_show_stack().
      
      vkdb_printf() tries to avoid a potential infinite loop by disabling the
      trap.  But this approach is racy, for example:
      
      CPU1					CPU2
      
      vkdb_printf()
        // assume that kdb_trap_printk == 0
        saved_trap_printk = kdb_trap_printk;
        kdb_trap_printk = 0;
      
      					kdb_show_stack()
      					  kdb_trap_printk++;
      
      Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
      	  even when it should have been disabled. It will not
      	  cause a deadlock but...
      
         // using the outdated saved value: 0
         kdb_trap_printk = saved_trap_printk;
      
      					  kdb_trap_printk--;
      
      Problem2: Now, kdb_trap_printk == -1 and will stay like this.
         It means that all messages will get passed to kdb from
         now on.
      
      This patch removes the racy saved_trap_printk handling.  Instead, the
      recursion is prevented by a check for the locked CPU.
      
      The solution is still kind of racy.  A non-related printk(), from
      another process, might get trapped by vkdb_printf().  And the wanted
      printk() might not get trapped because kdb_printf_cpu is assigned.  But
      this problem existed even with the original code.
      
      A proper solution would be to get_cpu() before setting kdb_trap_printk
      and trap messages only from this CPU.  I am not sure if it is worth the
      effort, though.
      
      In fact, the race is very theoretical.  When kdb is running any of the
      commands that use kdb_trap_printk there is a single active CPU and the
      other CPUs should be in a holding pen inside kgdb_cpu_enter().
      
      The only time this is violated is when there is a timeout waiting for
      the other CPUs to report to the holding pen.
      
      Finally, note that the situation is a bit schizophrenic.  vkdb_printf()
      explicitly allows recursion but only from KDB code that calls
      kdb_printf() directly.  On the other hand, the generic printk()
      recursion is not allowed because it might cause an infinite loop.  This
      is why we could not hide the decision inside vkdb_printf() easily.
      
      Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.com
      
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34aaff40
  15. 05 Dec, 2016 1 commit
    • Al Viro's avatar
      [iov_iter] new primitives - copy_from_iter_full() and friends · cbbd26b8
      Al Viro authored
      
      
      copy_from_iter_full(), copy_from_iter_full_nocache() and
      csum_and_copy_from_iter_full() - counterparts of copy_from_iter()
      et.al., advancing iterator only in case of successful full copy
      and returning whether it had been successful or not.
      
      Convert some obvious users.  *NOTE* - do not blindly assume that
      something is a good candidate for those unless you are sure that
      not advancing iov_iter in failure case is the right thing in
      this case.  Anything that does short read/short write kind of
      stuff (or is in a loop, etc.) is unlikely to be a good one.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      cbbd26b8
  16. 17 Nov, 2016 1 commit
  17. 14 Nov, 2016 1 commit
  18. 11 Nov, 2016 1 commit
    • Hans de Goede's avatar
      Revert "console: don't prefer first registered if DT specifies stdout-path" · c6c7d83b
      Hans de Goede authored
      This reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path").
      
      The reverted commit changes existing behavior on which many ARM boards
      rely.  Many ARM small-board-computers, like e.g.  the Raspberry Pi have
      both a video output and a serial console.  Depending on whether the user
      is using the device as a more regular computer; or as a headless device
      we need to have the console on either one or the other.
      
      Many users rely on the kernel behavior of the console being present on
      both outputs, before the reverted commit the console setup with no
      console= kernel arguments on an ARM board which sets stdout-path in dt
      would look like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
        tty0                 -WU (E  p  )    4:1
      
      Where as after the reverted commit, it looks like this:
      
        [root@localhost ~]# cat /proc/consoles
        ttyS0                -W- (EC p a)    4:64
      
      This commit reverts commit 05fd007e ("console: don't prefer first
      registered if DT specifies stdout-path") restoring the original
      behavior.
      
      Fixes: 05fd007e ("console: don't prefer first registered if DT specifies stdout-path")
      Link: http://lkml.kernel.org/r/20161104121135.4780-2-hdegoede@redhat.com
      
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Thorsten Leemhuis <regressions@leemhuis.info>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c6c7d83b
  19. 09 Nov, 2016 1 commit
  20. 19 Oct, 2016 1 commit
    • Linus Torvalds's avatar
      printk: suppress empty continuation lines · 8835ca59
      Linus Torvalds authored
      
      
      We have a fairly common pattern where you print several things as
      continuations on one single line in a loop, and then at the end you do
      
      	printk(KERN_CONT "\n");
      
      to flush the buffered output.
      
      But if the output was flushed by something else (concurrent printk
      activity, or just system logging), we don't want that final flushing to
      just print an empty line.
      
      So just suppress empty continuation lines when they couldn't be merged
      into the line they are a continuation of.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8835ca59
  21. 09 Oct, 2016 4 commits
    • Linus Torvalds's avatar
      printk: make reading the kernel log flush pending lines · bfd8d3f2
      Linus Torvalds authored
      
      
      That will mean that any possible subsequent continuation will now be
      broken up onto a line of its own (since reading the log has finalized
      the beginning og the line), but if user space has activated system
      logging (or if there's a kernel message dump going on) that is the right
      thing to do.
      
      And now that we actually get the continuation flags _right_ for this
      all, the user space logger that is reading the kernel messages can
      actually see the continuation marker.  Not that anybody seems to really
      bother with it (or care), but in theory user space can do its own
      message stitching.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bfd8d3f2
    • Linus Torvalds's avatar
      printk: re-organize log_output() to be more legible · 5e467652
      Linus Torvalds authored
      
      
      Avoid some duplicate logic now that we can return early, and update the
      comments for the new LOG_CONT world order.
      
      This also stops the continuation flushing from just using random record
      flags for the flushing action, instead taking the flags from the proper
      original line and updating them as we add continuations to it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e467652
    • Linus Torvalds's avatar
      printk: split out core logging code into helper function · c362c7ff
      Linus Torvalds authored
      
      
      The code that actually decides how to log the message (whether to put it
      directly into the record log, whether to append it to an existing
      buffered log, or whether to start a new buffered log) is fairly
      non-obvious code in the middle of the vprintk_emit() function.
      
      Splitting that code up into a helper function makes it easier to
      understand, but perhaps more importantly also allows for the code to
      just return early out of the helper function once it has made the
      decision about where the new log content goes.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c362c7ff
    • Linus Torvalds's avatar
      printk: reinstate KERN_CONT for printing continuation lines · 4bcc595c
      Linus Torvalds authored
      Long long ago the kernel log buffer was a buffered stream of bytes, very
      much like stdio in user space.  It supported log levels by scanning the
      stream and noticing the log level markers at the beginning of each line,
      but if you wanted to print a partial line in multiple chunks, you just
      did multiple printk() calls, and it just automatically worked.
      
      Except when it didn't, and you had very confusing output when different
      lines got all mixed up with each other.  Then you got fragment lines
      mixing with each other, or with non-fragment lines, because it was
      traditionally impossible to tell whether a printk() call was a
      continuation or not.
      
      To at least help clarify the issue of continuation lines, we added a
      KERN_CONT marker back in 2007 to mark continuation lines:
      
        47492527 ("printk: add KERN_CONT annotation").
      
      That continuation marker was initially an empty string, and didn't
      actuall make any semantic difference.  But it at least made it possible
      to annotate the source code, and have check-patch notice that a printk()
      didn't need or want a log level marker, because it was a continuation of
      a previous line.
      
      To avoid the ambiguity between a continuation line that had that
      KERN_CONT marker, and a printk with no level information at all, we then
      in 2009 made KERN_CONT be a real log level marker which meant that we
      could now reliably tell the difference between the two cases.
      
        5fd29d6c ("printk: clean up handling of log-levels and newlines")
      
      and we could take advantage of that to make sure we didn't mix up
      continuation lines with lines that just didn't have any loglevel at all.
      
      Then, in 2012, the kernel log buffer was changed to be a "record" based
      log, where each line was a record that has a loglevel and a timestamp.
      
      You can see the beginning of that conversion in commits
      
        e11fea92 ("kmsg: export printk records to the /dev/kmsg interface")
        7ff9554b ("printk: convert byte-buffer to variable-length record buffer")
      
      with a number of follow-up commits to fix some painful fallout from that
      conversion.  Over all, it took a couple of months to sort out most of
      it.  But the upside was that you could have concurrent readers (and
      writers) of the kernel log and not have lines with mixed output in them.
      
      And one particular pain-point for the record-based kernel logging was
      exactly the fragmentary lines that are generated in smaller chunks.  In
      order to still log them as one recrod, the continuation lines need to be
      attached to the previous record properly.
      
      However the explicit continuation record marker that is actually useful
      for this exact case was actually removed in aroundm the same time by commit
      
        61e99ab8
      
       ("printk: remove the now unnecessary "C" annotation for KERN_CONT")
      
      due to the incorrect belief that KERN_CONT wasn't meaningful.  The
      ambiguity between "is this a continuation line" or "is this a plain
      printk with no log level information" was reintroduced, and in fact
      became an even bigger pain point because there was now the whole
      record-level merging of kernel messages going on.
      
      This patch reinstates the KERN_CONT as a real non-empty string marker,
      so that the ambiguity is fixed once again.
      
      But it's not a plain revert of that original removal: in the four years
      since we made KERN_CONT an empty string again, not only has the format
      of the log level markers changed, we've also had some usage changes in
      this area.
      
      For example, some ACPI code seems to use KERN_CONT _together_ with a log
      level, and now uses both the KERN_CONT marker and (for example) a
      KERN_INFO marker to show that it's an informational continuation of a
      line.
      
      Which is actually not a bad idea - if the continuation line cannot be
      attached to its predecessor, without the log level information we don't
      know what log level to assign to it (and we traditionally just assigned
      it the default loglevel).  So having both a log level and the KERN_CONT
      marker is not necessarily a bad idea, but it does mean that we need to
      actually iterate over potentially multiple markers, rather than just a
      single one.
      
      Also, since KERN_CONT was still conceptually needed, and encouraged, but
      didn't actually _do_ anything, we've also had the reverse problem:
      rather than having too many annotations it has too few, and there is bit
      rot with code that no longer marks the continuation lines with the
      KERN_CONT marker.
      
      So this patch not only re-instates the non-empty KERN_CONT marker, it
      also fixes up the cases of bit-rot I noticed in my own logs.
      
      There are probably other cases where KERN_CONT will be needed to be
      added, either because it is new code that never dealt with the need for
      KERN_CONT, or old code that has bitrotted without anybody noticing.
      
      That said, we should strive to avoid the need for KERN_CONT.  It does
      result in real problems for logging, and should generally not be seen as
      a good feature.  If we some day can get rid of the feature entirely,
      because nobody does any fragmented printk calls, that would be lovely.
      
      But until that point, let's at mark the code that relies on the hacky
      multi-fragment kernel printk's.  Not only does it avoid the ambiguity,
      it also annotates code as "maybe this would be good to fix some day".
      
      (That said, particularly during single-threaded bootup, the downsides of
      KERN_CONT are very limited.  Things get much hairier when you have
      multiple threads going on and user level reading and writing logs too).
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4bcc595c
  22. 08 Oct, 2016 1 commit
    • Paul Burton's avatar
      console: don't prefer first registered if DT specifies stdout-path · 05fd007e
      Paul Burton authored
      If a device tree specifies a preferred device for kernel console output
      via the stdout-path or linux,stdout-path chosen node properties or the
      stdout alias then the kernel ought to honor it & output the kernel
      console to that device.  As it stands, this isn't the case.  Whilst we
      parse the stdout-path properties & set an of_stdout variable from
      of_alias_scan(), and use that from of_console_check() to determine
      whether to add a console device as a preferred console whilst
      registering it, we also prefer the first registered console if no other
      has been selected at the time of its registration.
      
      This means that if a console other than the one the device tree selects
      via stdout-path is registered first, we will switch to using it & when
      the stdout-path console is later registered the call to
      add_preferred_console() via of_console_check() is too late to do
      anything useful.  In practice this seems to mean that we switch to the
      dummy console device fairly early & see no further console output:
      
          Console: colour dummy device 80x25
          console [tty0] enabled
          bootconsole [ns16550a0] disabled
      
      Fix this by not automatically preferring the first registered console if
      one is specified by the device tree.  This allows consoles to be
      registered but not enabled, and once the driver for the console selected
      by stdout-path calls of_console_check() the driver will be added to the
      list of preferred consoles before any other console has been enabled.
      When that console is then registered via register_console() it will be
      enabled as expected.
      
      Link: http://lkml.kernel.org/r/20160809151937.26118-1-paul.burton@imgtec.com
      
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Ivan Delalande <colona@arista.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Jan Kara <jack@suse.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      05fd007e