1. 07 Nov, 2015 2 commits
  2. 10 Sep, 2015 2 commits
    • Jann Horn's avatar
      fs: Don't dump core if the corefile would become world-readable. · 40f705a7
      Jann Horn authored
      
      
      On a filesystem like vfat, all files are created with the same owner
      and mode independent of who created the file. When a vfat filesystem
      is mounted with root as owner of all files and read access for everyone,
      root's processes left world-readable coredumps on it (but other
      users' processes only left empty corefiles when given write access
      because of the uid mismatch).
      
      Given that the old behavior was inconsistent and insecure, I don't see
      a problem with changing it. Now, all processes refuse to dump core unless
      the resulting corefile will only be readable by their owner.
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      40f705a7
    • Jann Horn's avatar
      fs: if a coredump already exists, unlink and recreate with O_EXCL · fbb18169
      Jann Horn authored
      
      
      It was possible for an attacking user to trick root (or another user) into
      writing his coredumps into an attacker-readable, pre-existing file using
      rename() or link(), causing the disclosure of secret data from the victim
      process' virtual memory.  Depending on the configuration, it was also
      possible to trick root into overwriting system files with coredumps.  Fix
      that issue by never writing coredumps into existing files.
      
      Requirements for the attack:
       - The attack only applies if the victim's process has a nonzero
         RLIMIT_CORE and is dumpable.
       - The attacker can trick the victim into coredumping into an
         attacker-writable directory D, either because the core_pattern is
         relative and the victim's cwd is attacker-writable or because an
         absolute core_pattern pointing to a world-writable directory is used.
       - The attacker has one of these:
        A: on a system with protected_hardlinks=0:
           execute access to a folder containing a victim-owned,
           attacker-readable file on the same partition as D, and the
           victim-owned file will be deleted before the main part of the attack
           takes place. (In practice, there are lots of files that fulfill
           this condition, e.g. entries in Debian's /var/lib/dpkg/info/.)
           This does not apply to most Linux systems because most distros set
           protected_hardlinks=1.
        B: on a system with protected_hardlinks=1:
           execute access to a folder containing a victim-owned,
           attacker-readable and attacker-writable file on the same partition
           as D, and the victim-owned file will be deleted before the main part
           of the attack takes place.
           (This seems to be uncommon.)
        C: on any system, independent of protected_hardlinks:
           write access to a non-sticky folder containing a victim-owned,
           attacker-readable file on the same partition as D
           (This seems to be uncommon.)
      
      The basic idea is that the attacker moves the victim-owned file to where
      he expects the victim process to dump its core.  The victim process dumps
      its core into the existing file, and the attacker reads the coredump from
      it.
      
      If the attacker can't move the file because he does not have write access
      to the containing directory, he can instead link the file to a directory
      he controls, then wait for the original link to the file to be deleted
      (because the kernel checks that the link count of the corefile is 1).
      
      A less reliable variant that requires D to be non-sticky works with link()
      and does not require deletion of the original link: link() the file into
      D, but then unlink() it directly before the kernel performs the link count
      check.
      
      On systems with protected_hardlinks=0, this variant allows an attacker to
      not only gain information from coredumps, but also clobber existing,
      victim-writable files with coredumps.  (This could theoretically lead to a
      privilege escalation.)
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbb18169
  3. 26 Jun, 2015 2 commits
  4. 23 Jun, 2015 1 commit
  5. 12 Apr, 2015 1 commit
  6. 06 Mar, 2015 1 commit
  7. 20 Feb, 2015 1 commit
  8. 14 Oct, 2014 1 commit
    • Oleg Nesterov's avatar
      coredump: add %i/%I in core_pattern to report the tid of the crashed thread · b03023ec
      Oleg Nesterov authored
      format_corename() can only pass the leader's pid to the core handler,
      but there is no simple way to figure out which thread originated the
      coredump.
      
      As Jan explains, this also means that there is no simple way to create
      the backtrace of the crashed process:
      
      As programs are mostly compiled with implicit gcc -fomit-frame-pointer
      one needs program's .eh_frame section (equivalently PT_GNU_EH_FRAME
      segment) or .debug_frame section.  .debug_frame usually is present only
      in separate debug info files usually not even installed on the system.
      While .eh_frame is a part of the executable/library (and it is even
      always mapped for C++ exceptions unwinding) it no longer has to be
      present anywhere on the disk as the program could be upgraded in the
      meantime and the running instance has its executable file already
      unlinked from disk.
      
      One possibility is to echo 0x3f >/proc/*/coredump_filter and dump all
      the file-backed memory including the executable's .eh_frame section.
      But that can create huge core files, for example even due to mmapped
      data files.
      
      Other possibility would be to read .eh_frame from /proc/PID/mem at the
      core_pattern handler time of the core dump.  For the backtrace one needs
      to read the register state first which can be done from core_pattern
      handler:
      
          ptrace(PTRACE_SEIZE, tid, 0, PTRACE_O_TRACEEXIT)
          close(0);    // close pipe fd to resume the sleeping dumper
          waitpid();   // should report EXIT
          PTRACE_GETREGS or other requests
      
      The remaining problem is how to get the 'tid' value of the crashed
      thread.  It could be read from the first NT_PRSTATUS note of the core
      file but that makes the core_pattern handler complicated.
      
      Unfortunately %t is already used so this patch uses %i/%I.
      
      Automatic Bug Reporting Tool (https://github.com/abrt/abrt/wiki/overview)
      is experimenting with this.  It is using the elfutils
      (https://fedorahosted.org/elfutils/
      
      ) unwinder for generating the
      backtraces.  Apart from not needing matching executables as mentioned
      above, another advantage is that we can get the backtrace without saving
      the core (which might be quite large) to disk.
      
      [mmilata@redhat.com: final paragraph of changelog]
      Signed-off-by: default avatarJan Kratochvil <jan.kratochvil@redhat.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
      Cc: Mark Wielaard <mjw@redhat.com>
      Cc: Martin Milata <mmilata@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b03023ec
  9. 23 Jul, 2014 1 commit
  10. 19 Apr, 2014 1 commit
    • Eric Dumazet's avatar
      coredump: fix va_list corruption · 404ca80e
      Eric Dumazet authored
      A va_list needs to be copied in case it needs to be used twice.
      
      Thanks to Hugh for debugging this issue, leading to various panics.
      
      Tested:
      
        lpq84:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
      
      'produce_core' is simply : main() { *(int *)0 = 1;}
      
        lpq84:~# ./produce_core
        Segmentation fault (core dumped)
        lpq84:~# dmesg | tail -1
        [  614.352947] Core dump to |/foobar12345 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 lpq84 (null) pipe failed
      
      Notice the last argument was replaced by a NULL (we were lucky enough to
      not crash, but do not try this on your production machine !)
      
      After fix :
      
        lpq83:~# echo "|/foobar12345 %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h %h" >/proc/sys/kernel/core_pattern
        lpq83:~# ./produce_core
        Segmentation fault
        lpq83:~# dmesg | tail -1
        [  740.800441] Core dump to |/foobar12345 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 lpq83 pipe failed
      
      Fixes: 5fe9d8ca
      
       ("coredump: cn_vprintf() has no reason to call vsnprintf() twice")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Diagnosed-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: stable@vger.kernel.org # 3.11+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      404ca80e
  11. 24 Jan, 2014 1 commit
  12. 16 Nov, 2013 2 commits
  13. 09 Nov, 2013 5 commits
  14. 25 Oct, 2013 1 commit
  15. 11 Sep, 2013 1 commit
  16. 03 Jul, 2013 6 commits
  17. 04 May, 2013 1 commit
  18. 01 May, 2013 8 commits
    • Oleg Nesterov's avatar
      coredump: change wait_for_dump_helpers() to use wait_event_interruptible() · dc7ee2aa
      Oleg Nesterov authored
      
      
      wait_for_dump_helpers() calls wake_up/kill_fasync from inside the
      wait_event-like loop.  This is not needed and in fact this is not
      strictly correct, we can/should do this only once after we change
      pipe->writers.  We could even check if it becomes zero.
      
      Change this code to use use wait_event_interruptible(), this can also
      help to make this wait freezable.
      
      With this patch we check pipe->readers without pipe_lock(), this is
      fine.  Once we see pipe->readers == 1 we know that the handler
      decremented the counter, this is all we need.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc7ee2aa
    • Oleg Nesterov's avatar
      coredump: factor out the setting of PF_DUMPCORE · 079148b9
      Oleg Nesterov authored
      
      
      Cleanup.  Every linux_binfmt->core_dump() sets PF_DUMPCORE, move this into
      zap_threads() called by do_coredump().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      079148b9
    • Oleg Nesterov's avatar
      coredump: introduce dump_interrupted() · 528f827e
      Oleg Nesterov authored
      
      
      By discussion with Mandeep.
      
      Change dump_write(), dump_seek() and do_coredump() to check
      signal_pending() and abort if it is true.  dump_seek() does this only
      before f_op->llseek(), otherwise it relies on dump_write().
      
      We need this change to ensure that the coredump won't delay suspend, and
      to ensure it reacts to SIGKILL "quickly enough", a core dump can take a
      lot of time.  In particular this can help oom-killer.
      
      We add the new trivial helper, dump_interrupted() to add the comments and
      to simplify the potential freezer changes.  Perhaps it will have more
      callers.
      
      Ideally it should do try_to_freeze() but then we need the unpleasant
      changes in dump_write() and wait_for_dump_helpers().  It is not trivial to
      change dump_write() to restart if f_op->write() fails because of
      freezing().  We need to handle the short writes, we need to clear
      TIF_SIGPENDING (and we can't rely on recalc_sigpending() unless we change
      it to check PF_DUMPCORE).  And if the buggy f_op->write() sets
      TIF_SIGPENDING we can not distinguish this case from the race with
      freeze_task() + __thaw_task().
      
      So we simply accept the fact that the freezer can truncate a core-dump but
      at least you can reliably suspend.  Hopefully we can tolerate this
      unlikely case and the necessary complications doesn't worth a trouble.
      But if we decide to make the coredumping freezable later we can do this on
      top of this change.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      528f827e
    • Oleg Nesterov's avatar
      coredump: sanitize the setting of signal->group_exit_code · acdedd99
      Oleg Nesterov authored
      
      
      Now that the coredumping process can be SIGKILL'ed, the setting of
      ->group_exit_code in do_coredump() can race with complete_signal() and
      SIGKILL or 0x80 can be "lost", or wait(status) can report status ==
      SIGKILL | 0x80.
      
      But the main problem is that it is not clear to me what should we do if
      binfmt->core_dump() succeeds but SIGKILL was sent, that is why this patch
      comes as a separate change.
      
      This patch adds 0x80 if ->core_dump() succeeds and the process was not
      killed.  But perhaps we can (should?) re-set ->group_exit_code changed by
      SIGKILL back to "siginfo->si_signo |= 0x80" in case when core_dumped == T.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Tested-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      acdedd99
    • Oleg Nesterov's avatar
      coredump: ensure that SIGKILL always kills the dumping thread · 6cd8f0ac
      Oleg Nesterov authored
      
      
      prepare_signal() blesses SIGKILL sent to the dumping process but this
      signal can be "lost" anyway.  The problems is, complete_signal() sees
      SIGNAL_GROUP_EXIT and skips the "kill them all" logic.  And even if the
      dumping process is single-threaded (so the target is always "correct"),
      the group-wide SIGKILL is not recorded in task->pending and thus
      __fatal_signal_pending() won't be true.  A multi-threaded case has even
      more problems.
      
      And even ignoring all technical details, SIGNAL_GROUP_EXIT doesn't look
      right to me.  This coredumping process is not exiting yet, it can do a lot
      of work dumping the core.
      
      With this patch the dumping process doesn't have SIGNAL_GROUP_EXIT, we set
      signal->group_exit_task instead.  This makes signal_group_exit() true and
      thus this should equally close the races with exit/exec/stop but allows to
      kill the dumping thread reliably.
      
      Notes:
      	- It is not clear what should we do with ->group_exit_code
      	  if the dumper was killed, see the next change.
      
      	- we need more (hopefully straightforward) changes to ensure
      	  that SIGKILL actually interrupts the coredump. Basically we
      	  need to check __fatal_signal_pending() in dump_write() and
      	  dump_seek().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Tested-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6cd8f0ac
    • Oleg Nesterov's avatar
      coredump: only SIGKILL should interrupt the coredumping task · 403bad72
      Oleg Nesterov authored
      
      
      There are 2 well known and ancient problems with coredump/signals, and a
      lot of related bug reports:
      
      - do_coredump() clears TIF_SIGPENDING but of course this can't help
        if, say, SIGCHLD comes after that.
      
        In this case the coredump can fail unexpectedly. See for example
        wait_for_dump_helper()->signal_pending() check but there are other
        reasons.
      
      - At the same time, dumping a huge core on the slow media can take a
        lot of time/resources and there is no way to kill the coredumping
        task reliably. In particular this is not oom_kill-friendly.
      
      This patch tries to fix the 1st problem, and makes the preparation for the
      next changes.
      
      We add the new SIGNAL_GROUP_COREDUMP flag set by zap_threads() to indicate
      that this process dumps the core.  prepare_signal() checks this flag and
      nacks any signal except SIGKILL.
      
      Note that this check tries to be conservative, in the long term we should
      probably treat the SIGNAL_GROUP_EXIT case equally but this needs more
      discussion.  See marc.info/?l=linux-kernel&m=120508897917439
      
      Notes:
      	- recalc_sigpending() doesn't check SIGNAL_GROUP_COREDUMP.
      	  The patch assumes that dump_write/etc paths should never
      	  call it, but we can change it as well.
      
      	- There is another source of TIF_SIGPENDING, freezer. This
      	  will be addressed separately.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Tested-by: default avatarMandeep Singh Baines <msb@chromium.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Neil Horman <nhorman@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      403bad72
    • Lucas De Marchi's avatar
      usermodehelper: split remaining calls to call_usermodehelper_fns() · 907ed132
      Lucas De Marchi authored
      
      
      These are the only users of call_usermodehelper_fns().  This function
      suffers from not being able to determine if the cleanup is called.  Even
      if in this places the cleanup pointer is NULL, convert them to use the
      separate call_usermodehelper_setup() + call_usermodehelper_exec()
      functions so we can remove the _fns variant.
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      907ed132
    • Lucas De Marchi's avatar
      coredump: remove trailling whitespace · fb96c475
      Lucas De Marchi authored
      
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@profusion.mobi>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fb96c475
  19. 09 Apr, 2013 2 commits