1. 30 Nov, 2018 1 commit
    • Andrea Arcangeli's avatar
      userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails · 9e368259
      Andrea Arcangeli authored
      Patch series "userfaultfd shmem updates".
      
      Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the
      lack of the VM_MAYWRITE check and the lack of i_size checks.
      
      Then looking into the above we also fixed the MAP_PRIVATE case.
      
      Hugh by source review also found a data loss source if UFFDIO_COPY is
      used on shmem MAP_SHARED PROT_READ mappings (the production usages
      incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't
      happen in those production usages like with QEMU).
      
      The whole patchset is marked for stable.
      
      We verified QEMU postcopy live migration with guest running on shmem
      MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE.
      Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU
      unconditionally invokes a punch hole if the guest mapping is filebacked
      and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and
      for the anon backend).
      
      This patch (of 5):
      
      We internally used EFAULT to communicate with the caller, switch to
      ENOENT, so EFAULT can be used as a non internal retval.
      
      Link: http://lkml.kernel.org/r/20181126173452.26955-2-aarcange@redhat.com
      Fixes: 4c27fe4c
      
       ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e368259
  2. 08 Jun, 2018 1 commit
    • Mike Rapoport's avatar
      userfaultfd: prevent non-cooperative events vs mcopy_atomic races · df2cc96e
      Mike Rapoport authored
      If a process monitored with userfaultfd changes it's memory mappings or
      forks() at the same time as uffd monitor fills the process memory with
      UFFDIO_COPY, the actual creation of page table entries and copying of
      the data in mcopy_atomic may happen either before of after the memory
      mapping modifications and there is no way for the uffd monitor to
      maintain consistent view of the process memory layout.
      
      For instance, let's consider fork() running in parallel with
      userfaultfd_copy():
      
      process        		         |	uffd monitor
      ---------------------------------+------------------------------
      fork()        		         | userfaultfd_copy()
      ...        		         | ...
          dup_mmap()        	         |     down_read(mmap_sem)
          down_write(mmap_sem)         |     /* create PTEs, copy data */
              dup_uffd()               |     up_read(mmap_sem)
              copy_page_range()        |
              up_write(mmap_sem)       |
              dup_uffd_complete()      |
                  /* notify monitor */ |
      
      If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will
      be present by the time copy_page_range() is called and they will appear
      in the child's memory mappings.  However, if the fork() is the first to
      take the mmap_sem, the new pages won't be mapped in the child's address
      space.
      
      If the pages are not present and child tries to access them, the monitor
      will get page fault notification and everything is fine.  However, if
      the pages *are present*, the child can access them without uffd
      noticing.  And if we copy them into child it'll see the wrong data.
      Since we are talking about background copy, we'd need to decide whether
      the pages should be copied or not regardless #PF notifications.
      
      Since userfaultfd monitor has no way to determine what was the order,
      let's disallow userfaultfd_copy in parallel with the non-cooperative
      events.  In such case we return -EAGAIN and the uffd monitor can
      understand that userfaultfd_copy() clashed with a non-cooperative event
      and take an appropriate action.
      
      Link: http://lkml.kernel.org/r/1527061324-19949-1-git-send-email-rppt@linux.vnet.ibm.com
      
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarPavel Emelyanov <xemul@virtuozzo.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Andrei Vagin <avagin@virtuozzo.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df2cc96e
  3. 07 Feb, 2018 1 commit
  4. 07 Sep, 2017 2 commits
  5. 09 Mar, 2017 1 commit
  6. 02 Mar, 2017 1 commit
  7. 25 Feb, 2017 1 commit
  8. 23 Feb, 2017 6 commits
  9. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      
      
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  10. 17 Mar, 2016 1 commit
  11. 16 Jan, 2016 2 commits
  12. 04 Sep, 2015 2 commits