1. 18 Feb, 2019 1 commit
    • Ilya Dryomov's avatar
      libceph: handle an empty authorize reply · 0fd3fd0a
      Ilya Dryomov authored
      The authorize reply can be empty, for example when the ticket used to
      build the authorizer is too old and TAG_BADAUTHORIZER is returned from
      the service.  Calling ->verify_authorizer_reply() results in an attempt
      to decrypt and validate (somewhat) random data in au->buf (most likely
      the signature block from calc_signature()), which fails and ends up in
      con_fault_finish() with !con->auth_retry.  The ticket isn't invalidated
      and the connection is retried again and again until a new ticket is
      obtained from the monitor:
      
        libceph: osd2 192.168.122.1:6809 bad authorize reply
        libceph: osd2 192.168.122.1:6809 bad authorize reply
        libceph: osd2 192.168.122.1:6809 bad authorize reply
        libceph: osd2 192.168.122.1:6809 bad authorize reply
      
      Let TAG_BADAUTHORIZER handler kick in and increment con->auth_retry.
      
      Cc: stable@vger.kernel.org
      Fixes: 5c056fdc ("libceph: verify authorize reply on connect")
      Link: https://tracker.ceph.com/issues/20164
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      0fd3fd0a
  2. 21 Jan, 2019 1 commit
    • Ilya Dryomov's avatar
      libceph: avoid KEEPALIVE_PENDING races in ceph_con_keepalive() · 4aac9228
      Ilya Dryomov authored
      
      
      con_fault() can transition the connection into STANDBY right after
      ceph_con_keepalive() clears STANDBY in clear_standby():
      
          libceph user thread               ceph-msgr worker
      
      ceph_con_keepalive()
        mutex_lock(&con->mutex)
        clear_standby(con)
        mutex_unlock(&con->mutex)
                                      mutex_lock(&con->mutex)
                                      con_fault()
                                        ...
                                        if KEEPALIVE_PENDING isn't set
                                          set state to STANDBY
                                        ...
                                      mutex_unlock(&con->mutex)
        set KEEPALIVE_PENDING
        set WRITE_PENDING
      
      This triggers warnings in clear_standby() when either ceph_con_send()
      or ceph_con_keepalive() get to clearing STANDBY next time.
      
      I don't see a reason to condition queue_con() call on the previous
      value of KEEPALIVE_PENDING, so move the setting of KEEPALIVE_PENDING
      into the critical section -- unlike WRITE_PENDING, KEEPALIVE_PENDING
      could have been a non-atomic flag.
      
      Reported-by: syzbot+acdeb633f6211ccdf886@syzkaller.appspotmail.com
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Tested-by: default avatarMyungho Jung <mhjungk@gmail.com>
      4aac9228
  3. 26 Dec, 2018 4 commits
    • Ilya Dryomov's avatar
      libceph: switch more to bool in ceph_tcp_sendmsg() · 87349cda
      Ilya Dryomov authored
      
      
      Unlike in ceph_tcp_sendpage(), it's a bool.
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      87349cda
    • Ilya Dryomov's avatar
      libceph: use MSG_SENDPAGE_NOTLAST with ceph_tcp_sendpage() · 433b0a12
      Ilya Dryomov authored
      Prevent do_tcp_sendpages() from calling tcp_push() (at least) once per
      page.  Instead, arrange for tcp_push() to be called (at least) once per
      data payload.  This results in more MSS-sized packets and fewer packets
      overall (5-10% reduction in my tests with typical OSD request sizes).
      See commits 2f533844 ("tcp: allow splice() to build full TSO
      packets"), 35f9c09f ("tcp: tcp_sendpages() should call tcp_push()
      once") and ae62ca7b
      
       ("tcp: fix MSG_SENDPAGE_NOTLAST logic") for
      details.
      
      Here is an example of a packet size histogram for 128K OSD requests
      (MSS = 1448, top 5):
      
      Before:
      
           SIZE    COUNT
           1448   777700
            952   127915
           1200    39238
           1219     9806
             21     5675
      
      After:
      
           SIZE    COUNT
           1448   897280
             21     6201
           1019     2797
            643     2739
            376     2479
      
      We could do slightly better by explicitly corking the socket but it's
      not clear it's worth it.
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      433b0a12
    • Ilya Dryomov's avatar
      libceph: use sock_no_sendpage() as a fallback in ceph_tcp_sendpage() · 3239eb52
      Ilya Dryomov authored
      
      
      sock_no_sendpage() makes the code cleaner.
      
      Also, don't set MSG_EOR.  sendpage doesn't act on MSG_EOR on its own,
      it just honors the setting from the preceding sendmsg call by looking
      at ->eor in tcp_skb_can_collapse_to().
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      3239eb52
    • Ilya Dryomov's avatar
      libceph: drop last_piece logic from write_partial_message_data() · 1f6b821a
      Ilya Dryomov authored
      last_piece is for the last piece in the current data item, not in the
      entire data payload of the message.  This is harmful for messages with
      multiple data items.  On top of that, we don't need to signal the end
      of a data payload either because it is always followed by a footer.
      
      We used to signal "more" unconditionally, until commit fe38a2b6
      ("libceph: start defining message data cursor").  Part of a large
      series, it introduced cursor->last_piece and also mistakenly inverted
      the hint by passing last_piece for "more".  This was corrected with
      commit c2cfa194 ("libceph: Fix ceph_tcp_sendpage()'s more boolean
      usage").
      
      As it is, last_piece is not helping at all: because Nagle algorithm is
      disabled, for a simple message with two 512-byte data items we end up
      emitting three packets: front + first data item, second data item and
      footer.  Go back to the original pre-fe38a2b6
      
       behavior -- a single
      packet in most cases.
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      1f6b821a
  4. 19 Nov, 2018 1 commit
    • Ilya Dryomov's avatar
      libceph: fall back to sendmsg for slab pages · 7e241f64
      Ilya Dryomov authored
      
      
      skb_can_coalesce() allows coalescing neighboring slab objects into
      a single frag:
      
        return page == skb_frag_page(frag) &&
               off == frag->page_offset + skb_frag_size(frag);
      
      ceph_tcp_sendpage() can be handed slab pages.  One example of this is
      XFS: it passes down sector sized slab objects for its metadata I/O.  If
      the kernel client is co-located on the OSD node, the skb may go through
      loopback and pop on the receive side with the exact same set of frags.
      When tcp_recvmsg() attempts to copy out such a frag, hardened usercopy
      complains because the size exceeds the object's allocated size:
      
        usercopy: kernel memory exposure attempt detected from ffff9ba917f20a00 (kmalloc-512) (1024 bytes)
      
      Although skb_can_coalesce() could be taught to return false if the
      resulting frag would cross a slab object boundary, we already have
      a fallback for non-refcounted pages.  Utilize it for slab pages too.
      
      Cc: stable@vger.kernel.org # 4.8+
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      7e241f64
  5. 23 Oct, 2018 1 commit
    • David Howells's avatar
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells authored
      
      
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      aa563d7b
  6. 22 Oct, 2018 2 commits
    • Ilya Dryomov's avatar
      libceph: preallocate message data items · 0d9c1ab3
      Ilya Dryomov authored
      Currently message data items are allocated with ceph_msg_data_create()
      in setup_request_data() inside send_request().  send_request() has never
      been allowed to fail, so each allocation is followed by a BUG_ON:
      
        data = ceph_msg_data_create(...);
        BUG_ON(!data);
      
      It's been this way since support for multiple message data items was
      added in commit 6644ed7b
      
       ("libceph: make message data be a pointer")
      in 3.10.
      
      There is no reason to delay the allocation of message data items until
      the last possible moment and we certainly don't need a linked list of
      them as they are only ever appended to the end and never erased.  Make
      ceph_msg_new2() take max_data_items and adapt the rest of the code.
      Reported-by: default avatarJerry Lee <leisurelysw24@gmail.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      0d9c1ab3
    • Ilya Dryomov's avatar
      libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist() · 89486833
      Ilya Dryomov authored
      
      
      Because send_mds_reconnect() wants to send a message with a pagelist
      and pass the ownership to the messenger, ceph_msg_data_add_pagelist()
      consumes a ref which is then put in ceph_msg_data_destroy().  This
      makes managing pagelists in the OSD client (where they are wrapped in
      ceph_osd_data) unnecessarily hard because the handoff only happens in
      ceph_osdc_start_request() instead of when the pagelist is passed to
      ceph_osd_data_pagelist_init().  I counted several memory leaks on
      various error paths.
      
      Fix up ceph_msg_data_add_pagelist() and carry a pagelist ref in
      ceph_osd_data.
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      89486833
  7. 02 Aug, 2018 5 commits
  8. 04 Jun, 2018 2 commits
  9. 26 Apr, 2018 1 commit
    • Ilya Dryomov's avatar
      libceph: validate con->state at the top of try_write() · 9c55ad1c
      Ilya Dryomov authored
      ceph_con_workfn() validates con->state before calling try_read() and
      then try_write().  However, try_read() temporarily releases con->mutex,
      notably in process_message() and ceph_con_in_msg_alloc(), opening the
      window for ceph_con_close() to sneak in, close the connection and
      release con->sock.  When try_write() is called on the assumption that
      con->state is still valid (i.e. not STANDBY or CLOSED), a NULL sock
      gets passed to the networking stack:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
        IP: selinux_socket_sendmsg+0x5/0x20
      
      Make sure con->state is valid at the top of try_write() and add an
      explicit BUG_ON for this, similar to try_read().
      
      Cc: stable@vger.kernel.org
      Link: https://tracker.ceph.com/issues/23706
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarJason Dillaman <dillaman@redhat.com>
      9c55ad1c
  10. 02 Apr, 2018 4 commits
  11. 13 Nov, 2017 1 commit
  12. 02 Nov, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      
      
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  13. 01 Aug, 2017 1 commit
  14. 17 Jul, 2017 1 commit
  15. 07 Jul, 2017 2 commits
  16. 24 May, 2017 1 commit
  17. 23 May, 2017 1 commit
  18. 09 May, 2017 1 commit
  19. 23 Mar, 2017 1 commit
    • Ilya Dryomov's avatar
      libceph: force GFP_NOIO for socket allocations · 633ee407
      Ilya Dryomov authored
      sock_alloc_inode() allocates socket+inode and socket_wq with
      GFP_KERNEL, which is not allowed on the writeback path:
      
          Workqueue: ceph-msgr con_work [libceph]
          ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000
          0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00
          ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148
          Call Trace:
          [<ffffffff816dd629>] schedule+0x29/0x70
          [<ffffffff816e066d>] schedule_timeout+0x1bd/0x200
          [<ffffffff81093ffc>] ? ttwu_do_wakeup+0x2c/0x120
          [<ffffffff81094266>] ? ttwu_do_activate.constprop.135+0x66/0x70
          [<ffffffff816deb5f>] wait_for_completion+0xbf/0x180
          [<ffffffff81097cd0>] ? try_to_wake_up+0x390/0x390
          [<ffffffff81086335>] flush_work+0x165/0x250
          [<ffffffff81082940>] ? worker_detach_from_pool+0xd0/0xd0
          [<ffffffffa03b65b1>] xlog_cil_force_lsn+0x81/0x200 [xfs]
          [<ffffffff816d6b42>] ? __slab_free+0xee/0x234
          [<ffffffffa03b4b1d>] _xfs_log_force_lsn+0x4d/0x2c0 [xfs]
          [<ffffffff811adc1e>] ? lookup_page_cgroup_used+0xe/0x30
          [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
          [<ffffffffa03b4dcf>] xfs_log_force_lsn+0x3f/0xf0 [xfs]
          [<ffffffffa039a723>] ? xfs_reclaim_inode+0xa3/0x330 [xfs]
          [<ffffffffa03a62c6>] xfs_iunpin_wait+0xc6/0x1a0 [xfs]
          [<ffffffff810aa250>] ? wake_atomic_t_function+0x40/0x40
          [<ffffffffa039a723>] xfs_reclaim_inode+0xa3/0x330 [xfs]
          [<ffffffffa039ac07>] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs]
          [<ffffffffa039bb13>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
          [<ffffffffa03ab745>] xfs_fs_free_cached_objects+0x15/0x20 [xfs]
          [<ffffffff811c0c18>] super_cache_scan+0x178/0x180
          [<ffffffff8115912e>] shrink_slab_node+0x14e/0x340
          [<ffffffff811afc3b>] ? mem_cgroup_iter+0x16b/0x450
          [<ffffffff8115af70>] shrink_slab+0x100/0x140
          [<ffffffff8115e425>] do_try_to_free_pages+0x335/0x490
          [<ffffffff8115e7f9>] try_to_free_pages+0xb9/0x1f0
          [<ffffffff816d56e4>] ? __alloc_pages_direct_compact+0x69/0x1be
          [<ffffffff81150cba>] __alloc_pages_nodemask+0x69a/0xb40
          [<ffffffff8119743e>] alloc_pages_current+0x9e/0x110
          [<ffffffff811a0ac5>] new_slab+0x2c5/0x390
          [<ffffffff816d71c4>] __slab_alloc+0x33b/0x459
          [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
          [<ffffffff8164bda1>] ? inet_sendmsg+0x71/0xc0
          [<ffffffff815b906d>] ? sock_alloc_inode+0x2d/0xd0
          [<ffffffff811a21f2>] kmem_cache_alloc+0x1a2/0x1b0
          [<ffffffff815b906d>] sock_alloc_inode+0x2d/0xd0
          [<ffffffff811d8566>] alloc_inode+0x26/0xa0
          [<ffffffff811da04a>] new_inode_pseudo+0x1a/0x70
          [<ffffffff815b933e>] sock_alloc+0x1e/0x80
          [<ffffffff815ba855>] __sock_create+0x95/0x220
          [<ffffffff815baa04>] sock_create_kern+0x24/0x30
          [<ffffffffa04794d9>] con_work+0xef9/0x2050 [libceph]
          [<ffffffffa04aa9ec>] ? rbd_img_request_submit+0x4c/0x60 [rbd]
          [<ffffffff81084c19>] process_one_work+0x159/0x4f0
          [<ffffffff8108561b>] worker_thread+0x11b/0x530
          [<ffffffff81085500>] ? create_worker+0x1d0/0x1d0
          [<ffffffff8108b6f9>] kthread+0xc9/0xe0
          [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
          [<ffffffff816e1b98>] ret_from_fork+0x58/0x90
          [<ffffffff8108b630>] ? flush_kthread_worker+0x90/0x90
      
      Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here.
      
      Cc: stable@vger.kernel.org # 3.10+, needs backporting
      Link: http://tracker.ceph.com/issues/19309
      
      Reported-by: default avatarSergey Jerusalimov <wintchester@gmail.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      633ee407
  20. 14 Jan, 2017 1 commit
    • Peter Zijlstra's avatar
      locking/atomic, kref: Add kref_read() · 2c935bc5
      Peter Zijlstra authored
      
      
      Since we need to change the implementation, stop exposing internals.
      
      Provide kref_read() to read the current reference count; typically
      used for debug messages.
      
      Kills two anti-patterns:
      
      	atomic_read(&kref->refcount)
      	kref->refcount.counter
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2c935bc5
  21. 27 Dec, 2016 2 commits
  22. 12 Dec, 2016 3 commits
  23. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      
      
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  24. 25 Mar, 2016 1 commit