1. 21 Nov, 2018 1 commit
  2. 05 Sep, 2018 1 commit
  3. 11 Jul, 2018 4 commits
  4. 26 Jun, 2018 1 commit
    • Theodore Ts'o's avatar
      ext4: correctly handle a zero-length xattr with a non-zero e_value_offs · 21542545
      Theodore Ts'o authored
      commit 8a2b307c upstream.
      Ext4 will always create ext4 extended attributes which do not have a
      value (where e_value_size is zero) with e_value_offs set to zero.  In
      most places e_value_offs will not be used in a substantive way if
      e_value_size is zero.
      There was one exception to this, which is in ext4_xattr_set_entry(),
      where if there is a maliciously crafted file system where there is an
      extended attribute with e_value_offs is non-zero and e_value_size is
      0, the attempt to remove this xattr will result in a negative value
      getting passed to memmove, leading to the following sadness:
      [   41.225365] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
      [   44.538641] BUG: unable to handle kernel paging request at ffff9ec9a3000000
      [   44.538733] IP: __memmove+0x81/0x1a0
      [   44.538755] PGD 1249bd067 P4D 1249bd067 PUD 1249c1067 PMD 80000001230000e1
      [   44.538793] Oops: 0003 [#1] SMP PTI
      [   44.539074] CPU: 0 PID: 1470 Comm: poc Not tainted 4.16.0-rc1+ #1
      [   44.539475] Call Trace:
      [   44.539832]  ext4_xattr_set_entry+0x9e7/0xf80
      [   44.539972]  ext4_xattr_block_set+0x212/0xea0
      [   44.540041]  ext4_xattr_set_handle+0x514/0x610
      [   44.540065]  ext4_xattr_set+0x7f/0x120
      [   44.540090]  __vfs_removexattr+0x4d/0x60
      [   44.540112]  vfs_removexattr+0x75/0xe0
      [   44.540132]  removexattr+0x4d/0x80
      [   44.540279]  path_removexattr+0x91/0xb0
      [   44.540300]  SyS_removexattr+0xf/0x20
      [   44.540322]  do_syscall_64+0x71/0x120
      [   44.540344]  entry_SYSCALL_64_after_hwframe+0x21/0x86
      This addresses CVE-2018-10840.
      Reported-by: default avatar"Xu, Wen" <wen.xu@gatech.edu>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Cc: stable@kernel.org
      Fixes: dec214d0
       ("ext4: xattr inode deduplication")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  5. 24 Apr, 2018 4 commits
  6. 02 Nov, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      How this work was done:
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
      All documentation files were explicitly excluded.
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
         For non */uapi/* files that summary was:
         SPDX license identifier                            # files
         GPL-2.0                                              11139
         and resulted in the first patch in this series.
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
         SPDX license identifier                            # files
         GPL-2.0 WITH Linux-syscall-note                        930
         and resulted in the second patch in this series.
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
         SPDX license identifier                            # files
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
         and that resulted in the third patch in this series.
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  7. 24 Aug, 2017 1 commit
    • Tahsin Erdogan's avatar
      ext4: backward compatibility support for Lustre ea_inode implementation · a6d05676
      Tahsin Erdogan authored
      Original Lustre ea_inode feature did not have ref counts on xattr inodes
      because there was always one parent that referenced it. New
      implementation expects ref count to be initialized which is not true for
      Lustre case. Handle this by detecting Lustre created xattr inode and set
      its ref count to 1.
      The quota handling of xattr inodes have also changed with deduplication
      support. New implementation manually manages quotas to support sharing
      across multiple users. A consequence is that, a referencing inode
      incorporates the blocks of xattr inode into its own i_block field.
      We need to know how a xattr inode was created so that we can reverse the
      block charges during reference removal. This is handled by introducing a
      EXT4_STATE_LUSTRE_EA_INODE flag. The flag is set on a xattr inode if
      inode appears to have been created by Lustre. During xattr inode reference
      removal, the manual quota uncharge is skipped if the flag is set.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
  8. 14 Aug, 2017 1 commit
  9. 06 Aug, 2017 5 commits
  10. 31 Jul, 2017 1 commit
  11. 06 Jul, 2017 1 commit
    • Tahsin Erdogan's avatar
      ext4: fix __ext4_new_inode() journal credits calculation · af65207c
      Tahsin Erdogan authored
      ea_inode feature allows creating extended attributes that are up to
      64k in size. Update __ext4_new_inode() to pick increased credit limits.
      To avoid overallocating too many journal credits, update
      __ext4_xattr_set_credits() to make a distinction between xattr create
      vs update. This helps __ext4_new_inode() because all attributes are
      known to be new, so we can save credits that are normally needed to
      delete old values.
      Also, have fscrypt specify its maximum context size so that we don't
      end up allocating credits for 64k size.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
  12. 22 Jun, 2017 19 commits
    • Tahsin Erdogan's avatar
      ext4: add nombcache mount option · cdb7ee4c
      Tahsin Erdogan authored
      The main purpose of mb cache is to achieve deduplication in
      extended attributes. In use cases where opportunity for deduplication
      is unlikely, it only adds overhead.
      Add a mount option to explicitly turn off mb cache.
      Suggested-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: strong binding of xattr inode references · b9fc761e
      Tahsin Erdogan authored
      To verify that a xattr entry is not pointing to the wrong xattr inode,
      we currently check that the target inode has EXT4_EA_INODE_FL flag set and
      also the entry size matches the target inode size.
      For stronger validation, also incorporate crc32c hash of the value into
      the e_hash field. This is done regardless of whether the entry lives in
      the inode body or external attribute block.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: eliminate xattr entry e_hash recalculation for removes · daf83281
      Tahsin Erdogan authored
      When an extended attribute block is modified, ext4_xattr_hash_entry()
      recalculates e_hash for the entry that is pointed by s->here. This  is
      unnecessary if the modification is to remove an entry.
      Currently, if the removed entry is the last one and there are other
      entries remaining, hash calculation targets the just erased entry which
      has been filled with zeroes and effectively does nothing.  If the removed
      entry is not the last one and there are more entries, this time it will
      recalculate hash on the next entry which is totally unnecessary.
      Fix these by moving the decision on when to recalculate hash to
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: reserve space for xattr entries/names · 9c6e7853
      Tahsin Erdogan authored
      New ea_inode feature allows putting large xattr values into external
      inodes.  struct ext4_xattr_entry and the attribute name however have to
      remain in the inode extra space or external attribute block.  Once that
      space is exhausted, no further entries can be added.  Some of that space
      could also be used by values that fit in there at the time of addition.
      So, a single xattr entry whose value barely fits in the external block
      could prevent further entries being added.
      To mitigate the problem, this patch introduces a notion of reserved
      space in the external attribute block that cannot be used by value data.
      This reserve is enforced when ea_inode feature is enabled.  The amount
      of reserve is arbitrarily chosen to be min(block_size/8, 1024).  The
      table below shows how much space is reserved for each block size and the
      guaranteed mininum number of entries that can be placed in the external
      attribute block.
      block size     reserved bytes  entries (name length = 16)
       1k            128              3
       2k            256              7
       4k            512             15
       8k            1024            31
      16k            1024            31
      32k            1024            31
      64k            1024            31
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      quota: add get_inode_usage callback to transfer multi-inode charges · 7a9ca53a
      Tahsin Erdogan authored
      Ext4 ea_inode feature allows storing xattr values in external inodes to
      be able to store values that are bigger than a block in size. Ext4 also
      has deduplication support for these type of inodes. With deduplication,
      the actual storage waste is eliminated but the users of such inodes are
      still charged full quota for the inodes as if there was no sharing
      happening in the background.
      This design requires ext4 to manually charge the users because the
      inodes are shared.
      An implication of this is that, if someone calls chown on a file that
      has such references we need to transfer the quota for the file and xattr
      inodes. Current dquot_transfer() function implicitly transfers one inode
      charge. With ea_inode feature, we would like to transfer multiple inode
      Add get_inode_usage callback which can interrogate the total number of
      inodes that were charged for a given inode.
      [ Applied fix from Colin King to make sure the 'ret' variable is
        initialized on the successful return path.  Detected by
        CoverityScan, CID#1446616 ("Uninitialized scalar variable") --tytso]
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarJan Kara <jack@suse.cz>
    • Tahsin Erdogan's avatar
      ext4: xattr inode deduplication · dec214d0
      Tahsin Erdogan authored
      Ext4 now supports xattr values that are up to 64k in size (vfs limit).
      Large xattr values are stored in external inodes each one holding a
      single value. Once written the data blocks of these inodes are immutable.
      The real world use cases are expected to have a lot of value duplication
      such as inherited acls etc. To reduce data duplication on disk, this patch
      implements a deduplicator that allows sharing of xattr inodes.
      The deduplication is based on an in-memory hash lookup that is a best
      effort sharing scheme. When a xattr inode is read from disk (i.e.
      getxattr() call), its crc32c hash is added to a hash table. Before
      creating a new xattr inode for a value being set, the hash table is
      checked to see if an existing inode holds an identical value. If such an
      inode is found, the ref count on that inode is incremented. On value
      removal the ref count is decremented and if it reaches zero the inode is
      The quota charging for such inodes is manually managed. Every reference
      holder is charged the full size as if there was no sharing happening.
      This is consistent with how xattr blocks are also charged.
      [ Fixed up journal credits calculation to handle inline data and the
        rare case where an shared xattr block can get freed when two thread
        race on breaking the xattr block sharing. --tytso ]
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: cleanup transaction restarts during inode deletion · 30a7eb97
      Tahsin Erdogan authored
      During inode deletion, the number of journal credits that will be
      needed is hard to determine.  For that reason we have journal
      extend/restart calls in several places.  Whenever a transaction is
      restarted, filesystem must be in a consistent state because there is
      no atomicity guarantee beyond a restart call.
      Add ext4_xattr_ensure_credits() helper function which takes care of
      journal extend/restart logic.  It also handles getting jbd2 write
      access and dirty metadata calls.  This function is called at every
      iteration of handling an ea_inode reference.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext2, ext4: make mb block cache names more explicit · 47387409
      Tahsin Erdogan authored
      There will be a second mb_cache instance that tracks ea_inodes. Make
      existing names more explicit so that it is clear that they refer to
      xattr block cache.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      mbcache: make mbcache naming more generic · c07dfcb4
      Tahsin Erdogan authored
      Make names more generic so that mbcache usage is not limited to
      block sharing. In a subsequent patch in the series
      ("ext4: xattr inode deduplication"), we start using the mbcache code
      for sharing xattr inodes. With that patch, old mb_cache_entry.e_block
      field could be holding either a block number or an inode number.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: modify ext4_xattr_ino_array to hold struct inode * · 0421a189
      Tahsin Erdogan authored
      Tracking struct inode * rather than the inode number eliminates the
      repeated ext4_xattr_inode_iget() call later. The second call cannot
      fail in practice but still requires explanation when it wants to ignore
      the return value. Avoid the trouble and make things simple.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: improve journal credit handling in set xattr paths · c1a5d5f6
      Tahsin Erdogan authored
      Both ext4_set_acl() and ext4_set_context() need to be made aware of
      ea_inode feature when it comes to credits calculation.
      Also add a sufficient credits check in ext4_xattr_set_handle() right
      after xattr write lock is grabbed. Original credits calculation is done
      outside the lock so there is a possiblity that the initially calculated
      credits are not sufficient anymore.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: ext4_xattr_delete_inode() should return accurate errors · 65d30005
      Tahsin Erdogan authored
      In a few places the function returns without trying to pass the actual
      error code to the caller. Fix those.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: retry storing value in external inode with xattr block too · b347e2bc
      Tahsin Erdogan authored
      When value size is <= EXT4_XATTR_MIN_LARGE_EA_SIZE(), and it
      doesn't fit in either inline or xattr block, a second try is made to
      store it in an external inode while storing the entry itself in inline
      area. There should also be an attempt to store the entry in xattr block.
      This patch adds a retry loop to do that. It also makes the caller the
      sole decider on whether to store a value in an external inode.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: fix credits calculation for xattr inode · b3155298
      Tahsin Erdogan authored
      When there is no space for a value in xattr block, it may be stored
      in an xattr inode even if the value length is less than
      EXT4_XATTR_MIN_LARGE_EA_SIZE(). So the current assumption in credits
      calculation is wrong.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: fix ext4_xattr_cmp() · 7cec1918
      Tahsin Erdogan authored
      When a xattr entry refers to an external inode, the value data is not
      available in the inline area so we should not attempt to read it using
      value offset.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: fix ext4_xattr_move_to_block() · f6109100
      Tahsin Erdogan authored
      When moving xattr entries from inline area to a xattr block, entries
      that refer to external xattr inodes need special handling because
      value data is not available in the inline area but rather should be
      read from its external inode.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: fix ext4_xattr_make_inode_space() value size calculation · 9bb21ced
      Tahsin Erdogan authored
      ext4_xattr_make_inode_space() is interested in calculating the inline
      space used in an inode. When a xattr entry refers to an external inode
      the value size indicates the external inode size, not the value size in
      the inline area. Change the function to take this into account.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: ext4_xattr_value_same() should return false for external data · 0bd454c0
      Tahsin Erdogan authored
      ext4_xattr_value_same() is used as a quick optimization in case the new
      xattr value is identical to the previous value. When xattr value is
      stored in a xattr inode the check becomes expensive so it is better to
      just assume that they are not equal.
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    • Tahsin Erdogan's avatar
      ext4: add missing le32_to_cpu(e_value_inum) conversions · 990461dd
      Tahsin Erdogan authored
      Two places in code missed converting xattr inode number using
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>