1. 05 Mar, 2014 3 commits
    • Joe Thornber's avatar
      dm thin: fix deadlock in __requeue_bio_list · 18adc577
      Joe Thornber authored
      
      
      The spin lock in requeue_io() was held for too long, allowing deadlock.
      Don't worry, due to other issues addressed in the following "dm thin:
      fix noflush suspend IO queueing" commit, this code was never called.
      
      Fix this by taking the spin lock for a much shorter period of time.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      18adc577
    • Joe Thornber's avatar
      dm thin: fix out of data space handling · 3e1a0699
      Joe Thornber authored
      
      
      Ideally a thin pool would never run out of data space; the low water
      mark would trigger userland to extend the pool before we completely run
      out of space.  However, many small random IOs to unprovisioned space can
      consume data space at an alarming rate.  Adjust your low water mark if
      you're frequently seeing "out-of-data-space" mode.
      
      Before this fix, if data space ran out the pool would be put in
      PM_READ_ONLY mode which also aborted the pool's current metadata
      transaction (data loss for any changes in the transaction).  This had a
      side-effect of needlessly compromising data consistency.  And retry of
      queued unserviceable bios, once the data pool was resized, could
      initiate changes to potentially inconsistent pool metadata.
      
      Now when the pool's data space is exhausted transition to a new pool
      mode (PM_OUT_OF_DATA_SPACE) that allows metadata to be changed but data
      may not be allocated.  This allows users to remove thin volumes or
      discard data to recover data space.
      
      The pool is no longer put in PM_READ_ONLY mode in response to the pool
      running out of data space.  And PM_READ_ONLY mode no longer aborts the
      pool's current metadata transaction.  Also, set_pool_mode() will now
      notify userspace when the pool mode is changed.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      3e1a0699
    • Mike Snitzer's avatar
      dm thin: ensure user takes action to validate data and metadata consistency · 07f2b6e0
      Mike Snitzer authored
      
      
      If a thin metadata operation fails the current transaction will abort,
      whereby causing potential for IO layers up the stack (e.g. filesystems)
      to have data loss.  As such, set THIN_METADATA_NEEDS_CHECK_FLAG in the
      thin metadata's superblock which:
      1) requires the user verify the thin metadata is consistent (e.g. use
         thin_check, etc)
      2) suggests the user verify the thin data is consistent (e.g. use fsck)
      
      The only way to clear the superblock's THIN_METADATA_NEEDS_CHECK_FLAG is
      to run thin_repair.
      
      On metadata operation failure: abort current metadata transaction, set
      pool in read-only mode, and now set the needs_check flag.
      
      As part of this change, constraints are introduced or relaxed:
      * don't allow a pool to transition to write mode if needs_check is set
      * don't allow data or metadata space to be resized if needs_check is set
      * if a thin pool's metadata space is exhausted: the kernel will now
        force the user to take the pool offline for repair before the kernel
        will allow the metadata space to be extended.
      
      Also, update Documentation to include information about when the thin
      provisioning target commits metadata, how it handles metadata failures
      and running out of space.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      07f2b6e0
  2. 04 Mar, 2014 1 commit
    • Mike Snitzer's avatar
      dm thin: synchronize the pool mode during suspend · cdc2b415
      Mike Snitzer authored
      Commit b5330655
      
       ("dm thin: handle metadata failures more consistently")
      increased potential for the pool's mode to be changed in response to
      metadata operation failures.
      
      When the pool mode is changed it isn't synchronized with the mode in
      pool_features stored in the target's context (ti->private) that is used
      as the basis for (re)establishing the pool mode during resume via
      bind_control_target.
      
      It is important that we synchronize the pool mode when it is changed
      otherwise the pool may experience and unexpected mode transition on the
      next resume (especially if there was no new table load).
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      cdc2b415
  3. 03 Mar, 2014 2 commits
    • Mikulas Patocka's avatar
      dm snapshot: fix metadata corruption · 2c945820
      Mikulas Patocka authored
      Commit 55494bf2 ("dm snapshot: use dm-bufio") broke snapshots.
      Before that 3.14-rc1 commit, loading a snapshot's list of exceptions
      involved reading exception areas one by one into ps->area and inserting
      those exceptions into the hash table.  Commit 55494bf2
      
       changed
      it so that dm-bufio with prefetch is used to load exceptions in batchs.
      Exceptions are loaded correctly, but ps->area is left uninitialized.
      When a new exception is allocated, it is stored in this uninitialized
      ps->area which will be written to the disk.  This causes metadata
      corruption.
      
      Fix this corruption by copying the last area that was read via dm-bufio
      into ps->area.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      2c945820
    • Mike Snitzer's avatar
      dm: fix Kconfig indentation · c64d240d
      Mike Snitzer authored
      
      
      Since DM_DEBUG_BLOCK_STACK_TRACING is a DM_PERSISTENT_DATA config option
      move it from drivers/md/Kconfig to drivers/md/persistent-data/Kconfig.
      
      Doing so fixes indentation for other DM config options.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      c64d240d
  4. 28 Feb, 2014 2 commits
    • Heinz Mauelshagen's avatar
      dm cache mq: fix memory allocation failure for large cache devices · 14f398ca
      Heinz Mauelshagen authored
      
      
      The memory allocated for the multiqueue policy's hash table doesn't need
      to be physically contiguous.  Use vzalloc() instead of kzalloc().
      Fedora has been carrying this fix since 10/10/2013.
      
      Failure seen during creation of a 10TB cached device with a 2048 sector
      block size and 411GB cache size:
      
       dmsetup: page allocation failure: order:9, mode:0x10c0d0
       CPU: 11 PID: 29235 Comm: dmsetup Not tainted 3.10.4 #3
       Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1a       12/30/2011
        000000000010c0d0 ffff880090941898 ffffffff81387ab4 ffff880090941928
        ffffffff810bb26f 0000000000000009 000000000010c0d0 ffff880090941928
        ffffffff81385dbc ffffffff815f3840 ffffffff00000000 000002000010c0d0
       Call Trace:
        [<ffffffff81387ab4>] dump_stack+0x19/0x1b
        [<ffffffff810bb26f>] warn_alloc_failed+0x110/0x124
        [<ffffffff81385dbc>] ? __alloc_pages_direct_compact+0x17c/0x18e
        [<ffffffff810bda2e>] __alloc_pages_nodemask+0x6c7/0x75e
        [<ffffffff810bdad7>] __get_free_pages+0x12/0x3f
        [<ffffffff810ea148>] kmalloc_order_trace+0x29/0x88
        [<ffffffff810ec1fd>] __kmalloc+0x36/0x11b
        [<ffffffffa031eeed>] ? mq_create+0x1dc/0x2cf [dm_cache_mq]
        [<ffffffffa031efc0>] mq_create+0x2af/0x2cf [dm_cache_mq]
        [<ffffffffa0314605>] dm_cache_policy_create+0xa7/0xd2 [dm_cache]
        [<ffffffffa0312530>] ? cache_ctr+0x245/0xa13 [dm_cache]
        [<ffffffffa031263e>] cache_ctr+0x353/0xa13 [dm_cache]
        [<ffffffffa012b916>] dm_table_add_target+0x227/0x2ce [dm_mod]
        [<ffffffffa012e8e4>] table_load+0x286/0x2ac [dm_mod]
        [<ffffffffa012e65e>] ? dev_wait+0x8a/0x8a [dm_mod]
        [<ffffffffa012e324>] ctl_ioctl+0x39a/0x3c2 [dm_mod]
        [<ffffffffa012e35a>] dm_ctl_ioctl+0xe/0x12 [dm_mod]
        [<ffffffff81101181>] vfs_ioctl+0x21/0x34
        [<ffffffff811019d3>] do_vfs_ioctl+0x3b1/0x3f4
        [<ffffffff810f4d2e>] ? ____fput+0x9/0xb
        [<ffffffff81050b6c>] ? task_work_run+0x7e/0x92
        [<ffffffff81101a68>] SyS_ioctl+0x52/0x82
        [<ffffffff81391d92>] system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      14f398ca
    • Heinz Mauelshagen's avatar
      dm cache: fix truncation bug when mapping I/O to >2TB fast device · e0d849fa
      Heinz Mauelshagen authored
      
      
      When remapping a block to the cache's fast device that is larger than
      2TB we must not truncate the destination sector to 32bits.  The 32bit
      temporary result of from_cblock() was being overflowed in
      remap_to_cache() due to the logical left shift.
      
      Use an intermediate 64bit type to store the 32bit from_cblock() result
      to fix the overflow.
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      e0d849fa
  5. 27 Feb, 2014 1 commit
    • Mike Snitzer's avatar
      dm thin: allow metadata space larger than supported to go unused · 7d48935e
      Mike Snitzer authored
      
      
      It was always intended that a user could provide a thin metadata device
      that is larger than the max supported by the on-disk format.  The extra
      space would just go unused.
      
      Unfortunately that never worked.  If the user attempted to use a larger
      metadata device on creation they would get an error like the following:
      
       device-mapper: space map common: space map too large
       device-mapper: transaction manager: couldn't create metadata space map
       device-mapper: thin metadata: tm_create_with_sm failed
       device-mapper: table: 252:17: thin-pool: Error creating metadata object
       device-mapper: ioctl: error adding target to table
      
      Fix this by allowing the initial metadata space map creation to cap its
      size at the max number of blocks supported (DM_SM_METADATA_MAX_BLOCKS).
      get_metadata_dev_size() must also impose DM_SM_METADATA_MAX_BLOCKS (via
      THIN_METADATA_MAX_SECTORS), otherwise extending metadata would cap at
      THIN_METADATA_MAX_SECTORS_WARNING (which is larger than supported).
      
      Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for
      the sizeof the disk_bitmap_header.  So the supported maximum metadata
      size is a bit smaller (reduced from 33423360 to 33292800 sectors).
      
      Lastly, remove the "excess space will not be used" warning message from
      get_metadata_dev_size(); it resulted in printing the warning multiple
      times.  Factor out warn_if_metadata_device_too_big(), call it from
      pool_ctr() and maybe_resize_metadata_dev().
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      7d48935e
  6. 26 Feb, 2014 1 commit
  7. 24 Feb, 2014 1 commit
    • Mike Snitzer's avatar
      dm thin: fix the error path for the thin device constructor · 1acacc07
      Mike Snitzer authored
      
      
      dm_pool_close_thin_device() must be called if dm_set_target_max_io_len()
      fails in thin_ctr().  Otherwise __pool_destroy() will fail because the
      pool will still have an open thin device:
      
       device-mapper: thin metadata: attempt to close pmd when 1 device(s) are still open
       device-mapper: thin: __pool_destroy: dm_pool_metadata_close() failed.
      
      Also, must establish error code if failing thin_ctr() because the pool
      is in fail_io mode.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Cc: stable@vger.kernel.org
      1acacc07
  8. 18 Feb, 2014 1 commit
  9. 17 Feb, 2014 4 commits
    • Mikulas Patocka's avatar
      dm io: fix I/O to multiple destinations · d73f9907
      Mikulas Patocka authored
      Commit 003b5c57
      
       ("block: Convert drivers
      to immutable biovecs") broke dm-mirror due to dm-io breakage.
      
      dm-io had three possible iterators (DM_IO_PAGE_LIST, DM_IO_BVEC,
      DM_IO_VMA) that iterate over pages where the I/O should be performed.
      
      The switch to immutable biovecs changed the DM_IO_BVEC iterator to
      DM_IO_BIO.  Before this change the iterator stored the pointer to a bio
      vector in the dpages structure.  The iterator incremented the pointer in
      the dpages structure as it advanced over the pages.  After the immutable
      biovecs change, the DM_IO_BIO iterator stores a pointer to the bio in
      the dpages structure and uses bio_advance to change the bio as it
      advances.
      
      The problem is that the function dispatch_io stores the content of the
      dpages structure into the variable old_pages and restores it before
      issuing I/O to each of the devices.  Before the change, the statement
      "*dp = old_pages;" restored the iterator to its starting position.
      After the change, struct dpages holds a pointer to the bio, thus the
      statement "*dp = old_pages;" doesn't restore the iterator.
      
      Consequently, in the context of dm-mirror: only the first mirror leg is
      written correctly, the kernel locks up when trying to write the other
      mirror legs because the number of sectors to write in the where->count
      variable doesn't match the number of sectors returned by the iterator.
      
      This patch fixes the bug by partially reverting the original patch - it
      changes the code so that struct dpages holds a pointer to the bio vector,
      so that the statement "*dp = old_pages;" restores the iterator correctly.
      
      The field "context_u" holds the offset from the beginning of the current
      bio vector entry, just like the "bio->bi_iter.bi_bvec_done" field.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      d73f9907
    • Mike Snitzer's avatar
      dm thin: avoid metadata commit if a pool's thin devices haven't changed · 4d1662a3
      Mike Snitzer authored
      Commit 905e51b3
      
       ("dm thin: commit outstanding data every second")
      introduced a periodic commit.  This commit occurs regardless of whether
      any thin devices have made changes.
      
      Fix the periodic commit to check if any of a pool's thin devices have
      changed using dm_pool_changed_this_transaction().
      Reported-by: default avatarAlexander Larsson <alexl@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Cc: stable@vger.kernel.org
      4d1662a3
    • Mike Snitzer's avatar
      dm cache: do not add migration to completed list before unhooking bio · 80ae49aa
      Mike Snitzer authored
      
      
      When completing an overwrite bio, in overwrite_endio(), the associated
      migration should not be added to the 'completed_migrations' until the
      bio's fields are restored with dm_unhook_bio().
      
      Otherwise, do_worker() can race to process 'completed_migrations' before
      dm_unhook_bio() -- so the bio's bi_end_io is incorrect.  This is
      unlikely to cause any problems given the current code but should be
      fixed on the basis of correctness.
      
      Also, the cache's spinlock only needs to be held when manipulating the
      'completed_migrations' list -- other changes don't need protection.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      80ae49aa
    • Mike Snitzer's avatar
      dm cache: move hook_info into common portion of per_bio_data structure · c6eda5e8
      Mike Snitzer authored
      Commit c9d28d5d
      
       ("dm cache: promotion optimisation for writes")
      incorrectly placed the 'hook_info' member in the writethrough-only
      portion of the per_bio_data structure.
      
      Given that the overwrite optimization may be used for writeback the
      'hook_info' member must be placed above the 'cache' member of the
      per_bio_data structure.  Any members above 'cache' are available from
      both writeback and writethrough modes' per_bio_data structure.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Cc: stable@vger.kernel.org # 3.13+
      c6eda5e8
  10. 16 Feb, 2014 8 commits
    • Linus Torvalds's avatar
      Linux 3.14-rc3 · 6d0abeca
      Linus Torvalds authored
      6d0abeca
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 3962dfbe
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "We have a small collection of fixes in my for-linus branch.
      
        The big thing that stands out is a revert of a new ioctl.  Users
        haven't shipped yet in btrfs-progs, and Dave Sterba found a better way
        to export the information"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: use right clone root offset for compressed extents
        btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105
        Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol
        Btrfs: fix max_inline mount option
        Btrfs: fix a lockdep warning when cleaning up aborted transaction
        Revert "btrfs: add ioctl to export size of global metadata reservation"
      3962dfbe
    • Linus Torvalds's avatar
      Merge tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 4302a875
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
       "Fix booting on PPC boards.  Changes to of_match_node matching caused
        the serial port on some PPC boards to stop working.  Reverted the
        change and reimplement to split matching between new style compatible
        only matching and fallback to old matching algorithm"
      
      * tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of: search the best compatible match first in __of_match_node()
        Revert "OF: base: match each node compatible against all given matches first"
      4302a875
    • Kevin Hao's avatar
      of: search the best compatible match first in __of_match_node() · 06b29e76
      Kevin Hao authored
      
      
      Currently, of_match_node compares each given match against all node's
      compatible strings with of_device_is_compatible.
      
      To achieve multiple compatible strings per node with ordering from
      specific to generic, this requires given matches to be ordered from
      specific to generic. For most of the drivers this is not true and also
      an alphabetical ordering is more sane there.
      
      Therefore, this patch introduces a function to match each of the node's
      compatible strings against all given compatible matches without type and
      name first, before checking the next compatible string. This implies
      that node's compatibles are ordered from specific to generic while
      given matches can be in any order. If we fail to find such a match
      entry, then fall-back to the old method in order to keep compatibility.
      
      Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Tested-by: default avatarStephen Chivers <schivers@csc.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      06b29e76
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 946dd683
      Linus Torvalds authored
      Pull SCSI target fixes from Nicholas Bellinger:
       "Mostly minor fixes this time to v3.14-rc1 related changes.  Also
        included is one fix for a free after use regression in persistent
        reservations UNREGISTER logic that is CC'ed to >= v3.11.y stable"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
        Target/sbc: Fix protection copy routine
        IB/srpt: replace strict_strtoul() with kstrtoul()
        target: Simplify command completion by removing CMD_T_FAILED flag
        iser-target: Fix leak on failure in isert_conn_create_fastreg_pool
        iscsi-target: Fix SNACK Type 1 + BegRun=0 handling
        target: Fix missing length check in spc_emulate_evpd_83()
        qla2xxx: Remove last vestiges of qla_tgt_cmd.cmd_list
        target: Fix 32-bit + CONFIG_LBDAF=n link error w/ sector_div
        target: Fix free-after-use regression in PR unregister
      946dd683
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 2d0ef4fb
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "i2c has a bugfix and documentation improvements for you"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        Documentation: i2c: mention ACPI method for instantiating devices
        Documentation: i2c: describe devicetree method for instantiating devices
        i2c: mv64xxx: refactor message start to ensure proper initialization
      2d0ef4fb
    • Linus Torvalds's avatar
      Merge branches 'irq-urgent-for-linus' and 'irq-core-for-linus' of... · 5a667a0c
      Linus Torvalds authored
      Merge branches 'irq-urgent-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
      
      Pull irq update from Thomas Gleixner:
       "Fix from the urgent branch: a trivial oneliner adding the missing
        Kconfig dependency curing build failures which have been discovered by
        several build robots.
      
        The update in the irq-core branch provides a new function in the
        irq/devres code, which is a prerequisite for driver developers to get
        rid of boilerplate code all over the place.
      
        Not a bugfix, but it has zero impact on the current kernel due to the
        lack of users.  It's simpler to provide the infrastructure to
        interested parties via your tree than fulfilling the wishlist of
        driver maintainers on which particular commit or tag this should be
        based on"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Add devm_request_any_context_irq()
      5a667a0c
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3a19c07c
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "The following trilogy of patches brings you:
      
         - fix for a long standing math overflow issue with HZ < 60
      
         - an onliner fix for a corner case in the dreaded tick broadcast
           mechanism affecting a certain range of AMD machines which are
           infested with the infamous automagic C1E power control misfeature
      
         - a fix for one of the ARM platforms which allows the kernel to
           proceed and boot instead of stupidly panicing for no good reason.
           The patch is slightly larger than necessary, but it's less ugly
           than the alternative 5 liner"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick: Clear broadcast pending bit when switching to oneshot
        clocksource: Kona: Print warning rather than panic
        time: Fix overflow when HZ is smaller than 60
      3a19c07c
  11. 15 Feb, 2014 13 commits
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.14-rc2' of... · 9bd01b9b
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull twi tracing fixes from Steven Rostedt:
       "Two urgent fixes in the tracing utility.
      
        The first is a fix for the way the ring buffer stores timestamps.
        After a restructure of the code was done, the ring buffer timestamp
        logic missed the fact that the first event on a sub buffer is to have
        a zero delta, as the full timestamp is stored on the sub buffer
        itself.  But because the delta was not cleared to zero, the timestamp
        for that event will be calculated as the real timestamp + the delta
        from the last timestamp.  This can skew the timestamps of the events
        and have them say they happened when they didn't really happen.
        That's bad.
      
        The second fix is for modifying the function graph caller site.  When
        the stop machine was removed from updating the function tracing code,
        it missed updating the function graph call site location.  It is still
        modified as if it is being done via stop machine.  But it's not.  This
        can lead to a GPF and kernel crash if the function graph call site
        happens to lie between cache lines and one CPU is executing it while
        another CPU is doing the update.  It would be a very hard condition to
        hit, but the result is severe enough to have it fixed ASAP"
      
      * tag 'trace-fixes-v3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace/x86: Use breakpoints for converting function graph caller
        ring-buffer: Fix first commit on sub-buffer having non-zero delta
      9bd01b9b
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7fc92804
      Linus Torvalds authored
      Pull x86 EFI fixes from Peter Anvin:
       "A few more EFI-related fixes"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/efi: Check status field to validate BGRT header
        x86/efi: Fix 32-bit fallout
      7fc92804
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 83660b73
      Linus Torvalds authored
      Pull ARM SoC fixes from Kevin Hilman:
       "A collection of ARM SoC fixes for v3.14-rc1.
      
        Mostly a collection of Kconfig, device tree data and compilation fixes
        along with fix to drivers/phy that fixes a boot regression on some
        Marvell mvebu platforms"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        dma: mv_xor: Silence a bunch of LPAE-related warnings
        ARM: ux500: disable msp2 device tree node
        ARM: zynq: Reserve not DMAable space in front of the kernel
        ARM: multi_v7_defconfig: Select CONFIG_SOC_DRA7XX
        ARM: imx6: Initialize low-power mode early again
        ARM: pxa: fix various compilation problems
        ARM: pxa: fix compilation problem on AM300EPD board
        ARM: at91: add Atmel's SAMA5D3 Xplained board
        spi/atmel: document clock properties
        mmc: atmel-mci: document clock properties
        ARM: at91: enable USB host on at91sam9n12ek board
        ARM: at91/dt: fix sama5d3 ohci hclk clock reference
        ARM: at91/dt: sam9263: fix compatibility string for the I2C
        ata: sata_mv: Fix probe failures with optional phys
        drivers: phy: Add support for optional phys
        drivers: phy: Make NULL a valid phy reference
        ARM: fix HAVE_ARM_TWD selection for OMAP and shmobile
        ARM: moxart: move DMA_OF selection to driver
        ARM: hisi: fix kconfig warning on HAVE_ARM_TWD
      83660b73
    • Wolfram Sang's avatar
    • Wolfram Sang's avatar
    • Filipe David Borba Manana's avatar
      Btrfs: use right clone root offset for compressed extents · 93de4ba8
      Filipe David Borba Manana authored
      
      
      For non compressed extents, iterate_extent_inodes() gives us offsets
      that take into account the data offset from the file extent items, while
      for compressed extents it doesn't. Therefore we have to adjust them before
      placing them in a send clone instruction. Not doing this adjustment leads to
      the receiving end requesting for a wrong a file range to the clone ioctl,
      which results in different file content from the one in the original send
      root.
      
      Issue reproducible with the following excerpt from the test I made for
      xfstests:
      
        _scratch_mkfs
        _scratch_mount "-o compress-force=lzo"
      
        $XFS_IO_PROG -f -c "truncate 118811" $SCRATCH_MNT/foo
        $XFS_IO_PROG -c "pwrite -S 0x0d -b 39987 92267 39987" $SCRATCH_MNT/foo
      
        $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap1
      
        $XFS_IO_PROG -c "pwrite -S 0x3e -b 80000 200000 80000" $SCRATCH_MNT/foo
        $BTRFS_UTIL_PROG filesystem sync $SCRATCH_MNT
        $XFS_IO_PROG -c "pwrite -S 0xdc -b 10000 250000 10000" $SCRATCH_MNT/foo
        $XFS_IO_PROG -c "pwrite -S 0xff -b 10000 300000 10000" $SCRATCH_MNT/foo
      
        # will be used for incremental send to be able to issue clone operations
        $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/clones_snap
      
        $BTRFS_UTIL_PROG subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap2
      
        $FSSUM_PROG -A -f -w $tmp/1.fssum $SCRATCH_MNT/mysnap1
        $FSSUM_PROG -A -f -w $tmp/2.fssum -x $SCRATCH_MNT/mysnap2/mysnap1 \
            -x $SCRATCH_MNT/mysnap2/clones_snap $SCRATCH_MNT/mysnap2
        $FSSUM_PROG -A -f -w $tmp/clones.fssum $SCRATCH_MNT/clones_snap \
            -x $SCRATCH_MNT/clones_snap/mysnap1 -x $SCRATCH_MNT/clones_snap/mysnap2
      
        $BTRFS_UTIL_PROG send $SCRATCH_MNT/mysnap1 -f $tmp/1.snap
        $BTRFS_UTIL_PROG send $SCRATCH_MNT/clones_snap -f $tmp/clones.snap
        $BTRFS_UTIL_PROG send -p $SCRATCH_MNT/mysnap1 \
            -c $SCRATCH_MNT/clones_snap $SCRATCH_MNT/mysnap2 -f $tmp/2.snap
      
        _scratch_unmount
        _scratch_mkfs
        _scratch_mount
      
        $BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/1.snap
        $FSSUM_PROG -r $tmp/1.fssum $SCRATCH_MNT/mysnap1 2>> $seqres.full
      
        $BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/clones.snap
        $FSSUM_PROG -r $tmp/clones.fssum $SCRATCH_MNT/clones_snap 2>> $seqres.full
      
        $BTRFS_UTIL_PROG receive $SCRATCH_MNT -f $tmp/2.snap
        $FSSUM_PROG -r $tmp/2.fssum $SCRATCH_MNT/mysnap2 2>> $seqres.full
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      93de4ba8
    • Anand Jain's avatar
      btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105 · f085381e
      Anand Jain authored
      
      
      bdev is null when disk has disappeared and mounted with
      the degrade option
      
      stack trace
      ---------
      btrfs_sysfs_add_one+0x105/0x1c0 [btrfs]
      open_ctree+0x15f3/0x1fe0 [btrfs]
      btrfs_mount+0x5db/0x790 [btrfs]
      ? alloc_pages_current+0xa4/0x160
      mount_fs+0x34/0x1b0
      vfs_kern_mount+0x62/0xf0
      do_mount+0x22e/0xa80
      ? __get_free_pages+0x9/0x40
      ? copy_mount_options+0x31/0x170
      SyS_mount+0x7e/0xc0
      system_call_fastpath+0x16/0x1b
      ---------
      
      reproducer:
      -------
      mkfs.btrfs -draid1 -mraid1 /dev/sdc /dev/sdd
      (detach a disk)
      devmgt detach /dev/sdc [1]
      mount -o degrade /dev/sdd /btrfs
      -------
      
      [1] github.com/anajain/devmgt.git
      Signed-off-by: default avatarAnand Jain <Anand.Jain@oracle.com>
      Tested-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      f085381e
    • Wolfram Sang's avatar
      i2c: mv64xxx: refactor message start to ensure proper initialization · 79970db2
      Wolfram Sang authored
      
      
      Because the offload mechanism can fall back to a standard transfer,
      having two seperate initialization states is unfortunate. Let's just
      have one state which does things consistently. This fixes a bug where
      some preparation was missing when the fallback happened. And it makes
      the code much easier to follow. To implement this, we put the check
      if offload is possible at the top of the offload setup function.
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Tested-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Cc: stable@vger.kernel.org # v3.12+
      Fixes: 930ab3d4 (i2c: mv64xxx: Add I2C Transaction Generator support)
      79970db2
    • Linus Torvalds's avatar
      Merge tag 'usb-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · ca033390
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here is a bunch of USB fixes for 3.14-rc3.  Most of these are xhci
        reverts, fixing a bunch of reported issues with USB 3 host controller
        issues that loads of people have been hitting (with the exception of
        kernel developers, all of our machines seem to be working fine, which
        is why these took so long to get resolved...)
      
        There are some other minor fixes and new device ids, as ususal.  All
        have been in linux-next successfully"
      
      * tag 'usb-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
        usb: option: blacklist ZTE MF667 net interface
        Revert "usb: xhci: Link TRB must not occur within a USB payload burst"
        Revert "xhci: Avoid infinite loop when sg urb requires too many trbs"
        Revert "xhci: Set scatter-gather limit to avoid failed block writes."
        xhci 1.0: Limit arbitrarily-aligned scatter gather.
        Modpost: fixed USB alias generation for ranges including 0x9 and 0xA
        usb: core: Fix potential memory leak adding dyn USBdevice IDs
        USB: ftdi_sio: add Tagsys RFID Reader IDs
        usb: qcserial: add Netgear Aircard 340U
        usb-storage: enable multi-LUN scanning when needed
        USB: simple: add Dynastream ANT USB-m Stick device support
        usb-storage: add unusual-devs entry for BlackBerry 9000
        usb-storage: restrict bcdDevice range for Super Top in Cypress ATACB
        usb: phy: move some error messages to debug
        usb: ftdi_sio: add Mindstorms EV3 console adapter
        usb: dwc2: fix memory corruption in dwc2 driver
        usb: dwc2: fix role switch breakage
        usb: dwc2: bail out early when booting with "nousb"
        Revert "xhci: replace xhci_read_64() with readq()"
        Revert "xhci: replace xhci_write_64() with writeq()"
        ...
      ca033390
    • Linus Torvalds's avatar
      Merge tag 'tty-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 40a215fb
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are a small number of tty/serial driver fixes to resolve reported
        issues with 3.14-rc and earlier (in the case of the vt bugfix).  Some
        of these have been tested and reported by a number of people as the
        tty bugfix was pretty commonly hit on some platforms.
      
        All have been in linux-next for a while"
      
      * tag 'tty-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        vt: Fix secure clear screen
        serial: 8250: Support XR17V35x fraction divisor
        n_tty: Fix stale echo output
        serial: sirf: fix kernel panic caused by unpaired spinlock
        serial: 8250_pci: unbreak last serial ports on NetMos 9865 cards
        n_tty: Fix poll() when TIME_CHAR and MIN_CHAR == 0
        serial: omap: fix rs485 probe on defered pinctrl
        serial: 8250_dw: fix compilation warning when !CONFIG_PM_SLEEP
        serial: omap-serial: Move info message to probe function
        tty: Set correct tty name in 'active' sysfs attribute
        tty: n_gsm: Fix for modems with brk in modem status control
        drivers/tty/hvc: don't use module_init in non-modular hyp. console code
      40a215fb
    • Linus Torvalds's avatar
      Merge tag 'staging-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · e2e481d6
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are a number (lots, I know) of fixes for staging drivers to
        resolve a bunch of reported issues.
      
        The largest patches here is one revert of a patch that is in 3.14-rc1
        to fix reported problems, and a sync of a usb host driver that
        required some ARM patches to go in before it could be accepted (which
        is why it missed -rc1)
      
        All of these have been in linux-next for a while with no reported
        issues"
      
      * tag 'staging-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (56 commits)
        staging/rtl8821ae: fix build, depends on MAC80211
        iio: max1363: Use devm_regulator_get_optional for optional regulator
        iio:accel:bma180: Use modifier instead of index in channel specification
        iio: adis16400: Set timestamp as the last element in chan_spec
        iio: ak8975: Fix calculation formula for convert micro tesla to gauss unit
        staging:iio:ad799x fix typo in ad799x_events[]
        iio: mxs-lradc: remove useless scale_available files
        iio: mxs-lradc: fix buffer overflow
        iio:magnetometer:mag3110: Fix output of decimal digits in show_int_plus_micros()
        iio:magnetometer:mag3110: Report busy in _read_raw() / write_raw() when buffer is enabled
        wlags49_h2: Fix overflow in wireless_set_essid()
        xlr_net: Fix missing trivial allocation check
        staging: r8188eu: overflow in rtw_p2p_get_go_device_address()
        staging: r8188eu: array overflow in rtw_mp_ioctl_hdl()
        staging: r8188eu: Fix typo in USB_DEVICE list
        usbip/userspace/libsrc/names.c: memory leak
        gpu: ion: dereferencing an ERR_PTR
        staging: comedi: usbduxsigma: fix unaligned dereferences
        staging: comedi: fix too early cleanup in comedi_auto_config()
        staging: android: ion: dummy: fix an error code
        ...
      e2e481d6
    • Linus Torvalds's avatar
      Merge tag 'driver-core-3.14-rc3' of... · ad07f124
      Linus Torvalds authored
      Merge tag 'driver-core-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core fix from Greg KH:
       "Here is a single driver core patch for 3.14-rc3 for the component code
        that Russell has found and fixed"
      
      * tag 'driver-core-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        drivers/base: fix devres handling for master device
      ad07f124
    • Linus Torvalds's avatar
      Merge tag 'char-misc-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · bb0a05d7
      Linus Torvalds authored
      Pull char/misc fixes from Greg KH:
       "Here are some small char/misc driver fixes, along with some
        documentation updates, for 3.14-rc3.  Nothing major, just a number of
        fixes for reported issues"
      
      * tag 'char-misc-3.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Revert "misc: eeprom: sunxi: Add new compatibles"
        Revert "ARM: sunxi: dt: Convert to the new SID compatibles"
        misc: mic: fix possible signed underflow (undefined behavior) in userspace API
        ARM: sunxi: dt: Convert to the new SID compatibles
        misc: eeprom: sunxi: Add new compatibles
        misc: genwqe: Fix potential memory leak when pinning memory
        Documentation:Update Documentation/zh_CN/arm64/memory.txt
        Documentation:Update Documentation/zh_CN/arm64/booting.txt
        Documentation:Chinese translation of Documentation/arm64/tagged-pointers.txt
        raw: set range for MAX_RAW_DEVS
        raw: test against runtime value of max_raw_minors
        Drivers: hv: vmbus: Don't timeout during the initial connection with host
        Drivers: hv: vmbus: Specify the target CPU that should receive notification
        VME: Correct read/write alignment algorithm
        mei: don't unset read cb ptr on reset
        mei: clear write cb from waiting list on reset
      bb0a05d7
  12. 14 Feb, 2014 3 commits
    • Josef Bacik's avatar
      Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol · 3a0dfa6a
      Josef Bacik authored
      
      
      A user was running into errors from an NFS export of a subvolume that had a
      default subvol set.  When we mount a default subvol we will use d_obtain_alias()
      to find an existing dentry for the subvolume in the case that the root subvol
      has already been mounted, or a dummy one is allocated in the case that the root
      subvol has not already been mounted.  This allows us to connect the dentry later
      on if we wander into the path.  However if we don't ever wander into the path we
      will keep DCACHE_DISCONNECTED set for a long time, which angers NFS.  It doesn't
      appear to cause any problems but it is annoying nonetheless, so simply unset
      DCACHE_DISCONNECTED in the get_default_root case and switch btrfs_lookup() to
      use d_materialise_unique() instead which will make everything play nicely
      together and reconnect stuff if we wander into the defaul subvol path from a
      different way.  With this patch I'm no longer getting the NFS errors when
      exporting a volume that has been mounted with a default subvol set.  Thanks,
      
      cc: bfields@fieldses.org
      cc: ebiederm@xmission.com
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      3a0dfa6a
    • Mitch Harder's avatar
      Btrfs: fix max_inline mount option · feb5f965
      Mitch Harder authored
      
      
      Currently, the only mount option for max_inline that has any effect is
      max_inline=0.  Any other value that is supplied to max_inline will be
      adjusted to a minimum of 4k.  Since max_inline has an effective maximum
      of ~3900 bytes due to page size limitations, the current behaviour
      only has meaning for max_inline=0.
      
      This patch will allow the the max_inline mount option to accept non-zero
      values as indicated in the documentation.
      Signed-off-by: default avatarMitch Harder <mitch.harder@sabayonlinux.org>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      feb5f965
    • Liu Bo's avatar
      Btrfs: fix a lockdep warning when cleaning up aborted transaction · a9d2d4ad
      Liu Bo authored
      
      
      Given now we have 2 spinlock for management of delayed refs,
      CONFIG_DEBUG_SPINLOCK=y helped me find this,
      
      [ 4723.413809] BUG: spinlock wrong CPU on CPU#1, btrfs-transacti/2258
      [ 4723.414882]  lock: 0xffff880048377670, .magic: dead4ead, .owner: btrfs-transacti/2258, .owner_cpu: 2
      [ 4723.417146] CPU: 1 PID: 2258 Comm: btrfs-transacti Tainted: G        W  O 3.12.0+ #4
      [ 4723.421321] Call Trace:
      [ 4723.421872]  [<ffffffff81680fe7>] dump_stack+0x54/0x74
      [ 4723.422753]  [<ffffffff81681093>] spin_dump+0x8c/0x91
      [ 4723.424979]  [<ffffffff816810b9>] spin_bug+0x21/0x26
      [ 4723.425846]  [<ffffffff81323956>] do_raw_spin_unlock+0x66/0x90
      [ 4723.434424]  [<ffffffff81689bf7>] _raw_spin_unlock+0x27/0x40
      [ 4723.438747]  [<ffffffffa015da9e>] btrfs_cleanup_one_transaction+0x35e/0x710 [btrfs]
      [ 4723.443321]  [<ffffffffa015df54>] btrfs_cleanup_transaction+0x104/0x570 [btrfs]
      [ 4723.444692]  [<ffffffff810c1b5d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
      [ 4723.450336]  [<ffffffff810c1c2d>] ? trace_hardirqs_on+0xd/0x10
      [ 4723.451332]  [<ffffffffa015e5ee>] transaction_kthread+0x22e/0x270 [btrfs]
      [ 4723.452543]  [<ffffffffa015e3c0>] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs]
      [ 4723.457833]  [<ffffffff81079efa>] kthread+0xea/0xf0
      [ 4723.458990]  [<ffffffff81079e10>] ? kthread_create_on_node+0x140/0x140
      [ 4723.460133]  [<ffffffff81692aac>] ret_from_fork+0x7c/0xb0
      [ 4723.460865]  [<ffffffff81079e10>] ? kthread_create_on_node+0x140/0x140
      [ 4723.496521] ------------[ cut here ]------------
      
      ----------------------------------------------------------------------
      
      The reason is that we get to call cond_resched_lock(&head_ref->lock) while
      still holding @delayed_refs->lock.
      
      So it's different with __btrfs_run_delayed_refs(), where we do drop-acquire
      dance before and after actually processing delayed refs.
      
      Here we don't drop the lock, others are not able to add new delayed refs to
      head_ref, so cond_resched_lock(&head_ref->lock) is not necessary here.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      a9d2d4ad