1. 30 Oct, 2011 5 commits
    • Martin Schwidefsky's avatar
      [S390] memory leak with RCU_TABLE_FREE · e73b7fff
      Martin Schwidefsky authored
      
      
      The rcu page table free code uses a couple of bits in the page table
      pointer passed to tlb_remove_table to discern the different page table
      types. __tlb_remove_table extracts the type with an incorrect mask which
      leads to memory leaks. The correct mask is ((FRAG_MASK << 4) | FRAG_MASK).
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      e73b7fff
    • Martin Schwidefsky's avatar
      [S390] user per registers vs. ptrace single stepping · a45aff52
      Martin Schwidefsky authored
      git commit 5e9a2692
      
       "[S390] ptrace cleanup" introduced a regression
      for the case when both a user PER set (e.g. a storage alteration trace) and
      PTRACE_SINGLESTEP are active. The new code will overrule the user PER set
      with a instruction-fetch PER set over the whole address space for ptrace
      single stepping. The inferior process will be stopped after each instruction
      with an instruction fetch event. Any other events that may have occurred
      concurrently are not reported (e.g. storage alteration event) because the
      control bits for them are not set. The solution is to merge the PER control
      bits of the user PER set with the PER_EVENT_IFETCH control bit for
      PTRACE_SINGLESTEP.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      a45aff52
    • Sebastian Ott's avatar
      [S390] topology: fix alloc_masks annotation · caa04f69
      Sebastian Ott authored
      
      
      Fix this warning:
      WARNING: vmlinux.o(.text+0x199b6): Section mismatch in reference from
      the function alloc_masks() to the function .init.text:__alloc_bootmem()
      Signed-off-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      caa04f69
    • Martin Schwidefsky's avatar
      [S390] avoid warning in show_cpuinfo · dd4a5a31
      Martin Schwidefsky authored
      
      
      The .start function and indirectly the .next function of the show_cpuinfo
      sequential operation uses NR_CPUS as limit instead of nr_cpu_ids.
      This can cause warnings like this:
      
      WARNING: at /usr/src/linux/include/linux/cpumask.h:107
      Process lscpu (pid: 575, task: 000000007deb4338, ksp: 000000007794f588)
      Krnl PSW : 0704000180000000 0000000000106db4 (show_cpuinfo+0x108/0x234)
                 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
      Krnl GPRS: 0000000000000003 0000000000791988 000000000071b478 0000000000000004
                 0000000000000001 0000000000000000 000000007d139500 0000000000000400
                 0000000000000000 000000000070e24c 000000007d48d600 0000000000000005
                 000000007d48d600 00000000004dfa10 0000000000106cf8 000000007794fcc0
      Krnl Code: 0000000000106da8: 95001000           cli     0(%r1),0
                 0000000000106dac: a774ffac           brc     7,106d04
                 0000000000106db0: a7f40001           brc     15,106db2
                >0000000000106db4: 92011000           mvi     0(%r1),1
                 0000000000106db8: a7f4ffa6           brc     15,106d04
                 0000000000106dbc: c0e5000065b4       brasl   %r14,113924
                 0000000000106dc2: c09000303a45       larl    %r9,70e24c
                 0000000000106dc8: c020001eefd4       larl    %r2,4e4d70
      
      Replacing NR_CPUS with nr_cpu_ids fixes it.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      dd4a5a31
    • Peter Oberparleiter's avatar
      [S390] fix mismatch in summation of I/O IRQ statistics · de400d6b
      Peter Oberparleiter authored
      
      
      Current IRQ statistics support does not show detail counts for I/O
      interrupts which are processed internally only. The result is a
      summation count which is way off such as this one:
      
                 CPU0       CPU1       CPU2
      I/O:       1331        710        442
      [...]
      QAI:         15         16         16   [I/O] QDIO Adapter Interrupt
      QDI:          1          0          0   [I/O] QDIO Interrupt
      DAS:        706        645        381   [I/O] DASD
      C15:         26         10          0   [I/O] 3215
      C70:          0          0          0   [I/O] 3270
      TAP:          0          0          0   [I/O] Tape
      VMR:          0          0          0   [I/O] Unit Record Devices
      LCS:          0          0          0   [I/O] LCS
      CLW:          0          0          0   [I/O] CLAW
      CTC:          0          0          0   [I/O] CTC
      APB:          0          0          0   [I/O] AP Bus
      
      Fix this by moving I/O interrupt accounting into the common I/O layer.
      Signed-off-by: default avatarPeter Oberparleiter <peter.oberparleiter@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      de400d6b
  2. 29 Oct, 2011 7 commits
  3. 28 Oct, 2011 28 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 · ec7ae517
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
        [SCSI] qla4xxx: export address/port of connection (fix udev disk names)
        [SCSI] ipr: Fix BUG on adapter dump timeout
        [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
        [SCSI] hpsa: change confusing message to be more clear
        [SCSI] iscsi class: fix vlan configuration
        [SCSI] qla4xxx: fix data alignment and use nl helpers
        [SCSI] iscsi class: fix link local mispelling
        [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
        [SCSI] aacraid: use lower snprintf() limit
        [SCSI] lpfc 8.3.27: Change driver version to 8.3.27
        [SCSI] lpfc 8.3.27: T10 additions for SLI4
        [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
        [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
        [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
        [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
        [SCSI] megaraid_sas: Changelog and version update
        [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
        [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
        [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
        [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
        ...
      ec7ae517
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client · 97d2eb13
      Linus Torvalds authored
      * 'for-linus' of git://ceph.newdream.net/git/ceph-client:
        libceph: fix double-free of page vector
        ceph: fix 32-bit ino numbers
        libceph: force resend of osd requests if we skip an osdmap
        ceph: use kernel DNS resolver
        ceph: fix ceph_monc_init memory leak
        ceph: let the set_layout ioctl set single traits
        Revert "ceph: don't truncate dirty pages in invalidate work thread"
        ceph: replace leading spaces with tabs
        libceph: warn on msg allocation failures
        libceph: don't complain on msgpool alloc failures
        libceph: always preallocate mon connection
        libceph: create messenger with client
        ceph: document ioctls
        ceph: implement (optional) max read size
        ceph: rename rsize -> rasize
        ceph: make readpages fully async
      97d2eb13
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 68d99b2c
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (549 commits)
        ALSA: hda - Fix ADC input-amp handling for Cx20549 codec
        ALSA: hda - Keep EAPD turned on for old Conexant chips
        ALSA: hda/realtek - Fix missing volume controls with ALC260
        ASoC: wm8940: Properly set codec->dapm.bias_level
        ALSA: hda - Fix pin-config for ASUS W90V
        ALSA: hda - Fix surround/CLFE headphone and speaker pins order
        ALSA: hda - Fix typo
        ALSA: Update the sound git tree URL
        ALSA: HDA: Add new revision for ALC662
        ASoC: max98095: Convert codec->hw_write to snd_soc_write
        ASoC: keep pointer to resource so it can be freed
        ASoC: sgtl5000: Fix wrong mask in some snd_soc_update_bits calls
        ASoC: wm8996: Fix wrong mask for setting WM8996_AIF_CLOCKING_2
        ASoC: da7210: Add support for line out and DAC
        ASoC: da7210: Add support for DAPM
        ALSA: hda/realtek - Fix DAC assignments of multiple speakers
        ASoC: Use SGTL5000_LINREG_VDDD_MASK instead of hardcoded mask value
        ASoC: Set sgtl5000->ldo in ldo_regulator_register
        ASoC: wm8996: Use SND_SOC_DAPM_AIF_OUT for AIF2 Capture
        ASoC: wm8994: Use SND_SOC_DAPM_AIF_OUT for AIF3 Capture
        ...
      68d99b2c
    • Linus Torvalds's avatar
      Merge branch 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci · 0e59e7e7
      Linus Torvalds authored
      * 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
        PCI: Clean-up MPS debug output
        pci: Clamp pcie_set_readrq() when using "performance" settings
        PCI: enable MPS "performance" setting to properly handle bridge MPS
        PCI: Workaround for Intel MPS errata
        PCI: Add support for PASID capability
        PCI: Add implementation for PRI capability
        PCI: Export ATS functions to modules
        PCI: Move ATS implementation into own file
        PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
        PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
        PCI / PM: Extend PME polling to all PCI devices
        PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
        PCI: Make pci_setup_bridge() non-static for use by arch code
        x86: constify PCI raw ops structures
        PCI: Add quirk for known incorrect MPSS
        PCI: Add Solarflare vendor ID and SFC4000 device IDs
      0e59e7e7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 46b51ea2
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (83 commits)
        mmc: fix compile error when CONFIG_BLOCK is not enabled
        mmc: core: Cleanup eMMC4.5 conditionals
        mmc: omap_hsmmc: if multiblock reads are broken, disable them
        mmc: core: add workaround for controllers with broken multiblock reads
        mmc: core: Prevent too long response times for suspend
        mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00
        mmc: sd: Handle SD3.0 cards not supporting UHS-I bus speed mode
        mmc: core: support HPI send command
        mmc: core: Add cache control for eMMC4.5 device
        mmc: core: Modify the timeout value for writing power class
        mmc: core: new discard feature support at eMMC v4.5
        mmc: core: mmc sanitize feature support for v4.5
        mmc: dw_mmc: modify DATA register offset
        mmc: sdhci-pci: add flag for devices that can support runtime PM
        mmc: omap_hsmmc: ensure pbias configuration is always done
        mmc: core: Add Power Off Notify Feature eMMC 4.5
        mmc: sdhci-s3c: fix potential NULL dereference
        mmc: replace printk with appropriate display macro
        mmc: core: Add default timeout value for CMD6
        mmc: sdhci-pci: add runtime pm support
        ...
      46b51ea2
    • Linus Torvalds's avatar
      Merge branch 'devel-stable' of... · 1fdb24e9
      Linus Torvalds authored
      Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
      
      * 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
        ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
        ARM: gic, local timers: use the request_percpu_irq() interface
        ARM: gic: consolidate PPI handling
        ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
        ARM: mach-s5p64x0: remove mach/memory.h
        ARM: mach-s3c64xx: remove mach/memory.h
        ARM: plat-mxc: remove mach/memory.h
        ARM: mach-prima2: remove mach/memory.h
        ARM: mach-zynq: remove mach/memory.h
        ARM: mach-bcmring: remove mach/memory.h
        ARM: mach-davinci: remove mach/memory.h
        ARM: mach-pxa: remove mach/memory.h
        ARM: mach-ixp4xx: remove mach/memory.h
        ARM: mach-h720x: remove mach/memory.h
        ARM: mach-vt8500: remove mach/memory.h
        ARM: mach-s5pc100: remove mach/memory.h
        ARM: mach-tegra: remove mach/memory.h
        ARM: plat-tcc: remove mach/memory.h
        ARM: mach-mmp: remove mach/memory.h
        ARM: mach-cns3xxx: remove mach/memory.h
        ...
      
      Fix up mostly pretty trivial conflicts in:
       - arch/arm/Kconfig
       - arch/arm/include/asm/localtimer.h
       - arch/arm/kernel/Makefile
       - arch/arm/mach-shmobile/board-ap4evb.c
       - arch/arm/mach-u300/core.c
       - arch/arm/mm/dma-mapping.c
       - arch/arm/mm/proc-v7.S
       - arch/arm/plat-omap/Kconfig
      largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
      CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
      addition of NEED_MACH_MEMORY_H next to HAVE_IDE.
      1fdb24e9
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue · f362f98e
      Linus Torvalds authored
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
        leases: fix write-open/read-lease race
        nfs: drop unnecessary locking in llseek
        ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
        vfs: add generic_file_llseek_size
        vfs: do (nearly) lockless generic_file_llseek
        direct-io: merge direct_io_walker into __blockdev_direct_IO
        direct-io: inline the complete submission path
        direct-io: separate map_bh from dio
        direct-io: use a slab cache for struct dio
        direct-io: rearrange fields in dio/dio_submit to avoid holes
        direct-io: fix a wrong comment
        direct-io: separate fields only used in the submission path from struct dio
        vfs: fix spinning prevention in prune_icache_sb
        vfs: add a comment to inode_permission()
        vfs: pass all mask flags check_acl and posix_acl_permission
        vfs: add hex format for MAY_* flag values
        vfs: indicate that the permission functions take all the MAY_* flags
        compat: sync compat_stats with statfs.
        vfs: add "device" tag to /proc/self/mountstats
        cleanup: vfs: small comment fix for block_invalidatepage
        ...
      
      Fix up trivial conflict in fs/gfs2/file.c (llseek changes)
      f362f98e
    • Linus Torvalds's avatar
      Merge http://sucs.org/~rohan/git/gfs2-3.0-nmw · f793f296
      Linus Torvalds authored
      * http://sucs.org/~rohan/git/gfs2-3.0-nmw: (24 commits)
        GFS2: Move readahead of metadata during deallocation into its own function
        GFS2: Remove two unused variables
        GFS2: Misc fixes
        GFS2: rewrite fallocate code to write blocks directly
        GFS2: speed up delete/unlink performance for large files
        GFS2: Fix off-by-one in gfs2_blk2rgrpd
        GFS2: Clean up ->page_mkwrite
        GFS2: Correctly set goal block after allocation
        GFS2: Fix AIL flush issue during fsync
        GFS2: Use cached rgrp in gfs2_rlist_add()
        GFS2: Call do_strip() directly from recursive_scan()
        GFS2: Remove obsolete assert
        GFS2: Cache the most recently used resource group in the inode
        GFS2: Make resource groups "append only" during life of fs
        GFS2: Use rbtree for resource groups and clean up bitmap buffer ref count scheme
        GFS2: Fix lseek after SEEK_DATA, SEEK_HOLE have been added
        GFS2: Clean up gfs2_create
        GFS2: Use ->dirty_inode()
        GFS2: Fix bug trap and journaled data fsync
        GFS2: Fix inode allocation error path
        ...
      f793f296
    • Linus Torvalds's avatar
      Merge branch '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6 · dabcbb1b
      Linus Torvalds authored
      * '3.2-without-smb2' of git://git.samba.org/sfrench/cifs-2.6: (52 commits)
        Fix build break when freezer not configured
        Add definition for share encryption
        CIFS: Make cifs_push_locks send as many locks at once as possible
        CIFS: Send as many mandatory unlock ranges at once as possible
        CIFS: Implement caching mechanism for posix brlocks
        CIFS: Implement caching mechanism for mandatory brlocks
        CIFS: Fix DFS handling in cifs_get_file_info
        CIFS: Fix error handling in cifs_readv_complete
        [CIFS] Fixup trivial checkpatch warning
        [CIFS] Show nostrictsync and noperm mount options in /proc/mounts
        cifs, freezer: add wait_event_freezekillable and have cifs use it
        cifs: allow cifs_max_pending to be readable under /sys/module/cifs/parameters
        cifs: tune bdi.ra_pages in accordance with the rsize
        cifs: allow for larger rsize= options and change defaults
        cifs: convert cifs_readpages to use async reads
        cifs: add cifs_async_readv
        cifs: fix protocol definition for READ_RSP
        cifs: add a callback function to receive the rest of the frame
        cifs: break out 3rd receive phase into separate function
        cifs: find mid earlier in receive codepath
        ...
      dabcbb1b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 5619a693
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com/xfs/xfs: (69 commits)
        xfs: add AIL pushing tracepoints
        xfs: put in missed fix for merge problem
        xfs: do not flush data workqueues in xfs_flush_buftarg
        xfs: remove XFS_bflush
        xfs: remove xfs_buf_target_name
        xfs: use xfs_ioerror_alert in xfs_buf_iodone_callbacks
        xfs: clean up xfs_ioerror_alert
        xfs: clean up buffer allocation
        xfs: remove buffers from the delwri list in xfs_buf_stale
        xfs: remove XFS_BUF_STALE and XFS_BUF_SUPER_STALE
        xfs: remove XFS_BUF_SET_VTYPE and XFS_BUF_SET_VTYPE_REF
        xfs: remove XFS_BUF_FINISH_IOWAIT
        xfs: remove xfs_get_buftarg_list
        xfs: fix buffer flushing during unmount
        xfs: optimize fsync on directories
        xfs: reduce the number of log forces from tail pushing
        xfs: Don't allocate new buffers on every call to _xfs_buf_find
        xfs: simplify xfs_trans_ijoin* again
        xfs: unlock the inode before log force in xfs_change_file_space
        xfs: unlock the inode before log force in xfs_fs_nfs_commit_metadata
        ...
      5619a693
    • J. Bruce Fields's avatar
      leases: fix write-open/read-lease race · f3c7691e
      J. Bruce Fields authored
      
      
      In setlease, we use i_writecount to decide whether we can give out a
      read lease.
      
      In open, we break leases before incrementing i_writecount.
      
      There is therefore a window between the break lease and the i_writecount
      increment when setlease could add a new read lease.
      
      This would leave us with a simultaneous write open and read lease, which
      shouldn't happen.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      f3c7691e
    • Andi Kleen's avatar
      nfs: drop unnecessary locking in llseek · 79835a71
      Andi Kleen authored
      
      
      This makes NFS follow the standard generic_file_llseek locking scheme.
      
      Cc: Trond.Myklebust@netapp.com
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      79835a71
    • Andi Kleen's avatar
      ext4: replace cut'n'pasted llseek code with generic_file_llseek_size · 4cce0e28
      Andi Kleen authored
      
      
      This gives ext4 the benefits of unlocked llseek.
      
      Cc: tytso@mit.edu
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      4cce0e28
    • Andi Kleen's avatar
      vfs: add generic_file_llseek_size · 5760495a
      Andi Kleen authored
      
      
      Add a generic_file_llseek variant to the VFS that allows passing in
      the maximum file size of the file system, instead of always
      using maxbytes from the superblock.
      
      This can be used to eliminate some cut'n'paste seek code in ext4.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      5760495a
    • Andi Kleen's avatar
      vfs: do (nearly) lockless generic_file_llseek · ef3d0fd2
      Andi Kleen authored
      
      
      The i_mutex lock use of generic _file_llseek hurts.  Independent processes
      accessing the same file synchronize over a single lock, even though
      they have no need for synchronization at all.
      
      Under high utilization this can cause llseek to scale very poorly on larger
      systems.
      
      This patch does some rethinking of the llseek locking model:
      
      First the 64bit f_pos is not necessarily atomic without locks
      on 32bit systems. This can already cause races with read() today.
      This was discussed on linux-kernel in the past and deemed acceptable.
      The patch does not change that.
      
      Let's look at the different seek variants:
      
      SEEK_SET: Doesn't really need any locking.
      If there's a race one writer wins, the other loses.
      
      For 32bit the non atomic update races against read()
      stay the same. Without a lock they can also happen
      against write() now.  The read() race was deemed
      acceptable in past discussions, and I think if it's
      ok for read it's ok for write too.
      
      => Don't need a lock.
      
      SEEK_END: This behaves like SEEK_SET plus it reads
      the maximum size too. Reading the maximum size would have the
      32bit atomic problem. But luckily we already have a way to read
      the maximum size without locking (i_size_read), so we
      can just use that instead.
      
      Without i_mutex there is no synchronization with write() anymore,
      however since the write() update is atomic on 64bit it just behaves
      like another racy SEEK_SET.  On non atomic 32bit it's the same
      as SEEK_SET.
      
      => Don't need a lock, but need to use i_size_read()
      
      SEEK_CUR: This has a read-modify-write race window
      on the same file. One could argue that any application
      doing unsynchronized seeks on the same file is already broken.
      But for the sake of not adding a regression here I'm
      using the file->f_lock to synchronize this. Using this
      lock is much better than the inode mutex because it doesn't
      synchronize between processes.
      
      => So still need a lock, but can use a f_lock.
      
      This patch implements this new scheme in generic_file_llseek.
      I dropped generic_file_llseek_unlocked and changed all callers.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ef3d0fd2
    • Andi Kleen's avatar
      direct-io: merge direct_io_walker into __blockdev_direct_IO · 847cc637
      Andi Kleen authored
      
      
      This doesn't change anything for the compiler, but hch thought it would
      make the code clearer.
      
      I moved the reference counting into its own little inline.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      847cc637
    • Andi Kleen's avatar
      direct-io: inline the complete submission path · ba253fbf
      Andi Kleen authored
      
      
      Add inlines to all the submission path functions. While this increases
      code size it also gives gcc a lot of optimization opportunities
      in this critical hotpath.
      
      In particular -- together with some other changes -- this
      allows gcc to get rid of the unnecessary clearing of
      sdio at the beginning and optimize the messy parameter passing.
      Any non inlining of a function which takes a sdio parameter
      would break this optimization because they cannot be done if the
      address of a structure is taken.
      
      Note that benefits are only seen with CONFIG_OPTIMIZE_INLINING
      and CONFIG_CC_OPTIMIZE_FOR_SIZE both set to off.
      
      This gives about 2.2% improvement on a large database benchmark
      with a high IOPS rate.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ba253fbf
    • Andi Kleen's avatar
      direct-io: separate map_bh from dio · 18772641
      Andi Kleen authored
      
      
      Only a single b_private field in the map_bh buffer head is needed after
      the submission path. Move map_bh separately to avoid storing
      this information in the long term slab.
      
      This avoids the weird 104 byte hole in struct dio_submit which also needed
      to be memseted early.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      18772641
    • Andi Kleen's avatar
      direct-io: use a slab cache for struct dio · 6e8267f5
      Andi Kleen authored
      
      
      A direct slab call is slightly faster than kmalloc and can be better cached
      per CPU. It also avoids rounding to the next kmalloc slab.
      
      In addition this enforces cache line alignment for struct dio to avoid
      any false sharing.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      6e8267f5
    • Andi Kleen's avatar
      direct-io: rearrange fields in dio/dio_submit to avoid holes · 0dc2bc49
      Andi Kleen authored
      
      
      Fix most problems reported by pahole.
      
      There is still a weird 104 byte hole after map_bh. I'm not sure what
      causes this.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      0dc2bc49
    • Andi Kleen's avatar
      direct-io: fix a wrong comment · cde1ecb3
      Andi Kleen authored
      
      
      There's nothing on the stack, even before my changes.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      cde1ecb3
    • Andi Kleen's avatar
      direct-io: separate fields only used in the submission path from struct dio · eb28be2b
      Andi Kleen authored
      
      
      This large, but largely mechanic, patch moves all fields in struct dio
      that are only used in the submission path into a separate on stack
      data structure. This has the advantage that the memory is very likely
      cache hot, which is not guaranteed for memory fresh out of kmalloc.
      
      This also gives gcc more optimization potential because it can easier
      determine that there are no external aliases for these variables.
      
      The sdio initialization is a initialization now instead of memset.
      This allows gcc to break sdio into individual fields and optimize
      away unnecessary zeroing (after all the functions are inlined)
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      eb28be2b
    • Christoph Hellwig's avatar
      vfs: fix spinning prevention in prune_icache_sb · 62a3ddef
      Christoph Hellwig authored
      
      
      We need to move the inode to the end of the list to actually make the
      spinning prevention explained in the comment above it work.  With a
      plain list_move it will simply stay in place as we're always reclaiming
      from the head of the list.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      62a3ddef
    • Andreas Gruenbacher's avatar
    • Andreas Gruenbacher's avatar
    • Aneesh Kumar K.V's avatar
      vfs: add hex format for MAY_* flag values · 8522ca58
      Aneesh Kumar K.V authored
      
      
      We are going to add more flags and having them in hex format
      make it simpler
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      8522ca58
    • Andreas Gruenbacher's avatar
    • Eric W. Biederman's avatar
      compat: sync compat_stats with statfs. · 1448c721
      Eric W. Biederman authored
      
      
      This was found by inspection while tracking a similar
      bug in compat_statfs64, that has been fixed in mainline
      since decemeber.
      
      - This fixes a bug where not all of the f_spare fields
        were cleared on mips and s390.
      - Add the f_flags field to struct compat_statfs
      - Copy f_flags to userspace in case someone cares.
      - Use __clear_user to copy the f_spare field to userspace
        to ensure that all of the elements of f_spare are cleared.
        On some architectures f_spare is has 5 ints and on some
        architectures f_spare only has 4 ints.  Which makes
        the previous technique of clearing each int individually
        broken.
      
      I don't expect anyone actually uses the old statfs system
      call anymore but if they do let them benefit from having
      the compat and the native version working the same.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1448c721