1. 12 Dec, 2009 1 commit
  2. 08 Dec, 2009 1 commit
  3. 04 Dec, 2009 1 commit
  4. 03 Dec, 2009 2 commits
  5. 26 Nov, 2009 1 commit
    • Ilya Loginov's avatar
      block: add helpers to run flush_dcache_page() against a bio and a request's pages · 2d4dc890
      Ilya Loginov authored
      
      
      Mtdblock driver doesn't call flush_dcache_page for pages in request.  So,
      this causes problems on architectures where the icache doesn't fill from
      the dcache or with dcache aliases.  The patch fixes this.
      
      The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid
      pointless empty cache-thrashing loops on architectures for which
      flush_dcache_page() is a no-op.  Every architecture was provided with this
      flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is
      equal 1 or do nothing otherwise.
      
      See "fix mtd_blkdevs problem with caches on some architectures" discussion
      on LKML for more information.
      Signed-off-by: default avatarIlya Loginov <isloginov@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Peter Horton <phorton@bitbox.co.uk>
      Cc: "Ed L. Cashin" <ecashin@coraid.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      2d4dc890
  6. 15 Nov, 2009 1 commit
  7. 06 Nov, 2009 1 commit
    • Isaku Yamahata's avatar
      ia64/xen: compilation fix · 5b5d9448
      Isaku Yamahata authored
      
      
      This patch fixes the following compilation error introduced by a PCI related
      features.
      
      The change set of 5dd1af9f84c79bedd589db89e71ca733f3bf0ebd moves some
      xen related definitions from the arch header file
      (x86/include/asm/xen/hypervisor.h) to the common header file
      (include/xen/xen.h).  So ia64/xen also follows it.
      
      In file included from linux-next/include/xen/grant_table.h:41,
                       from linux-next/drivers/block/xen-blkfront.c:48:
      linux-next/arch/ia64/include/asm/xen/hypervisor.h:43: error: nested redefinition of 'enum xen_domain_type'
      linux-next/arch/ia64/include/asm/xen/hypervisor.h:43: error: redeclaration of 'enum xen_domain_type'
      linux-next/arch/ia64/include/asm/xen/hypervisor.h:44: error: redeclaration of enumerator 'XEN_NATIVE'
      linux-next/include/xen/xen.h:5: error: previous definition of 'XEN_NATIVE' was here
      linux-next/arch/ia64/include/asm/xen/hypervisor.h:45: error: redeclaration of enumerator 'XEN_PV_DOMAIN'
      linux-next/include/xen/xen.h:6: error: previous definition of 'XEN_PV_DOMAIN' was here
      linux-next/arch/ia64/include/asm/xen/hypervisor.h:46: error: redeclaration of enumerator 'XEN_HVM_DOMAIN'
      linux-next/include/xen/xen.h:7: error: previous definition of 'XEN_HVM_DOMAIN' was here
      Signed-off-by: default avatarIsaku Yamahata <yamahata@valinux.co.jp>
      Signed-off-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      5b5d9448
  8. 02 Nov, 2009 1 commit
    • Tony Luck's avatar
      Revert "[IA64] fix percpu warnings" · e8c93fc7
      Tony Luck authored
      This reverts commit b94b0808
      
      .
      
      genksyms currently cannot handle complicated types for exported
      percpu variables.  Drop this patch for now as it prevents a
      module from being loaded on sn2 systems:
      
       xpc: no symbol version for per_cpu____sn_cnodeid_to_nasid
       xpc: Unknown symbol per_cpu____sn_cnodeid_to_nasid
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      e8c93fc7
  9. 15 Oct, 2009 1 commit
  10. 14 Oct, 2009 1 commit
  11. 13 Oct, 2009 1 commit
  12. 12 Oct, 2009 1 commit
    • Neil Horman's avatar
      net: Generalize socket rx gap / receive queue overflow cmsg · 3b885787
      Neil Horman authored
      Create a new socket level option to report number of queue overflows
      
      Recently I augmented the AF_PACKET protocol to report the number of frames lost
      on the socket receive queue between any two enqueued frames.  This value was
      exported via a SOL_PACKET level cmsg.  AFter I completed that work it was
      requested that this feature be generalized so that any datagram oriented socket
      could make use of this option.  As such I've created this patch, It creates a
      new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a
      SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue
      overflowed between any two given frames.  It also augments the AF_PACKET
      protocol to take advantage of this new feature (as it previously did not touch
      sk->sk_drops, which this patch uses to record the overflow count).  Tested
      successfully by me.
      
      Notes:
      
      1) Unlike my previous patch, this patch simply records the sk_drops value, which
      is not a number of drops between packets, but rather a total number of drops.
      Deltas must be computed in user space.
      
      2) While this patch currently works with datagram oriented protocols, it will
      also be accepted by non-datagram oriented protocols. I'm not sure if thats
      agreeable to everyone, but my argument in favor of doing so is that, for those
      protocols which aren't applicable to this option, sk_drops will always be zero,
      and reporting no drops on a receive queue that isn't used for those
      non-participating protocols seems reasonable to me.  This also saves us having
      to code in a per-protocol opt in mechanism.
      
      3) This applies cleanly to net-next assuming that commit
      97775007
      
       (my af packet cmsg patch) is reverted
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b885787
  13. 07 Oct, 2009 1 commit
    • Tony Luck's avatar
      [IA64] Squeeze ticket locks back into 4 bytes. · 9d40ee20
      Tony Luck authored
      
      
      Linus pointed out that other people have spent large amounts of time
      and effort to optimize the layout of frequently used structures. Often
      these have embedded locks, and the assumption is that a lock takes
      4 bytes.  Linus also pointed out how to work with the limited options
      for atomic instructions on Itanium.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      9d40ee20
  14. 27 Sep, 2009 1 commit
  15. 25 Sep, 2009 1 commit
    • Tony Luck's avatar
      [IA64] implement ticket locks for Itanium · 2c86963b
      Tony Luck authored
      Back in January 2008 Nick Piggin implemented "ticket" spinlocks
      for X86 (See commit 314cdbef
      
      ).
      
      IA64 implementation has a couple of differences because of the
      available atomic operations ... e.g. we have no fetchadd2 instruction
      that operates on a 16-bit quantity so we make ticket locks use
      a 32-bit word for each of the current ticket and now-serving values.
      
      Performance on uncontended locks is about 8% worse than the previous
      implementation, but this seems a good trade for determinism in the
      contended case. Performance impact on macro-level benchmarks is in
      the noise.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      2c86963b
  16. 24 Sep, 2009 3 commits
  17. 22 Sep, 2009 2 commits
  18. 16 Sep, 2009 1 commit
    • Peter Zijlstra's avatar
      sched: Disable wakeup balancing · 182a85f8
      Peter Zijlstra authored
      
      
      Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't
      really mind too much, SD_BALANCE_NEWIDLE picks up most of the
      slack.
      
      On a dual socket, quad core, dual thread nehalem system:
      
      sysbench (--num_threads=16):
      
       SD_BALANCE_WAKE-: 13982 tx/s
       SD_BALANCE_WAKE+: 15688 tx/s
      
      kbuild (-j16):
      
       SD_BALANCE_WAKE-: 47.648295846  seconds time elapsed   ( +-   0.312% )
       SD_BALANCE_WAKE+: 47.608607360  seconds time elapsed   ( +-   0.026% )
      
      (same within noise)
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      182a85f8
  19. 15 Sep, 2009 4 commits
    • Peter Zijlstra's avatar
      sched: Reduce forkexec_idx · b8a543ea
      Peter Zijlstra authored
      
      
      If we're looking to place a new task, we might as well find the
      idlest position _now_, not 1 tick ago.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b8a543ea
    • Mike Galbraith's avatar
      sched: Improve latencies and throughput · 0ec9fab3
      Mike Galbraith authored
      
      
      Make the idle balancer more agressive, to improve a
      x264 encoding workload provided by Jason Garrett-Glaser:
      
       NEXT_BUDDY NO_LB_BIAS
       encoded 600 frames, 252.82 fps, 22096.60 kb/s
       encoded 600 frames, 250.69 fps, 22096.60 kb/s
       encoded 600 frames, 245.76 fps, 22096.60 kb/s
      
       NO_NEXT_BUDDY LB_BIAS
       encoded 600 frames, 344.44 fps, 22096.60 kb/s
       encoded 600 frames, 346.66 fps, 22096.60 kb/s
       encoded 600 frames, 352.59 fps, 22096.60 kb/s
      
       NO_NEXT_BUDDY NO_LB_BIAS
       encoded 600 frames, 425.75 fps, 22096.60 kb/s
       encoded 600 frames, 425.45 fps, 22096.60 kb/s
       encoded 600 frames, 422.49 fps, 22096.60 kb/s
      
      Peter pointed out that this is better done via newidle_idx,
      not via LB_BIAS, newidle balancing should look for where
      there is load _now_, not where there was load 2 ticks ago.
      
      Worst-case latencies are improved as well as no buddies
      means less vruntime spread. (as per prior lkml discussions)
      
      This change improves kbuild-peak parallelism as well.
      Reported-by: default avatarJason Garrett-Glaser <darkshikari@gmail.com>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1253011667.9128.16.camel@marge.simson.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0ec9fab3
    • Peter Zijlstra's avatar
      sched: Tweak wake_idx · 78e7ed53
      Peter Zijlstra authored
      
      
      When merging select_task_rq_fair() and sched_balance_self() we lost
      the use of wake_idx, restore that and set them to 0 to make wake
      balancing more aggressive.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      78e7ed53
    • Peter Zijlstra's avatar
      sched: Merge select_task_rq_fair() and sched_balance_self() · c88d5910
      Peter Zijlstra authored
      
      
      The problem with wake_idle() is that is doesn't respect things like
      cpu_power, which means it doesn't deal well with SMT nor the recent
      RT interaction.
      
      To cure this, it needs to do what sched_balance_self() does, which
      leads to the possibility of merging select_task_rq_fair() and
      sched_balance_self().
      
      Modify sched_balance_self() to:
      
        - update_shares() when walking up the domain tree,
          (it only called it for the top domain, but it should
           have done this anyway), which allows us to remove
          this ugly bit from try_to_wake_up().
      
        - do wake_affine() on the smallest domain that contains
          both this (the waking) and the prev (the wakee) cpu for
          WAKE invocations.
      
      Then use the top-down balance steps it had to replace wake_idle().
      
      This leads to the dissapearance of SD_WAKE_BALANCE and
      SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE.
      
      SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective.
      
      Touch all topology bits to replace the old with new SD flags --
      platforms might need re-tuning, enabling SD_BALANCE_WAKE
      conditionally on a NUMA distance seems like a good additional
      feature, magny-core and small nehalem systems would want this
      enabled, systems with slow interconnects would not.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c88d5910
  20. 14 Sep, 2009 2 commits
    • Hidetoshi Seto's avatar
      [IA64] kexec: Make INIT safe while transition to · 07a6a4ae
      Hidetoshi Seto authored
      
      
      kdump/kexec kernel
      
      Summary:
      
        Asserting INIT on the beginning of kdump/kexec kernel will result
        in unexpected behavior because INIT handler for previous kernel is
        invoked on new kernel.
      
      Description:
      
        In panic situation, we can receive INIT while kernel transition,
        i.e. from beginning of panic to bootstrap of kdump kernel.
        Since we initialize registers on leave from current kernel, no
        longer monarch/slave handlers of current kernel in virtual mode are
        called safely.  (In fact system goes hang as far as I confirmed)
      
      How to Reproduce:
      
        Start kdump
          # echo c > /proc/sysrq-trigger
        Then assert INIT while kdump kernel is booting, before new INIT
        handler for kdump kernel is registered.
      
      Expected(Desirable) result:
      
        kdump kernel boots without any problem, crashdump retrieved
      
      Actual result:
      
        INIT handler for previous kernel is invoked on kdump kernel
        => panic, hang etc. (unexpected)
      
      Proposed fix:
      
        We can unregister these init handlers from SAL before jumping into
        new kernel, however then the INIT will fallback to default behavior,
        result in warmboot by SAL (according to the SAL specification) and
        we cannot retrieve the crashdump.
      
        Therefore this patch introduces a NOP init handler and register it
        to SAL before leave from current kernel, to start kdump safely by
        preventing INITs from entering virtual mode and resulting in warmboot.
      
        On the other hand, in case of kexec that not for kdump, it also
        has same problem with INIT while kernel transition.
        This patch handles this case differently, because for kexec
        unregistering handlers will be preferred than registering NOP
        handler, since the situation "no handlers registered" is usual
        state for kernel's entry.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      07a6a4ae
    • Hidetoshi Seto's avatar
      [IA64] kdump: Mask MCA/INIT on frozen cpus · 4295ab34
      Hidetoshi Seto authored
      
      
      Summary:
      
        INIT asserted on kdump kernel invokes INIT handler not only on a
        cpu that running on the kdump kernel, but also BSP of the panicked
        kernel, because the (badly) frozen BSP can be thawed by INIT.
      
      Description:
      
        The kdump_cpu_freeze() is called on cpus except one that initiates
        panic and/or kdump, to stop/offline the cpu (on ia64, it means we
        pass control of cpus to SAL, or put them in spinloop).  Note that
        CPU0(BSP) always go to spinloop, so if panic was happened on an AP,
        there are at least 2cpus (= the AP and BSP) which not back to SAL.
      
        On the spinning cpus, interrupts are disabled (rsm psr.i), but INIT
        is still interruptible because psr.mc for mask them is not set unless
        kdump_cpu_freeze() is not called from MCA/INIT context.
      
        Therefore, assume that a panic was happened on an AP, kdump was
        invoked, new INIT handlers for kdump kernel was registered and then
        an INIT is asserted.  From the viewpoint of SAL, there are 2 online
        cpus, so INIT will be delivered to both of them.  It likely means
        that not only the AP (= a cpu executing kdump) enters INIT handler
        which is newly registered, but also BSP (= another cpu spinning in
        panicked kernel) enters the same INIT handler.  Of course setting of
        registers in BSP are still old (for panicked kernel), so what happen
        with running handler with wrong setting will be extremely unexpected.
        I believe this is not desirable behavior.
      
      How to Reproduce:
      
        Start kdump on one of APs (e.g. cpu1)
          # taskset 0x2 echo c > /proc/sysrq-trigger
        Then assert INIT after kdump kernel is booted, after new INIT handler
        for kdump kernel is registered.
      
      Expected results:
      
        An INIT handler is invoked only on the AP.
      
      Actual results:
      
        An INIT handler is invoked on the AP and BSP.
      
      Sample of results:
      
        I got following console log by asserting INIT after prompt "root:/>".
        It seems that two monarchs appeared by one INIT, and one panicked at
        last.  And it also seems that the panicked one supposed there were
        4 online cpus and no one did rendezvous:
      
          :
          [  0 %]dropping to initramfs shell
          exiting this shell will reboot your system
          root:/> Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=0
          ia64_init_handler: Promoting cpu 0 to monarch.
          Delaying for 5 seconds...
          All OS INIT slaves have reached rendezvous
          Processes interrupted by INIT - 0 (cpu 0 task 0xa000000100af0000)
          :
          <<snip>>
          :
          Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=1
          Delaying for 5 seconds...
          mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail.
          OS INIT slave did not rendezvous on cpu 1 2 3
          INIT swapper 0[0]: bugcheck! 0 [1]
          :
          <<snip>>
          :
          Kernel panic - not syncing: Attempted to kill the idle task!
      
      Proposed fix:
      
        To avoid this problem, this patch inserts ia64_set_psr_mc() to mask
        INIT on cpus going to be frozen.  This masking have no effect if the
        kdump_cpu_freeze() is called from INIT handler when kdump_on_init == 1,
        because psr.mc is already turned on to 1 before entering OS_INIT.
        I confirmed that weird log like above are disappeared after applying
        this patch.
      Signed-off-by: default avatarHidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Haren Myneni <hbabu@us.ibm.com>
      Cc: kexec@lists.infradead.org
      Acked-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      4295ab34
  21. 10 Sep, 2009 3 commits
  22. 09 Sep, 2009 1 commit
  23. 11 Aug, 2009 2 commits
  24. 10 Aug, 2009 1 commit
  25. 05 Aug, 2009 2 commits
    • Jan Engelhardt's avatar
      net: implement a SO_DOMAIN getsockoption · 0d6038ee
      Jan Engelhardt authored
      
      
      This sockopt goes in line with SO_TYPE and SO_PROTOCOL. It makes it
      possible for userspace programs to pass around file descriptors — I
      am referring to arguments-to-functions, but it may even work for the
      fd passing over UNIX sockets — without needing to also pass the
      auxiliary information (PF_INET6/IPPROTO_TCP).
      Signed-off-by: default avatarJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d6038ee
    • Jan Engelhardt's avatar
      net: implement a SO_PROTOCOL getsockoption · 49c794e9
      Jan Engelhardt authored
      
      
      Similar to SO_TYPE returning the socket type, SO_PROTOCOL allows to
      retrieve the protocol used with a given socket.
      
      I am not quite sure why we have that-many copies of socket.h, and why
      the values are not the same on all arches either, but for where hex
      numbers dominate, I use 0x1029 for SO_PROTOCOL as that seems to be
      the next free unused number across a bunch of operating systems, or
      so Google results make me want to believe. SO_PROTOCOL for others
      just uses the next free Linux number, 38.
      Signed-off-by: default avatarJan Engelhardt <jengelh@medozas.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49c794e9
  26. 03 Aug, 2009 2 commits
  27. 28 Jul, 2009 1 commit