1. 01 Nov, 2018 3 commits
    • Philippe Gerum's avatar
      PM: ipipe: converge to Dovetail's CPUIDLE management · cb5702e0
      Philippe Gerum authored
      Handle requests for transitioning to deeper C-states the way Dovetail
      does, which prevents us from losing the timer when grabbed by a
      co-kernel, in presence of a CPUIDLE driver.
      cb5702e0
    • Philippe Gerum's avatar
      ipipe: add cpuidle control interface · caf90a26
      Philippe Gerum authored
      Add a kernel interface for sharing CPU idling control between the host
      kernel and a co-kernel. The former invokes ipipe_cpuidle_control()
      which the latter should implement, for determining whether entering a
      sleep state is ok. This hook should return boolean true if so.
      
      The co-kernel may veto such entry if need be, in order to prevent
      latency spikes, as exiting sleep states might be costly depending on
      the CPU idling operation being used.
      caf90a26
    • Philippe Gerum's avatar
      genirq: add generic I-pipe core · d9f057db
      Philippe Gerum authored
      This commit provides the arch-independent bits for implementing the
      interrupt pipeline core, a lightweight layer introducing a separate,
      high-priority execution stage for handling all IRQs in pseudo-NMI
      mode, which cannot be delayed by the regular kernel code. See
      Documentation/ipipe.rst for details about interrupt pipelining.
      
      Architectures which support interrupt pipelining should select
      HAVE_IPIPE_SUPPORT, along with implementing the required arch-specific
      code. In such a case, CONFIG_IPIPE becomes available to the user via
      the Kconfig interface for enabling the feature.
      d9f057db
  2. 18 Oct, 2018 1 commit
  3. 04 Oct, 2018 1 commit
  4. 19 Sep, 2018 2 commits
    • Eric Dumazet's avatar
      inet: frags: break the 2GB limit for frags storage · 990204dd
      Eric Dumazet authored
      
      
      Some users are willing to provision huge amounts of memory to be able
      to perform reassembly reasonnably well under pressure.
      
      Current memory tracking is using one atomic_t and integers.
      
      Switch to atomic_long_t so that 64bit arches can use more than 2GB,
      without any cost for 32bit arches.
      
      Note that this patch avoids an overflow error, if high_thresh was set
      to ~2GB, since this test in inet_frag_alloc() was never true :
      
      if (... || frag_mem_limit(nf) > nf->high_thresh)
      
      Tested:
      
      $ echo 16000000000 >/proc/sys/net/ipv4/ipfrag_high_thresh
      
      <frag DDOS>
      
      $ grep FRAG /proc/net/sockstat
      FRAG: inuse 14705885 memory 16000002880
      
      $ nstat -n ; sleep 1 ; nstat | grep Reas
      IpReasmReqds                    3317150            0.0
      IpReasmFails                    3317112            0.0
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 3e67f106
      
      )
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      990204dd
    • Eric Dumazet's avatar
      inet: frags: use rhashtables for reassembly units · 9aee41ef
      Eric Dumazet authored
      
      
      Some applications still rely on IP fragmentation, and to be fair linux
      reassembly unit is not working under any serious load.
      
      It uses static hash tables of 1024 buckets, and up to 128 items per bucket (!!!)
      
      A work queue is supposed to garbage collect items when host is under memory
      pressure, and doing a hash rebuild, changing seed used in hash computations.
      
      This work queue blocks softirqs for up to 25 ms when doing a hash rebuild,
      occurring every 5 seconds if host is under fire.
      
      Then there is the problem of sharing this hash table for all netns.
      
      It is time to switch to rhashtables, and allocate one of them per netns
      to speedup netns dismantle, since this is a critical metric these days.
      
      Lookup is now using RCU. A followup patch will even remove
      the refcount hold/release left from prior implementation and save
      a couple of atomic operations.
      
      Before this patch, 16 cpus (16 RX queue NIC) could not handle more
      than 1 Mpps frags DDOS.
      
      After the patch, I reach 9 Mpps without any tuning, and can use up to 2GB
      of storage for the fragments (exact number depends on frags being evicted
      after timeout)
      
      $ grep FRAG /proc/net/sockstat
      FRAG: inuse 1966916 memory 2140004608
      
      A followup patch will change the limits for 64bit arches.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Alexander Aring <alex.aring@gmail.com>
      Cc: Stefan Schmidt <stefan@osg.samsung.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      (cherry picked from commit 648700f7
      
      )
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9aee41ef
  5. 17 Aug, 2018 1 commit
  6. 15 Aug, 2018 10 commits
  7. 03 Aug, 2018 4 commits
  8. 22 Jul, 2018 1 commit
  9. 17 Jul, 2018 1 commit
  10. 08 Jul, 2018 1 commit
  11. 03 Jul, 2018 2 commits
    • Vaibhav Jain's avatar
      cxl: Disable prefault_mode in Radix mode · c9debbd1
      Vaibhav Jain authored
      commit b6c84ba2 upstream.
      
      Currently we see a kernel-oops reported on Power-9 while attaching a
      context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set
      to anything other than 'none'. The backtrace of the oops is of this
      form:
      
        Unable to handle kernel paging request for data at address 0x00000080
        Faulting instruction address: 0xc00800000bcf3b20
        cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800]
            pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl]
            lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl]
            sp: c00000037f003a80
           msr: 9000000000009033
           dar: 80
         dsisr: 40000000
          current = 0xc00000037f280000
          paca    = 0xc0000003ffffe600   softe: 3        irq_happened: 0x01
            pid   = 3529, comm = afp_no_int
        <snip>
        cxl_prefault+0xfc/0x248 [cxl]
        process_element_entry_psl9+0xd8/0x1a0 [cxl]
        cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl]
        native_attach_process+0xc0/0x130 [cxl]
        afu_ioctl+0x3f4/0x5e0 [cxl]
        do_vfs_ioctl+0xdc/0x890
        ksys_ioctl+0x68/0xf0
        sys_ioctl+0x40/0xa0
        system_call+0x58/0x6c
      
      The issue is caused as on Power-8 the AFU attr 'prefault_mode' was
      used to improve initial storage fault performance by prefaulting
      process segments. However on Power-9 with radix mode we don't have
      Storage-Segments that we can prefault. Also prefaulting process Pages
      will be too costly and fine-grained.
      
      Hence, since the prefaulting mechanism doesn't makes sense of
      radix-mode, this patch updates prefault_mode_store() to not allow any
      other value apart from CXL_PREFAULT_NONE when radix mode is enabled.
      
      Fixes: f24be42a
      
       ("cxl: Add psl9 specific code")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
      Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c9debbd1
    • Geert Uytterhoeven's avatar
      lib/vsprintf: Remove atomic-unsafe support for %pCr · ea0ac01f
      Geert Uytterhoeven authored
      commit 666902e4
      
       upstream.
      
      "%pCr" formats the current rate of a clock, and calls clk_get_rate().
      The latter obtains a mutex, hence it must not be called from atomic
      context.
      
      Remove support for this rarely-used format, as vsprintf() (and e.g.
      printk()) must be callable from any context.
      
      Any remaining out-of-tree users will start seeing the clock's name
      printed instead of its rate.
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@gmail.com>
      Fixes: 900cca29 ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks")
      Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be
      
      
      To: Jia-Ju Bai <baijiaju1990@gmail.com>
      To: Jonathan Corbet <corbet@lwn.net>
      To: Michael Turquette <mturquette@baylibre.com>
      To: Stephen Boyd <sboyd@kernel.org>
      To: Zhang Rui <rui.zhang@intel.com>
      To: Eduardo Valentin <edubezval@gmail.com>
      To: Eric Anholt <eric@anholt.net>
      To: Stefan Wahren <stefan.wahren@i2se.com>
      To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: linux-doc@vger.kernel.org
      Cc: linux-clk@vger.kernel.org
      Cc: linux-pm@vger.kernel.org
      Cc: linux-serial@vger.kernel.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-renesas-soc@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: Geert Uytterhoeven <geert+renesas@glider.be>
      Cc: stable@vger.kernel.org # 4.1+
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ea0ac01f
  12. 20 Jun, 2018 6 commits
  13. 11 Jun, 2018 1 commit
  14. 30 May, 2018 3 commits
  15. 22 May, 2018 3 commits