1. 01 Nov, 2018 2 commits
    • Philippe Gerum's avatar
      PM: ipipe: converge to Dovetail's CPUIDLE management · cb5702e0
      Philippe Gerum authored
      Handle requests for transitioning to deeper C-states the way Dovetail
      does, which prevents us from losing the timer when grabbed by a
      co-kernel, in presence of a CPUIDLE driver.
      cb5702e0
    • Philippe Gerum's avatar
      ipipe: add cpuidle control interface · caf90a26
      Philippe Gerum authored
      Add a kernel interface for sharing CPU idling control between the host
      kernel and a co-kernel. The former invokes ipipe_cpuidle_control()
      which the latter should implement, for determining whether entering a
      sleep state is ok. This hook should return boolean true if so.
      
      The co-kernel may veto such entry if need be, in order to prevent
      latency spikes, as exiting sleep states might be costly depending on
      the CPU idling operation being used.
      caf90a26
  2. 25 Dec, 2017 1 commit
  3. 10 Aug, 2017 1 commit
  4. 15 May, 2017 1 commit
  5. 01 May, 2017 1 commit
  6. 02 Mar, 2017 1 commit
  7. 06 Dec, 2016 1 commit
  8. 29 Nov, 2016 1 commit
  9. 04 Jul, 2016 1 commit
    • Shreyas B. Prabhu's avatar
      cpuidle: Fix last_residency division · dbd1b8ea
      Shreyas B. Prabhu authored
      Snooze is a poll idle state in powernv and pseries platforms. Snooze
      has a timeout so that if a CPU stays in snooze for more than target
      residency of the next available idle state, then it would exit
      thereby giving chance to the cpuidle governor to re-evaluate and
      promote the CPU to a deeper idle state. Therefore whenever snooze
      exits due to this timeout, its last_residency will be target_residency
      of the next deeper state.
      
      Commit e93e59ce "cpuidle: Replace ktime_get() with local_clock()"
      changed the math around last_residency calculation. Specifically,
      while converting last_residency value from nano- to microseconds, it
      carries out right shift by 10. Because of that, in snooze timeout
      exit scenarios last_residency calculated is roughly 2.3% less than
      target_residency of the next available state. This pattern is picked
      up by get_typical_interval() in the menu governor and therefore
      expected_interval in menu_select() is frequently less than the
      target_residency of any state other than snooze.
      
      Due to this we are entering snooze at a higher rate, thereby
      affecting the single thread performance.
      
      Fix this by using more precise division via ktime_us_delta().
      
      Fixes: e93e59ce
      
       "cpuidle: Replace ktime_get() with local_clock()"
      Reported-by: default avatarAnton Blanchard <anton@samba.org>
      Bisected-by: default avatarShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Acked-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      dbd1b8ea
  10. 18 May, 2016 1 commit
    • Daniel Lezcano's avatar
      cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() · e7387da5
      Daniel Lezcano authored
      Commit 0b89e9aa (cpuidle: delay enabling interrupts until all
      coupled CPUs leave idle) rightfully fixed a regression by letting
      the coupled idle state framework to handle local interrupt enabling
      when the CPU is exiting an idle state.
      
      The current code checks if the idle state is coupled and, if so, it
      will let the coupled code to enable interrupts. This way, it can
      decrement the ready-count before handling the interrupt. This
      mechanism prevents the other CPUs from waiting for a CPU which is
      handling interrupts.
      
      But the check is done against the state index returned by the back
      end driver's ->enter functions which could be different from the
      initial index passed as parameter to the cpuidle_enter_state()
      function.
      
       entered_state = target_state->enter(dev, drv, index);
      
       [ ... ]
      
       if (!cpuidle_state_is_coupled(drv, entered_state))
      	local_irq_enable();
      
       [ ... ]
      
      If the 'index' is referring to a coupled idle state but the
      'entered_state' is *not* coupled, then the interrupts are enabled
      again. All CPUs blocked on the sync barrier may busy loop longer
      if the CPU has interrupts to handle before decrementing the
      ready-count. That's consuming more energy than saving.
      
      Fixes: 0b89e9aa
      
       (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
      [ rjw: Subject & changelog ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e7387da5
  11. 26 Apr, 2016 1 commit
    • Daniel Lezcano's avatar
      cpuidle: Replace ktime_get() with local_clock() · e93e59ce
      Daniel Lezcano authored
      
      
      The ktime_get() can have a non negligeable overhead, use local_clock()
      instead.
      
      In order to test the difference between ktime_get() and local_clock(),
      a quick hack has been added to trigger, via debugfs, 10000 times a
      call to ktime_get() and local_clock() and measure the elapsed time.
      
      Then the average value, the min and max is computed for each call.
      
      From userspace, the test above was called 100 times every 2 seconds.
      
      So, ktime_get() and local_clock() have been called 1000000 times in
      total.
      
      The results are:
      
      ktime_get():
      ============
       * average: 101 ns (stddev: 27.4)
       * maximum: 38313 ns
       * minimum: 65 ns
      
      local_clock():
      ==============
       * average: 60 ns (stddev: 9.8)
       * maximum: 13487 ns
       * minimum: 46 ns
      
      The local_clock() is faster and more stable.
      
      Even if it is a drop in the ocean, changing the ktime_get() by the
      local_clock() allows to save 80ns at idle time (entry + exit). And
      in some circumstances, especially when there are several CPUs racing
      for the clock access, we save tens of microseconds.
      
      The idle duration resulting from a diff is converted from nanosec to
      microsec. This could be done with integer division (div 1000) - which is
      an expensive operation or by 10 bits shifting (div 1024) - which is fast
      but unprecise.
      
      The following table gives some results at the limits.
      
       ------------------------------------------
      |   nsec   |   div(1000)   |   div(1024)   |
       ------------------------------------------
      |   1e3    |        1 usec |      976 nsec |
       ------------------------------------------
      |   1e6    |     1000 usec |      976 usec |
       ------------------------------------------
      |   1e9    |  1000000 usec |   976562 usec |
       ------------------------------------------
      
      There is a linear deviation of 2.34%. This loss of precision is acceptable
      in the context of the resulting diff which is used for statistics. These
      ones are processed to guess estimate an approximation of the duration of the
      next idle period which ends up into an idle state selection. The selection
      criteria takes into account the next duration based on large intervals,
      represented by the idle state's target residency.
      
      The 2^10 division is enough because the approximation regarding the 1e3
      division is lost in all the approximations done for the next idle duration
      computation.
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      [ rjw: Subject ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      e93e59ce
  12. 09 Apr, 2016 1 commit
  13. 22 Jan, 2016 1 commit
  14. 19 Jan, 2016 1 commit
  15. 28 Aug, 2015 1 commit
  16. 21 Jul, 2015 1 commit
  17. 09 Jul, 2015 1 commit
  18. 30 May, 2015 1 commit
    • Rafael J. Wysocki's avatar
      cpuidle: Do not use CPUIDLE_DRIVER_STATE_START in cpuidle.c · 7d51d979
      Rafael J. Wysocki authored
      
      
      The CPUIDLE_DRIVER_STATE_START symbol is defined as 1 only if
      CONFIG_ARCH_HAS_CPU_RELAX is set, otherwise it is defined as 0.
      However, if CONFIG_ARCH_HAS_CPU_RELAX is set, the first (index 0)
      entry in the cpuidle driver's table of states is overwritten with
      the default "poll" entry by the core.  The "state" defined by the
      "poll" entry doesn't provide ->enter_dead and ->enter_freeze
      callbacks and its exit_latency is 0.
      
      For this reason, it is not necessary to use CPUIDLE_DRIVER_STATE_START
      in cpuidle_play_dead() (->enter_dead is NULL, so the "poll state"
      will be skipped by the loop).
      
      It also is arguably unuseful to return states with exit_latency
      equal to 0 from find_deepest_state(), so the function can be modified
      to start the loop from index 0 and the "poll state" will be skipped by
      it as a result of the check against latency_req.
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarPreeti U Murthy <preeti@linux.vnet.ibm.com>
      7d51d979
  19. 19 May, 2015 1 commit
  20. 14 May, 2015 3 commits
  21. 09 May, 2015 1 commit
  22. 04 May, 2015 1 commit
  23. 29 Apr, 2015 1 commit
  24. 03 Apr, 2015 1 commit
  25. 05 Mar, 2015 1 commit
  26. 28 Feb, 2015 2 commits
  27. 15 Feb, 2015 1 commit
    • Rafael J. Wysocki's avatar
      PM / sleep: Make it possible to quiesce timers during suspend-to-idle · 124cf911
      Rafael J. Wysocki authored
      
      
      The efficiency of suspend-to-idle depends on being able to keep CPUs
      in the deepest available idle states for as much time as possible.
      Ideally, they should only be brought out of idle by system wakeup
      interrupts.
      
      However, timer interrupts occurring periodically prevent that from
      happening and it is not practical to chase all of the "misbehaving"
      timers in a whack-a-mole fashion.  A much more effective approach is
      to suspend the local ticks for all CPUs and the entire timekeeping
      along the lines of what is done during full suspend, which also
      helps to keep suspend-to-idle and full suspend reasonably similar.
      
      The idea is to suspend the local tick on each CPU executing
      cpuidle_enter_freeze() and to make the last of them suspend the
      entire timekeeping.  That should prevent timer interrupts from
      triggering until an IO interrupt wakes up one of the CPUs.  It
      needs to be done with interrupts disabled on all of the CPUs,
      though, because otherwise the suspended clocksource might be
      accessed by an interrupt handler which might lead to fatal
      consequences.
      
      Unfortunately, the existing ->enter callbacks provided by cpuidle
      drivers generally cannot be used for implementing that, because some
      of them re-enable interrupts temporarily and some idle entry methods
      cause interrupts to be re-enabled automatically on exit.  Also some
      of these callbacks manipulate local clock event devices of the CPUs
      which really shouldn't be done after suspending their ticks.
      
      To overcome that difficulty, introduce a new cpuidle state callback,
      ->enter_freeze, that will be guaranteed (1) to keep interrupts
      disabled all the time (and return with interrupts disabled) and (2)
      not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
      look for the deepest available idle state with ->enter_freeze present
      and to make the CPU execute that callback with suspended tick (and the
      last of the online CPUs to execute it with suspended timekeeping).
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      124cf911
  28. 13 Feb, 2015 1 commit
    • Rafael J. Wysocki's avatar
      PM / sleep: Re-implement suspend-to-idle handling · 38106313
      Rafael J. Wysocki authored
      
      
      In preparation for adding support for quiescing timers in the final
      stage of suspend-to-idle transitions, rework the freeze_enter()
      function making the system wait on a wakeup event, the freeze_wake()
      function terminating the suspend-to-idle loop and the mechanism by
      which deep idle states are entered during suspend-to-idle.
      
      First of all, introduce a simple state machine for suspend-to-idle
      and make the code in question use it.
      
      Second, prevent freeze_enter() from losing wakeup events due to race
      conditions and ensure that the number of online CPUs won't change
      while it is being executed.  In addition to that, make it force
      all of the CPUs re-enter the idle loop in case they are in idle
      states already (so they can enter deeper idle states if possible).
      
      Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
      checks in cpuidle_select() and cpuidle_reflect() with a single
      suspend-to-idle state check in cpuidle_idle_call().
      
      Finally, introduce cpuidle_enter_freeze() that will simply find the
      deepest idle state available to the given CPU and enter it using
      cpuidle_enter().
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      38106313
  29. 24 Sep, 2014 1 commit
    • Daniel Lezcano's avatar
      sched: Let the scheduler see CPU idle states · 442bf3aa
      Daniel Lezcano authored
      
      
      When the cpu enters idle, it stores the cpuidle state pointer in its
      struct rq instance which in turn could be used to make a better decision
      when balancing tasks.
      
      As soon as the cpu exits its idle state, the struct rq reference is
      cleared.
      
      There are a couple of situations where the idle state pointer could be changed
      while it is being consulted:
      
      1. For x86/acpi with dynamic c-states, when a laptop switches from battery
         to AC that could result on removing the deeper idle state. The acpi driver
         triggers:
      	'acpi_processor_cst_has_changed'
      		'cpuidle_pause_and_lock'
      			'cpuidle_uninstall_idle_handler'
      				'kick_all_cpus_sync'.
      
      All cpus will exit their idle state and the pointed object will be set to
      NULL.
      
      2. The cpuidle driver is unloaded. Logically that could happen but not
      in practice because the drivers are always compiled in and 95% of them are
      not coded to unregister themselves.  In any case, the unloading code must
      call 'cpuidle_unregister_device', that calls 'cpuidle_pause_and_lock'
      leading to 'kick_all_cpus_sync' as mentioned above.
      
      A race can happen if we use the pointer and then one of these two scenarios
      occurs at the same moment.
      
      In order to be safe, the idle state pointer stored in the rq must be
      used inside a rcu_read_lock section where we are protected with the
      'rcu_barrier' in the 'cpuidle_uninstall_idle_handler' function. The
      idle_get_state() and idle_put_state() accessors should be used to that
      effect.
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linaro-kernel@lists.linaro.org
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/n/tip-@git.kernel.org
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      442bf3aa
  30. 19 Sep, 2014 1 commit
  31. 09 Jul, 2014 1 commit
  32. 06 May, 2014 1 commit
  33. 30 Apr, 2014 1 commit
  34. 11 Mar, 2014 3 commits