1. 20 Sep, 2020 11 commits
  2. 13 Sep, 2020 1 commit
  3. 12 Sep, 2020 2 commits
  4. 03 Sep, 2020 7 commits
  5. 02 Sep, 2020 2 commits
    • Philippe Gerum's avatar
      genirq: irq_pipeline: keep in-band stalled until arch_cpu_idle() · 6d5df09d
      Philippe Gerum authored
      We need the whole CPU idling path to run with the in-band stage
      stalled until the arch-specific code is invoked to enter the idle
      mode.
      
      This fixes a regression introduced by #40e67045
      
      , which left
      irq_cpuidle_enter() with the in-band stage unstalled. Instead, the
      latter should stall the in-band stage before returning, since debug
      assertions may expect this later on.
      
      It should be noted that since we ran with hard irqs off, the
      regression did not cause any issue with the mere logic. This was more
      of a problem with debug assertions checking for irqs_disabled().
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      6d5df09d
    • Philippe Gerum's avatar
      genirq: irq_pipeline: revert hack to ignore oob ticks in in-band time accounting · 6441a893
      Philippe Gerum authored
      
      
      The hack in steal_account_process_time() is supposed to fix the
      following issue:
      
        <in-band task A>
      	   (context switch)
        	   <oob task B>
      	   	[timer IRQ, logged for in-band stage]
      		...
      	   (context switch)
        <in-band task A>
        	   synchronize_irqs(i.e. play events logged for the in-band stage)
      		account_process_times(current = task A)
      
      IOW, the un-corrected pattern would cause task A to be charged for all
      timer ticks which were actually received by task B while it was running
      oob. The code in steal_account_process_time() tries to prevent this by
      asking the callers not to charge the CPU time upon timer ticks received
      from the oob stage (e.g. X86_IF clear in eflags).
      
      For this reason, maxtime is returned when we don't want the CPU time to
      be charged. Problem is that this implementation appeared for kernel
      v4.7, and it does not work the way it should with recent kernels anymore
      (vtime may be miscalculated for instance).
      
      Drop this code, plan for revisiting the issue in a better way.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      6441a893
  6. 30 Aug, 2020 4 commits
    • Philippe Gerum's avatar
      x86: irq_pipeline: defer IRQ affinity update to in-band · 939170a3
      Philippe Gerum authored
      
      
      Some irqchips require IRQ affinity to be set from the context of the
      migrated interrupt, such as Intel's IO*APIC hardware (see
      CONFIG_GENERIC_PENDING_IRQ). Since the code actually changing the
      affinity may be available to the in-band stage exclusively, we have to
      defer its execution until events are synchronized for that stage
      before returning from the interrupt frame.
      
      The IO-APIC ack handlers now schedule an irq_work routine to perform
      any pending affinity update which is guaranteed to run on top of the
      current interrupt frame, when handle_irq_pipelined_finish() eventually
      synchronizes the in-band stage before leaving. This makes possible to
      change the affinity of an interrupt otherwise delivered to the oob
      stage.
      
      It might happen that such update is skipped in case the context
      preempted by the interrupt should be hidden from the in-band stage,
      such as running oob or when the stage is stalled. If so, the update
      would wait for the next interrupt to happen in the right context, like
      the idle loop or any other irq-preemptible section of the kernel code.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      939170a3
    • Philippe Gerum's avatar
      genirq: irq_pipeline: add IRQD_SETAFFINITY_BLOCKED state · 3488ef20
      Philippe Gerum authored
      
      
      Some irqchips require IRQ affinity to be set from the context of the
      migrated interrupt, such as Intel's IO*APIC hardware (see
      CONFIG_GENERIC_PENDING_IRQ). Since the code actually changing the
      affinity may be available to the in-band stage exclusively, we have to
      defer its execution until events are synchronized for that stage
      before returning from the interrupt frame.
      
      Since we might have received the original event in a context from
      which the in-band interrupt log will not be synchronized on top of
      handle_irq_pipelined_finish(), we need a way to tag IRQ descriptors on
      entry to the pipeline so that the flow handler won't schedule any
      deferred affinity update for the event. Those contexts are:
      
      - if the in-band stage is stalled
      - if running on the out-of-band stage
      
      IRQD_SETAFFINITY_BLOCKED is such a marker, telling the
      architecture-specific pipeline code not to schedule any affinity
      update when set.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      3488ef20
    • Philippe Gerum's avatar
      genirq: irq_pipeline: plug IRQ synchronization lag while idling · 40e67045
      Philippe Gerum authored
      
      
      default_idle_call() is entered with hard irqs on, in-band stage
      stalled, which means that we may have IRQs pending before calling the
      arch-specific idling code.
      
      This case is properly detected by irq_cpuidle_enter(), but it fails to
      synchronize the in-band log, which causes the pending events to wait
      for the next interrupt before they can be played as a result of
      unblocking from the CPU idling code, then unstalling the stage, which
      is definitely wrong.
      
      irq_cpuidle_enter() must synchronize the in-band log if IRQs are
      pending for the stage immediately. In addition, this routine should
      always leave the in-band stage unstalled on exit, to align the logic
      with the non-pipelined case.
      
      At this chance, also fix a couple of coding style and comment issues.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      40e67045
    • Thomas Gleixner's avatar
      x86/irq: Unbreak interrupt affinity setting · bf6a77a3
      Thomas Gleixner authored and Philippe Gerum's avatar Philippe Gerum committed
      Several people reported that 5.8 broke the interrupt affinity setting
      mechanism.
      
      The consolidation of the entry code reused the regular exception entry code
      for device interrupts and changed the way how the vector number is conveyed
      from ptregs->orig_ax to a function argument.
      
      The low level entry uses the hardware error code slot to push the vector
      number onto the stack which is retrieved from there into a function
      argument and the slot on stack is set to -1.
      
      The reason for setting it to -1 is that the error code slot is at the
      position where pt_regs::orig_ax is. A positive value in pt_regs::orig_ax
      indicates that the entry came via a syscall. If it's not set to a negative
      value then a signal delivery on return to userspace would try to restart a
      syscall. But there are other places which rely on pt_regs::orig_ax being a
      valid indicator for syscall entry.
      
      But setting pt_regs::orig_ax to -1 has a nasty side effect vs. the
      interrupt affinity setting mechanism, which was overlooked when this change
      was made.
      
      Moving interrupts on x86 happens in several steps. A new vector on a
      different CPU is allocated and the relevant interrupt source is
      reprogrammed to that. But that's racy and there might be an interrupt
      already in flight to the old vector. So the old vector is preserved until
      the first interrupt arrives on the new vector and the new target CPU. Once
      that happens the old vector is cleaned up, but this cleanup still depends
      on the vector number being stored in pt_regs::orig_ax, which is now -1.
      
      That -1 makes the check for cleanup: pt_regs::orig_ax == new_vector
      always false. As a consequence the interrupt is moved once, but then it
      cannot be moved anymore because the cleanup of the old vector never
      happens.
      
      There would be several ways to convey the vector information to that place
      in the guts of the interrupt handling, but on deeper inspection it turned
      out that this check is pointless and a leftover from the old affinity model
      of X86 which supported multi-CPU affinities. Under this model it was
      possible that an interrupt had an old and a new vector on the same CPU, so
      the vector match was required.
      
      Under the new model the effective affinity of an interrupt is always a
      single CPU from the requested affinity mask. If the affinity mask changes
      then either the interrupt stays on the CPU and on the same vector when that
      CPU is still in the new affinity mask or it is moved to a different CPU, but
      it is never moved to a different vector on the same CPU.
      
      Ergo the cleanup check for the matching vector number is not required and
      can be removed which makes the dependency on pt_regs:orig_ax go away.
      
      The remaining check for new_cpu == smp_processsor_id() is completely
      sufficient. If it matches then the interrupt was successfully migrated and
      the cleanup can proceed.
      
      For paranoia sake add a warning into the vector assignment code to
      validate that the assumption of never moving to a different vector on
      the same CPU holds.
      
      Fixes: 633260fa
      
       ("x86/irq: Convey vector as argument and not in ptregs")
      Reported-by: default avatarAlex bykov <alex.bykov@scylladb.com>
      Reported-by: default avatarAvi Kivity <avi@scylladb.com>
      Reported-by: default avatarAlexander Graf <graf@amazon.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexander Graf <graf@amazon.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/87wo1ltaxz.fsf@nanos.tec.linutronix.de
      bf6a77a3
  7. 24 Aug, 2020 1 commit
    • Leonid Gasheev's avatar
      x86: irq_pipeline: fix interrupt vector name for Hyper-V · f29eaf7c
      Leonid Gasheev authored and Philippe Gerum's avatar Philippe Gerum committed
      
      
      This fixes the following compilation error:
      
       CC      arch/x86/hyperv/hv_init.o
      In file included from ../arch/x86/hyperv/hv_init.c:18:0:
      ../arch/x86/hyperv/hv_init.c: In function 'sysvec_hyperv_reenlightenment':
      ../arch/x86/hyperv/hv_init.c:156:34:
      error: 'HYPERVISOR_REENLIGHTENMENT_VECTOR' undeclared (first use in this function);
      did you mean 'HYPERV_REENLIGHTENMENT_VECTOR'?
       DEFINE_IDTENTRY_SYSVEC_PIPELINED(HYPERVISOR_REENLIGHTENMENT_VECTOR,
      
      Aligned with the arch/x86/include/asm/irq_vectors.h file
      Signed-off-by: Leonid Gasheev's avatarLeonid Gasheev  <vtk-powerlab@yandex.ru>
      f29eaf7c
  8. 23 Aug, 2020 2 commits
  9. 15 Aug, 2020 1 commit
  10. 13 Aug, 2020 2 commits
  11. 12 Aug, 2020 7 commits