1. 03 Apr, 2019 40 commits
    • Jan Kiszka's avatar
      ipipe: Fix output layout of tracer · 432d2adc
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      A long time ago (probably in 2.6-times), someone converted spaces to
      tabs, shuffling the layout around this way, and by forgetting to account
      for the multi-domain removal.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Jan Kiszka's avatar
      ipipe: Fix panic output of tracer · 4e5bb6a4
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      Since 4.9, we need to declare continued lines via KERN_CONT.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Jan Kiszka's avatar
      ipipe: Clean up per-CPU host timers on hotplug · cd0067e9
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      When a CPU is unplugged, make sure to drop all per-CPU ipipe timer
      devices when removing the CPU. Otherwise, we will corrupt the device
      list when re-registering the host timer on CPU onlining.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Jan Kiszka's avatar
      ipipe: Introduce and use ipipe_root_nr_syscalls · abeb1754
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      At least one arch, infamous x86, has a difference of NR_syscalls
      depending on compat vs. native ABI. Account for that by introducing a
      function that can deliver the currently valid syscall number if an arch
      implements such a service. In all other cases, this change is
      functionally no difference.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Jan Kiszka's avatar
      ipipe: Introduce ptrace resume notifier · 68f0d8b6
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      This I-pipe hook reports the desired resumption mode to the subscriber:
      resume all process tasks or just single-step a particular one? The use
      case is to enable synchronous stopping / resuming of all head tasks of
      a ptraced real-time process.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Jan Kiszka's avatar
      ipipe: Introduce infrastructure for userspace return notifier · a3654dd9
      Jan Kiszka authored and Philippe Gerum's avatar Philippe Gerum committed
      A little bit inspired by the kernel's user return notifier, this
      introduces an I-pipe hook before the kernel jumps back to a userspace
      context from the root domain. The hook is design to allow a switch back
      to the head domain, thus will not run through signal/preemption checks
      when returning from the callback over head. It is guaranteed to fire on
      return from interrupts and exceptions but may also fire on certain
      syscall-return paths.
      The first use case for the hook is resumption of ptraced tasks over
      head if they were stopped in that domain.
      This provides just the generic infrastructure, the invocation of
      __ipipe_notify_user_intreturn as well as the definition of
      TIP_USERINTRET are architecture-specific.
      Signed-off-by: Jan Kiszka's avatarJan Kiszka <jan.kiszka@siemens.com>
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      sched/core: ipipe: do not panic on failed migration to the head stage · 6eed8844
      Philippe Gerum authored
      __ipipe_migrate_head() should not BUG() unconditionally when failing
      to schedule out a thread, but rather let the real-time core handle the
      situation a bit more gracefully.
    • Philippe Gerum's avatar
      ipipe: timer: prevent double-ack if host timer is not grabbed · d10ec2b9
      Philippe Gerum authored
      Only timers stolen away from the host kernel should be early acked by
      the pipeline core. Otherwise, the regular IRQ handler associated to
      the timer would duplicate the action. The IRQ line is left masked,
      waiting for the IRQ flow handler to unmask it eventually.
    • Philippe Gerum's avatar
      ipipe: timer: notify co-kernel about entering ONESHOT_STOPPED mode · 70b9018f
      Philippe Gerum authored
      Although we don't want to disable the hardware not to wreck the
      outstanding timing requests managed by the co-kernel, we should
      nevertheless notify it about entering the ONESHOT_STOPPED mode, so
      that it may disable the host tick emulation.
    • Philippe Gerum's avatar
      ipipe: timer: do not interpose on undefined handlers · 0929aeb5
      Philippe Gerum authored
      There is no point in interposing on clock chip handlers for which
      there was no support originally. In some cases (oneshot_stopped), we
      may even get a kernel fault, jumping to a NULL address.
      Interpose on non-NULL original handlers only.
    • Philippe Gerum's avatar
      ipipe: timer: resume hardware operations in oneshot handler · 3c888aa8
      Philippe Gerum authored
      Although we won't allow disabling the hardware when the clock event
      logic switches a device to stopped mode - so that we won't affect the
      timer logic running on the head stage unexpectedly -, we still have to
      enable the hardware when switched (back) to oneshot mode, since it may
      have been stopped prior to interposing on the device in
      Failing to do so would leave the hardware shut down for both regular
      and Xenomai operations, with no mean to bring it up again.
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      sched: idle: ipipe: drop spurious check · 67157bba
      Philippe Gerum authored
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      printk: ipipe: defer vprintk() output · b7d2654e
      Philippe Gerum authored
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      ipipe: tick: revive the host tick after device grab · 44db4e30
      Philippe Gerum authored
      Once the device was grabbed by ipipe_timer_start(), any pending host
      tick programmed in the hardware is basically lost, unknown to the
      co-kernel implementing the proxy handlers.
      Schedule a host event with the latest target time programmed to have
      the co-kernel know about the pending tick.
    • Philippe Gerum's avatar
      PM: ipipe: converge to Dovetail's CPUIDLE management · ced28fdc
      Philippe Gerum authored
      Handle requests for transitioning to deeper C-states the way Dovetail
      does, which prevents us from losing the timer when grabbed by a
      co-kernel, in presence of a CPUIDLE driver.
    • Philippe Gerum's avatar
      ipipe: tick: cap timer_set op to device supported max · 2d0950b8
      Philippe Gerum authored
      At this chance, switch the min_delay_tick value to unsigned long to
      match the corresponding clockevent definition.
    • Philippe Gerum's avatar
      ipipe: tick: out-of-band devices require GENERIC_CLOCKEVENTS · 7e7242e9
      Philippe Gerum authored
      Drop the legacy support for architectures not enabling the generic
      clock event framework, which would only provide periodic timing.
      We don't support any of those archs, and there is no point in running
      a Xenomai co-kernel on a hardware not capable of handling oneshot
      timing requests.
    • Philippe Gerum's avatar
      ftrace: ipipe: rely on fully atomic stop_machine() handler · a41e415e
      Philippe Gerum authored
      Now that stop_machine() guarantees fully atomic execution of the stop
      routine via hard interrupt disabling, there is no point in using
      ipipe_critical_enter/exit() for the same purpose in order to patch the
      kernel text.
    • Philippe Gerum's avatar
      stop_machine: ipipe: ensure atomic stop-context operations · bdb2e511
      Philippe Gerum authored
      stop_machine() guarantees that all online CPUs are spinning
      non-preemptible in a known code location before a subset of them may
      safely run a stop-context function. This service is typically useful
      for live patching the kernel code, or changing global memory mappings,
      so that no activity could run in parallel until the system has
      returned to a stable state after all stop-context operations have
      When interrupt pipelining is enabled, we have to provide the same
      guarantee by restoring hard interrupt disabling where virtualizing the
      interrupt disable flag would defeat it.
    • Philippe Gerum's avatar
      lockdep: ipipe: improve detection of out-of-band contexts · 32d011d8
      Philippe Gerum authored
      trace_hardirqs_on_virt[_caller]() must be invoked instead of
      trace_hardirqs_on[_caller]() from assembly sites before returning from
      an interrupt/fault, so that the virtual IRQ disable state is checked
      for before switching the tracer's logic state to ON.
      This is required as an interrupt may be received and handled by the
      pipeline core although not forwarded to the root domain, when
      interrupts are virtually disabled. In such a case, we want to
      reconcile the tracer's logic with the effect of interrupt pipelining.
    • Philippe Gerum's avatar
      lockdep: ipipe: make the logic aware of interrupt pipelining · 038a0da6
      Philippe Gerum authored
      The lockdep engine will check for the current interrupt state as part
      of the locking validation process, which must encompass:
      - the CPU interrupt state
      - the current pipeline domain
      - the virtual interrupt disable flag
      so that we can traverse the tracepoints from any context sanely and
      In addition trace_hardirqs_on_virt_caller() should be called by the
      arch-dependent code when tracking the interrupt state before returning
      to user-space after a kernel entry (exceptions, IRQ). This makes sure
      that the tracking logic only applies to the root domain, and considers
      the virtual disable flag exclusively.
      For instance, the kernel may be entered when interrupts are (only)
      virtually disabled for the root domain (i.e. stalled), and we should
      tell the IRQ tracing logic that IRQs are about to be enabled back only
      if the root domain is unstalled before leaving to user-space. In such
      a context, the state of the interrupt bit in the CPU would be
    • Philippe Gerum's avatar
      ipipe: add cpuidle control interface · 5b9c52b2
      Philippe Gerum authored
      Add a kernel interface for sharing CPU idling control between the host
      kernel and a co-kernel. The former invokes ipipe_cpuidle_control()
      which the latter should implement, for determining whether entering a
      sleep state is ok. This hook should return boolean true if so.
      The co-kernel may veto such entry if need be, in order to prevent
      latency spikes, as exiting sleep states might be costly depending on
      the CPU idling operation being used.
    • Philippe Gerum's avatar
      ftrace: ipipe: enable tracing from the head domain · e1d9ba38
      Philippe Gerum authored
      Enabling ftrace for a co-kernel running in the head domain of a
      pipelined interrupt context means to:
      - make sure that ftrace's live kernel code patching still runs
        unpreempted by any head domain activity (so that the latter can't
        tread on invalid or half-baked changes in the .text section).
      - allow the co-kernel code running in the head domain to traverse
        ftrace's tracepoints safely.
      The changes introduced by this commit ensure this by fixing up some
      key critical sections so that interrupts are still disabled in the
      CPU, undoing the interrupt flag virtualization in those particular
    • Philippe Gerum's avatar
      fork: ipipe: announce mm dismantling · 60646a11
      Philippe Gerum authored
      IPIPE_KEVT_CLEANUP is emitted before a process memory context is
      entirely dropped, after all the mappings have been exited. Per-process
      resources which might be maintained by the co-kernel could be released
      there, as all tasks have exited.
    • Philippe Gerum's avatar
      sched: ipipe: announce CPU affinity change · bb786e53
      Philippe Gerum authored
      Emit IPIPE_KEVT_SETAFFINITY to the co-kernel when the target task is
      about to move to another CPU.
      CPU migration can only take place from the root domain, the pipeline
      does not provide any support for migrating tasks from the head domain,
      and derives several key assumptions based on this invariant.
    • Philippe Gerum's avatar
      sched: ipipe: announce signal receipt · b5b919a4
      Philippe Gerum authored
      Emit IPIPE_KEVT_SIGWAKE when the target task is about to receive a
      (regular) signal. The co-kernel may decide to schedule a transition of
      the recipient to the root domain in order to have it handle that
      signal asap, which is commonly required for keeping the kernel sane.
      This notification is always sent from the context of the issuer.
    • Philippe Gerum's avatar
      sched: ipipe: announce task exit · 5e887045
      Philippe Gerum authored
      Emit IPIPE_KEVT_EXIT from do_exit() to the co-kernel before the
      current task has dropped the files and mappings it owns.
    • Philippe Gerum's avatar
      KVM: ipipe: keep hypervisor state consistent across domain preemption · e5a589c9
      Philippe Gerum authored
      In order for the hypervisor to operate properly in presence of a
      co-kernel, we need:
      - the virtualization core to know when the hypervisor stalls due
        to a preemption by the co-kernel.
      - to know when the VM enters and leaves guest mode.
    • Philippe Gerum's avatar
      sched: ipipe: add domain debug checks to common scheduling paths · 2575a32a
      Philippe Gerum authored
      Catch invalid calls of root-only code from the head domain from common
      paths which may lead to blocking the current task linux-wise. Checks
      are enabled by CONFIG_IPIPE_DEBUG_CONTEXT.
    • Philippe Gerum's avatar
      sched: ipipe: enable task migration between domains · 90b6728d
      Philippe Gerum authored
      This is the basic code enabling alternate control of tasks between the
      regular kernel and an embedded co-kernel. The changes cover the
      following aspects:
      - extend the per-thread information block with a private area usable
        by the co-kernel for storing additional state information
      - provide the API enabling a scheduler exchange mechanism, so that
        tasks can run under the control of either kernel alternatively. This
        includes a service to move the current task to the head domain under
        the control of the co-kernel, and the converse service to re-enter
        the root domain once the co-kernel has released such task.
      - ensure the generic context switching code can be used from any
        domain, serializing execution as required.
      These changes have to be paired with arch-specific code further
      enabling context switching from the head domain.
    • Philippe Gerum's avatar
      clockevents: ipipe: connect clock chips to abstract tick device · 76cd6ae4
      Philippe Gerum authored
      Announce all clock event chips as they are registered to the
      out-of-band tick device infrastructure, so that we can interpose on
      key handlers in their descriptors.
    • Philippe Gerum's avatar
      timekeeping: ipipe: forward clock shift value to DSO helpers · 4a2dc10e
      Philippe Gerum authored
      In order to propagate the "host real-time update" event to a co-kernek
      (IPIPE_KEVT_HOSTRT), we need the clock shift value of the monotonic
      clock to be passed to the legacy vDSO handler, for (re)calculating the
      new wall clock time which is eventually announced to the co-kernel.
      Only architectures which still implement the legacy
      update_vsyscall_old() interface need this change.
    • Philippe Gerum's avatar
      ipipe: add kernel event notifiers · 6b70aed7
      Philippe Gerum authored
      Add the core API for enabling (regular) kernel event notifications to
      a co-kernel running over the head domain. For instance, such a
      co-kernel may need to know when a task is about to be resumed upon
      signal receipt, or when it gets an access fault trap.
      This commit adds the client-side API for enabling such notification
      for class of events, but does not provide the notification points per
      se, which comes later.
    • Philippe Gerum's avatar
      printk: ipipe: add raw console channel · 5fd82466
      Philippe Gerum authored
      A raw output handler (.write_raw) is added to the console descriptor
      for writing (short) text output unmodified, without any logging,
      header or preparation whatsoever, usable from any pipeline domain.
      The dedicated raw_printk() variant formats the output message then
      passes it on to the handler holding a hard spinlock, irqs off.
      This is a very basic debug channel for situations when resorting to
      the fairly complex printk() handling is not an option. Unlike early
      consoles, regular consoles can provide a raw output service past the
      boot sequence. Raw output handlers are typically provided by serial
      console devices.
    • Philippe Gerum's avatar
      dump_stack: ipipe: make dump_stack() domain-aware · 005c09be
      Philippe Gerum authored
      When dumping a stack backtrace, we neither need nor want to disable
      root stage IRQs over the head stage, where CPU migration can't
      Conversely, we neither need nor want to disable hard IRQs from the
      head stage, so that latency won't skyrocket either.
    • Philippe Gerum's avatar
      printk: ipipe: defer printk() from head domain · 7ff6fc47
      Philippe Gerum authored
      The printk() machinery cannot immediately invoke the console driver(s)
      when called from the head domain, since such driver code belongs to
      the root domain and cannot be shared between domains.
      Output issued from the head domain is formatted then logged into a
      staging buffer, and a dedicated virtual IRQ is posted to the root
      domain for notification. When the virtual IRQ handler runs, the
      contents of the staging buffer is flushed to the printk() interface
      anew, which may eventually pass the output on to the console drivers
      from such a context.