1. 05 Jun, 2021 40 commits
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      dovetail: rework address space pinning · c1599c5d
      Philippe Gerum authored
      
      
      Real-time applications contolled by the out-of-band core require some
      guarantees regarding how memory is managed for them in order to
      prevent unexpected delays:
      
      [1] paging must be disabled, all current and future pages must be
          faulted in.
      
      [2] copy-on-write must not be relied upon between a real-time parent
          and any of its children in order to share pages upon fork(). IOW,
          every child should get its own copy of the parent's pages upon
          fork(), and the latter should NOT have to be marked read-only as a
          result of this.
      
      The former implementation relied on Dovetail-specific code to address
      these requirements:
      
      - force_commit_memory() would scan all VMAs attached to the caller's
        address space in order to fault them in via commit_vma(). A new task
        attaching to the out-of-band core was expected to call
        force_commit_memory() in order to process the address space
        accordingly.
      
      - commit_vma() would populate a VMA by calling
        populate_vma_page_range() for common mappings, or pin special
        mappings via GUP such as huge pages.
      
      - commit_vma() would also be called when the protection bits of a page
        is changed, in order to catch cases which would require more
        COW-breaking as a result. This is useless, copy_pte_range() is the
        only code path where pages may have to be unCOWed.
      
      COW-breaking upon fork() was not yet performed by Dovetail.
      
      These applications can use mlockall(MCL_CURRENT|MCL_FUTURE) in order
      to enforce [1], this will ensure the mappings attached to the caller's
      mm are populated and faulted in when applicable. Locking the memory
      has been a requirement for these applications since day
      one. Therefore, force_commit_memory() is redundant with
      mlockall(MCL_CURRENT).
      
      [2] can be obtained by extending to Dovetail-aware memory the
      COW-breaking logic readily available to pinned pages (FOLL_PIN) in
      copy_pte_range() -> copy_present_pte() -> copy_present_page(). The
      real address space of a task which calls dovetail_init_altsched() can
      be marked as Dovetail-aware in the process, since such a call is a
      clear hint that the underlying task will require both [1] and [2].
      
      At this chance, MMF_VM_PINNED is renamed MMF_DOVETAILED to fix a
      confusing name clash with the page pinning logic, which has different
      semantics.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      c1599c5d
    • Philippe Gerum's avatar
      evl/sched: refine tracepoints · 23d657e2
      Philippe Gerum authored
      
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      23d657e2
    • Philippe Gerum's avatar
      evl/syscall: remove indirection via pointer table · cce881ed
      Philippe Gerum authored
      
      
      We have only very few syscalls, prefer a plain switch to a pointer
      indirection which ends up being fairly costly due to exploit
      mitigations.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      cce881ed
    • Philippe Gerum's avatar
      evl/wait: display waitqueue name in trace · 12bbc1f7
      Philippe Gerum authored
      
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      12bbc1f7
    • Philippe Gerum's avatar
      evl: kconfig: introduce high per-CPU concurrency switch · a34cfccc
      Philippe Gerum authored
      
      
      EVL_HIGH_PERCPU_CONCURRENCY optimizes the implementation for
      applications with many real-time threads running concurrently on any
      given CPU core (typically when eight or more threads may be sharing a
      single CPU core). This is a combination of the scalable scheduler and
      rb-tree timer indexing as a single configuration switch, since both
      aspects are normally coupled.
      
      If the application system runs only a few EVL threads per CPU core,
      then this option should be turned off, in order to minimize the cache
      footprint of the queuing operations performed by the scheduler and
      timer subsystems. Otherwise, it should be turned on in order to have
      constant-time queuing operations for a large number of runnable
      threads and outstanding timers.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      a34cfccc
    • Philippe Gerum's avatar
      evl/sched: enable fast linear thread scheduler (non-scalable) · 22920393
      Philippe Gerum authored
      
      
      For applications with only few runnable tasks at any point in time, a
      linear queue ordering the latter for scheduling delivers better
      performance on low-end systems due to smaller CPU cache footprints,
      compared to the multi-level queue used by the scalable scheduler.
      
      Allow users to select between lightning-fast and scalable scheduler
      implementation depending on the runtime profile of the application.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      
      # Please enter the commit message for your changes. Lines starting
      # with '#' will be ignored, and an empty message aborts the commit.
      #
      # On branch evl/master
      # Your branch is ahead of 'origin/evl/master' by 2 commits.
      #   (use "git push" to publish your local commits)
      #
      # Changes to be committed:
      #	modified:   include/evl/sched.h
      #	modified:   include/evl/sched/queue.h
      #	modified:   include/evl/sched/tp.h
      #	modified:   include/evl/sched/weak.h
      #	modified:   kernel/evl/Kconfig
      #	modified:   kernel/evl/sched/core.c
      #
      # Untracked files:
      #	include/trace/events/mm.h
      #
      22920393
    • Philippe Gerum's avatar
      bd6858d8
    • Philippe Gerum's avatar
      evl/timer: add linear indexing method · 31eca2f4
      Philippe Gerum authored
      
      
      Add (back) the ability to index timers either in a rb-tree or linked
      to a basic linked list.
      
      The latter delivers lower latency to applications systems with very
      few active timers at any point in time (typically less than 10 active
      timers, e.g. not more than a couple of timed loops, very few timed
      syscalls).
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      31eca2f4
    • Philippe Gerum's avatar
      genirq: clarify naming for out-of-band IPI service · cd4344cc
      Philippe Gerum authored
      
      
      irq_pipeline_send_remote() as a name fails to convey the idea of
      sending out-of-band IPIs.
      
      Since this service can only send such type of IRQ, let's rename it to
      irq_send_oob_ipi() for the sake of clarity and consistency.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      cd4344cc
    • Philippe Gerum's avatar
      050651db
    • Philippe Gerum's avatar
      irq_pipeline: locking: add prepare, finish helpers to hard spinlocks · dad24efb
      Philippe Gerum authored
      
      
      The companion core may make good use of a way to act upon a locking
      operation which is about to start, or an unlocking operation which has
      just taken place. Typically, some debug code could be enabled this
      way, checking for the consistency of such operations. Since hybrid
      spinlocks are based on hard spinlocks, those helpers are available in
      both cases.
      
      The locking process is now as follows:
      
      IRQ forms:
      
      * locking:     hard_disable_irqs + lock_prepare + spin_on_lock
      * try-locking: hard_disable_irqs + trylock_prepare + try_lock, trylock_fail if busy
      * unlocking:   unlock + lock_finish + hard_enable_irqs
      
      basic forms:
      
      * locking:     lock_prepare + spin_on_lock
      * try-locking: trylock_prepare + try_lock, trylock_fail if busy
      * unlocking:   unlock + lock_finish
      
      hard_spin_lock_prepare() and hard_spin_unlock_finish() are such
      helpers. An empty implementation is provided by
      include/dovetail/spinlock.h, which the core may override.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      dad24efb
    • Philippe Gerum's avatar
      evl/lock: add preemption tracking · ba943b38
      Philippe Gerum authored
      
      
      An EVL lock is now distinct from a hard lock in that it tracks and
      disables preemption in the core when held.
      
      Such spinlock may be useful when only EVL threads running out-of-band
      can contend for the lock, to the exclusion of out-of-band IRQ
      handlers. In this case, disabling preemption before attempting to grab
      the lock may be substituted to disabling hard irqs.
      
      There are gotchas when using such type of lock from the in-band
      context, see comments in evl/lock.h.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      ba943b38
    • Philippe Gerum's avatar
      evl/mutex: convert mutex lock to hard lock · 3a810978
      Philippe Gerum authored
      
      
      For the most part, a thread hard lock - which requires hard irqs to be
      off - is nested with the mutex lock to access the protected
      sections. Therefore we would not benefit in the common case from the
      preemption disabling feature we are going to add to the EVL-specific
      spinlock. Make it a hard lock to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      3a810978
    • Philippe Gerum's avatar
      evl/thread: detect sleeping call with preemption disabled · b3e50a80
      Philippe Gerum authored
      
      
      Sleeping voluntarily with EVL preemption disabled is a bug. Add the
      proper assertion to detect this.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      b3e50a80
    • Philippe Gerum's avatar
      evl/observable: convert observable, subscriber locks to hard locks · dc1edef8
      Philippe Gerum authored
      
      
      The subscriber lock is shared between both execution stages, but
      accessed from the in-band stage for the most part, which implies
      disabling hard irqs while holding it. Meanwhile, out-of-band IRQs and
      EVL threads may compete for the observable lock, which would require
      hard irqs to be disabled while holding it.  Therefore we would not
      generally benefit from the preemption disabling feature we are going
      to add to the EVL-specific spinlock in any case. Make these hard locks
      to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      dc1edef8
    • Philippe Gerum's avatar
      evl/poll: convert poll head lock to hard lock · 45f17f40
      Philippe Gerum authored
      
      
      Out-of-band IRQs and EVL thread contexts would usually compete for
      such lock, which would require hard irqs to be disabled while holding
      it. Therefore we would not generally benefit from the preemption
      disabling feature we are going to add to the EVL-specific
      spinlock. Make it a hard lock to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      45f17f40
    • Philippe Gerum's avatar
      evl/thread: use static allocation for the ptsync barrier · af703356
      Philippe Gerum authored
      
      
      Now that the inclusion hell is fixed with evl/wait.h, we may include
      it into mm_info.h, for defining the ptsync barrier statically into the
      out-of-band mm state.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      af703356
    • Philippe Gerum's avatar
      evl/wait: reduce header dependency · f14e9a8f
      Philippe Gerum authored
      
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      f14e9a8f
    • Philippe Gerum's avatar
      evl/wait: convert wait queue lock to hard lock · 4e2a6950
      Philippe Gerum authored
      
      
      Out-of-band IRQs and EVL thread contexts would usually compete for
      such lock, which would require hard irqs to be disabled while holding
      it. Therefore we would not generally benefit from the preemption
      disabling feature we are going to add to the EVL-specific
      spinlock. Make it a hard lock to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      4e2a6950
    • Philippe Gerum's avatar
      evl/thread: convert thread lock to hard lock · 2b87d1df
      Philippe Gerum authored
      
      
      Out-of-band IRQs and EVL thread contexts may compete for such lock,
      which would require hard irqs to be disabled while holding it.
      Therefore we would not benefit from the preemption disabling feature
      we are going to add to the EVL-specific spinlock. Make it a hard lock
      to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      2b87d1df
    • Philippe Gerum's avatar
      evl/sched: convert run queue lock to hard lock · ae274337
      Philippe Gerum authored
      
      
      Out-of-band IRQs and EVL thread contexts may compete for such lock,
      which would require hard irqs to be disabled while holding it.
      Therefore we would not benefit from the preemption disabling feature
      we are going to add to the EVL-specific spinlock. Make it a hard lock
      to clarify the intent.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      ae274337
    • Philippe Gerum's avatar
      evl/lock: stop using the oob stall bit for protection · 421b3fa5
      Philippe Gerum authored
      
      
      We don't actually need to rely on the oob stall bit, provided hard
      irqs are off in the deemed interrupt-free sections, because the latter
      is sufficient as long as the code does not traverse a pipeline
      synchronization point (sync_current_irq_stage()) while holding a lock,
      which would be in and of itself a bug in the first place.
      
      Remove the stall/unstall operations from the evl_spinlock
      implementation, fixing the few locations which were still testing the
      oob stall bit.
      
      The oob stall bit is still set by Dovetail on entry to IRQ handlers,
      which is ok: we will neither use nor affect it anymore, only relying
      on hard disabled irqs.
      
      This temporary alignment of the evl_spinlock on the hard spinlock is a
      first step to revisit the lock types in the core, before the
      evl_spinlock is changed again to manage the preemption count.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      421b3fa5
    • Philippe Gerum's avatar
      evl/sched: do not consider oob stall on enabling preemption · e4901cba
      Philippe Gerum authored
      
      
      Checking the oob stall bit in __evl_enable_preempt() to block the
      rescheduling is obsolete. It relates to a nested locking construct
      which is long gone, when the evl_spinlock managed the preemption count
      and the big lock was still in, i.e.:
      
      lock_irqsave(&ugly_big_lock, flags)  /* stall bit raised */
      	evl_spin_lock(&inner_lock);  /* +1 preempt */
      	   wake_up_high_prio_thread();
      	evl_spin_unlock(&inner_lock); /* -1 preempt == 0, NO schedule because stalled */
      unlock_irqrestore(&ugly_big_lock, flags) /* stall bit restored */
      
      This was a way to prevent a rescheduling to take place inadvertently
      while holding the big lock.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      e4901cba
    • Philippe Gerum's avatar
      1a9644a9
    • Philippe Gerum's avatar
      evl/crossing: add oob vs in-band synchronization barrier · 65152de3
      Philippe Gerum authored
      
      
      This is a simple synchronization mechanism allowing an in-band caller
      to pass a point in the code making sure that no out-of-band operations
      which might traverse the same crossing are in flight.
      
      Out-of-band callers delimit the danger zone by down-ing and up-ing the
      barrier at the crossing, the in-band code should ask for passing the
      crossing.
      
      CAUTION: the caller must guarantee that evl_down_crossing() cannot be
      invoked _after_ evl_pass_crossing() is entered for a given crossing.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      65152de3
    • Philippe Gerum's avatar
      evl/thread: add evl_current_kthread() · ee9a41fd
      Philippe Gerum authored
      
      
      Returns the current kthread descriptor or NULL if another thread
      context is running. CAUTION: does not account for IRQ context.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      ee9a41fd
    • Philippe Gerum's avatar
      evl/thread: allow passing user-defined arg to kthread · 9a8178b8
      Philippe Gerum authored
      
      
      We need more flexibility in the argument passed to the thread
      function. Change for an opaque pointer passed to evl_run_kthread() and
      variants instead of the current kthread descriptor.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      9a8178b8
    • Philippe Gerum's avatar
      evl/thread: clarify naming, complete kthread-specific interface · 6b4e7e3a
      Philippe Gerum authored
      
      
      By convention, all thread-related calls which implicitly affect
      current and therefore do not take any @thread parameter should use a
      short-form name, such as evl_delay(), evl_sleep(). For this reason,
      the following renames took place:
      
      - evl_set_thread_period -> evl_set_period
      - evl_wait_thread_period -> evl_wait_period
      - evl_delay_thread -> evl_delay
      
      In addition, complete the set of kthread-specific calls which are
      based on the inner thread interface (this one working for user and
      kernel threads indifferently):
      
      - evl_kthread_unblock
      - evl_kthread_join
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      6b4e7e3a
    • Philippe Gerum's avatar
      evl/clock: fix discrepancies in set_time() handler · ac658ccd
      Philippe Gerum authored
      
      
      This is an internal interface which should deal with ktime directly,
      not timespec64. In addition, rename to set() in order to match the
      converse short form read() call.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      ac658ccd
    • lio via Evl's avatar
      evl trace:fix the occasional NULL pointer references · 8f6869a6
      lio via Evl authored and Philippe Gerum's avatar Philippe Gerum committed
      
      
      Trace event *evl_sched_attrs* calls TP_printk("%s") to out print the thread
      name get by evl_element_name().
      
      However, evl_element_name() may return NULL sometimes, and TP_printk(%s)
      may cause some problems.
      
      This patch will avoid this.
      Signed-off-by: default avatarlio <carver4lio@163.com>
      8f6869a6
    • Philippe Gerum's avatar
      0d0af4b4
    • Philippe Gerum's avatar
      evl/timer: remove timer priority logic · c91b3540
      Philippe Gerum authored
      
      
      Prioritization of timers in timer queues dates back to the Dark Ages
      of Xenomai 2.x, when multiple time bases would co-exist in the core,
      some of which representing date values as a count of periodic ticks.
      In such a case, multiple timers might elapse on the very same tick,
      hence the need for prioritizing them.
      
      With a single time base indexing timers on absolute date values, which
      are expressed as a 64bit monotonic count of nanoseconds, the
      likelihood of observing identical trigger dates is very low.
      
      Furthermore, the formerly defined priorities where assigned as
      follows:
      
      1) high priority to the per-thread periodic and resource timers
      2) medium priority to the user-defined timers
      3) low priority to the in-band tick emulation timer
      
      It turns out that forcibly prioritizing 1) over 2) is at least
      debatable, if not questionable: resource timers have no high priority
      at all, they merely tick on the (unlikely) timeout condition. On the
      other hand, user-defined timers may well deal with high priority
      events only some EVL driver code may know about.
      
      Finally, handling 3) is a fast operation on top of Dovetail, which is
      already deferred internally whenever the timer management core detects
      that some oob activity is running/pending.
      
      So we may remove the logic handling the timer priority, only relying
      on the trigger date for dispatching. This should save precious cycles
      in the hot path without any actual downside.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      c91b3540
    • Philippe Gerum's avatar
      evl/thread: rename evl_cancel_kthread() to evl_stop_kthread() · 7a2fbdca
      Philippe Gerum authored
      
      
      Align naming of the kthread termination-related calls on the in-band
      counterparts. At this chance, further clarify the interface by having
      evl_kthread_should_stop() explicitly return a boolean status.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      7a2fbdca
    • Philippe Gerum's avatar
      evl/sched: perform early lookup into FIFO class to pick threads · 995566f9
      Philippe Gerum authored
      
      
      This change first asserts that the FIFO class is the topmost
      scheduling class by design.  From this point, we may check this class
      upfront when looking for the next runnable thread to pick, without
      going though the indirection of its .sched_pick handler.
      
      This allows the compiler to fold most of the FIFO picking code into
      the generic __pick_next_thread() routine, saving an indirect call.
      This is nicer to the I-cache in all cases, and spares the cycles which
      would otherwise be consumed by some vulnerability mitigation code like
      retpolines.
      
      On a highly cache-stressed i.mx6q, the worst case latency figures with
      this change in dropped from about 5%.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      995566f9
    • Philippe Gerum's avatar
      evl/sched: fifo: drop rq rotation handler · 21b285b2
      Philippe Gerum authored
      
      
      There is no user for this one.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      21b285b2
    • Philippe Gerum's avatar
    • Philippe Gerum's avatar
      d364a6ee
    • Philippe Gerum's avatar
      evl/poll: allow for polling multiple heads concurrently · 4b429ce0
      Philippe Gerum authored
      
      
      Some drivers may need to poll multiple heads concurrently on a single
      oob_poll() invocation. Replace the single wait.next backlink to the
      poll head by an array of connectors multiple heads. The oob_poll()
      handler of a driver can now attach the wait descriptor it receives
      (struct oob_poll_wait) to up to EVL_POLL_NR_CONNECTORS different poll
      heads (currently set to 4).
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      4b429ce0
    • Philippe Gerum's avatar
      include/evl: wait: do not drag tracepoints in · c8ed3db2
      Philippe Gerum authored
      
      
      Core EVL headers should be readable from the asm-generic/ section
      (e.g. evl/wait.h), which among other things requires the tracepoint
      definitions not to be pulled in. The latter should be read from the
      files implementating them instead.
      Signed-off-by: Philippe Gerum's avatarPhilippe Gerum <rpm@xenomai.org>
      c8ed3db2