1. 26 Jul, 2013 3 commits
  2. 14 Jul, 2013 1 commit
    • Paul Gortmaker's avatar
      arm: delete __cpuinit/__CPUINIT usage from all ARM users · 8bd26e3a
      Paul Gortmaker authored
      The __cpuinit type of throwaway sections might have made sense
      some time ago when RAM was more constrained, but now the savings
      do not offset the cost and complications.  For example, the fix in
      commit 5e427ec2 ("x86: Fix bit corruption at CPU resume time")
      is a good example of the nasty type of bugs that can be created
      with improper use of the various __init prefixes.
      
      After a discussion on LKML[1] it was decided that cpuinit should go
      the way of devinit and be phased out.  Once all the users are gone,
      we can then finally remove the macros themselves from linux/init.h.
      
      Note that some harmless section mismatch warnings may result, since
      notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
      and are flagged as __cpuinit  -- so if we remove the __cpuinit from
      the arch specific callers, we will also get section mismatch warnings.
      As an intermediate step, we intend to turn the linux/init.h cpuinit
      related content into no-ops as early as possible, since that will get
      rid of these warnings.  In any case, they are temporary and harmless.
      
      This removes all the ARM uses of the __cpuinit macros from C code,
      and all __CPUINIT from assembly code.  It also had two ".previous"
      section statements that were paired off against __CPUINIT
      (aka .section ".cpuinit.text") that also get removed here.
      
      [1] https://lkml.org/lkml/2013/5/20/589
      
      
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      8bd26e3a
  3. 09 Jul, 2013 2 commits
  4. 04 Jul, 2013 2 commits
  5. 03 Jul, 2013 1 commit
  6. 29 Jun, 2013 1 commit
  7. 26 Jun, 2013 6 commits
  8. 24 Jun, 2013 4 commits
  9. 21 Jun, 2013 1 commit
    • Stephen Boyd's avatar
      sched_clock: Add temporary asm/sched_clock.h · 629a6a2b
      Stephen Boyd authored
      Some new users of the ARM sched_clock framework are going through
      the arm-soc tree. Before 38ff87f7
      
       (sched_clock: Make ARM's
      sched_clock generic for all architectures, 2013-06-01) the header
      file was in asm, but now it's in linux. One solution would be to
      do an evil merge of the arm-soc tree and fix up the asm users,
      but it's easier to add a temporary asm header that we can remove
      along with the few stragglers after the merge window is over.
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      629a6a2b
  10. 20 Jun, 2013 2 commits
    • Lorenzo Pieralisi's avatar
      ARM: kernel: implement stack pointer save array through MPIDR hashing · 7604537b
      Lorenzo Pieralisi authored
      
      
      Current implementation of cpu_{suspend}/cpu_{resume} relies on the MPIDR
      to index the array of pointers where the context is saved and restored.
      The current approach works as long as the MPIDR can be considered a
      linear index, so that the pointers array can simply be dereferenced by
      using the MPIDR[7:0] value.
      On ARM multi-cluster systems, where the MPIDR may not be a linear index,
      to properly dereference the stack pointer array, a mapping function should
      be applied to it so that it can be used for arrays look-ups.
      
      This patch adds code in the cpu_{suspend}/cpu_{resume} implementation
      that relies on shifting and ORing hashing method to map a MPIDR value to a
      set of buckets precomputed at boot to have a collision free mapping from
      MPIDR to context pointers.
      
      The hashing algorithm must be simple, fast, and implementable with few
      instructions since in the cpu_resume path the mapping is carried out with
      the MMU off and the I-cache off, hence code and data are fetched from DRAM
      with no-caching available. Simplicity is counterbalanced with a little
      increase of memory (allocated dynamically) for stack pointers buckets, that
      should be anyway fairly limited on most systems.
      
      Memory for context pointers is allocated in a early_initcall with
      size precomputed and stashed previously in kernel data structures.
      Memory for context pointers is allocated through kmalloc; this
      guarantees contiguous physical addresses for the allocated memory which
      is fundamental to the correct functioning of the resume mechanism that
      relies on the context pointer array to be a chunk of contiguous physical
      memory. Virtual to physical address conversion for the context pointer
      array base is carried out at boot to avoid fiddling with virt_to_phys
      conversions in the cpu_resume path which is quite fragile and should be
      optimized to execute as few instructions as possible.
      Virtual and physical context pointer base array addresses are stashed in a
      struct that is accessible from assembly using values generated through the
      asm-offsets.c mechanism.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Colin Cross <ccross@android.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Amit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Reviewed-by: default avatarNicolas Pitre <nico@linaro.org>
      Tested-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Tested-by: default avatarKevin Hilman <khilman@linaro.org>
      Tested-by: Stephen Warren's avatarStephen Warren <swarren@wwwdotorg.org>
      7604537b
    • Lorenzo Pieralisi's avatar
      ARM: kernel: build MPIDR hash function data structure · 8cf72172
      Lorenzo Pieralisi authored
      
      
      On ARM SMP systems, cores are identified by their MPIDR register.
      The MPIDR guidelines in the ARM ARM do not provide strict enforcement of
      MPIDR layout, only recommendations that, if followed, split the MPIDR
      on ARM 32 bit platforms in three affinity levels. In multi-cluster
      systems like big.LITTLE, if the affinity guidelines are followed, the
      MPIDR can not be considered an index anymore. This means that the
      association between logical CPU in the kernel and the HW CPU identifier
      becomes somewhat more complicated requiring methods like hashing to
      associate a given MPIDR to a CPU logical index, in order for the look-up
      to be carried out in an efficient and scalable way.
      
      This patch provides a function in the kernel that starting from the
      cpu_logical_map, implement collision-free hashing of MPIDR values by checking
      all significative bits of MPIDR affinity level bitfields. The hashing
      can then be carried out through bits shifting and ORing; the resulting
      hash algorithm is a collision-free though not minimal hash that can be
      executed with few assembly instructions. The mpidr is filtered through a
      mpidr mask that is built by checking all bits that toggle in the set of
      MPIDRs corresponding to possible CPUs. Bits that do not toggle do not carry
      information so they do not contribute to the resulting hash.
      
      Pseudo code:
      
      /* check all bits that toggle, so they are required */
      for (i = 1, mpidr_mask = 0; i < num_possible_cpus(); i++)
      	mpidr_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0));
      
      /*
       * Build shifts to be applied to aff0, aff1, aff2 values to hash the mpidr
       * fls() returns the last bit set in a word, 0 if none
       * ffs() returns the first bit set in a word, 0 if none
       */
      fs0 = mpidr_mask[7:0] ? ffs(mpidr_mask[7:0]) - 1 : 0;
      fs1 = mpidr_mask[15:8] ? ffs(mpidr_mask[15:8]) - 1 : 0;
      fs2 = mpidr_mask[23:16] ? ffs(mpidr_mask[23:16]) - 1 : 0;
      ls0 = fls(mpidr_mask[7:0]);
      ls1 = fls(mpidr_mask[15:8]);
      ls2 = fls(mpidr_mask[23:16]);
      bits0 = ls0 - fs0;
      bits1 = ls1 - fs1;
      bits2 = ls2 - fs2;
      aff0_shift = fs0;
      aff1_shift = 8 + fs1 - bits0;
      aff2_shift = 16 + fs2 - (bits0 + bits1);
      u32 hash(u32 mpidr) {
      	u32 l0, l1, l2;
      	u32 mpidr_masked = mpidr & mpidr_mask;
      	l0 = mpidr_masked & 0xff;
      	l1 = mpidr_masked & 0xff00;
      	l2 = mpidr_masked & 0xff0000;
      	return (l0 >> aff0_shift | l1 >> aff1_shift | l2 >> aff2_shift);
      }
      
      The hashing algorithm relies on the inherent properties set in the ARM ARM
      recommendations for the MPIDR. Exotic configurations, where for instance the
      MPIDR values at a given affinity level have large holes, can end up requiring
      big hash tables since the compression of values that can be achieved through
      shifting is somewhat crippled when holes are present. Kernel warns if
      the number of buckets of the resulting hash table exceeds the number of
      possible CPUs by a factor of 4, which is a symptom of a very sparse HW
      MPIDR configuration.
      
      The hash algorithm is quite simple and can easily be implemented in assembly
      code, to be used in code paths where the kernel virtual address space is
      not set-up (ie cpu_resume) and instruction and data fetches are strongly
      ordered so code must be compact and must carry out few data accesses.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Colin Cross <ccross@android.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Amit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Reviewed-by: default avatarNicolas Pitre <nico@linaro.org>
      Tested-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Tested-by: default avatarKevin Hilman <khilman@linaro.org>
      Tested-by: Stephen Warren's avatarStephen Warren <swarren@wwwdotorg.org>
      8cf72172
  11. 17 Jun, 2013 4 commits
  12. 12 Jun, 2013 1 commit
  13. 07 Jun, 2013 8 commits
    • Jonathan Austin's avatar
      ARM: mpu: add MPU initialisation for secondary cores · eb08375e
      Jonathan Austin authored
      
      
      The MPU initialisation on the primary core is performed in two stages, one
      minimal stage to ensure the CPU can boot and a second one after
      sanity_check_meminfo. As the memory configuration is known by the time we
      boot secondary cores only a single step is necessary, provided the values
      for DRSR are passed to secondaries.
      
      This patch implements this arrangement. The configuration generated for the
      MPU regions is made available to the secondary core, which can then use the
      asm MPU intialisation code to program a complete region configuration.
      
      This is necessary for SMP configurations without an MMU, as the MPU
      initialisation is the only way to ensure that memory is specified as
      'shared'.
      Signed-off-by: default avatarJonathan Austin <jonathan.austin@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      CC: Nicolas Pitre <nico@linaro.org>
      eb08375e
    • Jonathan Austin's avatar
      ARM: mpu: add early bring-up code for the ARMv7 PMSA-compliant MPU · 67c9845b
      Jonathan Austin authored
      
      
      This patch adds initial support for using the MPU, which is necessary for
      SMP operation on PMSAv7 processors because it is the only way to ensure
      memory is shared. This is an initial patch and full SMP support is added
      later in this series.
      
      The setup of the MPU is performed in a way analagous to that for the MMU:
      Very early initialisation before the C environment is brought up, followed
      by a sanity check and more complete initialisation in C.
      
      This patch provides the simplest possible memory region configuration:
      MPU_PROBE_REGION: Reserved for probing MPU details, not enabled
      MPU_BG_REGION: A 'background' region that specifies all memory strongly ordered
      MPU_RAM_REGION: A single shared, cacheable, normal region for the valid RAM.
      
      In this early initialisation code we simply map the whole of the address
      space with the BG_REGION and (at least) the kernel with the RAM_REGION. The
      MPU has region alignment constraints that require us to round past the end
      of the kernel.
      
      As region 2 has a higher priority than region 1, it overrides the strongly-
      ordered behaviour for RAM only.
      
      Subsequent patches will add more complete initialisation from the C-world
      and support for bringing up secondary CPUs.
      Signed-off-by: default avatarJonathan Austin <jonathan.austin@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      CC: Hyok S. Choi <hyok.choi@samsung.com>
      67c9845b
    • Jonathan Austin's avatar
      ARM: mpu: add header for MPU register layouts and region data · a2b45b0d
      Jonathan Austin authored
      
      
      This commit adds definitions relevant to the ARM v7 PMSA compliant MPU.
      
      The register layouts and region configuration data is made accessible to asm
      as well as C-code so that it can be used in early bring-up of the MPU.
      
      The mpu region information structs assume that the properties for the I/D side
      are the same, though the implementation could be trivially extended for future
      platforms where this is no-longer true.
      
      The MPU_*_REGION defines are used for the basic, static MPU region setup.
      Signed-off-by: default avatarJonathan Austin <jonathan.austin@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      a2b45b0d
    • Jonathan Austin's avatar
      ARM: mpu: add PMSA related registers and bitfields to existing headers · aca7e592
      Jonathan Austin authored
      
      
      This patch adds the following definitions relevant to the PMSA:
      
      Add SCTLR bit 17, (CR_BR - Background Region bit) to the list of CR_*
      bitfields. This bit determines whether to use the architecturally defined
      memory map
      
      Add the MPUIR to the available registers when using read_cpuid macro. The
      MPUIR is the MPU type register.
      Signed-off-by: default avatarJonathan Austin <jonathan.austin@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      CC:"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
      aca7e592
    • Jonathan Austin's avatar
      ARM: nommu: add stub local_flush_bp_all() for !CONFIG_MMUU · 8d655d83
      Jonathan Austin authored
      Since the merging of Will's tlb-ops branch, specifically 89c7e4b8
      
      
      (ARM: 7661/1: mm: perform explicit branch predictor maintenance when required),
      building SMP without CONFIG_MMU has been broken.
      
      The local_flush_bp_all function is only called for operations related to
      changing the kernel's view of memory and ASID rollover - both of which are
      irrelevant to an !MMU kernel.
      
      This patch adds a stub local_flush_bp_all() function to the other tlb
      maintenance stubs and restores the ability to build an SMP !MMU kernel.
      Signed-off-by: default avatarJonathan Austin <jonathan.austin@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      8d655d83
    • Will Deacon's avatar
      ARM: nommu: provide dummy cpu_switch_mm implementation · 02ed1c7b
      Will Deacon authored
      
      
      cpu_switch_mm is a logical nop on nommu systems, so define it as such
      when !CONFIG_MMU.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      02ed1c7b
    • Will Deacon's avatar
      ARM: nommu: define dummy TLB operations for nommu configurations · 5c709e69
      Will Deacon authored
      
      
      nommu platforms do not perform address translation and therefore clearly
      don't have TLBs. However, some SMP code assumes the presence of the TLB
      flushing routines and will therefore fail to compile for a nommu system.
      
      This patch defines dummy local_* TLB operations and #defines
      tlb_ops_need_broadcast() as 0, therefore causing the usual ARM SMP TLB
      operations to call the local variants instead.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      CC: Nicolas Pitre <nico@linaro.org>
      5c709e69
    • Mark Rutland's avatar
      clocksource: arch_timer: use virtual counters · 0d651e4e
      Mark Rutland authored
      
      
      Switching between reading the virtual or physical counters is
      problematic, as some core code wants a view of time before we're fully
      set up. Using a function pointer and switching the source after the
      first read can make time appear to go backwards, and having a check in
      the read function is an unfortunate block on what we want to be a fast
      path.
      
      Instead, this patch makes us always use the virtual counters. If we're a
      guest, or don't have hyp mode, we'll use the virtual timers, and as such
      don't care about CNTVOFF as long as it doesn't change in such a way as
      to make time appear to travel backwards. As the guest will use the
      virtual timers, a (potential) KVM host must use the physical timers
      (which can wake up the host even if they fire while a guest is
      executing), and hence a host must have CNTVOFF set to zero so as to have
      a consistent view of time between the physical timers and virtual
      counters.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Rob Herring <rob.herring@calxeda.com>
      0d651e4e
  14. 06 Jun, 2013 1 commit
    • Peter Zijlstra's avatar
      arch, mm: Remove tlb_fast_mode() · 29eb7782
      Peter Zijlstra authored
      
      
      Since the introduction of preemptible mmu_gather TLB fast mode has been
      broken. TLB fast mode relies on there being absolutely no concurrency;
      it frees pages first and invalidates TLBs later.
      
      However now we can get concurrency and stuff goes *bang*.
      
      This patch removes all tlb_fast_mode() code; it was found the better
      option vs trying to patch the hole by entangling tlb invalidation with
      the scheduler.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Reported-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29eb7782
  15. 05 Jun, 2013 1 commit
    • Will Deacon's avatar
      ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier() · 509eb76e
      Will Deacon authored
      
      
      __my_cpu_offset is non-volatile, since we want its value to be cached
      when we access several per-cpu variables in a row with preemption
      disabled. This means that we rely on preempt_{en,dis}able to hazard
      with the operation via the barrier() macro, so that we can't end up
      migrating CPUs without reloading the per-cpu offset.
      
      Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile
      asm block as a side-effect, and will happily re-order it before other
      memory clobbers (including those in prempt_disable()) and cache the
      value. This has been observed to break the cmpxchg logic in the slub
      allocator, leading to livelock in kmem_cache_alloc in mainline kernels.
      
      This patch adds a dummy memory input operand to __my_cpu_offset,
      forcing it to be ordered with respect to the barrier() macro.
      
      Cc: <stable@vger.kernel.org>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Reviewed-by: default avatarNicolas Pitre <nico@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      509eb76e
  16. 04 Jun, 2013 2 commits