1. 27 Nov, 2008 1 commit
    • Martin Schwidefsky's avatar
      [S390] fix system call parameter functions. · 59da2139
      Martin Schwidefsky authored
      
      
      syscall_get_nr() currently returns a valid result only if the call
      chain of the traced process includes do_syscall_trace_enter(). But
      collect_syscall() can be called for any sleeping task, the result of
      syscall_get_nr() in general is completely bogus.
      
      To make syscall_get_nr() work for any sleeping task the traps field
      in pt_regs is replace with svcnr - the system call number the process
      is executing. If svcnr == 0 the process is not on a system call path.
      
      The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
      for the first system call parameter. This is incorrect since gprs[2]
      may have been overwritten with the system call number if the call
      chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      59da2139
  2. 14 Nov, 2008 6 commits
  3. 28 Oct, 2008 6 commits
    • Christian Borntraeger's avatar
      [S390] s390: Fix build for !CONFIG_S390_GUEST + CONFIG_VIRTIO_CONSOLE · ea4bfdf5
      Christian Borntraeger authored
      The s390 kernel does not compile if virtio console is enabled, but guest
      support is disabled:
      
        LD      .tmp_vmlinux1
      arch/s390/kernel/built-in.o: In function `setup_arch':
      /space/linux-2.5/arch/s390/kernel/setup.c:773: undefined reference to
      `s390_virtio_console_init'
      
      The fix is related to
      commit 99e65c92
      
      
      Author: Christian Borntraeger <borntraeger@de.ibm.com>
      Date:   Fri Jul 25 15:50:04 2008 +0200
          KVM: s390: Fix guest kconfig
      
      Which changed the build process to build kvm_virtio.c only if CONFIG_S390_GUEST
      is set. We must ifdef the prototype in the header file accordingly.
      Reported-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      ea4bfdf5
    • Heiko Carstens's avatar
      [S390] No more 4kb stacks. · 7f5a8ba6
      Heiko Carstens authored
      
      
      We got a stack overflow with a small stack configuration on a 32 bit
      system. It just looks like as 4kb isn't enough and too dangerous.
      So lets get rid of 4kb stacks on 32 bit.
      
      But one thing I completely dislike about the call trace below is that
      just for debugging or tracing purposes sprintf gets called (cio_start_key):
      
      	/* process condition code */
      	sprintf(dbf_txt, "ccode:%d", ccode);
      	CIO_TRACE_EVENT(4, dbf_txt);
      
      But maybe its just me who thinks that this could be done better.
      
          <4>Kernel stack overflow.
          <4>Modules linked in: dm_multipath sunrpc bonding qeth_l2 dm_mod qeth ccwgroup vmur
          <4>CPU: 1 Not tainted 2.6.27-30.x.20081015-s390default #1
          <4>Process httpd (pid: 3807, task: 20ae2df8, ksp: 1666fb78)
          <4>Krnl PSW : 040c0000 8027098a (number+0xe/0x348)
          <4>           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0
          <4>Krnl GPRS: 00d43318 0027097c 1666f277 9666f270
          <4>           00000000 00000000 0000000a ffffffff
          <4>           9666f270 1666f228 1666f277 1666f098
          <4>           00000002 80270982 80271016 1666f098
          <4>Krnl Code: 8027097e: f0340dd0a7f1	srp	3536(4,%r0),2033(%r10),4
          <4>           80270984: 0f00		clcl	%r0,%r0
          <4>           80270986: a7840001		brc	8,80270988
          <4>          >8027098a: 18ef		lr	%r14,%r15
          <4>           8027098c: a7faff68		ahi	%r15,-152
          <4>           80270990: 18bf		lr	%r11,%r15
          <4>           80270992: 18a2		lr	%r10,%r2
          <4>           80270994: 1893		lr	%r9,%r3
      
      Modified calltrace with annotated stackframe size of each function:
      
      stackframe size
          |
       0 304 vsnprintf+850 [0x271016]
       1  72 sprintf+74 [0x271522]
       2  56 cio_start_key+262 [0x2d4c16]
       3  56 ccw_device_start_key+222 [0x2dfe92]
       4  56 ccw_device_start+40 [0x2dff28]
       5  48 raw3215_start_io+104 [0x30b0f8]
       6  56 raw3215_write+494 [0x30ba0a]
       7  40 con3215_write+68 [0x30bafc]
       8  40 __call_console_drivers+146 [0x12b0fa]
       9  32 _call_console_drivers+102 [0x12b192]
      10  64 release_console_sem+268 [0x12b614]
      11 168 vprintk+462 [0x12bca6]
      12  72 printk+68 [0x12bfd0]
      13 256 __print_symbol+50 [0x15a882]
      14  56 __show_trace+162 [0x103d06]
      15  32 show_trace+224 [0x103e70]
      16  48 show_stack+152 [0x103f20]
      17  56 dump_stack+126 [0x104612]
      18  96 __alloc_pages_internal+592 [0x175004]
      19  80 cache_alloc_refill+776 [0x196f3c]
      20  40 __kmalloc+258 [0x1972ae]
      21  40 __alloc_skb+94 [0x328086]
      22  32 pskb_copy+50 [0x328252]
      23  32 skb_realloc_headroom+110 [0x328a72]
      24 104 qeth_l2_hard_start_xmit+378 [0x7803bfde]
      25  56 dev_hard_start_xmit+450 [0x32ef6e]
      26  56 __qdisc_run+390 [0x3425d6]
      27  48 dev_queue_xmit+410 [0x331e06]
      28  40 ip_finish_output+308 [0x354ac8]
      29  56 ip_output+218 [0x355b6e]
      30  24 ip_local_out+56 [0x354584]
      31 120 ip_queue_xmit+300 [0x355cec]
      32  96 tcp_transmit_skb+812 [0x367da8]
      33  40 tcp_push_one+158 [0x369fda]
      34 112 tcp_sendmsg+852 [0x35d5a0]
      35 240 sock_sendmsg+164 [0x32035c]
      36  56 kernel_sendmsg+86 [0x32064a]
      37  88 sock_no_sendpage+98 [0x322b22]
      38 104 tcp_sendpage+70 [0x35cc1e]
      39  48 sock_sendpage+74 [0x31eb66]
      40  64 pipe_to_sendpage+102 [0x1c4b2e]
      41  64 __splice_from_pipe+120 [0x1c5340]
      42  72 splice_from_pipe+90 [0x1c57e6]
      43  56 generic_splice_sendpage+38 [0x1c5832]
      44  48 do_splice_from+104 [0x1c4c38]
      45  48 direct_splice_actor+52 [0x1c4c88]
      46  80 splice_direct_to_actor+180 [0x1c4f80]
      47  72 do_splice_direct+70 [0x1c5112]
      48  64 do_sendfile+360 [0x19de18]
      49  72 sys_sendfile64+126 [0x19df32]
      50 336 sysc_do_restart+18 [0x111a1a]
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      7f5a8ba6
    • Heiko Carstens's avatar
      [S390] Change default IPL method to IPL_VM. · 46e7951f
      Heiko Carstens authored
      
      
      allyesconfig and allmodconfig built kernels have a tape IPL record.
      A the vmreader record makes much more sense, since hardly anybody will
      ever IPL a kernel from tape. So change the default.
      As I side effect I can test these kernels without fiddling around with
      the kernel config ;)
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      46e7951f
    • Roel Kluin's avatar
      [S390] appldata: unsigned ops->size cannot be negative · 13f8b7c5
      Roel Kluin authored
      
      
      unsigned ops->size cannot be negative
      Signed-off-by: default avatarRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      13f8b7c5
    • Heiko Carstens's avatar
      [S390] Fix sysdev class file creation. · da5aae70
      Heiko Carstens authored
      
      
      Use sysdev_class_create_file() to create create sysdev class attributes
      instead of sysfs_create_file(). Using sysfs_create_file() wasn't a very
      good idea since the show and store functions have a different amount of
      parameters for sysfs files and sysdev class files.
      In particular the pointer to the buffer is the last argument and
      therefore accesses to random memory regions happened.
      Still worked surprisingly well until we got a kernel panic.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      da5aae70
    • Christian Borntraeger's avatar
      [S390] pgtables: Fix race in enable_sie vs. page table ops · 250cf776
      Christian Borntraeger authored
      
      
      The current enable_sie code sets the mm->context.pgstes bit to tell
      dup_mm that the new mm should have extended page tables. This bit is also
      used by the s390 specific page table primitives to decide about the page
      table layout - which means context.pgstes has two meanings. This can cause
      any kind of bugs. For example  - e.g. shrink_zone can call
      ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young
      will test for context.pgstes. Since enable_sie changed that value of the old
      struct mm without changing the page table layout ptep_clear_flush_young will
      do the wrong thing.
      The solution is to split pgstes into two bits
      - one for the allocation
      - one for the current state
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      250cf776
  4. 20 Oct, 2008 3 commits
    • Matt Helsley's avatar
      container freezer: implement freezer cgroup subsystem · dc52ddc0
      Matt Helsley authored
      
      
      This patch implements a new freezer subsystem in the control groups
      framework.  It provides a way to stop and resume execution of all tasks in
      a cgroup by writing in the cgroup filesystem.
      
      The freezer subsystem in the container filesystem defines a file named
      freezer.state.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.  Reading will return the current state.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This is the basic mechanism which should do the right thing for user space
      task in a simple scenario.
      
      It's important to note that freezing can be incomplete.  In that case we
      return EBUSY.  This means that some tasks in the cgroup are busy doing
      something that prevents us from completely freezing the cgroup at this
      time.  After EBUSY, the cgroup will remain partially frozen -- reflected
      by freezer.state reporting "FREEZING" when read.  The state will remain
      "FREEZING" until one of these things happens:
      
      	1) Userspace cancels the freezing operation by writing "RUNNING" to
      		the freezer.state file
      	2) Userspace retries the freezing operation by writing "FROZEN" to
      		the freezer.state file (writing "FREEZING" is not legal
      		and returns EIO)
      	3) The tasks that blocked the cgroup from entering the "FROZEN"
      		state disappear from the cgroup's set of tasks.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: export thaw_process]
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Acked-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Tested-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc52ddc0
    • Matt Helsley's avatar
      container freezer: add TIF_FREEZE flag to all architectures · 83224b08
      Matt Helsley authored
      
      
      This patch series introduces a cgroup subsystem that utilizes the swsusp
      freezer to freeze a group of tasks.  It's immediately useful for batch job
      management scripts.  It should also be useful in the future for
      implementing container checkpoint/restart.
      
      The freezer subsystem in the container filesystem defines a cgroup file
      named freezer.state.  Reading freezer.state will return the current state
      of the cgroup.  Writing "FROZEN" to the state file will freeze all tasks
      in the cgroup.  Subsequently writing "RUNNING" will unfreeze the tasks in
      the cgroup.
      
      * Examples of usage :
      
         # mkdir /containers/freezer
         # mount -t cgroup -ofreezer freezer  /containers
         # mkdir /containers/0
         # echo $some_pid > /containers/0/tasks
      
      to get status of the freezer subsystem :
      
         # cat /containers/0/freezer.state
         RUNNING
      
      to freeze all tasks in the container :
      
         # echo FROZEN > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         FREEZING
         # cat /containers/0/freezer.state
         FROZEN
      
      to unfreeze all tasks in the container :
      
         # echo RUNNING > /containers/0/freezer.state
         # cat /containers/0/freezer.state
         RUNNING
      
      This patch:
      
      The first step in making the refrigerator() available to all
      architectures, even for those without power management.
      
      The purpose of such a change is to be able to use the refrigerator() in a
      new control group subsystem which will implement a control group freezer.
      
      [akpm@linux-foundation.org: fix sparc]
      Signed-off-by: default avatarCedric Le Goater <clg@fr.ibm.com>
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Acked-by: default avatarPavel Machek <pavel@suse.cz>
      Acked-by: default avatarSerge E. Hallyn <serue@us.ibm.com>
      Acked-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      Acked-by: default avatarNigel Cunningham <nigel@tuxonice.net>
      Tested-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83224b08
    • Badari Pulavarty's avatar
      mm: cleanup to make remove_memory() arch-neutral · 71088785
      Badari Pulavarty authored
      
      
      There is nothing architecture specific about remove_memory().
      remove_memory() function is common for all architectures which support
      hotplug memory remove.  Instead of duplicating it in every architecture,
      collapse them into arch neutral function.
      
      [akpm@linux-foundation.org: fix the export]
      Signed-off-by: default avatarBadari Pulavarty <pbadari@us.ibm.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Gary Hade <garyhade@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      71088785
  5. 16 Oct, 2008 6 commits
  6. 15 Oct, 2008 2 commits
  7. 13 Oct, 2008 1 commit
  8. 10 Oct, 2008 8 commits
  9. 03 Oct, 2008 1 commit
    • Heiko Carstens's avatar
      [S390] nohz: Fix __udelay. · d3d238c7
      Heiko Carstens authored
      This fixes a regression that came with 934b2857
      
      
      ("[S390] nohz/sclp: disable timer on synchronous waits.").
      If udelay() gets called from a disabled context it sets the clock comparator
      to a value where it expects the next interrupt. When the interrupt happens
      the clock comparator gets not reset and therefore the interrupt condition
      doesn't get cleared. The result is an endless timer interrupt loop.
      
      In addition this patch fixes also the following:
      
      rcutorture reveals that our __udelay implementation is still buggy,
      since it might schedule tasklets, but prevents their execution:
      
      NOHZ: local_softirq_pending 42
      NOHZ: local_softirq_pending 02
      NOHZ: local_softirq_pending 142
      NOHZ: local_softirq_pending 02
      
      To fix this we make sure that only the clock comparator interrupt
      is enabled when the enabled wait psw is loaded.
      Also no code gets called anymore which might schedule tasklets.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      d3d238c7
  10. 09 Sep, 2008 1 commit
  11. 08 Sep, 2008 1 commit
    • Manfred Spraul's avatar
      kernel/cpu.c: create a CPU_STARTING cpu_chain notifier · e545a614
      Manfred Spraul authored
      
      
      Right now, there is no notifier that is called on a new cpu, before the new
      cpu begins processing interrupts/softirqs.
      Various kernel function would need that notification, e.g. kvm works around
      by calling smp_call_function_single(), rcu polls cpu_online_map.
      
      The patch adds a CPU_STARTING notification. It also adds a helper function
      that sends the message to all cpu_chain handlers.
      
      Tested on x86-64.
      All other archs are untested. Especially on sparc, I'm not sure if I got
      it right.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e545a614
  12. 06 Sep, 2008 1 commit
  13. 25 Aug, 2008 1 commit
  14. 21 Aug, 2008 2 commits