[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [microblaze-uclinux] kernel BUG at sched.c:687!



Hi,
given the assumption there's a kernel stack overflow on high IRQ load, what
would you suggest to fix that? I suppose a bigger stack wouldn't really help
but just delay the problem to a higher IRQ load limit.
CU, F@lk

> -----Original Message-----
> From: owner-microblaze-uclinux@xxxxxxxxxxxxxx
> [mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx]On Behalf Of Alejandro
> Lucero
> Sent: Friday, April 21, 2006 1:04 PM
> To: microblaze-uclinux@xxxxxxxxxxxxxx
> Subject: Re: [microblaze-uclinux] kernel BUG at sched.c:687!
> 
> 
> Hi Prasad,
> 
> This is my fault. I sent an email 1 hour ago but it seems to 
> be delayed. I had 
> a mistake in my instructions. The code must be AFTER 
> SAVE_STATE. If not the 
> stack pointer used could be from user stack and the check 
> will fail giving a 
> false positive. 
> 
> Sorry again.
> 
> On Friday 21 April 2006 09:41, DeviPrasad Natesan wrote:
> > Hi alexandro,
> > I did put your stack checking code (with current ptr updated as mine
> > ) and strange as it seems, the code goes into loop just 
> when the linux
> > coming up. (This seems to work normal without this code). I checked
> > with the our custom hardware and also the ml403 eval board and it
> > behaves the same. I removed all the custom application and it is a
> > simple system with ethernet, uart with root mounted in the 
> jffs2. This
> > itself seems to corrupt the kernel stack (according to this test
> > code).
> >
> > Does it mean that the last "current" task corrupts the 
> stack?.  Got to
> > continue debugging tomorrow. Seems totally illogical at this point.
> >
> > - Prasad
> >
> > On 4/20/06, Alejandro Lucero <alucero@xxxxxxxxx> wrote:
> > > On Thursday 20 April 2006 13:29, Brettschneider Falk wrote:
> > > > Hi Alejandro,
> > > > thanks! Am I right, it will infinitely loop in case of 
> a kernel stack
> > > > overflow? If yes, how could I write a fixed value to a 
> register address
> > > > in that loop (to e.g. switch an LED on)? Going to try 
> that soon...
> > > > Cheers, F@lk
> > >
> > > If the CPU executes this infinity loop it means the 
> kernel stack size is
> > > more than 7500 bytes and this size is very close to 8192. 
> Moreover, the
> > > limit is 8192 - sizeof(struct task_struct) since the 
> process descriptor
> > > is at the bottom. This does not imply a stack overflow 
> but the odds are
> > > high.
> > >
> > > If you know the leds address (look at
> > > arch/microblaze/platform/uclinux-auto/autoconfig.in) you 
> can use it with
> > > a swi instruction:
> > >
> > >         addi r11, r0, 0x1;  /* Assuming 0x00000001 put a led on */
> > >         swi r11, r0, led_address;
> > >
> > > You can add this before the first nop instruction.
> > >
> > > > > -----Original Message-----
> > > > > From: owner-microblaze-uclinux@xxxxxxxxxxxxxx
> > > > > [mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx]On Behalf Of
> > > > > Alejandro Lucero
> > > > > Sent: Thursday, April 20, 2006 1:50 PM
> > > > > To: microblaze-uclinux@xxxxxxxxxxxxxx
> > > > > Subject: Re: [microblaze-uclinux] kernel BUG at sched.c:687!
> > > > >
> > > > > On Thursday 20 April 2006 10:31, Brettschneider Falk wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Alejandro Lucero wrote:
> > > > > > > I assumed you are using the entry.S without my 
> patch reported
> > > > > > > two days ago.
> > > > > > > aren't you?
> > > > > >
> > > > > > I've tried JWs version of your patch but it doesn't help as
> > > > >
> > > > > a bugfix. My
> > > > >
> > > > > > environment is one active user app with several threads
> > > > >
> > > > > (SCHED_RR), a high
> > > > >
> > > > > > IRQ frequency (about 2 per millisecond), many 
> thread switches, many
> > > > > > locks/unlocks of semaphores and mutexes. From time to time
> > > > >
> > > > > one thread of
> > > > >
> > > > > > that application calls pthread_cancel() to another thread.
> > > > > > Often (about after 20 kill actions) this leads to either a
> > > > >
> > > > > Linux crash
> > > > >
> > > > > > (with several versions of "kernel BUG at sched.c:***"), or
> > > > >
> > > > > just a total
> > > > >
> > > > > > hang or an exit of the app with return code 5. (The 
> statistical
> > > > > > distribution is: displaying of scheduler bug = 0,01%, Linux
> > > > >
> > > > > hang = 60%,
> > > > >
> > > > > > process exit = rest.) I haven't the problems if either the
> > > > >
> > > > > IRQ frequency is
> > > > >
> > > > > > very low or no threads are cancelled(). That's why I asked
> > > > >
> > > > > you if you ever
> > > > >
> > > > > > tried to kill threads in your application, this increases
> > > > >
> > > > > the chance of a
> > > > >
> > > > > > Linux crash extremely here.
> > > > >
> > > > > Perhaps you could do some tests to discard the kernel stack
> > > > > overflow. Try to
> > > > > put this in your entry.S file but update the "current"
> > > > > pointer and make sure
> > > > > you are not using memory 0x554, 0x558, 0xc64 and 0xc68
> > > > > (surely LMB memory).
> > > > > This code looks at the kernel stack size and if it is greter
> > > > > then 0x1d4c
> > > > > (7500bytes) the system will execute an endless loop with
> > > > > interrupts disabled.
> > > > > In 0xc64 is stored the maximum kernel stack size used.
> > > > >
> > > > > Rembember to update current which is my kernel is in
> > > > > 0x0213472c address. Use
> > > > > objdump -t image.elf | grep current
> > > > >
> > > > > Try to put this in ENTRY(irq) just after swi r1, r0, ENTRY_SP
> > > > > and before
> > > > > SAVE_STATE
> > > > >
> > > > >                             swi r11, r0, 0x554
> > > > >                     swi r12, r0, 0x558
> > > > >                         lwi    r11, r0, 0x0213472c;
> > > > >                         addi   r11, r11, 0x2000;
> > > > >                         rsub   r11, r1, r11;
> > > > >                         lwi    r12, r0, 0xc64;
> > > > >                         swi    r11, r0, 0xc68;
> > > > >                         rsub   r11, r11, r12;
> > > > >                         bgei   r11, 1f;
> > > > >                         lwi    r11, r0, 0xc68;
> > > > >                         swi    r11, r0, 0xc64;
> > > > >                         1:
> > > > >                         lwi    r11, r0, 0x0213472c;
> > > > >                         addi   r11, r11, 0x2000;
> > > > >                         rsub   r11, r1, r11;
> > > > >                         addi   r12, r0, 0x1d4c;
> > > > >                         rsub   r11, r12, r11;
> > > > >                         blei   r11, 2f;
> > > > >                         lwi    r11, r0, 0;
> > > > >                         mts    rmsr, r11;
> > > > >                         nop;
> > > > >                         nop;
> > > > >                         nop;
> > > > >                         bri -8;
> > > > >                         2:
> > > > >                     lwi r11, r0, 0x554
> > > > >                     lwi r12, r0, 0x558
> > > > >
> > > > > > Cheers, F@lk
> > > > > > ___________________________
> > > > > > microblaze-uclinux mailing list
> > > > > > microblaze-uclinux@xxxxxxxxxxxxxx
> > > > > > Project Home Page :
> > > >
> > > > http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> > > >
> > > > > Mailing List Archive :
> > > > > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
> > >
> > > --
> > > Alejandro Lucero
> > > Technical Director
> > > +34 665 68 71 68
> > > Valencia (SPAIN)
> > > www.os3sl.com
> > > ___________________________
> > > microblaze-uclinux mailing list
> > > microblaze-uclinux@xxxxxxxxxxxxxx
> > > Project Home Page : 
> http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> > > Mailing List Archive :
> > > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
> >
> > ___________________________
> > microblaze-uclinux mailing list
> > microblaze-uclinux@xxxxxxxxxxxxxx
> > Project Home Page : 
> http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> > Mailing List Archive :
> > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
> 
> -- 
> 
> Alejandro Lucero
> Technical Director
> +34 665 68 71 68
> Valencia (SPAIN)
> www.os3sl.com
> ___________________________
> microblaze-uclinux mailing list
> microblaze-uclinux@xxxxxxxxxxxxxx
> Project Home Page : 
> http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> Mailing List Archive : 
> http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
> 
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/