[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [microblaze-uclinux] kernel BUG at sched.c:687!
Hi alexandro,
I did put your stack checking code (with current ptr updated as mine
) and strange as it seems, the code goes into loop just when the linux
coming up. (This seems to work normal without this code). I checked
with the our custom hardware and also the ml403 eval board and it
behaves the same. I removed all the custom application and it is a
simple system with ethernet, uart with root mounted in the jffs2. This
itself seems to corrupt the kernel stack (according to this test
code).
Does it mean that the last "current" task corrupts the stack?. Got to
continue debugging tomorrow. Seems totally illogical at this point.
- Prasad
On 4/20/06, Alejandro Lucero <alucero@xxxxxxxxx> wrote:
> On Thursday 20 April 2006 13:29, Brettschneider Falk wrote:
> > Hi Alejandro,
> > thanks! Am I right, it will infinitely loop in case of a kernel stack
> > overflow? If yes, how could I write a fixed value to a register address in
> > that loop (to e.g. switch an LED on)? Going to try that soon...
> > Cheers, F@lk
>
> If the CPU executes this infinity loop it means the kernel stack size is more
> than 7500 bytes and this size is very close to 8192. Moreover, the limit is
> 8192 - sizeof(struct task_struct) since the process descriptor is at the
> bottom. This does not imply a stack overflow but the odds are high.
>
> If you know the leds address (look at
> arch/microblaze/platform/uclinux-auto/autoconfig.in) you can use it with a
> swi instruction:
>
> addi r11, r0, 0x1; /* Assuming 0x00000001 put a led on */
> swi r11, r0, led_address;
>
> You can add this before the first nop instruction.
>
> > > -----Original Message-----
> > > From: owner-microblaze-uclinux@xxxxxxxxxxxxxx
> > > [mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx]On Behalf Of Alejandro
> > > Lucero
> > > Sent: Thursday, April 20, 2006 1:50 PM
> > > To: microblaze-uclinux@xxxxxxxxxxxxxx
> > > Subject: Re: [microblaze-uclinux] kernel BUG at sched.c:687!
> > >
> > > On Thursday 20 April 2006 10:31, Brettschneider Falk wrote:
> > > > Hi,
> > > >
> > > > Alejandro Lucero wrote:
> > > > > I assumed you are using the entry.S without my patch reported
> > > > > two days ago.
> > > > > aren't you?
> > > >
> > > > I've tried JWs version of your patch but it doesn't help as
> > >
> > > a bugfix. My
> > >
> > > > environment is one active user app with several threads
> > >
> > > (SCHED_RR), a high
> > >
> > > > IRQ frequency (about 2 per millisecond), many thread switches, many
> > > > locks/unlocks of semaphores and mutexes. From time to time
> > >
> > > one thread of
> > >
> > > > that application calls pthread_cancel() to another thread.
> > > > Often (about after 20 kill actions) this leads to either a
> > >
> > > Linux crash
> > >
> > > > (with several versions of "kernel BUG at sched.c:***"), or
> > >
> > > just a total
> > >
> > > > hang or an exit of the app with return code 5. (The statistical
> > > > distribution is: displaying of scheduler bug = 0,01%, Linux
> > >
> > > hang = 60%,
> > >
> > > > process exit = rest.) I haven't the problems if either the
> > >
> > > IRQ frequency is
> > >
> > > > very low or no threads are cancelled(). That's why I asked
> > >
> > > you if you ever
> > >
> > > > tried to kill threads in your application, this increases
> > >
> > > the chance of a
> > >
> > > > Linux crash extremely here.
> > >
> > > Perhaps you could do some tests to discard the kernel stack
> > > overflow. Try to
> > > put this in your entry.S file but update the "current"
> > > pointer and make sure
> > > you are not using memory 0x554, 0x558, 0xc64 and 0xc68
> > > (surely LMB memory).
> > > This code looks at the kernel stack size and if it is greter
> > > then 0x1d4c
> > > (7500bytes) the system will execute an endless loop with
> > > interrupts disabled.
> > > In 0xc64 is stored the maximum kernel stack size used.
> > >
> > > Rembember to update current which is my kernel is in
> > > 0x0213472c address. Use
> > > objdump -t image.elf | grep current
> > >
> > > Try to put this in ENTRY(irq) just after swi r1, r0, ENTRY_SP
> > > and before
> > > SAVE_STATE
> > >
> > > swi r11, r0, 0x554
> > > swi r12, r0, 0x558
> > > lwi r11, r0, 0x0213472c;
> > > addi r11, r11, 0x2000;
> > > rsub r11, r1, r11;
> > > lwi r12, r0, 0xc64;
> > > swi r11, r0, 0xc68;
> > > rsub r11, r11, r12;
> > > bgei r11, 1f;
> > > lwi r11, r0, 0xc68;
> > > swi r11, r0, 0xc64;
> > > 1:
> > > lwi r11, r0, 0x0213472c;
> > > addi r11, r11, 0x2000;
> > > rsub r11, r1, r11;
> > > addi r12, r0, 0x1d4c;
> > > rsub r11, r12, r11;
> > > blei r11, 2f;
> > > lwi r11, r0, 0;
> > > mts rmsr, r11;
> > > nop;
> > > nop;
> > > nop;
> > > bri -8;
> > > 2:
> > > lwi r11, r0, 0x554
> > > lwi r12, r0, 0x558
> > >
> > > > Cheers, F@lk
> > > > ___________________________
> > > > microblaze-uclinux mailing list
> > > > microblaze-uclinux@xxxxxxxxxxxxxx
> > > > Project Home Page :
> >
> > http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> >
> > > Mailing List Archive :
> > > http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
>
> --
> Alejandro Lucero
> Technical Director
> +34 665 68 71 68
> Valencia (SPAIN)
> www.os3sl.com
> ___________________________
> microblaze-uclinux mailing list
> microblaze-uclinux@xxxxxxxxxxxxxx
> Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
>
>
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/