[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [microblaze-uclinux] Kernel module driver crash
If this is not too much off topic I would post here some simple steps to trigger the errors I was talking about.
1) transfer some data from pc to board
2) transfer some data from board to pc
for 1) I use a "while true; do wget http://<server_addr>/image.ub; done"
for 2) I use a "while true; do rm <somefile>*; wget http://<board_addr>/somefile; done"
I use boa as web server and put the <somefile> I want to transfer in /home/httpd (a symlink should suffice).
The two tasks are run simultaneously.
After some time of activity I get, for example:
Data bus error exception in kernel mode.
Oops: bus exception, sig: 7
Registers dump: mode=47AA0938
r1=0000007C, r2=00000005, r3=00000076, r4=00000000
r5=FFFFFFFC, r6=47AA0938, r7=00000005, r8=4F444946
r9=434F4E4E, r10=47AA092C, r11=45435449, r12=00000004
r13=000005A0, r14=46EAB024, r15=46EA1470, r16=00000000
r17=46EAB038, r18=00000016, r19=46EA1778, r20=46EA15B4
r21=47AA08DF, r22=47AA0000, r23=00000088, r24=00000000
r25=47AA08CA, r26=47AA0000, r27=47AA0938, r28=47AA0000
r29=0000007C, r30=00000000, r31=47AA08CA, rPC=46EA5028
msr=46EA4BF8, ear=46EA4B44, esr=46EA52FC, fsr=47AA092C
Oops: Exception in kernel mode, sig: 4
Registers dump: mode=47EFFFC0
r1=00000000, r2=4420DE68, r3=4420DE68, r4=4420DE68
r5=00000000, r6=47FE2000, r7=55CD3FFF, r8=00000000
r9=00000000, r10=000000F8, r11=00000000, r12=000005A0
r13=000000F8, r14=47EE5608, r15=47FEBC14, r16=00000082
r17=47FEBC28, r18=3BDF211A, r19=47D95820, r20=00000001
r21=4400FC9C, r22=7E3FCD80, r23=44208BF4, r24=47FEBCC8
r25=00000000, r26=440018BC, r27=FFFFFFFF, r28=47FE3EA0
r29=440018C8, r30=00000001, r31=00000000, rPC=00000000
msr=47FE3EA0, ear=47FEBC14, esr=47FEBC14, fsr=44004250
Kernel panic - not syncing: Attempted to kill init!
<0>Rebooting in 120 seconds..Machine restart...
Stack:
47fe3d20 00000004 44018818 00000000 440ca64c 440190c8 44193244 0000402f
009f7ef6 00008000 00003f90 0000009f 44009904 0000119e ffffffff 00000001
00000000 00000010 0001d4bf 4400c9b0 441945a8 00000078 44209520 00000000
Call Trace:
[<44018818>] atomic_notifier_call_chain+0x8/0x1c
[<440ca64c>] bust_spinlocks+0x58/0x78
[<440190c8>] emergency_restart+0xc/0x20
[<44009904>] panic+0x180/0x218
[<4400c9b0>] do_exit+0x360/0x8cc
[<440032f8>] die+0x68/0x70
[<440032e4>] die+0x54/0x70
[<44003370>] _exception+0x70/0x80
[<44003424>] full_exception+0xa4/0x1a8
[<44116db8>] netif_receive_skb+0x278/0x284
[<44004250>] _interrupt+0x110/0x118
[<44024004>] hrtimer_get_remaining+0x34/0x8c
[<44116e84>] process_backlog+0xc0/0x1c4
[<4410e9dc>] __alloc_skb+0x60/0x14c
[<4411701c>] net_rx_action+0x94/0x164
[<440effb8>] RecvHandler+0xec/0x1b8
[<440eff80>] RecvHandler+0xb4/0x1b8
[<4400fab4>] __do_softirq+0x28/0x3c
[<44002ecc>] handle_other_ex+0x38/0x88
[<4400fa58>] __do_softirq2+0xb8/0xec
[<4400fc9c>] irq_exit+0x40/0x54
[<440018bc>] do_IRQ+0x40/0x98
[<440018c8>] do_IRQ+0x4c/0x98
[<44004250>] _interrupt+0x110/0x118
[<4400fab4>] __do_softirq+0x28/0x3c
[<4402b158>] handle_IRQ_event+0x54/0xb8
Or, on another run:
BUG: soft lockup detected on CPU#0!
Stack:
469f3b38 00000000 00000000 00000000 0006f38d 44014030 44196730 00000000
00000000 00000000 00000001 469f2000 00000000 47bc02e4 4401407c 00000000
00000001 00000000 00000000 469f3b80 47bc02e4 44001c78 00000000 00000000
Call Trace:
[<44014030>] run_local_timers+0x18/0x2c
[<4401407c>] update_process_times+0x38/0xa4
[<44001c78>] timer_interrupt+0x54/0x8c
[<4402b158>] handle_IRQ_event+0x54/0xb8
[<4400fb14>] do_softirq+0x4c/0x60
[<4402b250>] __do_IRQ+0x94/0x12c
[<4400fc9c>] irq_exit+0x40/0x54
[<440018bc>] do_IRQ+0x40/0x98
[<440018c8>] do_IRQ+0x4c/0x98
[<4400fc9c>] irq_exit+0x40/0x54
[<44004250>] _interrupt+0x110/0x118
[<440018c8>] do_IRQ+0x4c/0x98
[<4402d5bc>] add_to_page_cache+0x134/0x144
[<4402d4ec>] add_to_page_cache+0x64/0x144
[<4402d5bc>] add_to_page_cache+0x134/0x144
[<4402f768>] generic_file_buffered_write+0x168/0x6d0
[<4402f774>] generic_file_buffered_write+0x174/0x6d0
[<4402ffe4>] __generic_file_aio_write_nolock+0x314/0x600
[<440df4d4>] n_tty_receive_buf+0x528/0xff4
[<44006184>] __wake_up+0x20/0x48
[<440df4ac>] n_tty_receive_buf+0x500/0xff4
[<4410f2e4>] __kfree_skb+0x8c/0x124
[<4413da64>] tcp_recvmsg+0x560/0x840
[<4413d554>] tcp_recvmsg+0x50/0x840
[<440560f0>] file_update_time+0xb8/0xf4
[<4402ffac>] __generic_file_aio_write_nolock+0x2dc/0x600
[<44030488>] generic_file_aio_write+0x84/0x17c
[<440407b0>] do_sync_write+0xb8/0x108
[<440ded78>] opost+0xcc/0x208
[<44040934>] vfs_write+0x134/0x140
[<440e0cd4>] write_chan+0x0/0x390
[<440db35c>] tty_write+0x214/0x25c
[<440207b4>] autoremove_wake_function+0x0/0x48
[<44040934>] vfs_write+0x134/0x140
[<440408a4>] vfs_write+0xa4/0x140
[<44040a40>] sys_write+0x54/0xac
[<44004730>] work_pending+0xc/0x3c
[<44004758>] work_pending+0x34/0x3c
[<44004758>] work_pending+0x34/0x3c
Some people in the past confirmed this same behaviour, but the question remained pending.
I could not find any easy way to trigger the errors on demand. It takes quite a bit to observe the crash with the method I explained (a few hours).
If anybody could confirm again this behaviour maybe we could narrow down the problem to a particular configuration/setting.
Anyway, thanks again for your time.
Giulio Mazzoleni
Il giorno lun, 27/04/2009 alle 15.24 +0200, Giulio Mazzoleni ha scritto:
> Hi Wendy,
> you are right.
>
> Furthemore if the init funciton is defined as "int
> init_module(void)" (and the call to module_init is removed) it is put in
> the .text section by the compiler instead of the .init section and the
> errors disappear.
>
> I still wonder if there could be any relation between this kind of
> errors and the ones I get during normal operation (they seem to get
> triggered more frequently under heavy network activity).
> The error messages printed by the kernel are the same, so I was hoping..
>
> Giulio
>
> Il giorno ven, 24/04/2009 alle 18.20 +1000, Wendy Liang ha scritto:
> > Hi Giulio and Ian,
> >
> > By dumping the .ko, I found that the the init_module can only jump to
> > the top of .text section of the .ko.
> >
> > if testd/c/b() is called by more than once, the compiler will generate
> > executable code for testd/c/b/a() in the same sequence as how they
> > defined in the .text section of .ko.
> >
> > Otherwise, because they are all static function, by default, the
> > compiler will only generate executable code for testa() in the .text
> > section of .ko.
> >
> > We are still investigating why it cannot jump to section testa() when
> > it is not at the top of .text section.
> >
> > Regards,
> > Wendy
>
>
>
>
> ___________________________
> microblaze-uclinux mailing list
> microblaze-uclinux@xxxxxxxxxxxxxx
> Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
> Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
>
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/