[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [microblaze-uclinux] Ethernet performance with xps_ll_temac



Hi Terry,

> Ok I found that kernel debugging was in fact enabled, so that must be
> causing the problem (thanks!) - I believe it's enabled by default when
> using petalinux-new-platform to create an MMU system. 

John: can we disabled that? I haven't seen problem there but of course it will
be better to find why is problem there.

> I disabled and
> get the results below.. I'm still trying to figure out what the problem
> is with the UDP_STEAM test (thoughts?) - that's the one I'm really
> interested in.

yes of course. Could you please send your .config file?

Thanks,
Michal

> 
> This system has the MMU enabled and the ll_temac buffers are 2k each,
> and it's in 100Mb/s mode. This is on the S3ADSP 3400 VSK.
> 
> *******************************************************************************
> ~ # ./netmeasure.sh -h 192.168.0.1 <http://192.168.0.1> -c 10
> 192.168.0.1 <http://192.168.0.1>; count=10
> Linux uclinux 2.6.20-uc0 #9 Thu Nov 6 07:32:45 PST 2008 microblaze
> CPU-Family:     MicroBlaze
> FPGA-Arch:      Unknown
> CPU-Ver:        7.10.d
> CPU-MHz:        62.500000
> BogoMips:       31.02
> HW-Div:         yes
> HW-Shift:       yes
> Icache:        16kB
> Dcache:        16kB
> HW-Debug:       yes
>            CPU0      
>   0:     119487     level OPB-INTC  timer
>   1:      45493     level OPB-INTC  xilinx_dma_tx_int
>   2:      59787     level OPB-INTC  xilinx_dma_rx_int
>   3:          0     level OPB-INTC  eth0
>   4:       1252      edge OPB-INTC  uartlite
> MemTotal:       257664 kB
> MemFree:        236872 kB
> Buffers:             0 kB
> 
> |TCP_STREAM| 10506 10571 10486 10451 10632 10642 10659 10701 10613
> 10691|Average|10595|
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> recv_response: partial response received: 0 bytes
> |UDP_STREAM|          |Average|failed|
> |TCP_MAERTS| 3601 3668 3681 3682 3675 3647 3719 3715 3704 3669|Average|3676|
> |TCP_RR| 148 148 149 145 148 149 148 147 147 149|Average|147|
> |TCP_CRR| 44 43 43 44 44 44 44 44 44 44|Average|43|
> |UDP_RR| 162 163 163 160 163 163 160 164 164 163|Average|162|
> eth0      Link encap:Ethernet  HWaddr 00:0A:35:05:05:08 
>           inet addr:192.168.0.20 <http://192.168.0.20> 
> Bcast:192.168.0.255 <http://192.168.0.255>  Mask:255.255.255.0
> <http://255.255.255.0>
>           UP BROADCAST RUNNING  MTU:1500  Metric:1
>           RX packets:286592 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:343492 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:119607693 (114.0 MiB)  TX bytes:315564188 (300.9 MiB)
>           Interrupt:3
> 
> lo        Link encap:Local Loopback 
>           inet addr:127.0.0.1 <http://127.0.0.1>  Mask:255.0.0.0
> <http://255.0.0.0>
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> 
> *************************************************************************************
> On Wed, Nov 5, 2008 at 2:12 PM, Michal Simek <monstr@xxxxxxxxx
> <mailto:monstr@xxxxxxxxx>> wrote:
> 
>     Hi all,
> 
>     I did some tests some days ago before starting your discussion.
>     I made a page with some results for ml505 board with ll_temac. There
>     are some
>     options. On the base of your discussion I can tell that MMU kernel
>     is a little
>     bit slower than noMMU but difference is not big.
>     You can look at it
>     (http://monstr.eu/wiki/doku.php?id=kernel:testing:net).
> 
>     There is testing script too. Thanks for sending your results. Then
>     we can look
>     where your problem is.
> 
>     If anyone send my results for any devel board, hw configuration,
>     etc. , I'll add
>     it there. Just send output from script and some information about
>     your hw
>     configuration.
> 
>     Cheers,
>     Michal
> 
> 
> 
>     Terry ONeal wrote:
>     > I tried again on the spartan3E1600 board  with the MMU enabled -
>     > performance was just as bad..  I then took the original S3A3400 board
>     > system, disabled the MMU, rebuilt the kernel, and got about a 10x
>     > improvement:
>     >
>     > #netperf -H 192.168.0.1 <http://192.168.0.1> <http://192.168.0.1>
>     > TCP STREAM TEST from 0.0.0.0 <http://0.0.0.0> <http://0.0.0.0>
>     (0.0.0.0 <http://0.0.0.0> <http://0.0.0.0>)
>     > port 0 AF_INET to 192.168.0.1 <http://192.168.0.1>
>     <http://192.168.0.1> (192.168.0.1 <http://192.168.0.1>
>     > <http://192.168.0.1>) port 0 AF_INET
>     > Recv   Send    Send
>     > Socket Socket  Message  Elapsed
>     > Size   Size    Size     Time     Throughput
>     > bytes  bytes   bytes    secs.    10^6bits/sec
>     >
>     >      0  16384  16384    10.00      18.81
>     >
>     >
>     >
>     > So it appears to me that this is related to the MMU being enabled - or
>     > the way the ll_temac driver is written..
>     >
>     > Terry
>     >
>     >
>     >
>     > On Tue, Nov 4, 2008 at 10:48 AM, Terry ONeal
>     <terryoneal3@xxxxxxxxx <mailto:terryoneal3@xxxxxxxxx>
>     > <mailto:terryoneal3@xxxxxxxxx <mailto:terryoneal3@xxxxxxxxx>>> wrote:
>     >
>     >     I'm using the Spartan3A DSP 3400 board.. That is a good point,
>     this
>     >     could be a physical layer issue, I've seen/heard of the same thing
>     >     in the past... I'm going to give a couple of the other reference
>     >     designs a try and see if other boards give better results..
>     >
>     >     Thanks,
>     >     Terry
>     >
>     >
>     >
>     >     On Tue, Nov 4, 2008 at 4:14 AM, Whitmore Ian J
>     >     <IJWHITMORE@xxxxxxxxxxx <mailto:IJWHITMORE@xxxxxxxxxxx>
>     <mailto:IJWHITMORE@xxxxxxxxxxx <mailto:IJWHITMORE@xxxxxxxxxxx>>> wrote:
>     >
>     >         Hello Terry,
>     >
>     >
>     >
>     >         What Dev board/FPGA are you using?
>     >
>     >
>     >
>     >         If it is Spartan 3A DSP 1800 (possibly other similar
>     models are
>     >         affected also) there is a problem with the physical layer DCM
>     >         phase shift in the gmii – currently under investigation by
>     >         Xilinx (I've only experimented at gigabit speeds though).
>     >
>     >
>     >
>     >         You could try adding the following line to the
>     >         implementation\system.ucf – in the GMII Receiver side DCM
>     >         constraints section
>     >
>     >                     "INST *gmii_rxc_dcm PHASE_SHIFT = 76"
>     >
>     >
>     >
>     >         This has yielded significant performance increases in
>     >         performance at gigabit for me – but I still get some dropped
>     >         frames of data.
>     >
>     >
>     >
>     >         With various tests (again at gigabit) I have found that
>     >         generally the bigger the ICache and DCache the better, mainly
>     >         ICache, (32K for both allows ~4.7MBytes/sec. ).
>     >
>     >         The most significant increase I have seen is by increasing the
>     >         MTU to 8982 (~19.6 MBytes/sec) but this involves
>     increasing your
>     >         TEMAC fifos to 16K to handle the larger frames, and also means
>     >         your infrastructure must support Jumbo frames.
>     >
>     >
>     >
>     >         Hope this helps.  I would probably try the larger ICache &
>     >         DCache first!
>     >
>     >
>     >
>     >         If you do find any more performance let me know – always
>     nice to
>     >         have a bit more bandwidth headroom!
>     >
>     >
>     >
>     >         Ian
>     >
>     >        
>     ------------------------------------------------------------------------
>     >
>     >         *From:* owner-microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx>
>     >         <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx>>
>     >         [mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx>
>     >         <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:owner-microblaze-uclinux@xxxxxxxxxxxxxx>>] *On Behalf Of
>     >         *Terry ONeal
>     >         *Sent:* 03 November 2008 22:21
>     >         *To:* microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:microblaze-uclinux@xxxxxxxxxxxxxx>
>     >         <mailto:microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:microblaze-uclinux@xxxxxxxxxxxxxx>>
>     >         *Subject:* [microblaze-uclinux] Ethernet performance with
>     >         xps_ll_temac
>     >
>     >
>     >
>     >         Hi All, I'm working on a Microblaze system (MMU enabled) that
>     >         uses xps_ll_temac, and I'm seeing poor network performance
>     when
>     >         running netperf.  The phy is in 100Mb mode and I'm seeing
>     around
>     >         2.5Mb/s with the netperf TCP_STREAM test. Microblaze
>     caches are
>     >         16K, temac buffers are 4k, barrell shifter and HW
>     multiplier are
>     >         turned on. I'm using petalinux sources I pulled from the svn
>     >         repository on 10/16/08.
>     >
>     >         What is interesting is that it appears that the kernel is
>     losing
>     >         timer ticks - if I configure netperf to run the test for 10
>     >         seconds, it actually takes 20 seconds (per my watch) to
>     run, but
>     >         the kernel only thinks it's been runnning for 10 seconds.
>     So it
>     >         appears the the ll_temac driver is hogging the CPU.
>     >
>     >         I also put togther my own application that simply creates
>     a UDP
>     >         socket and sends data through it as fast as it can - the
>     >         performance is a bit better than the netperf TCP test but
>     still
>     >         way off from what I'm expecting (at least 25Mb/s).
>     >
>     >         Has anyone had better luck with xps_ll_temac performance? Any
>     >         suggestions as to what may be going that is limiting the
>     >         performance?
>     >
>     >         Thanks,
>     >         Terry
>     >
>     >         The information contained in this E-Mail and any subsequent
>     >         correspondence is private and is intended solely for the
>     intended
>     >         recipient(s).  The information in this communication may be
>     >         confidential and/or legally privileged.  Nothing in this
>     e-mail is
>     >         intended to conclude a contract on behalf of QinetiQ or make
>     >         QinetiQ
>     >         subject to any other legally binding commitments, unless
>     the e-mail
>     >         contains an express statement to the contrary or
>     incorporates a
>     >         formal Purchase Order.
>     >
>     >         For those other than the recipient any disclosure, copying,
>     >         distribution, or any action taken or omitted to be taken in
>     >         reliance
>     >         on such information is prohibited and may be unlawful.
>     >
>     >         Emails and other electronic communication with QinetiQ may be
>     >         monitored and recorded for business purposes including
>     security,
>     >         audit
>     >         and archival purposes.  Any response to this email indicates
>     >         consent
>     >         to this.
>     >
>     >         Telephone calls to QinetiQ may be monitored or recorded
>     for quality
>     >         control, security and other business purposes.
>     >
>     >         QinetiQ Limited
>     >         Registered in England & Wales: Company Number:3796233
>     >         Registered office: 85 Buckingham Gate, London SW1E 6PD, United
>     >         Kingdom
>     >         Trading address: Cody Technology Park, Cody Building, Ively
>     >         Road, Farnborough, Hampshire, GU14 0LX, United Kingdom
>     >         http://www.qinetiq.com/home/notices/legal.html
>     >         <http://www.QinetiQ.com/home/legal.html>
>     >
>     >
>     >
>     ___________________________
>     microblaze-uclinux mailing list
>     microblaze-uclinux@xxxxxxxxxxxxxx
>     <mailto:microblaze-uclinux@xxxxxxxxxxxxxx>
>     Project Home Page :
>     http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
>     <http://www.itee.uq.edu.au/%7Ejwilliams/mblaze-uclinux>
>     Mailing List Archive :
>     http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/
>     <http://www.itee.uq.edu.au/%7Elistarch/microblaze-uclinux/>
> 
> 
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@xxxxxxxxxxxxxx
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/