[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[microblaze-uclinux] optimised memcpy & friends



Hi everyone,

I've just checked some optimised memcpy,memmove and memset routines into 
CVS.  If you freshen your arch/microblaze/lib and include/asm-microblaze 
directories you'll get the new code.

Previously we were using the naive byte-at-a-time implementations 
provided by the kernel - it turns out that these are horribly 
inefficient on microblaze primarliy due to the write-through caching - 
each byte write delays for a full OPB bus transaction.  Nasty.  If you 
didn't have any cache it was even worse - each byte read would also go 
to the bus.

The new implementations are generic C, so could be optimised further, 
but at least are alignmnet aware.  They do byte copies until the 
destination address is word alignent, then do as many full-word copies 
as possible, before cleaning up the left-overs again with byte copies. 
The basic idea is explained here:

http://www.embedded.com/showArticle.jhtml?articleID=19205567

There is a slight overhead that could be avoided by testing for short 
(probably <20 byte) operations, and using byte copies for those - if 
anyone wants to add them and do some tests I'll be happy to commit.

Note the new memcpy and memove use bit shifts extensively, so I've made 
their inclusion conditional on CONFIG_MICROBLAZE0_USE_BARREL.  If your 
platform doesn't have a barrel shifter, you'll stay with the byte-copy 
versions.

I haven't done any hard measurements, but simple bus timing analysis 
suggests that these routines should be on average about 3 times faster 
than the old ones, for medium to large transfers.

Regards,

John



___________________________
microblaze-uclinux mailing list
microblaze-uclinux@itee.uq.edu.au
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/