[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[microblaze-uclinux] optimised memcpy & friends
Hi everyone,
I've just checked some optimised memcpy,memmove and memset routines into
CVS. If you freshen your arch/microblaze/lib and include/asm-microblaze
directories you'll get the new code.
Previously we were using the naive byte-at-a-time implementations
provided by the kernel - it turns out that these are horribly
inefficient on microblaze primarliy due to the write-through caching -
each byte write delays for a full OPB bus transaction. Nasty. If you
didn't have any cache it was even worse - each byte read would also go
to the bus.
The new implementations are generic C, so could be optimised further,
but at least are alignmnet aware. They do byte copies until the
destination address is word alignent, then do as many full-word copies
as possible, before cleaning up the left-overs again with byte copies.
The basic idea is explained here:
http://www.embedded.com/showArticle.jhtml?articleID=19205567
There is a slight overhead that could be avoided by testing for short
(probably <20 byte) operations, and using byte copies for those - if
anyone wants to add them and do some tests I'll be happy to commit.
Note the new memcpy and memove use bit shifts extensively, so I've made
their inclusion conditional on CONFIG_MICROBLAZE0_USE_BARREL. If your
platform doesn't have a barrel shifter, you'll stay with the byte-copy
versions.
I haven't done any hard measurements, but simple bus timing analysis
suggests that these routines should be on average about 3 times faster
than the old ones, for medium to large transfers.
Regards,
John
___________________________
microblaze-uclinux mailing list
microblaze-uclinux@itee.uq.edu.au
Project Home Page : http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux
Mailing List Archive : http://www.itee.uq.edu.au/~listarch/microblaze-uclinux/