A fundamental concern in today's computer world is where learning curves are taking us. Moore's Law results in CPU speed doubling every 12-18 months, while memory is not keeping up. What can be done to address the memory wall (see issues of Computer Architecture News in 1995 for some debate using this term)?
Work done here addresses two concerns: how the memory hierarchy could be changed to take trends into account, and how to change our thinking about programming.
Architecture Projects
- RAMpage memory hierarchy
Should main memory move to what would today be considered the lowest level of cache, and DRAM be treated as a paging device? - DRAM as a slow peripheral
DRAM, although losing the battle to keep up with the CPU, is not yet a slow peripheral but shouldn't we plan for the possibility that it might need to be seen that way soon? - HRAM algorithm analysis
How far can adaptation of algorithm analysis go in adding a memory hierarchy component? There is some work going on in this field called Algorithm Engineering but it's probably more accurate to call "performance tuning" until there is some real science behind making performance predictions. Watch this space for links.
Other Areas of Research
I have also recently branched out into networks. The underlying concern is the same: how do we exploit trends in increasing bandwidth (analogous to increasing CPU throughput) versus relatively static latency (analogous to DRAM access time) in networks?
- FastTrack
How far can we go in aggregating like traffic, to reduce congestion? Can we also use the relatively easy growth in bandwidth to achieve latency goals with lateral thinking?
Earlier Work
An Object-Oriented Library for Shared-Memory Parallel Simulations, PhD Thesis, University of Cape Town, 1996. (abstract and PostScript)
[DR Cheriton, HA Goosen, H Holbrook and P Machanick] Restructuring a Parallel Simulation to Improve Cache Behaviour in a Shared-Memory Multiprocessor: The Value of Distributed Synchronization, Proc. 7th Workshop on Parallel and Distributed Simulation, San Diego, May 1993, pp 159-162 (abstract and full paper)
[D R Cheriton H A Goosen and P Machanick] Restructuring a Parallel Simulation to Improve Cache Behavior in a Shared-Memory Multiprocessor: A First Experience, Proc. Int. Symp. on Shared Memory Multiprocessing, Tokyo, April 1991, pp. 109-118; update at 1991 Topaz User's Group Conference, DEC SRC, Palo Alto, 1991
History
Check out my memory history pages.
Links
There are many interesting links at the WWW Computer Architecture Home Page. Here are some more specific architecture links:
- The Stanford Hydra project makes a strong case for single-chip multiprocessors
- A lot of interesting work has been done at University of Washington on simultaneous multithreading (SMT)
- The Berkeley IRAM project looks at the benefits of integrating RAM and CPU on one chip
- There are several studies on instruction-level parallelism, including:
- Wall's classic 1993 study
- Limits of Control Flow on Parallelism by Lam et al.
- Here is some work on scalable video in demand:
- a server solution, Tiger
- an alleged comprehensive solution
For a more general academic search engine, Citeseer is great. It gives you a citation index as well as abstracts, full text (where available) and a good search engine which doesn't turn up a bunch of non-academic material.
Philip Machanick philip.machanick-AT.NO.SPAM-gmail.com

