Exploiting Deep Parallel Memory Hierarchies for Ray Casting Volume Rendering

Proceedings of the 1997 Parallel Rendering Symposium, ACM SIGGRAPH, Oct. 1997, pp. 15-22,115-116.

Authors

Abstract

Previous work in single-processor ray casting methods for volume rendering has concentrated on algorithmic optimizations to reduce computational work. Previous work in parallel volume rendering has concentrated on partitioning, with the goals of maximizing load balance and minimizing communication between distributed nodes. Building on our previous work at lower levels of the hierarchy, we present techniques to efficiently exploit all levels of the deep memory hierarchy of a distributed Power Challenge Array, on which we implement a logical global address space for volume blocks with caching. This focus on the optimal exploitation of the entire memory hierarchy, from the processor cache, to the interconnection network between distributed nodes allow us to efficiently render a 7.1 GB dataset. Our results have implications for the parallel solution of other problems which, like ray casting, require a global gather operation, and contain coherence. We discuss implications for the design of a parallel architecture suited to solving this class of problems.

Keywords

volume rendering, memory hierarchies, distributed architectures.

Errata

Unfortunately, in the symposium proceedings, this paper was printed with one of the figures missing from the first color plate. The second color plate had a typo ("frame 49" should be "frame 40"). You can download the corrected color plates here, or included in the versions of the paper below.

Download

All files are in gzipped postscript format.

You can download the paper as one file:

Or split into color and black-and-white pages for separate printing:

Slides from Symposium Presentation

The PowerPoint slides from my symposium presentation are also available here.

You can also view the slides as HTML web pages, although this doesn't provide adequate resolution for some of the figures.

Acknowledgements

Machine resources for this work were provided by Silicon Graphics Corporation, the National Center for Supercomputing Applications (NCSA), and Peter Schröder at Caltech. This research is sponsored by the Defense Advanced Research Projects Agency (DARPA) under contract number DABT63-95-C-0116, and AASERT award number N0014-93-1-0843. The Visible Male and Visible Female datasets were courtesy the National Library of Medicine. The Vorticity dataset was courtesy the Laboratory for Computational Science and Engineering at the University of Minnesota.