SIGPLAN Notices (ACM Special Interest Group on Programming Languages)
The uniform representation of 2-dimensional arrays serially in Morton order (or Ee Cyrillic sign order) supports both their iterative scan with cartesian indices and their divide-and-conquer manipulation as quaternary trees. This data structure is important because it relaxes serious problems of locality and latency, and the tree helps to schedule multi-processing. Results here show how it facilitates algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. We have built a rudimentary C-to-C translator that implements matrices in Morton-order from source that presumes a row-major implementation. Early performance from LAPACK's reference implementation of dgesv (linear solver), and all its supporting routines (including dgemm matrix-multiplication) form a successful research demonstration. Its performance predicts improvements from new algebra in back-end optimizers. We also present results from a more stylish dgemm algorithm that takes better advantage of this representation. With only routine back-end optimizations inserted by hand (unfolding the base case and passing arguments in registers), we achieve machine performance exceeding that of the manufacturer-crafted dgemm running at 67% of peak flops. And the same code performs similarly on several machines. Together, these results show how existing codes and futurc block-recursive algorithms can work well together on this matrix representation. Locality is key to future performance, and the new representation has a remarkable impact.
Wise, David S.; Alexander, Gregory A.; and Frens, Jeremy D., "Language support for Morton-order matrices" (2001). University Faculty Publications and Creative Works. 558.