Design of a data layout and memory access abstraction layer for heterogeneous architectures

This is a PhD thesis written by Bernhard Manfred Gruber (ORCID: 0000-0001-7848-1690) to achieve the academic degree Doktoringenieur (Dr.-Ing.) at the Technical University Dresden. It was submitted on 2024-08-20, defended on 2025-04-17, and published on 2025-10-20.

This repository contains all Latex sources to build the thesis, as well as all data and scripts to produce the included plots. Furthermore, the slide decks used for the status talk (some sort of pre-defence) as well as the final public defence are included. The final rendered document is available in two versions: The digital and the printed version.

The published thesis is also available on the Qucosa server of the Saxon State and University Library Dresden (SLUB): https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa2-989028

If not indicated otherwise, all materials published in this repository are licensed under the Creative Commons Attribution 4.0 International License.

Abstract

Efficient parallel programs increasingly rely on memory-related optimizations and on respect for the target hardware's internal structure. This presents a challenge for portable codes running on a variety of architectures. A single-source approach is highly desirable while, ideally, retaining full control over target specific optimizations.

Memory-related optimizations are manifold, and generally require full control over data layout, memory access, storage format, memory allocation, and physical memory location. These aspects are ideally decoupled from a program and its data structures, and unified into a coherent zero-overhead abstraction layer.

By abstracting multidimensional arrays of nested structures, the foundation of data structure design, as indexable spaces, portable programs can be written against a generic interface. The low-level abstraction of memory access (LLAMA) implements this concept as a C++ abstraction library, underneath which every performance relevant aspect can be customized with minimal effort and without needing any change to user code.

LLAMA shows no overhead in most analyzed code bases, including real-world software, and generally produces machine code equivalent to manual data layout or SIMD implementations, while running portably on all relevant contemporary hardware architectures. The abstraction provided by LLAMA provides a solid foundation for systematic optimization, including instrumentation, profiling, and rapid data layout exploration.

LLAMA shows that a unification of existing memory optimization approaches is entirely possible, while making no compromises on the portability of code and supported hardware platforms, providing a novel tool for the development of high-performance C++ applications in a heterogeneous environment.

Name		Name	Last commit message	Last commit date
Latest commit History 652 Commits
data		data
defense		defense
latex		latex
publicationlist		publicationlist
statustalk		statustalk
summary		summary
.gitignore		.gitignore
LICENSE		LICENSE
LLAMA_Gruber_online_version.pdf		LLAMA_Gruber_online_version.pdf
LLAMA_Gruber_print_version.pdf		LLAMA_Gruber_print_version.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Design of a data layout and memory access abstraction layer for heterogeneous architectures

Abstract

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Design of a data layout and memory access abstraction layer for heterogeneous architectures

Abstract

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages