Stephan Herhut

Science is organized knowledge.
Wisdom is organized life.

About

I am working at NVIDIA as a compiler architect and distinguished compiler engineer. My interests span all things compilers, across architectures, products and research.

Before joining NVIDIA in 2023, I have been working as a software engineer at Google, which allowed me to indulge my passion for building compilers. Google also provided a great environment to build my skills in technical leadership, currently as the technical lead for Google’s ML Compilers for CPU and GPU that power frameworks like TensorFlow or JAX. My technical contributions span XLA and the MLIR compiler infrastructure.

Before working on TensorFlow, I have participated in various compiler projects at Google. Initially, I joined the Dart team and helped extend the compiler from Dart to JavaScript. I was also invoked in shrinking the Dart runtime to work on embedded systems.

Next, I helped build the new compiler toolchain for Android, with particular focus on optimizing and shrinking applications.

As part of the v8 team, I worked on making WebAssembly faster, with a focus on improved register allocation.

My interest for dynamic languages, especially those for the web, was sparked while being a Research Scientist at Intel Labs in Santa Clara. During my stay, I was part of the team behind River Trail, an effort to bring the power of data-parallel programming to the web.

Even earlier, until 2014, I enjoyed the position of a Research Fellow at the University of Hertfordshire. More precisely, I was part of the Compiler Technology and Computer Architecture group.

There, my work was focussed on concurrent systems and novel parallel execution models. In particular, I worked for the Apple-CORE project on extending the auto-parallelizing capabilities of the SaC compiler. In the setting of novel many-core architectures like the microgrid architecture with its hundreds of hardware threads, existing approaches to memory management and workload mapping do not scale. On the positive side, barriers that seemed insuperable suddenly become less challenging.

To kick-start my Ph.D., I investigated datatype-generic programming in the context of array types. This lead to the desire of enriching the type information that can be exploited for generic programming. After a short venture into dependent types, I realized that if values are types, then types should really be just values, or parts thereof. Or, in other words, that there should be no fixed phase divide between types and values. I have devised a system to program with properties rather than types in my Ph.D. thesis, which I defended in 2010. The key idea is to treat types and values uniformly as data and use partial evaluation to extract static knowledge.

Before my Ph.D., I was an undergraduate student at the University of Kiel, Germany. My studies revolved around concurrent systems, embedded and real-time systems, communication systems and programming languages. My thesis on fully separated namespaces in the presence of subtyping-based function overloading was supervised by Michael Hanus and Sven-Bodo Scholz. I have graduated in 2005 with the degree Diplom Informatiker and was awarded the b+m prize for an outstanding thesis.

I was a core contributor to the SaC project. Single Assignment C is a high-level, functional, yet high-performance array-programming language. Apart from language design, I am interested in the challenges involved in compiling a high-level language to high-performance code.

Lastly, I participated in the development of S-Net, a declarative coordination language for concurrent systems founded in the theory of asynchronous stream processing.

Details of my work appear in the publications section as soon as they get available. If you are interested in what I am doing or have suggestions or ideas feel free to contact me.