!!Torsten Hoefler - Selected Publications \\ All publications and bibliometric indices can be found at [https://scholar.google.com/citations?hl=en&user=DdBvcBEAAAAJ].\\ \\ 1) Slim fly: A cost effective low-diameter network topology (2014)- establishing a lower bound and optimal construction for high-performance topologies\\ \\ 2) Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results - establishing theoretical and practical methods for benchmarking computing systems (2015), basis for 2020 BenchCouncil Rising Star award\\ \\ 3) Using advanced MPI: Modern features of the message-passing interface - the MPI-3 book, (2014) Torsten contributed big pieces to the Message Passing Interface Standard version 3, the de-facto programming model of HPC\\ \\ 4) Dare: High-performance state machine replication on RDMA networks (2015) - introduced protocols for implementing replicated state machines on datacenter RDMA networks, has been implemented in practice\\ \\ 5) The Portals 4.2 Network Programming Interface - contribution to the design and specification of the Portals networking interface, which was the basis of libfabric and Intel's HPC networking products (OFA), 2018\\ \\ 6) Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures - a fundamentally new programming model (result of an ERC starting grant) based on graphical performance tuning to replace MPI, 2019\\ \\ 7) Neural code comprehension: A learnable representation of code semantics - a novel deep learning method to understand program code properties, 2018\\ \\ 8) A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations - using data-centric techniques to optimize a full application for quantum nanotransport simulation, 2019\\ \\ 9) sPIN: High-performance streaming Processing in the Network - a novel network acceleration interface for data acceleration offload to the network, 2017\\ \\ 10) To push or to pull: On reducing communication and synchronization in graph computations - fundamental analysis of parallel processing techniques for graph computations, 2017