Discamus continentiam augere, luxuriam coercere
Home -> Publications
Home
  Publications
    
edited volumes
  Awards
  Research
  Teaching
  Miscellaneous
  Full CV [pdf]
  BLOG






  Events








  Past Events





Publications of Torsten Hoefler
Copyright Notice:

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Citation Listings: DBLP   CSB   Google Scholar   ACM Digital Library   Semantic Scholar   ORCID

Research overview                  Using Advanced MPI                 Edited volumes
      
filter by year
From to
filter by type
filter by tag (only from 2015-today)
IEEE CiSE
[1] Torsten Hoefler, Marcin Copik, Pete Beckman, Andrew Jones, Ian Foster, Manish Parashar, Daniel Reed, Matthias Troyer, Thomas Schulthess, Dan Ernst, Jack Dongarra:
 XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing USENIX: The Advanced Computing Systems Association, Dec. 2024,
NeurIPS'24
[2] Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman:
 QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs In Proceedings of the Neural Information Processing Systems, presented in Vancouver, Canada, Dec. 2024,
TACO
[3] Andrea Lepori, Alexandru Calotoiu, Torsten Hoefler:
 Iterating Pointers: Enabling Static Analysis for Loop-based Pointers ACM Transactions on Architecture and Code Optimization. Oct. 2024,
arXiv
[4] Maciej Besta, Robert Gerstenberger, Patrick Iff, Pournima Sonawane, Juan Gómez Luna, Raghavendra Kanakagiri, Rui Min, Onur Mutlu, Torsten Hoefler, Raja Appuswamy, Aidan O Mahony:
 Hardware Acceleration for Knowledge Graph Processing: Challenges & Recent Developments arXiv:2408.12173. Aug. 2024,
USENIX ATC'24
[5] Mikhail Khalilov, Marcin Chrapek, Siyuan Shen, Alessandro Vezzu, Thomas Benz, Salvatore Di Girolamo, Timo Schneider, Daniele De Sensi, Luca Benini, Torsten Hoefler:
 OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs Jul. 2024, (acceptance rate 15.9%, 77/482)
IEEE Computer
[6] Torsten Hoefler, Duncan Roweth, Keith Underwood, Bob Alverson, Mark Griswold, Vahid Tabatabaee, Mohan Kalkunte, Surendra Anubolu, Siyuan Shen, Abdul Kabbani, Moray McLaren, Steve Scott:
 Datacenter Ethernet and RDMA: Issues at Hyperscale IEEE Computer. Vol 56, Nr. 7, pages 67-77, Jul. 2024, Cover Feature Technology Predictions
ICML
[7] Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Dueben, Torsten Hoefler:
 DiffDA: a Diffusion model for weather-scale Data Assimilation Jul. 2024,
arXiv
[8] Maciej Besta, Florian Scheidl, Lukas Gianinazzi, Shachar Klaiman, Jürgen Müller, Torsten Hoefler:
 Demystifying Higher-Order Graph Neural Networks arXiv:2406.12841. Jun. 2024,
arXiv
[9] Maciej Besta, Ales Kubicek, Roman Niggli, Robert Gerstenberger, Lucas Weitzendorf, Mingyuan Chi, Patrick Iff, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Marcin Chrapek, Michał Podstawski, Torsten Hoefler:
 Multi-Head RAG: Solving Multi-Aspect Problems with LLMs arXiv:2406.05085. Jun. 2024,
arXiv
[10] Maciej Besta, Lorenzo Paleari, Ales Kubicek, Piotr Nyczyk, Robert Gerstenberger, Patrick Iff, Tomasz Lehmann, Hubert Niewiadomski, Torsten Hoefler:
 CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks arXiv:2406.02524. Jun. 2024,
SPAA'24
[11] Kartik Lakhotia, Laura Monroe, Kelly Isham, Maciej Besta, Nils Blach, Torsten Hoefler, Fabrizio Petrini:
 PolarStar: Expanding the Horizon of Diameter-3 Networks In Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'24), presented in Nantes, France, pages 345–357, Association for Computing Machinery, ISBN: 9798400704161, Jun. 2024, (acceptance rate 29.9%, 35/117)
IEEE TPAMI
[12] Maciej Besta, Torsten Hoefler:
 Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol 46, Nr. 5, pages 2584-2606, IEEE Press, May 2024,
ICLR'24
[13] Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman:
 SliceGPT: Compress Large Language Models by Deleting Rows and Columns In The Twelfth International Conference on Learning Representations, May 2024,
IPDPS'24
[14] Yves Baumann, Tal Ben-Nun, Maciej Besta, Lukas Gianinazzi, Torsten Hoefler, Piotr Luczynski:
 Low-Depth Spatial Tree Algorithms In Proceedings of the 38th IEEE International Parallel and Distributed Processing Symposium (IPDPS'24), presented in San Francisco, CA, USA, pages 180-192, IEEE Press, May 2024, (acceptance rate 26.1%, 88/337)
HPDC'24
[15] Piotr Luczynski, Lukas Gianinazzi, Patrick Iff, Leighton Wilson, Daniele De Sensi, Torsten Hoefler:
 Near-Optimal Wafer-Scale Reduce In Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC'24), presented in Pisa, Italy, Association for Computing Machinery, May 2024,
IPDPS'24
[16] Marcin Copik, Marcin Chrapek, Larissa Schmid, Alexandru Calotoiu, Torsten Hoefler:
  Software Resource Disaggregation for HPC with Serverless Computing In Proceedings of the 38th IEEE International Parallel and Distributed Processing Symposium (IPDPS'24), presented in San Francisco, CA, USA, IEEE, May 2024,
ICLR'24
[17] Tim Dettmers, Ruslan A. Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh:
 SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression In The Twelfth International Conference on Learning Representations, May 2024,
NSDI'24
[18] Nils Blach, Maciej Besta, Daniele De Sensi, Jens Domke, Hussein Harake, Shigang Li, Patrick Iff, Marek Konieczny, Kartik Lakhotia, Ales Kubicek, Marcel Ferrari, Fabrizio Petrini, Torsten Hoefler:
 A High-Performance Design, Implementation, Deployment, and Evaluation of the Slim Fly Network In 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI '24), presented in Santa Clara, CA, USA, pages 1025-1044, USENIX Association, ISBN: 978-1-939133-39-7, Apr. 2024,
ESSD
[19] Bjorn Stevens et al.:
 Earth Virtualization Engines (EVE) Earth System Science Data. Vol 16, Nr. 4, pages 2113-2122, Apr. 2024,
PPoPP'24
[21] Lukas Gianinazzi, Alexandros Nikolaos Ziogas, Piotr Luczynski, Langwen Huang, Saleh Ashkboos, Florian Scheidl, Armon Carigiet, Chio Ge, Nabil Abubaker, Maciej Besta, Tal Ben-Nun, Torsten Hoefler:
 Arrow Matrix Decomposition: A Novel Approach for Communication-Efficient Sparse Matrix Multiplication In The Proceedings of the 2024 USENIX Annual Technical Conference, presented in Edinburgh, United Kingdom, pages 404-416, Association for Computing Machinery, Mar. 2024,
AAAI'24
[22] Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michał Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler:
 Graph of Thoughts: Solving Elaborate Problems with Large Language Models Proceedings of the AAAI Conference on Artificial Intelligence. Vol 38, Nr. 16, presented in Vancouver, Canada, pages 17682-17690, AAAI Press, Mar. 2024, (acceptance rate 23.75%, 2342/9862)
Nature CompSci
[23] Peter Bauer, Torsten Hoefler, Bjorn Stevens, Wilco Hazeleger:
 Digital twins of Earth and the computing challenge of human interaction Nature Computational Science. Vol 4, Nr. 1, pages 154-157, Mar. 2024,
arXiv
[24] Maciej Besta, Florim Memedi, Zhenyu Zhang, Robert Gerstenberger, Guangyuan Piao, Nils Blach, Piotr Nyczyk, Marcin Copik, Grzegorz Kwaśniewski, Jürgen Müller, Lukas Gianinazzi, Ales Kubicek, Hubert Niewiadomski, Aidan O'Mahony, Onur Mutlu, Torsten Hoefler:
 Demystifying Chains, Trees, and Graphs of Thoughts arXiv:2401.14295. Jan. 2024,
arXiv
[25] Lukas Möller, Marcin Copik, Alexandru Calotoiu, Torsten Hoefler:
 Cppless: Productive and Performant Serverless Programming in C++ arXiv:2401.10834. Jan. 2024,
Big Data'23
[26] Wei Qiu, Marcin Copik, Yun Wang, Alexandru Calotoiu, Torsten Hoefler:
 User-guided Page Merging for Memory Deduplication in Serverless Systems In 2023 IEEE International Conference on Big Data (Big Data), Dec. 2023, (acceptance rate 17.5%, 92/526)
SC'23
[27] Marcin Chrapek, Mikhail Khalilov, Torsten Hoefler:
 HEAR: Homomorphically Encrypted Allreduce In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), presented in Denver, CO, USA, Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376) SC23 Best student paper, SC23 Reproducibility Advancement Award
SC'23
[28] Roberto L. Castro, Andrei Ivanov, Diego Andrade, Tal Ben-Nun, Basilio B. Fraguela, Torsten Hoefler:
 VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), presented in Denver, CO, USA, Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376)
SC'23
[29] Wenqi Jiang, Shigang Li, Yu Zhu, Johannes de Fine Licht, Zhenhao He, Runbin Shi, Cedric Renggli, Shuai Zhang, Theodoros Rekatsinas, Torsten Hoefler, Gustavo Alonso:
 Co-design Hardware and Algorithm for Vector Search In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), presented in Denver, CO, USA, Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376)
LOG'23
[30] Maciej Besta, Afonso Claudino Catarino, Lukas Gianinazzi, Nils Blach, Piotr Nyczyk, Hubert Niewiadomski, Torsten Hoefler:
 HOT: Higher-Order Dynamic Graph Representation Learning with Efficient Transformers In Proceedings of the Second Learning on Graphs Conference (LOG'23), presented in Virtual, PMLR, Nov. 2023,
SC'23
[31] Maciej Besta, Paweł Renc, Robert Gerstenberger, Paolo Sylos Labini, Alexandros Ziogas, Tiancheng Chen, Lukas Gianinazzi, Florian Scheidl, Kalman Szenes, Armon Carigiet, Patrick Iff, Grzegorz Kwasniewski, Raghavendra Kanakagiri, Chio Ge, Sammy Jaeger, Jarosław Wąs, Flavio Vella, Torsten Hoefler:
 High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor Formulations In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), presented in Denver, CO, USA, Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376)
SC'23
[32] Philipp Schaad, Timo Schneider, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Alexandru Calotoiu, Torsten Hoefler:
 FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376)
arXiv
[33] Patrick Iff, Benigna Bruggmann, Maciej Besta, Luca Benini, Torsten Hoefler:
 RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures arXiv:2311.06081. Nov. 2023,
SC'23
[34] Maciej Besta, Robert Gerstenberger, Marc Fischer, Michał Podstawski, Nils Blach, Berke Egeli, Georgy Mitenkov, Wojciech Chlapek, Marek Michalewicz, Hubert Niewiadomski, Jürgen Müller, Torsten Hoefler:
 The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'23), presented in Denver, CO, USA, Association for Computing Machinery, ISBN: 979-8-400701-09-2, Nov. 2023, (acceptance rate 23.9%, 90/376) Best Paper Finalist
EMNLP24
[35] Saleh Ashkboos, Ilia Markov, Elias Frantar, Tingxuan Zhong, Xincheng Wang, Jie Ren, Torsten Hoefler, Dan Alistarh:
 QUIK: Towards End-to-End 4-Bit Inference on Generative Large Language Models Nov. 2023,
ICCV'23
[36] Yunqiang Li, Jan C van Gemert, Torsten Hoefler, Bert Moons, Evangelos Eleftheriou, Bram-Ernst Verhoef:
 Differentiable Transportation Pruning 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Oct. 2023,
TPDS
[37] Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
 Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra IEEE Trans. Parallel Distrib. Syst.. Vol 34, Nr. 12, pages 3147-3161, Oct. 2023,
ACM CSUR
[38] Maciej Besta, Robert Gerstenberger, Emanuel Peter, Marc Fischer, Michał Podstawski, Claude Barthels, Gustavo Alonso, Torsten Hoefler:
 Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries ACM Comput. Surv.. Vol 56, Nr. 2, Association for Computing Machinery, ISSN: 0360-0300, Sep. 2023,
MODSIM'23
[39] Torsten Hoefler:
 Towards smart(er) High-Performance Networking Driving Future Simulations (Presentation) presented in Seattla, WA, USA, Aug. 2023, Invited talk at the MODSIM'23 workshop
ICPP'23
[40] Torsten Hoefler:
 Scalable and Efficient AI: From Supercomputers to Smartphones (Presentation) presented in Salt Lake City, UT, USA, Aug. 2023, Keynote talk at the 52nd International Conference on Parallel Processing
arXiv
[41] Julia Bazinska, Andrei Ivanov, Tal Ben-Nun, Nikoli Dryden, Maciej Besta, Siyuan Shen, Torsten Hoefler:
 Cached Operator Reordering: A Unified View for Fast GNN Training arXiv:2308.12093. Aug. 2023,
Nature Earth
[42] Peter Bauer, Peter D. Dueben, Matthew Chantry, Francisco Doblas-Reyes, Torsten Hoefler, Amy McGovern, Bjorn Stevens:
 Deep learning and a changing economy in weather and climate prediction Nature Reviews Earth and Environment. Vol 4, Nr. 1, pages 507-509, Aug. 2023,
ATC'23
[43] Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, Adrian Perrig:
 SAGE: Software-based Attestation for GPU Execution In 2023 USENIX Annual Technical Conference (USENIX ATC 23), pages 485--499, USENIX Association, ISBN: 978-1-939133-35-9, Jul. 2023,
DAC'23
[44] Patrick Iff, Maciej Besta, Matheus Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
 HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement In Proceedings of the 60th Annual Design Automation Conference, Jul. 2023,
DAC'23
[45] Patrick Iff, Maciej Besta, Matheus Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
 Sparse Hamming Graph: A Customizable Network-on-Chip Topology In Proceedings of the 60th Annual Design Automation Conference, Jul. 2023,
SPAA'23
[46] Kartik Lakhotia, Kelly Isham, Laura Monroe, Maciej Besta, Torsten Hoefler, Fabrizio Petrini:
 In-network Allreduce with Multiple Spanning Trees on PolarFly In Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'23), presented in Orlando, FL, USA, pages 165–176, Association for Computing Machinery, ISBN: 9781450395458, Jun. 2023,
ICS'23
[47] Lukas Truemper, Tal Ben-Nun, Philipp Schaad, Alexandru Calotoiu, Torsten Hoefler:
 Performance Embeddings: A Similarity-Based Transfer Tuning Approach to Performance Optimization Jun. 2023,
IEEE TPDS
[48] Maciej Besta, Marc Fischer, Vasiliki Kalavri, Michael Kapralov, Torsten Hoefler:
 Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems IEEE Transactions of Parallel and Distributed Systems. Vol 34, Nr. 6, pages 1860-1876, IEEE, Jun. 2023,
ICS'23
[49] Marcin Copik, Roman Böhringer, Alexandru Calotoiu, Torsten Hoefler:
 FMI: Fast and Cheap Message Passing for Serverless Functions Jun. 2023,
CIAC'23
[50] Tal Ben-Nun, Lukas Gianinazzi, Torsten Hoefler, Yishai Oltchik:
 Maximum Flows in Parametric Graph Templates In Algorithms and Complexity - 13th International Conference, Jun. 2023,
RISC-V Summit
[51] Andrei Ivanov, Timo Schneider, Luca Benini, Torsten Hoefler:
 RIVETS: An Efficient Training and Inference Library for RISC-V with Snitch Extensions In RISC-V Summit Europe, Jun. 2023,
HPDC'23
[52] Tiziano De Matteis, Lukas Gianinazzi, Johannes de Fine Licht, Torsten Hoefler:
 Streaming Task Graph Scheduling for Dataflow Architectures Jun. 2023,
FCRC'23
[53] Torsten Hoefler:
 Scalable and Efficient AI: From Supercomputers to Smartphones (Presentation) presented in Orlando, FL, USA, Jun. 2023, Keynote talk at the 2023 Federated Computing Research Conference
ICLR'23
[54] Langwen Huang, Torsten Hoefler:
 Compressing multidimensional weather and climate data into neural networks In The Eleventh International Conference on Learning Representations, May 2023, Notable Top 5% (Oral)
CiSE EVE
[55] Torsten Hoefler, Bjorn Stevens, Andreas F. Prein, Johanna Baehr, Thomas Schulthess, Thomas F. Stocker, John Taylor, Daniel Klocke, Pekka Manninen, Piers M. Forster, Tobias Kölling, Nicolas Gruber, Hartwig Anzt, Claudia Frauen, Florian Ziemen, Milan Klöwer, Karthik Kashinath, Christoph Schär, Oliver Fuhrer, Bryan N. Lawrence:
 Earth Virtualization Engines -- A Technical Perspective Computing in Science and Engineering (CiSE). Vol 25, Nr. 3, IEEE Computer Society, ISSN: 1521-9615, May 2023,
IPDPS'23 PhD Forum
[56] Marcin Copik, Torsten Hoefler:
 High-Performance Serverless for HPC and Clouds In 37th IEEE International Parallel & Distributed Processing Symposium (IPDPS), PhD Forum, May 2023,
ICLR'23
[57] Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh:
 GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers In The Eleventh International Conference on Learning Representations, May 2023,
CACM
[58] Torsten Hoefler, Thomas Häner, Matthias Troyer:
 Disentangling hype from practicality: On realistically achieving quantum advantage Vol 66, Nr. 5, In Communications of the ACM, pages 82-87, ACM, May 2023,
IPDPS' 23
[59] Marcin Copik, Konstantin Taranov, Alexandru Calotoiu, Torsten Hoefler:
 rFaaS: Enabling High Performance Serverless with RDMA and Leases In Proceedings of the 37th IEEE Interational Parallel and Distributed Processing Symposium, May 2023,
CGO'23
[60] Tal Ben-Nun, Berke Ates, Alexandru Calotoiu, Torsten Hoefler:
 Bridging Control-Centric and Data-Centric Optimization In 2023 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 173-185, Feb. 2023,
NeurIPS'22
[61] Nikoli Dryden, Torsten Hoefler:
 Spatial Mixture-of-Experts In Advances in Neural Information Processing Systems 35, presented in New Orleans, Louisiana, Dec. 2022,
NeurIPS'22
[62] Saleh Ashkboos, Langwen Huang, Nikoli Dryden, Tal Ben-Nun, Peter Dueben, Lukas Gianinazzi, Luca Kummer, Torsten Hoefler:
 ENS-10: A Dataset For Post-Processing Ensemble Weather Forecasts In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, presented in New Orleans, Louisiana, Dec. 2022,
SIGMETRICS
[63] Daniele De Sensi, Tiziano De Matteis, Konstantin Taranov, Salvatore Di Girolamo, Tobias Rahn, Torsten Hoefler:
 Noise in the Clouds: Influence of Network Performance Variability on Application Scalability Proc. ACM Meas. Anal. Comput. Syst.. Vol 6, Nr. 3, presented in New York, NY, USA, Association for Computing Machinery, Dec. 2022,
LOG'22
[64] Maciej Besta, Patrick Iff, Florian Scheidl, Kazuki Osawa, Nikoli Dryden, Michal Podstawski, Tiancheng Chen, Torsten Hoefler:
 Neural Graph Databases In Proceedings of the Learning on Graphs Conference (LOG'22), presented in Virtual, PMLR, Dec. 2022,
SC'22
[65] Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro,Steve Scott:
 HammingMesh: A Network Topology for Large-Scale Deep Learning In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), Nov. 2022, SC22 Reproducibility Advancement Award and Invited as CACM Research Highlight
ExaMPI22
[66] Shiyi Cao, Salvatore Di Girolamo, Torsten Hoefler:
 Accelerating Data Serialization/Deserialization Protocols with In-Network Compute In 2022 IEEE/ACM International Workshop on Exascale MPI (ExaMPI), Nov. 2022,
SC'22
[67] Maciej Besta, Cesare Miglioli, Paolo Sylos Labini, Jakub Tětek, Patrick Iff, Raghavendra Kanakagiri, Saleh Ashkboos, Kacper Janda, Michal Podstawski, Grzegorz Kwasniewski, Niels Gleinig, Flavio Vella, Onur Mutlu, Torsten Hoefler:
 ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), Nov. 2022, SC22 Best Paper (1/82)
arXiv
[68] Michael E Beverland, Prakash Murali, Matthias Troyer, Krysta M Svore, Torsten Hoefler, Vadym Kliuchnikov, Guang Hao Low, Mathias Soeken, Aarthi Sundaram, Alexander Vaschillo:
 Assessing requirements to scale to practical quantum advantage Nov. 2022, Presented at the Quantum Information Processing (QIP) conference
SC'22
[69] Alexandros Nikolaos Ziogas, Grzegorz Kwasniewski, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
 Deinsum: Practically I/O Optimal Multilinear Algebra In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), Nov. 2022,
SC'22
[70] Philipp Schaad, Tal Ben-Nun, Torsten Hoefler:
 Boosting Performance Optimization with Interactive Data Movement Visualization In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), ISBN: 9784665454445, Nov. 2022,
SC'22
[71] Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas Schulthess, Torsten Hoefler:
 Productive Performance Engineering for Weather and Climate Modeling with Python In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), ISBN: 9784665454445, Nov. 2022,
SC'22
[72] Salvatore Di Girolamo, Daniele De Sensi, Konstantin Taranov, Milos Malesevic, Maciej Besta, Timo Schneider, Severin Kistler, Torsten Hoefler:
 Building Blocks for Network-Accelerated Distributed File Systems In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), Nov. 2022, Best Paper Finalist
CCS'22
[73] Konstantin Taranov, Benjamin Rothenberger, Daniele De Sensi, Adrian Perrig, Torsten Hoefler:
 NeVerMore: Exploiting RDMA Mistakes in NVMe-oF Storage Applications In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (CCS '22), Nov. 2022, Best Paper Honorable Mention
SC'22
[74] Kartik Lakhotia, Maciej Besta, Laura Monroe, Kelly Isham, Patrick Iff, Torsten Hoefler, Fabrizio Petrini:
 PolarFly: A Cost-Effective and Flexible Low-Diameter Topology In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'22), Nov. 2022,
SC'22
[75] Shigang Li, Kazuki Osawa, Torsten Hoefler:
 Efficient Quantized Sparse Matrix Operations on Tensor Cores Nov. 2022, Best Paper Finalist
ICCAD'22
[76] Carl-Johannes Johnsen, Tiziano De Matteis, Tal Ben-Nun, Johannes de Fine Licht, Torsten Hoefler:
 Temporal Vectorization: A Compiler Approach to Automatic Multi-Pumping In 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD), Oct. 2022,
KDD'22
[77] Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler:
 Motif Prediction with Graph Neural Networks In Proceedings of the 28th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'22), presented in Washington DC, USA, pages 35–45, Association for Computing Machinery, ISBN: 9781450393850, Aug. 2022,
IEEE Computer
[78] Torsten Hoefler:
 Benchmarking data science: Twelve ways to lie with statistics and performance on parallel computers IEEE Computer. Vol 55, pages 49-56, Aug. 2022, Cover Feature Research Reproducibility
ICS'22
[79] Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
 A Data-Centric Optimization Framework for Machine Learning In Proceedings of the 2022 International Conference on Supercomputing (ICS'22), Jul. 2022,
ICS'22
[80] Alexandru Calotoiu, Tal Ben-Nun, Grzegorz Kwasniewski, Johannes de Fine Licht, Timo Schneider, Philipp Schaad, Torsten Hoefler:
 Lifting C Semantics for Dataflow Optimization In Proceedings of the 2022 International Conference on Supercomputing (ICS'22), Jul. 2022,
IEEE Computer
[81] Torsten Hoefler, Ariel Hendel, Duncan Roweth:
 The Convergence of Hyperscale Data Center and High-Performance Computing Networks IEEE Computer. Vol 55, Nr. 7, pages 29-37, Jul. 2022, Cover Feature Technology Predictions
SNN
[82] Andrei Ivanov, Nikoli Dryden, Torsten Hoefler:
 STen: An Interface for Efficient Sparsity in PyTorch In Sparsity in Neural Networks workshop, Jul. 2022,
ICS'22
[83] Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler:
 Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models In Proceedings of the 2022 International Conference on Supercomputing (ICS'22), Jul. 2022,
IPDPS'22
[84] András Strausz, Flavio Vella, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler:
 Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching In Proceedings of the 36th IEEE Interational Parallel and Distributed Processing Symposium (to appear), Jun. 2022,
ICST'22
[85] Andrei Lascu and Alastair F. Donaldson and Tobias Grosser and Torsten Hoefler:
 Metamorphic Fuzzing of C++ Libraries In IEEE International Conference on Software Testing, Verification and Validation, Jun. 2022,
SIGMOD'22
[86] Konstantin Taranov, Steve Byan, Virendra Marathe, Torsten Hoefler:
 KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks In Proceedings of the 2022 ACM SIGMOD International Conference on Management of Data, Jun. 2022,
IPDPS'22
[87] Niels Gleinig, Maciej Besta, Torsten Hoefler:
 I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication In Proceedings of the 36th IEEE Interational Parallel and Distributed Processing Symposium (to appear), Jun. 2022,
SIROCCO'22
[88] Niels Gleinig, Torsten Hoefler:
 The Red-Blue Pebble Game on Trees and DAGs with Large Input In Structural Information and Communication Complexity - 29th International Colloquium, SIROCCO 2022, Proceedings (to appear), Jun. 2022,
FCCM'22
[89] Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler:
 Fast Arbitrary Precision Floating Point on FPGA In Proceedings of the 30th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM'22), May 2022,
arXiv
[90] Lukas Gianinazzi, Tal Ben-Nun, Maciej Besta, Saleh Ashkboos, Yves Baumann, Piotr Luczynski, Torsten Hoefler:
 The spatial computer: A model for energy-efficient parallel computation arXiv:2205.04934. May 2022,
PPoPP'22
[91] Shigang Li, Torsten Hoefler:
 Near-Optimal Sparse Allreduce for Distributed Deep Learning In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Apr. 2022,
ICLR'22
[92] Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, Kate Saenko:
 Neural Parameter Allocation Search In Tenth International Conference on Learning Representations, Apr. 2022,
arXiv
[93] Marcin Copik, Alexandru Calotoiu, Konstantin Taranov, Torsten Hoefler:
 FaaSKeeper: Learning from Building Serverless Services with ZooKeeper as an Example Mar. 2022,
IEEE TPDS
[94] Marcin Copik, Tobias Grosser, Torsten Hoefler, Paolo Bientinesi, Benjamin Berkels:
 Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration IEEE Transactions on Parallel and Distributed Systems. Vol 33, Nr. 3, pages 523-535, IEEE, Mar. 2022,
ADAC11
[95] Torsten Hoefler, Linus Groner, Tal Ben-Nun, Tobias Wicky:
 Weather and Climate Simulations in Python using GT4Py and DaCe (Presentation) presented in Virtual, Jan. 2022,
Report
[96] Marcin Copik, Alexandru Calotoiu, Rodrigo Bruno, Gyorgy Rethy, Roman Böhringer, Torsten Hoefler:
 Process-as-a-Service: Elastic and Stateful Serverless with Cloud Processes Jan. 2022,
DATE
[97] Andrea Cossettini, Konstantin Taranov, Christian Vogt, Michele Magno, Torsten Hoefler, Luca Benini:
 A RDMA Interface for Ultra-Fast Ultrasound Data-Streaming over an Optical Link In Proceedings of Design, Automation, and Test in Europe (DATE), 2022,
DAC'21
[98] Niels Gleinig, Torsten Hoefler:
 An Efficient Algorithm for Sparse Quantum State Preparation In Proceedings of the 58th Annual Design Automation Conference, presented in San Francisco, CA, USA, ACM, Dec. 2021, (acceptance rate 23%)
Middleware 2021
[99] Marcin Copik, Grzegorz Kwasniewski, Maciej Besta, Michal Podstawski, Torsten Hoefler:
 SeBS: A Serverless Benchmark Suite for Function-as-a-Service Computing In Proceedings of the 22nd International Middleware Conference, presented in Qu\'{e}bec city, Canada, ACM, ISBN: 9781450385343, Dec. 2021,
SC21
[100] Nikoli Dryden, Roman Böhringer, Tal Ben-Nun, Torsten Hoefler:
 Clairvoyant Prefetching for Distributed Machine Learning I/O In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), presented in St. Louis, Missouri, ACM, Nov. 2021, (acceptance rate 25.9%, 98/379)
SC21
[101] Shigang Li, Torsten Hoefler:
 Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), presented in St. Louis, Missouri, ACM, Nov. 2021, (acceptance rate 25.9%, 98/379) Best Paper Finalist
SC21
[102] Grzegorz Kwasniewski, Marko Kabić, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Jens Eirik Saethre, André Gaillard, Timo Schneider, Maciej Besta, Anton Kozhevnikov, Joost VandeVondele, Torsten Hoefler:
 On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), Nov. 2021, (acceptance rate 25.9%, 98/379)
SC21
[103] Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis, Johannes de Fine Licht, Luca Lavarini, Torsten Hoefler:
 Productivity, Portability, Performance: Data-Centric Python In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), Nov. 2021, (acceptance rate 25.9%, 98/379)
SC21
[104] Thomas Häner, Damian S. Steiger, Torsten Hoefler, Matthias Troyer:
 Distributed Quantum Computing with QMPI In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), Nov. 2021, (acceptance rate 25.9%, 98/379)
OOPSLA'21
[105] Arjun Pitchanathan, Christian Ulmann, Michel Weber, Torsten Hoefler, Tobias Grosser:
 FPL: fast Presburger arithmetic through transprecision OOPSLA '21: Proceedings of the ACM international conference on Object oriented programming systems languages and applications. ACM, Nov. 2021, OOPSLA distinguished paper award (6/71)
SC21
[106] Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
 Flare: Flexible In-Network Allreduce In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC21), presented in St. Louis, Missouri, ACM, Nov. 2021, (acceptance rate 25.9%, 98/379)
MICRO'21
[107] Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez Luna, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Nils Blach, Marek Konieczny, Onur Mutlu, Torsten Hoefler:
 SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems In Proceedings of the 54th IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2021,
JMLR
[108] Torsten Hoefler, Dan Alistarh, Tan Ben-Nun, Nikoli Dryden, Alexandra Peste:
 Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks Journal of Machine Learning Research. Vol 22, Nr. 241, pages 1-124, Sep. 2021,
VLDB'21
[109] Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi, Jakub Beranek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Ozdemir, Adrian Balla, Marcin Copik, Philipp Lindenberger, Pavel Kalvoda, Marek Konieczny, Onur Mutlu, Torsten Hoefler:
 GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra In Proceedings of the 47th International Conference on Very Large Data Bases (VLDB'21), Aug. 2021,
TQC'21
[110] David Ittah, Thomas Häner, Vadym Kliuchnikov, Torsten Hoefler:
 QIRO: A Static Single Assignment-Based Quantum Program Representation for Optimization In ACM Transactions on Quantum Computing, Association for Computing Machinery, ISSN: 2643-6809, Aug. 2021,
ICML'21
[111] Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael O’Boyle, Hugh Leather:
 ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations In Thirty-eighth International Conference on Machine Learning, presented in Virtual, PMLR, Jul. 2021, (acceptance rate 21%)
USENIX ATC'21
[112] Maksym Planeta, Jan Bierbaum, Leo Sahaya Daphne Antony, Torsten Hoefler, Hermann Härtig:
 MigrOS: Transparent Live-Migration Support for Containerised RDMA Applications In Proceedings of the 2021 USENIX Annual Technical Conference, USENIX, Jul. 2021, (acceptance rate 18.8%, 64/341)
USENIX ATC'21
[113] Konstantin Taranov, Rodrigo Bruno, Gustavo Alonso, Torsten Hoefler:
 Naos: Serialization-free RDMA networking in Java In Proceedings of the 2021 USENIX Annual Technical Conference, USENIX, Jul. 2021, (acceptance rate 18.8%, 64/341)
SPAA'21
[114] Grzegorz Kwasniewski, Tal Ben-Nun, Lukas Gianinazzi, Alexandru Calotoiu, Timo Schneider, Alexandros Nikolaos Ziogas, Maciej Besta, Torsten Hoefler:
 Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs In Proceedings of the 33nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'21), Jul. 2021, (acceptance rate 14.9%)
SPAA'21
[115] Lukas Gianinazzi, Maciej Besta, Yannick Schaffner, Torsten Hoefler:
 Parallel Algorithms for Finding Large Cliques in Sparse Graphs In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'21), ACM, Jul. 2021,
ICS'21
[116] Alexandros Nikolaos Ziogas, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
 NPBench: A Benchmarking Suite for High-Performance NumPy In Proceedings of the 2021 International Conference on Supercomputing (ICS'21), Jun. 2021,
SIGMOD'21
[117] Konstantin Taranov, Salvatore Di Girolamo, Torsten Hoefler:
 CoRM: Compactable Remote Memory over RDMA In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data, Jun. 2021,
arXiv
[118] Lukas Gianinazzi, Maximilian Fries, Nikoli Dryden, Tal Ben-Nun, Maciej Besta, Torsten Hoefler:
 Learning Combinatorial Node Labeling Algorithms arXiv:2106.03594. Jun. 2021,
ISCA'21
[119] Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek, Luca Benini, Torsten Hoefler:
 A RISC-V in-network accelerator for flexible high-performance low-power packet processing In Proceedings of the 48th Annual International Symposium on Computer Architecture (ISCA'21), Jun. 2021,
IEEE TPDS
[120] Johannes de Fine Licht, Maciej Besta, Simon Meierhans, Torsten Hoefler:
 Transformations of High-Level Synthesis Codes for High-Performance Computing IEEE Transactions on Parallel and Distributed Systems. Vol 32, Nr. 5, pages 1014-1029, IEEE, May 2021,
IPDPS'21
[121] Marcus Ritter, Alexander Geiss, Johannes Wehrstein, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
 Noise-Resilient Empirical Performance Modeling with Deep Neural Networks In IPDPS '21: Proceedings of the 35th IEEE Interational Parallel and Distributed Processing Symposium, May 2021,
arXiv
[122] Maciej Besta, Marcel Schneider, Salvatore Di Girolamo, Ankit Singla, Torsten Hoefler:
 Towards Million-Server Network Simulations on Just a Laptop May 2021,
IEEE TPDS
[123] Maciej Besta, Jens Domke, Marcel Schneider, Marek Konieczny, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
 High-Performance Routing with Multipathing and Path Diversity in Ethernet and HPC Networks IEEE Transactions of Parallel and Distributed Systems. Vol 32, Nr. 4, pages 943-959, IEEE, Apr. 2021,
MLSys'21
[124] Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
 Data Movement Is All You Need: A Case Study on Optimizing Transformers In Proceedings of Machine Learning and Systems 3 (MLSys 2021), Apr. 2021, (acceptance rate: 23.5% (52/221)) Outstanding Paper Award (5/52)
RSTA
[125] Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler:
 Deep Learning for Post-Processing Ensemble Weather Forecasts Philosophical Transactions of the Royal Society A. Vol 379, Nr. 2194, The Royal Society, Feb. 2021,
Nature CompSci
[126] Peter Bauer, Peter D. Dueben, Torsten Hoefler, Tiago Quintino, Thomas C. Schulthess, Nils P. Wedi:
 The digital revolution of Earth-system science Nature Computational Science. Vol 1, Nr. 1, pages 104-113, Feb. 2021,
PPoPP'21
[127] Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler:
 Extracting Clean Performance Models from Tainted Programs In PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb. 2021, (acceptance rate: 21% (31/150))
DATE
[128] Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
 Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra In Proceedings of Design, Automation, and Test in Europe (DATE), 2021,
USENIX Security'21
[129] Benjamin Rothenberger, Konstantin Taranov, Adrian Perrig, Torsten Hoefler:
 ReDMArk: Bypassing RDMA Security Mechanisms In Proceedings of the 2021 USENIX Security Symposium, USENIX, 2021,
IEEE TPDS
[130] Shigang Li, Tal Ben-Nun, Giorgi Nadiradze, Salvatore Di Girolamo, Nikoli Dryden, Dan Alistarh, Torsten Hoefler:
 Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging IEEE Transactions on Parallel and Distributed Systems. Vol 32, Nr. 7, pages 1725-1739, IEEE, 2021,
TACO21
[131] Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
 Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-Accelerated Climate Simulation ACM Trans. Archit. Code Optim.. Vol 18, Nr. 4, Association for Computing Machinery, ISSN: 1544-3566, 2021,
CGO'21
[132] Johannes de Fine Licht, Andreas Kuster, Tiziano De Matteis, Tal Ben-Nun, Dominic Hofer, Torsten Hoefler:
 StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems In Proceedings of the 19th ACM/IEEE International Symposium on Code Generation and Optimization (CGO'21), 2021,
CIUK
[133] Torsten Hoefler:
 A Data-Centric Approach to Performance Portability (Presentation) presented in Virtual, Dec. 2020, Keynote talk at the Computing Insight UK 2020 Conference (CIUK'19)
SC20
[134] Maciej Besta and Marcel Schneider and Marek Konieczny and Karolina Cynk and Erik Henriksson and Salvatore Di Girolamo and Ankit Singla and Torsten Hoefler:
 FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), Nov. 2020, (acceptance rate: 25.1% (95/378))
SC20
[135] Maciej Besta and Armon Carigiet and Kacper Janda and Zur Vonarburg-Shmaria and Lukas Gianinazzi and Torsten Hoefler:
 High-Performance Parallel Graph Coloring with Strong Guarantees on Work, Depth, and Quality In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), Nov. 2020, (acceptance rate: 25.1% (95/378))
OOPSLA'20
[136] Thomas Häner, Matthias Troyer, Torsten Hoefler:
 Assertion-based optimization of quantum programs OOPSLA '20: Proceedings of the ACM international conference on Object oriented programming systems languages and applications. ACM, Nov. 2020,
OOPSLA'20
[137] Tobias Grosser, Theodoros Theodoridis, Maxmilian Falkenstein, Arjun Pitchanathan, Michael Kruse, Manuel Rigger, Zhendong Su, Torsten Hoefler:
 Fast Linear Programming through Transprecision Computing on Small and Sparse Data OOPSLA '20: Proceedings of the ACM international conference on Object oriented programming systems languages and applications. ACM, Nov. 2020,
IEEE TCAD
[138] Asif Ali Khan, Hauke Mewes, Tobias Grosser, Torsten Hoefler, Jeronimo Castrillon:
 Polyhedral Compilation for Racetrack Memories IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. Vol 39, Nr. 11, IEEE, Nov. 2020,
SC20
[139] Yuyang Jin, Haojie Wang, Teng Yu, Xiongchao Tang, Torsten Hoefler, Xu Liu, Jidong Zhai:
 SCALANA: Automating Scaling Loss Detection with Graph Analysis In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), Nov. 2020, (acceptance rate 25.1% (95/378))
Bench'20
[140] Torsten Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in virtual, Nov. 2020, Keynote talk at the 2020 BenchCouncil International Symposium on Benchmarking, Measuring and Optimizing (Bench'20)
ProTools @ SC'20
[141] Alexandru Calotoiu and Markus Geisenhofer and Florian Kummer and Marcus Ritter and Jens Weber and Torsten Hoefler and Martin Oberlack and Felix Wolf:
 Empirical Modeling of Spatially Diverging Performance In 2020 IEEE/ACM International Workshop on HPC User Support Tools (HUST) and Workshop on Programming and Performance Visualization, Nov. 2020,
SC20
[142] Tiziano De Matteis and Johannes de Fine Licht and Torsten Hoefler:
 FBLAS: Streaming Linear Algebra on FPGA In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), IEEE Press, ISBN: 9781728199986, Nov. 2020, (acceptance rate: 25.1% (95/378))
SC20
[143] Daniele De Sensi and Salvatore Di Girolamo and Kim H. McMahon and Duncan Roweth and Torsten Hoefler:
 An In-Depth Analysis of the Slingshot Interconnect In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC20), Nov. 2020, (acceptance rate: 25.1% (95/378))
HPC China
[144] Torsten Hoefler:
 General in-network processing - time is ripe! (Presentation) presented in hybrid/virtual, Oct. 2020, Keynote talk at the High-performance Interconnects Forum (in conjunction with HPC China 2020)
DISC'20
[145] Torsten Hoefler:
 High-performance distributed memory systems – from supercomputers to data centers (Presentation) presented in virtual, Oct. 2020, Keynote talk at the 2020 International Symposium on DIStributed Computing (DISC)
PACT20
[146] Lorenzo Chelini, Tobias Gysi, Tobias Grosser, Martin Kong, Henk Corporaal:
 Automatic Generation of Multi-Objective Polyhedral Compiler Transformations In Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, presented in Virtual, ACM, Oct. 2020,
arXiv
[147] Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler:
 On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization arXiv:2010.05975. Oct. 2020,
EuroMPI'20
[148] Alexandr Nigay and Lukas Mosimann and Timo Schneider and Torsten Hoefler:
 Communication and Timing Issues with MPI Virtualization In 27th European MPI Users' Group Meeting, presented in Austin, TX, USA, pages 11–20, Association for Computing Machinery, ISBN: 9781450388801, Sep. 2020,
IEEE TOC
[149] Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
 Snitch: A tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads IEEE Transactions on Computers (TOC). IEEE, Sep. 2020, Featured Paper in November 2021 issue
PVLDB'20
[150] Claude Barthels, Ingo Müller, Konstantin Taranov, Torsten Hoefler, Gustavo Alonso:
 Strong consistency is not hard to get: TwoPhase Locking and TwoPhase Commit on Thousands of Cores In Proceedings of the VLDB Endowment, Vol. 12, No. 13, VLDB Endowment, Sep. 2020,
SPAA'20
[151] Lukas Gianinazzi, Torsten Hoefler:
 Parallel Planar Subgraph Isomorphism and Vertex Connectivity In Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'20), ACM, Jul. 2020, Best Paper Finalist (5/68)
USENIX ATC'20
[152] Konstantin Taranov, Benjamin Rothenberger, Adrian Perrig, Torsten Hoefler:
 sRDMA -- Efficient NIC-based Authentication and Encryption for Remote Direct Memory Access In Proceedings of the 2020 USENIX Annual Technical Conference, USENIX, Jul. 2020, (acceptance rate 18.6%, 65/348)
HPBD&IS
[153] Torsten Hoefler:
 High-Performance Communication in Machine Learning (Presentation) presented in virtual, Jun. 2020, Keynote talk at the 2020 International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS 2020)
ESIWACE'20
[154] Torsten Hoefler:
 Deep Learning for Post-Processing Ensemble Weather Forecasts (Presentation) presented in virtual, Jun. 2020, Invited talk at the 2020 ESIWACE Workshop
DAC'20
[155] Andreas Kurth, Samuel Riedel, Florian Zaruba, Torsten Hoefler, Luca Benini:
 ATUNs: Modular and Scalable Support for Atomic Operations in a Shared Memory Multiprocessor In Proceedings of the 57th Annual Design Automation Conference, ACM, Jun. 2020, Best Paper Finalist (6/228)
CVPR'20
[156] Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry:
 Increasing batch size through instance repetition improves generalization In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2020,
IPDPS'20
[157] Maciej Besta, Raghavendra Kanakagiri, Harun Mustafa, Mikhail Karasikov, Gunnar Rätsch, Torsten Hoefler, Edgar Solomonik:
 Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons May 2020, In Proceedings of the 34th IEEE International Parallel and Distributed Processing Symposium
SuperFri
[158] Carlos Osuna, Tobias Wicky, Fabian Thuering, Torsten Hoefler, Oliver Fuhrer:
 Dawn: a High Level Domain-Specific Language Compiler Toolchain for Weather and Climate Applications Supercomputing Frontiers and Innovation. Vol 7, Nr. 2, May 2020,
IPDPS'20
[159] Marcus Ritter, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
 Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling presented in New Orleans, LA, USA, IEEE, May 2020, The 34th IEEE International Parallel & Distributed Processing Symposium (IPDPS'20)
IEEE TOC
[160] Fabian Schuiki, Florian Zaruba, Torsten Hoefler, Luca Benini:
 Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores IEEE Transactions on Computers (TOC). IEEE, Apr. 2020,
FPGA'20
[161] Johannes de Fine Licht, Grzegorz Kwasniewski, Torsten Hoefler:
 Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis Feb. 2020, In Proceedings of the 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
PPoPP'20
[162] Shigang Li, Tal Ben-Nun, Salvatore Di Girolamo, Dan Alistarh, Torsten Hoefler:
 Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations In Proceedings of the 25th Symposium on Principles and Practice of Parallel Programming (PPoPP'20), Feb. 2020, (acceptance rate: 23.1% (28/121)) Best Paper Nomination (5/28)
TRETS'20
[163] Maciej Besta, Marc Fischer, Tal Ben-Nun, Dimitri Stanojevic, Johannes de Fine Licht, Torsten Hoefler:
 Substream-Centric Maximum Matchings on FPGA Jan. 2020, In Proceedings of the ACM Trans. Reconfig. Technol. Syst Special Issue, Invited Paper
SPPEXA
[164] Alexandru Calotoiu, Marcin Copik, Torsten Hoefler, Marcus Ritter, Sergei Shudler, Felix Wolf:
 ExtraPeak: Advanced Automatic Performance Modeling for HPC Applications Springer. In Software for Exascale Computing - SPPEXA 2016-2019, pages 453–482, 2020,
BAMS
[165] Christoph Schär, Oliver Fuhrer, Andrea Arteaga, Nikolina Ban, Christophe Charpilloz, Salvatore Di Girolamo, Laureline Hentgen, Torsten Hoefler, Xavier Lapillonne, David Leutwyler, Katherine Osterried, Davide Panosetti, Stefan Rüdisühli, Linda Schlemmer, Thomas Schulthess, Michael Sprenger, Stefano Ubbiali, Heini Wernli:
 Kilometer-scale climate models: Prospects and challenges Bulletin of the American Meteorological Society. Vol 100, Nr. 12, American Meteorological Society, Dec. 2019, Early Online Release
ML4PS'19
[166] Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, Torsten Hoefler:
 Predicting Weather Uncertainty with Deep Convnets In Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS), presented in Vancouver, BC, Canada, Dec. 2019,
SC19
[167] Cedric Renggli, Dan Alistarh, Mehdi Aghagolzadeh, Torsten Hoefler:
 SparCML: High-Performance Sparse Communication for Machine Learning In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[168] Tal Ben-Nun, Johannes de Fine Licht, Alexandros Nikolaos Ziogas, Timo Schneider, Torsten Hoefler:
 Stateful Dataflow Multigraphs: A Data-Centric Model for Performance Portability on Heterogeneous Architectures In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[169] Tiziano De Matteis, Johannes de Fine Licht, Jakub Beránek, Torsten Hoefler:
 Streaming Message Interface: High-Performance Distributed Memory Programming on Reconfigurable Hardware In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[170] Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
 Optimizing the Data Movement in Quantum Transport Simulations via Data-Centric Parallel Programming In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[171] Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier, Torsten Hoefler:
 A Data-Centric Approach to Extreme-Scale Ab initio Dissipative Quantum Transport Simulations In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, Won ACM Gordon Bell Prize
SC19
[172] Daniele De Sensi, Salvatore Di Girolamo, Torsten Hoefler:
 Mitigating Network Noise on Dragonfly Networks through Application-Aware Routing In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[173] Salvatore Di Girolamo, Konstantin Taranov, Andreas Kurth, Michael Schaffner, Timo Schneider, Jakub Beránek, Maciej Besta, Luca Benini, Duncan Roweth, Torsten Hoefler:
 Network-Accelerated Non-Contiguous Memory Transfers In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344))
SC19
[174] Maciej Besta, Simon Weber, Lukas Gianinazzi, Robert Gerstenberger, Andrey Ivanov, Yishai Oltchik, Torsten Hoefler:
 Slim Graph: Practical Lossy Graph Compression for Approximate Graph Processing, Storage, and Analytics In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344)) Best Paper Finalist, Best Student Paper Finalist
MLHPC
[175] Torsten Hoefler:
 HPC for ML and ML for HPC - Scalability, Communication, and Programming (Presentation) presented in Denver, CO, USA, Nov. 2019, Keynote talk at the International Machine Learning in High-Performance Computing (MLHPC'19 in conjunction with ACM/IEEE Supercomputing, SC19)
H2RC'19
[176] Johannes de Fine Licht, Torsten Hoefler:
 hlslib: Software Engineering for Hardware Design In Fifth International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC'19), presented in Denver, CO, United States, IEEE, Nov. 2019,
SC19
[177] Grzegorz Kwasniewski and Marko Kabić and Maciej Besta and Joost VandeVondele and Raffaele Solcà and Torsten Hoefler:
 Red-Blue Pebbling Revisited: Near Optimal Parallel Matrix-Matrix Multiplication In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Nov. 2019, (acceptance rate: 22.7% (78/344)) Best Paper Finalist, SC19 Best Student Paper (1/87)
PACT'19
[178] Tobias Gysi, Tobias Grosser, Torsten Hoefler:
 Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot In Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques (PACT), presented in Seattle, WA, USA, IEEE, Sep. 2019,
PARCO
[179] Torsten Hoefler:
 Data-Centric Parallel Programming (Presentation) presented in Prague, Czech Republic, Sep. 2019, Keynote talk at the The 18th International Parallel Computing conference (ParCo'19)
PPAM
[180] Torsten Hoefler:
 High-Performance Communication in Machine Learning (Presentation) presented in Bialystok, Poland, Sep. 2019, Keynote talk at the 13th International Conference on Parallel Processing and Applied Mathematics (PPAM'19)
arXiv
[181] Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry:
 Mix & match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency Aug. 2019,
IEEE TPDS
[182] Sergei Shudler, Yannick Berens, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf:
 Engineering Algorithms for Scalability through Continuous Validation of Performance Expectations IEEE Transactions on Parallel and Distributed Systems. Vol 30, Nr. 8, pages 1768-1785, IEEE, Aug. 2019,
ACM CSUR
[183] Tal Ben-Nun, Torsten Hoefler:
 Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis ACM Comput. Surv.. Vol 52, Nr. 4, pages 65:1--65:43, ACM, ISSN: 0360-0300, Aug. 2019,
ICIAM'19
[184] Maciej Besta, Torsten Hoefler:
 Towards high-performance processing, storage, and analytics of extreme-scale graphs (Presentation) presented in Valencia, Spain, Jun. 2019, Invited talk at the 2019 International Congress on Industrial and Applied Mathematics (ICIAM'19)
PASC'19
[185] Felix Thaler, Stefan Moosbrugger, Carlos Osuna, Mauro Bianco, Hannes Vogt, Anton Afanasyev, Lukas Mosimann, Oliver Fuhrer, Thomas Schulthess, Torsten Hoefler:
 Porting the COSMO Weather Model to Intel KNL presented in Zurich, Switzerland, ACM, Jun. 2019, Accepted at the ACM Platform for Advanced Scientific Computing Conference (PASC19)
NRE'19
[186] Torsten Hoefler:
 Performance Reproducibility in HPC and Deep Learning (Presentation) presented in Frankfurt, Germany, Jun. 2019, Keynote talk at the Numerical Reproducibility at Exascale Workshop (NRE2019), ISC’19
ISC'19 ML
[187] Torsten Hoefler, Tal Ben-Nun:
 Optimizing and Benchmarking Large-Scale Deep Learning (Presentation) presented in Frankfurt, Germany, Jun. 2019, Invited talk at the Machine Learning day at the International Conference on Supercomputing (ISC'19)
GG500
[188] Torsten Hoefler:
 The Green Graph500 List (June 2019) (Presentation) presented in Frankfurt, Germany, Jun. 2019, Presented at the Green Graph 500 BoF at the International Conference on Supercomputing (ISC'19)
ISC'19
[189] Torsten Hoefler, Alexandros Nikolaos Ziogas, Tal Ben-Nun, Guillermo Indalecio Fernández, Timo Schneider, Mathieu Luisier and Johannes de Fine Licht:
 Data-Centric Parallel Programming (Presentation) presented in Frankfurt, Germany, Jun. 2019, invited talk at the International Conference on Supercomputing (ISC'19)
ICS'19
[190] Paul R. Eller, Torsten Hoefler, William Gropp:
 Using Performance Models to Understand Scalable Krylov Solver Performance at Scale for Structured Grid Problems In Proceedings of the 2019 ACM International Conference on Supercomputing (ICS'19), presented in Phoenix, AZ, ACM, Jun. 2019,
PLDI'19
[191] Tobias Gysi, Tobias Grosser, L. Brandner, Torsten Hoefler:
 A Fast Analytical Model of Fully Associative Caches In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, presented in Phoenix, AZ, USA, pages 816--829, ACM, ISBN: 978-1-4503-6712-7, Jun. 2019,
DAC'19
[192] Niels Gleinig and Frances Ann Hubis and Torsten Hoefler:
 Embedding Functions Into Reversible Circuits: A Probabilistic Approach to the Number of Lines In Proceedings of the 56th Annual Design Automation Conference, presented in Las Vegas, NV, USA, ACM, ISBN: 978-1-4503-6725-7/19/06, Jun. 2019,
IPDPS'19
[193] Salvatore Di Girolamo, P. Schmid, Thomas Schulthess, Torsten Hoefler:
 SimFS: A Simulation Data Virtualizing File System Interface In Proceedings of the 33st IEEE International Parallel & Distributed Processing Symposium (IPDPS'19), presented in Rio de Janeiro, Brazil, IEEE, May 2019,
IPDPS'19
[194] Tal Ben-Nun, Maciej Besta, S. Huber, Alexandros Nikolaos Ziogas, D. Peter, Torsten Hoefler:
 A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning IEEE, May 2019, Accepted at the 33rd IEEE International Parallel & Distributed Processing Symposium (IPDPS'19)
AsHES
[195] Torsten Hoefler:
 Performance Portability with Data-Centric Parallel Programming (Presentation) presented in Rio de Janeiro, Brasil, May 2019, Keynote talk at the The Ninth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES) (delayed online)
EMiT
[196] Torsten Hoefler:
 High-Performance Communication for Machine Learning (Presentation) presented in Huddersfield, UK, Apr. 2019, Keynote talk at the 5th Conference on Emerging Technologies – EMiT2019
HPCAC
[197] Torsten Hoefler:
 RDMA, Scalable MPI-3 RMA, and Next-Generation Post-RDMA Interconnects (Presentation) Apr. 2019, Best talk award winner at Swiss HPC Advisory Council Conference 2019
SCFE'19
[198] Torsten Hoefler:
 Extreme-Scale Graphs (Presentation) presented in Warsaw, Poland, Mar. 2019, Invited talk at Supercomputing Frontiers Europe 2019
PPoPP'19
[199] Martin Kuettler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Haertig, Amnon Barak, Torsten Hoefler:
 Corrected Trees for Reliable Group Communication Feb. 2019, Accepted at The ACM Conference Principles and Practice of Parallel Programming 2019 (PPoPP'19) (acceptance rate: 19% (29/152))
FPGA'19
[200] Maciej Besta, Marc Fischer, Tal Ben-Nun, Johannes de Fine Licht, Torsten Hoefler:
 Substream-Centric Maximum Matchings on FPGA Feb. 2019, In Proceedings of the 27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (acceptance rate: 23%) Best Paper Finalist (4/30)
arXiv
[201] Maciej Besta, Dimitri Stanojevic, Johannes de Fine Licht, Tal Ben-Nun, Torsten Hoefler:
 Graph Processing on FPGAs: Taxonomy, Survey, Challenges CoRR. Vol abs/1903.06697, Feb. 2019,
AHPC'19
[202] Torsten Hoefler:
 High-Performance Communication in Machine Learning (Presentation) presented in Grundlsee, Austria, Feb. 2019, Keynote at the Austrian HPC meeting 2019
ICL
[203] Torsten Hoefler:
 High-Performance Communication in Machine Learning (Presentation) presented in Knowville, TN, Feb. 2019,
MB3
[204] Alexandr Nigay, Timo Schneider, Torsten Hoefler:
 TinyMPI tasking prototype Feb. 2019,
TU Darmstadt
[205] Torsten Hoefler:
 An HPC Systems Guy’s View of Quantum Computing (Presentation) presented in Darstadt, Germany, Jan. 2019,
RWTH Aachen
[206] Torsten Hoefler:
 MPI Remote Memory Access Programming and Scientific Benchmarking of Parallel Codes (Presentation) presented in Aachen, Germany, Jan. 2019,
CiSE
[207] Thomas Schulthess, P. Bauer, Oliver Fuhrer, Torsten Hoefler, C. Schaer, N. Wedi:
 Reflecting on the goal and baseline for exascale computing: a roadmap based on weather and climate simulations Computing in Science and Engineering (CiSE). Vol 21, Nr. 1, IEEE Computer Society, ISSN: 1521-9615, Jan. 2019,
RWTH Aachen
[208] Torsten Hoefler:
 High-Performance Communication for Machine Learning (Presentation) presented in Aachen, Germany, Jan. 2019,
NIPS'18
[209] Tal Ben-Nun, Alice Shoshana Jakobovits, Torsten Hoefler:
 Neural Code Comprehension: A Learnable Representation of Code Semantics In Advances in Neural Information Processing Systems 31, presented in Montreal, Canada, pages 3589--3601, Curran Associates, Inc., Dec. 2018,
NIPS'18
[210] Dan Alistarh, Torsten Hoefler, Mikael Johansson, Sarit Khirirat, Nikola Konstantinov, Cedric Renggli:
 The Convergence of Sparsified Gradient Methods In Advances in Neural Information Processing Systems 31, presented in Montreal, Canada, Curran Associates, Inc., Dec. 2018,
PACT'18
[211] Maciej Besta, Dimitri Stanojevic, T. Zivic, J. Singh, M. Hoerold, Torsten Hoefler:
 Log(Graph): A Near-Optimal High-Performance Graph Representation presented in Limassol, Cyprus, ACM, Nov. 2018, Accepted at the 27th International Conference on Parallel Architectures and Compilation (PACT'18)
SC18
[212] Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, Weimin Zheng, Jingfang Xu:
 ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC18) - Gordon Bell Award Finalist, presented in Denver, CO, USA, ACM, Nov. 2018, Gordon Bell Award Finalist
IPAM UCLA
[213] Torsten Hoefler:
 Twelve ways to fool the masses when reporting performance of deep learning workloads (Presentation) presented in Los Angeles, CA, Nov. 2018, Workshop III: HPC for Computationally and Data-Intensive Problems
IPAM UCLA
[214] Torsten Hoefler:
 High-Performance Communication for Machine Learning (Presentation) presented in Los Angeles, CA, Nov. 2018, Workshop III: HPC for Computationally and Data-Intensive Problems
SC18
[215] Torsten Hoefler:
 High Level Programming Languages for Quantum Computation (Presentation) presented in Dallas, TX, USA, Nov. 2018,
SC18
[216] Torsten Hoefler:
 RDMA, Scalable MPI-3 RMA, and Next-Generation Post-RDMA Interconnects (Presentation) presented in Dallas, TX, USA, Nov. 2018, Keynote at ExaMPI 2018 Workshop (in conjunction with SC18)
SC18
[217] Torsten Hoefler:
 Will FPGAs make it this time? (Presentation) presented in Dallas, TX, USA, Nov. 2018,
SC18
[218] Torsten Hoefler:
 Deep500: An HPC Deep Learning Benchmark and Competition (Presentation) presented in Dallas, TX, USA, Nov. 2018,
CACM
[219] Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
 Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided In Communications of the ACM, ACM, Oct. 2018, Research Highlights
Cluster'18
[220] Alexandru Calotoiu, Alexander Graf, Torsten Hoefler, Daniel Lorenz, Sebastian Rinke, Felix Wolf:
 Lightweight Requirements Engineering for Exascale Co-design In {IEEE} International Conference on Cluster Computing, {CLUSTER} 2018, Belfast, UK, September 10-13, 2018, presented in Belfast, UK, IEEE, ISBN: 978-1-5386-8319-4, Sep. 2018, (28% (44/154))
Cluster'18
[221] Y. Oyama, Tal Ben-Nun, Torsten Hoefler, Satoshi Matsuoka:
 Accelerating Deep Learning Frameworks with Micro-batches In {IEEE} International Conference on Cluster Computing, {CLUSTER} 2018, Belfast, UK, September 10-13, 2018, presented in Belfast, UK, IEEE, ISBN: 978-1-5386-8319-4, Sep. 2018, (28% (44/154))
FacSum
[222] Torsten Hoefler:
 An HPC System's guy's view of Quantum Computing (Presentation) presented in Redmond, WA, Aug. 2018, Presentation at the Microsoft Faculty Summit 2018
Tsinghua
[223] Torsten Hoefler:
 Performance Modeling for Future Computing Technologies (Presentation) Jun. 2018, Invited talk at 60 years of CS @ Tsinghua celebration
GMD
[225] Oliver Fuhrer, T. Chadha, Torsten Hoefler, Grzegorz Kwasniewski, X. Lapillonne, D. Leutwyler, D. Luethi, Carlos Osuna, C. Schaer, Thomas Schulthess, Hannes Vogt:
 Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 Geoscientific Model Development. Vol 11, Nr. 4, Copernicus Publications, May 2018,
EuroSys' 18
[226] Konstantin Taranov, Gustavo Alonso, Torsten Hoefler:
 Fast and strongly-consistent per-item resilience in key-value stores ISBN: 978-1-4503-5584-1/18/04, Apr. 2018, EuroSys '18: Thirteenth EuroSys Conference 2018, April 23--26, 2018, Porto, Portugal (acceptance rate: 16% (43/262))
HPCAC
[227] Torsten Hoefler:
 Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis (Presentation) Apr. 2018, Keynote at Swiss HPC Advisory Council Conference 2018
WAMS'18
[228] Maciej Besta and Erik Henriksson and Torsten Hoefler:
 Lowering Diameter Enables Cost-Effective and High-Performance Networks (Presentation) presented in Williamsburg, USA, Mar. 2018, Presentation at the 2018 Warehouse-scale Memory Systems (WAMS) Workshop
ASPLOS'18
[229] Maciej Besta, S. M. Hassan, S. Yalamanchili, R. Ausavarungnirun, Onur Mutlu, Torsten Hoefler:
 Slim NoC: A Low-Diameter On-Chip Network Topology for High Energy Efficiency and Scalability Mar. 2018, Accepted at the 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'18)
IEEE TPDS
[230] Shigang Li, Yunquan Zhang, Torsten Hoefler:
 Cache-Oblivious MPI All-to-All Communications Based on Morton Order IEEE Transactions on Parallel and Distributed Systems. Vol 29, Nr. 3, pages 542-555, IEEE, Mar. 2018,
SOS'18
[231] Torsten Hoefler:
 Performance Portability - An Oxymoron? (Presentation) presented in Kona, HI, USA, Mar. 2018, Invited talk at SOS'18 Workshop
Multicore @ Siemens
[232] Torsten Hoefler:
 Developing high-performance software, from modeling to programming (Presentation) presented in Nuremberg, Germany, Feb. 2018, Invited opening presentation at the Multicore@Siemens conference
VMCAI
[233] Cedric Baumann, Andrei Marian Dan, Yuri Meshman, Torsten Hoefler, Martin Vechev:
 Automatic Verification of RMA Programs via Abstraction Extrapolation Springer International Publishing, Feb. 2018,
HiPINEB @ HPCA'18
[234] Torsten Hoefler:
 The three L's in modern high-performance networking: low latency, low cost, low processing load (Presentation) presented in Vienna, Austria, Feb. 2018, Keynote at the HiPINEB workshop at HPCA'18
PPoPP'18
[235] Johannes de Fine Licht, M. Blott, Torsten Hoefler:
 Designing scalable FPGA architectures using high-level synthesis In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, presented in Vienna, Austria, pages 403--404, ACM, ISBN: 978-1-4503-4982-6, Feb. 2018,
ICDE'18
[236] Ingo Mueller, Andrea Arteaga, Torsten Hoefler, Gustavo Alonso:
 Reproducible Floating-Point Aggregation in RDBMSs Feb. 2018, In Proceedings of the 2018 IEEE 34th International Conference on Data Enineering
PPoPP'18
[237] Lukas Gianinazzi, Pavel Kalvoda, Alessandro De Palma, Maciej Besta, Torsten Hoefler:
 Communication-Avoiding Parallel Minimum Cuts and Connected Components In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, presented in Vienna, Austria, pages 219-232, ACM, ISBN: 978-1-4503-4982-6, Feb. 2018, (acceptance rate: 20% (28/138))
HPCDC
[238] Torsten Hoefler, Sabela Ramos, Carlos Osuna, Felix Thaler, S. Moosbrugger, Oliver Fuhrer:
 Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL and the COSMO Weather Code (Presentation) presented in Denver, CO, Nov. 2017, Presentation at the Intel HPC Developer's Conference 2017
SC17
[239] Torsten Hoefler, Salvatore Di Girolamo, Konstantin Taranov, R. E. Grant, Ron Brightwell:
 sPIN: High-performance streaming Processing in the Network In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC17), Nov. 2017, (acceptance rate: 18% (61/327)) Best Paper Finalist at SC17 (5/61)
SC17
[240] Edgar Solomonik, Maciej Besta, F. Vella, Torsten Hoefler:
 Scaling Betweenness Centrality using Communication-Efficient Sparse Matrix Multiplication Nov. 2017, Accepted at The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'17) (acceptance rate: 18% (61/327))
IEEE TPDS
[241] Didem Unat, Anshu Dubey, Torsten Hoefler, John Shalf, Mark Abraham, Mauro Bianco, Bradford L. Chamberlain, Romain Cledat, H. Carter Edwards, Hal Finkel, Karl Fuerlinger, Frank Hannig, Emmanuel Jeannot, Amir Kamil, Jeff Keasler, Paul H J Kelly, Vitus Leung, Hatem Ltaief, Naoya Maruyama, Chris J. Newburn, Miquel Pericas:
 Trends in Data Locality Abstractions for HPC Systems IEEE Transactions on Parallel and Distributed Systems. Vol 28, Nr. 10, pages 3007-3020, IEEE, Oct. 2017,
Co-Design
[242] Torsten Hoefler, Sabela Ramos, Tal Ben-Nun:
 HPC Performance Optimization Advances at Extreme Scale (Presentation) presented in Hefei, China, Oct. 2017, Invited talk at the Co-Design workshop (HPC China 2017)
25 Years MPI
[243] Torsten Hoefler:
 A View on MPI's Recent Past, Present, and Future (Presentation) presented in Chicago, IL, Sep. 2017, Invited talk at 25 Years of MPI Symposium
VLDB'17
[244] C. Barthels, Timo Schneider, Ingo Mueller, Gustavo Alonso, Torsten Hoefler:
 Distributed Join Algorithms on Thousands of Cores Vol 10, Nr. 5, In Proc. VLDB Endow., presented in Munich, Germany, pages 517--528, VLDB Endowment, ISSN: 2150-8097, Aug. 2017,
HOTI'17
[245] Timo Schneider, J. Dinan, M. Flajslik, K. D. Underwood, and Torsten Hoefler:
 Fast Networks and Slow Memories: A Mechanism for Mitigating Bandwidth Mismatches In Proceedings of the 25th Annual Symposium on High-Performance Interconnects (HOTI'17), Aug. 2017,
HOTI'17
[246] P. Yebenes, J. Escudero-Sahuquillo, P. J. Garcia, F. J. Quiles, Torsten Hoefler:
 Improving Non-Minimal and Adaptive Routing Algorithms in Slim Fly Networks In Proceedings of the 25th Annual Symposium on High-Performance Interconnects (HOTI'17), Aug. 2017, Best Student Paper at HOTI'17
ICCS'17
[247] Andrea Arteaga, Oliver Fuhrer, Torsten Hoefler, Thomas Schulthess:
 Model-Driven Choice of Numerical Methods for the Solution of the Linear Advection Equation In Proceedings of the International Conference on Computational Science (ICCS'17), presented in Zurich, Switzerland, Elsevier, Jun. 2017,
Uni Saarland
[248] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Orlando, FL, Jun. 2017, Invited talk at IPDRM Workshop (IPDPS'17)
IPDRM
[249] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Orlando, FL, Jun. 2017, Invited talk at IPDRM Workshop (IPDPS'17)
EMBRACE
[250] Torsten Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in Orlando, FL, Jun. 2017, Keynote talk at EMBRACE Workshop (IPDPS'17)
HPDC'17
[251] Maciej Besta, M. Podstawski, L. Groner, Edgar Solomonik, Torsten Hoefler:
 To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'17), presented in Washington, DC, USA, ACM, Jun. 2017, (acceptance rate: 19%)
HPDC'17
[252] Marius Poke, Torsten Hoefler, C. W. Glass:
 AllConcur: Leaderless Concurrent Atomic Broadcast presented in Washington, DC, USA, ACM, Jun. 2017, (acceptance rate: 19%)
SPAA'17
[253] Edgar Solomonik, Grey Ballard, James Demmel, Torsten Hoefler:
 A Communication-Avoiding Parallel Algorithm for the Symmetric Eigenvalue Problem Nr. 11, In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'17), presented in Washington, DC, USA, pages 111--121, ACM, ISBN: 978-1-4503-4593-4, Jun. 2017,
IPDPS'17
[254] Sabela Ramos and Torsten Hoefler:
 Capability Models for Manycore Memory Systems: A Case-Study with Xeon Phi KNL In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)
CIAC'17
[255] K. T. Foerster, L. Groner, Torsten Hoefler, M. Koenig, S. Schmid, R. Wattenhofer:
 Multi-agent Pathfinding with n Agents on Graphs with n Vertices: Combinatorial Classification and Tight Algorithmic Bounds In Algorithms and Complexity - 10th International Conference, {CIAC} 2017, Athens, Greece, May 24-26, 2017, Proceedings, presented in Athens, Greece, May 2017,
IPDPS'17
[256] Torsten Hoefler, Amnon Barak, A. Shiloh and Z. Drezner:
 Corrected Gossip Algorithms for Fast Reliable Broadcast on Unreliable Systems In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)
IPDPS'17
[257] Maciej Besta, F. Marending, Edgar Solomonik, Torsten Hoefler:
 SlimSell: A Vectorized Graph Representation for Breadth-First Search In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)
IPDPS'17
[258] T. Wicky, Edgar Solomonik and Torsten Hoefler:
 Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)
IPDPS'17
[259] Salvatore Di Girolamo, F. Vella and Torsten Hoefler:
 Transparent Caching for RMA Systems In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)
TCDE
[261] C. Barthels, Gustavo Alonso, Torsten Hoefler:
 Designing Databases for Future High-Performance Networks IEEE Technical Committee on Data Engineering. Vol 40, Nr. 1, IEEE, Mar. 2017,
HiPINEB
[262] Torsten Hoefler:
 HiPINEB Panel (highly exaggerated) (Presentation) presented in Austin, TX, USA, Feb. 2017,
FAU
[263] Torsten Hoefler:
 Accelerating weather and climate simulations on heterogeneous architectures (Presentation) presented in Erlangen, Germany, Feb. 2017, Colloquium at the Friedrich-Alexander-Universitaet Erlangen-Nuernberg
UT Austin
[264] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Austin, TX, Feb. 2017, Seminar at University of texas Austin
ARM
[265] Torsten Hoefler:
 High-Performance Distributed RMA Locks (Presentation) presented in Austin, TX, Feb. 2017, Seminar at ARM Research
PPoPP'17
[266] Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Felix Wolf:
 Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications In Proceedings of the 22nd ACM SIGPLAN symposium on Principles and practice of parallel programming, presented in College Station, TX, ACM, Feb. 2017, (acceptance rate: 21%, 29/139)
ICT/CAS
[267] Torsten Hoefler:
 Accelerating weather and climate simulations on heterogeneous architectures (Presentation) presented in Beijing, China, Jan. 2017, Distinguished colloquium at the Institute of Computing Technology at the Chinese Academy of Sciences, Beijing, China Distinguished colloquium
ANL
[268] Torsten Hoefler:
 High-Performance Distributed RMA Locks (Presentation) presented in Champaign, IL, Jan. 2017, Seminar at University of Illinois at Urbana-Champaign/NCSA
UIUC
[269] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Champaign, IL, Jan. 2017, Seminar at University of Illinois at Urbana-Champaign/NCSA
Tsinghua
[270] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Beijing, China, Jan. 2017, Seminar at Tsinghua University, Beijing, China
SC16
[271] M. Martinasso, Grzegorz Kwasniewski, S. R. Alam, Thomas Schulthess, Torsten Hoefler:
 A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 63:1--63:11, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
SC16
[272] W. Tang, B. Wang, S. Ethier, Grzegorz Kwasniewski, Torsten Hoefler, K. Z. Ibrahim, K. Madduri, S. Williams, Leonid Oliker, C. Rosales-Fernandez, T. Williams:
 Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 43:1--43:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
LLVM-HPC'16
[273] Torsten Hoefler:
 Polly-ACC: Transparent Compilation to Heterogeneous Hardware. (Presentation) presented in Salt Lake City, UT, Nov. 2016, Invited talk at the LLVM-HPC workshop and TiTech Booth at SC16
SC16
[274] Jens Domke, Torsten Hoefler:
 Scheduling-Aware Routing for Supercomputers Nov. 2016, Accepted at The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'16) (acceptance rate: 18% (82/446))
SC16
[275] Tobias Gysi, J. Baer, Torsten Hoefler:
 dCUDA: Hardware Supported Overlap of Computation and Communication In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 52:1--52:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))
OOPSLA'16
[276] Andrei Marian Dan, Patrick Lam, Torsten Hoefler, Martin Vechev:
 Modeling and Analysis of Remote Memory Access Programming In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, presented in Amsterdam, Netherlands, pages 129--144, ACM, ISBN: 978-1-4503-4444-9, Nov. 2016, Outstanding Paper Award at OOPSLA'16 (4/52)
IEEE TPDS
[277] Sabela Ramos, Torsten Hoefler:
 Cache Line Aware Algorithm Design for Cache-Coherent Architectures IEEE Transactions on Parallel and Distributed Systems. Vol 27, Nr. 10, pages 2824-2837, IEEE, Oct. 2016,
CCDSC'16
[278] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Lyon, France, Oct. 2016, Invited talk at the CCDSC meeting
CoDesign'16
[279] Torsten Hoefler:
 Accelerating weather and climate simulations on heterogeneous architectures (Presentation) presented in Xi'an, China, Oct. 2016, Invited talk at the CoDesign Meeting at HPC China 2016
HPC China'16
[280] Torsten Hoefler:
 Theory and Practice in HPC: Modeling, Programming, and Networking (Presentation) presented in Xi'an, China, Oct. 2016, Keynote talk at HPC China 2016
Cluster'16
[281] Alexandru Calotoiu, D. Beckingsale, C. W. Earl, Torsten Hoefler, I. Karlin, M. Schulz, Felix Wolf:
 Fast Multi-Parameter Performance Modeling Oct. 2016, Accepted at IEEE International Conference on Cluster Computing (Cluster'16) (acceptance rate: 24% (39/162))
Wuxi'16
[282] Torsten Hoefler:
 High-Performance Distributed RMA Locks (Presentation) presented in Wuxi, China, Sep. 2016, Seminar talk at Intl. Workshop on High-Performance Systems
Guangzhou'16
[283] Torsten Hoefler:
 MODESTO: Data-centric Analytic Optimization of Complex Stencil Programs on Heterogeneous Architectures (Presentation) presented in Guangzhou, China, Sep. 2016, Seminar talk at Intl. Workshop on High-Performance Systems
Cluster'16
[284] Torsten Hoefler:
 Theory and Practice in HPC: Modeling, Programming, and Networking (Presentation) presented in Taipei, Taiwan, Sep. 2016, Opening keynote talk at IEEE Cluster 2016
HP
[285] Torsten Hoefler:
 Towards scalable RDMA locking on a NIC (Presentation) presented in Palo Alto, CA, USA, Aug. 2016,
HOTI'16
[286] Timo Schneider, O. Bibartiu, Torsten Hoefler:
 Ensuring Deadlock-Freedom in Low-Diameter InfiniBand Networks In Proceedings of the 24th Annual Symposium on High-Performance Interconnects (HOTI'16), Aug. 2016, Best Student Paper at HOTI'16
UTK
[287] Torsten Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in Knoxville, TN, USA, Aug. 2016,
HotI'16
[288] Torsten Hoefler:
 Network topologies for large-scale compute centers: It's the diameter, stupid! (Presentation) presented in San Jose, CA, USA, Aug. 2016, Invited talk at the IEEE Hot Interconnects 2016
IEEE MICRO
[289] Salvatore Di Girolamo, P. Jolivet, K. D. Underwood, Torsten Hoefler:
 Exploiting Offload Enabled Network Interfaces IEEE MICRO. Vol 36, Nr. 4, IEEE, Jul. 2016,
ICS'16
[290] Tobias Grosser, Torsten Hoefler:
 Polly-ACC: Transparent compilation to heterogeneous hardware In Proceedings of the the 30th International Conference on Supercomputing (ICS'16), Jun. 2016, (acceptance rate: 24% (43/178))
HPDC'16
[291] Jens Domke, Torsten Hoefler, Satoshi Matsuoka:
 Routing on the Dependency Graph: A New Approach to Deadlock-Free High-Performance Routing In Proceedings of the 25th Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Jun. 2016, (acceptance rate: 16% (20/129))
HPDC'16
[292] P. Schmid, Maciej Besta, Torsten Hoefler:
 High-Performance Distributed RMA Locks In Proceedings of the 25th Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Jun. 2016, (acceptance rate: 16% (20/129)) Karsten Schwan Best Paper Award at HPDC'16 (1/20)
Cetraro'16
[293] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Cetraro, Italy, Jun. 2016, Invited talk at the Cetraro HPC conference
PASC'16
[294] Torsten Hoefler:
 Selecting Technical Papers for an Interdisciplinary Conference: The PASC Review Process In Proceedings of the 3rd Platform of Advanced Scientific Computing Conference (PASC'16), Jun. 2016,
ISC'16
[295] Torsten Hoefler:
 An Overview of Static & Dynamic Techniques for Automatic Performance Modeling (Presentation) presented in Frankfurt, Germany, Jun. 2016, Invited talk at International Supercomputing Conference
ISC'16
[296] Torsten Hoefler:
 The Eigth Green Graph500 (Presentation) presented in Frankfurt, Germany, Jun. 2016,
Technion'16
[297] Torsten Hoefler:
 Progress in automatic GPU compilation and why you want to run MPI on your GPU. (Presentation) presented in Haifa, Israel, Jun. 2016, Seminar talk at Israel Institute of Technology (Technion)
Salishan
[298] Torsten Hoefler:
 Active RDMA - new tricks for an old dog (Presentation) presented in Gleneden Beach, OR, USA, Apr. 2016, Invited talk at Salishan Meeting
HLRS
[299] Torsten Hoefler:
 Scientific Benchmarking of Parallel Computing Systems (Presentation) presented in Stuttgart, Germany, Apr. 2016,
IJHPCA
[300] P. M. Widener, S. Levy, K. B. Ferreira, Torsten Hoefler:
 On noise and the performance benefit of nonblocking collectives The International Journal of High Performance Computing Applications. Vol 30, Nr. 1, pages 121-133, Sage, ISSN: 1094-3420, Jan. 2016, accepted for publication on Nov. 2nd 2015
SC15
[301] G. Kathareios, C. Minkenberg, B. Prisacari, G. Rodriguez, Torsten Hoefler:
 Cost-Effective Diameter-Two Topologies: Analysis and Evaluation presented in Austin, TX, USA, ACM, ISBN: 978-1-4503-3723-6, Nov. 2015, In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) (acceptance rate: 22%, 79/358)
SC15
[302] Torsten Hoefler, Roberto Belli:
 Scientific Benchmarking of Parallel Computing Systems presented in Austin, TX, USA, pages 73:1--73:12, ACM, ISBN: 978-1-4503-3723-6, Nov. 2015, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) (acceptance rate: 22%, 79/358)
Tsinghua
[303] Torsten Hoefler:
 Remote Memory Access Programming: Faster Parallel Computing Without Messages (Presentation) presented in Tsinghua University, Beijing, China, Nov. 2015,
SC15
[304] Torsten Hoefler:
 The Seventh Green Graph500 (Presentation) presented in Austin, TX, USA, Nov. 2015,
SC15
[305] Torsten Hoefler:
 Performance Reproducibility Birds of a Feather (Presentation) presented in Austin, TX, USA, Nov. 2015,
CoDesign
[306] Torsten Hoefler:
 Automatic Performance Models for the Masses: Static and dynamic techniques for application performance modeling (Presentation) presented in CoDesign Workshop, Wuxi, China, Nov. 2015,
PACT'15
[307] H. Schweizer, Maciej Besta, Torsten Hoefler:
 Evaluating the Cost of Atomic Operations on Modern Architectures presented in San Francisco, CA, USA, ACM, Oct. 2015, Accepted at the 24th International Conference on Parallel Architectures and Compilation (PACT'15) (acceptance rate: 21%, 38/179)
PACT'15
[308] A. Bhattacharyya, Grzegorz Kwasniewski, Torsten Hoefler:
 Using Compiler Techniques to Improve Automatic Performance Modeling presented in San Francisco, CA, USA, ACM, Oct. 2015, Accepted at the 24th International Conference on Parallel Architectures and Compilation (PACT'15) (acceptance rate: 21%, 38/179)
HOTI'15
[309] Salvatore Di Girolamo, P. Jolivet, K. D. Underwood, Torsten Hoefler:
 Exploiting Offload Enabled Network Interfaces In Proceedings of the 23rd Annual Symposium on High-Performance Interconnects (HOTI'15), presented in Oracle Santa Clara Campus, CA, USA, IEEE, Aug. 2015, Best Student Paper at HOTI'15
ISC'15
[310] Torsten Hoefler:
 The Sixth Green Graph500 List (Presentation) presented in Frankfurt, Germany, Jul. 2015,
UChicago
[311] Torsten Hoefler:
 Towards Remote Memory Access Programming for Data Analytics (Presentation) presented in Chicago, IL, USA, Jul. 2015,
ICS'15
[313] Tobias Gysi, Tobias Grosser, Torsten Hoefler:
 MODESTO: Data-centric Analytic Optimization of Complex Stencil Programs on Heterogeneous Architectures In Proceedings of the 29th International Conference on Supercomputing (ICS'15), presented in Newport Beach, CA, USA, pages 177--186, ACM, ISBN: 978-1-4503-3559-1, Jun. 2015, (acceptance rate: 25% (40/160))
ICS'15
[314] Maciej Besta, Torsten Hoefler:
 Active Access: A Mechanism for High-Performance Distributed Data-Centric Computations In Proceedings of the 29th International Conference on Supercomputing (ICS'15), presented in Newport Beach, CA, USA, pages 155--164, ACM, ISBN: 978-1-4503-3559-1, Jun. 2015, (acceptance rate: 25% (40/160))
HPDC'15
[315] Maciej Besta, Torsten Hoefler:
 Accelerating Irregular Computations with Hardware Transactional Memory and Active Messages In Proceedings of the 24th Symposium on High-Performance Parallel and Distributed Computing (HPDC'15), presented in Portland, OR, USA, pages 161--172, ACM, ISBN: 978-1-4503-3550-8, Jun. 2015, (acceptance rate: 16% (19/116)) Best Paper at HPDC'15 (1/19)
HPDC'15
[316] Sabela Ramos, Torsten Hoefler:
 Cache Line Aware Optimizations for ccNUMA Systems In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'15) (short paper), presented in Portland, OR, USA, pages 85--88, ACM, ISBN: 978-1-4503-3550-8, Jun. 2015,
LBNL
[317] Torsten Hoefler:
 Towards Remote Memory Access Programming for Data Analytics (Presentation) presented in Berkeley, CA, USA, Jun. 2015,
UCSD
[318] Torsten Hoefler:
 Remote Memory Access Programming: Faster Parallel Computing Without Messages (Presentation) presented in San Diego, CA, USA, Jun. 2015,
HP
[319] Torsten Hoefler:
 Efficient networking and programming of large-scale computing systems (Presentation) presented in Palo Alto, CA, USA, Jun. 2015,
ICS'15
[320] Sergei Shudler, Alexandru Calotoiu, Torsten Hoefler, Alexandre Strube, Felix Wolf:
 Exascaling Your Library: Will Your Implementation Meet Your Expectations? In Proceedings of the 29th International Conference on Supercomputing (ICS'15), presented in Newport Beach, CA, USA, pages 161--175, ACM, ISBN: 978-1-4503-3559-1, Jun. 2015, (acceptance rate: 25% (40/160))
GATech
[321] Torsten Hoefler:
 Remote Memory Access Programming: Faster Parallel Computing Without Messages (Presentation) presented in Atlanta, GA, USA, Jun. 2015,
HPDC'15
[323] Marius Poke, Torsten Hoefler:
 DARE: High-Performance State Machine Replication on RDMA Networks In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'15), presented in Portland, OR, USA, pages 107--118, ACM, ISBN: 978-1-4503-3550-8, Jun. 2015, (acceptance rate: 16% (19/116))
IPDPS'15
[324] Roberto Belli, Torsten Hoefler:
 Notified Access: Extending Remote Memory Access Programming Models for Producer-Consumer Synchronization In Proceedings of the 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS'15), presented in Hyderabad, India, IEEE, May 2015, (acceptance rate: 21,8%, 108/496) Best Paper at IPDPS'15 (4/108)
HotOS XV
[325] Torsten Hoefler, R. Ross, T. Roscoe:
 Distributing the Data Plane for Remote Storage Access presented in Kartause Ittingen, Switzerland, USENIX, May 2015, Proceedings of the 15th Workshop on Hot Topics in Operating Systems (acceptance rate: 32% (29/90))
Simula
[326] Torsten Hoefler, Jens Domke:
 Fail-in-Place Network Design presented in Oslo, Norway, May 2015,
HIPS/LSPP
[327] Torsten Hoefler:
 How fast will your application go? Static and dynamic techniques for application performance modeling. (Presentation) presented in Hyderabad, India, May 2015, Keynote talk at HIPS'15/LSPP'15 in conjuntion with IPDPS'15
CFI'15
[329] T. Lee, C. Pappas, C. Basescu, J. Han, Torsten Hoefler, A. Perrig:
 Source-Based Path Selection: The Data Plane Perspective In Proceedings of the 10th International Conference on Future Internet, presented in Seoul, Republic of Korea, pages 41--45, ACM, ISBN: 978-1-4503-3564-5, May 2015,
ACM TOPC
[330] Torsten Hoefler, J. Dinan, Rajeev Thakur, Brian Barrett, P. Balaji, William Gropp, K. Underwood:
 Remote Memory Access Programming in MPI-3 ACM Transactions on Parallel Computing (TOPC). ACM, Jan. 2015, accepted for publication on Dec. 4th
FFMK
[331] Torsten Hoefler:
 Resilience Overheads at Scale and Scalability (Presentation) presented in Dresden, Germany, Dec. 2014,
Adv MPI
[332] William Gropp, Torsten Hoefler, Rajeev Thakur, E. Lusk:
 Using Advanced MPI: Modern Features of the Message-Passing Interface presented in Cambridge, MA, MIT Press, ISBN: 978-0262527637, Nov. 2014,
SC14
[333] Jens Domke, Torsten Hoefler, Satoshi Matsuoka:
 Fail-in-Place Network Design: Interaction between Topology, Routing Algorithm and Failures presented in New Orleans, LA, USA, Nov. 2014, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)
SC14
[334] K. B. Ferreira, P. Widener, S. Levy, D. Arnold, Torsten Hoefler:
 Understanding the Effects of Communication and Coordination on Checkpointing at Scale presented in New Orleans, LA, USA, Nov. 2014, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)
SC14
[335] Maciej Besta, Torsten Hoefler:
 Slim Fly: A Cost Effective Low-Diameter Network Topology presented in New Orleans, LA, USA, Nov. 2014, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394) SC14 Best Student Paper (1/82)
SC14
[336] Torsten Hoefler:
 LogGOPSim - Simple and Fast Large-Scale Simulations (Presentation) presented in New Orleans, Louisiana, USA, Nov. 2014,
SC14
[337] Torsten Hoefler:
 The Fourth Green Graph500 List (Presentation) presented in New Orleans, Louisiana, USA, Nov. 2014,
SC14
[338] Torsten Hoefler:
 IA3 Panel: HPC vs. Irregular Applications (Presentation) presented in New Orleans, Louisiana, USA, Nov. 2014,
SC14
[339] Torsten Hoefler:
 A case for runtime recompilation in HPC (Presentation) presented in New Orleans, Louisiana, USA, Nov. 2014, The LLVM Compiler Infrastructure in HPC, Keynote Presentation
SC14
[340] Torsten Hoefler:
 What about MPI + LLVM? (Presentation) presented in New Orleans, Louisiana, USA, Nov. 2014,
JSFI
[341] Torsten Hoefler, D. Moor:
 Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations Journal of Supercomputing Frontiers and Innovations. Vol 1, Nr. 2, pages 58--75, SuperFri Open Journal, Oct. 2014,
EuroMPI'14
[342] P. Widener, K. Ferreira, S. Levy, Torsten Hoefler:
 Exploring the effect of noise on the performance benefit of nonblocking allreduce In Proceedings of the 21st European MPI Users' Group Meeting, presented in Kyoto, Japan, pages 77:77--77:82, ACM, ISBN: 978-1-4503-2875-3, Sep. 2014, Invited to a journal special issue on top picks from EuroMPI'14.
PACT'14
[343] A. Bhattacharyya, Torsten Hoefler:
 PEMOGEN: Automatic Adaptive Performance Modeling During Program Runtime In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation (PACT'14), presented in Edmonton, Alberta, Canada, pages 393-404, ACM, ISBN: 978-1-4503-2809-8, Aug. 2014,
MSU
[344] Torsten Hoefler:
 Remote Memory Access Programming - Tools and Fault Tolerance (Presentation) presented in Moscow, Russia, Jul. 2014,
HPDC'14
[345] B. Prisacari, G. Rodriguez, P. Heidelberger, D. Chen, C. Minkenberg, Torsten Hoefler:
 Efficient Task Placement and Routing in Dragonfly Networks In Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), presented in Vancouver, Canada, ACM, Jun. 2014, (acceptance rate: 16%, 21/130)
ISC'14
[346] Torsten Hoefler:
 Using Simulation to Evaluate the Performance of Resilience Strategies at Scale (Presentation) In ISC workshop on International Cooperation, presented in Leipzig, Germany, Jun. 2014,
ISC'14
[347] Torsten Hoefler:
 The Green Graph500 List (Presentation) presented in Leipzig, Germany, Jun. 2014,
SPAA'14
[348] Torsten Hoefler, Grzegorz Kwasniewski:
 Automatic Complexity Analysis of Explicitly Parallel Programs In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'14), presented in Prague, Czech Republic, ACM, Jun. 2014, (acceptance rate: 25%, 30/122)
HPDC'14
[349] Maciej Besta, Torsten Hoefler:
 Fault Tolerance for Remote Memory Access Programming Models In Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), presented in Vancouver, Canada, ACM, Jun. 2014, (acceptance rate: 16%, 21/130) Best Paper Nominee at HPDC'14 (3/21)
Computing
[350] Timo Schneider, Robert Gerstenberger, Torsten Hoefler:
 Application-oriented ping-pong benchmarking: how to assess the real communication overheads Journal of Computing. Vol 96, Nr. 4, pages 279-292, Springer Vienna, ISSN: 0010-485X, Apr. 2014, Special issue on top picks from EuroMPI'12.
IPDPS'14
[351] Andrea Arteaga, Oliver Fuhrer, Torsten Hoefler:
 Designing Bit-Reproducible Portable High-Performance Applications In Proceedings of the 28th IEEE International Parallel and Distributed Processing Symposium (IPDPS), presented in Phoenix, AR, USA, IEEE Computer Society, Apr. 2014, (acceptance rate: 21.1%, 114/541)
PADAL
[352] Didem Unat, John Shalf, Torsten Hoefler, Thomas Schulthess, Anshu Dubey (Editors), Maciej Besta, and others:
 Programming Abstractions for Data Locality Technical Report. presented in Lugano, Switzerland, Apr. 2014,
Cluster Computing
[353] Shigang Li, Torsten Hoefler, C. Hu, Marc Snir:
 Improved MPI collectives for MPI processes in shared address spaces Journal of Cluster Computing. pages 1-17, Springer US, ISSN: 1386-7857, Mar. 2014,
Euro-Par'14
[354] Felix Wolf, Christian Bischof, Torsten Hoefler, Bernd Mohr, Gabriel Wittum, Alexandru Calotoiu, Christian Iwainsky, Alexandre Strube, Andreas Vogel:
 Catwalk: A Quick Development Path for Performance Models Springer. In Euro-Par 2014: Parallel Processing Workshops, pages 589-600, 2014,
ACM TACO
[355] B. Prisacari, G. Rodriguez, C. Minkenberg, Torsten Hoefler:
 Fast Pattern-Specific Routing for Fat Tree Networks ACM Transactions on Architecture and Code Optimization. Vol 10, Nr. 4, presented in New York, NY, USA, pages 36:1--36:25, ACM, ISSN: 1544-3566, Dec. 2013, (acceptance rate: 24% (2011))
SC13
[356] Robert Gerstenberger, Maciej Besta, Torsten Hoefler:
 Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, presented in Denver, Colorado, USA, pages 53:1--53:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457) Best Student Paper Finalist (8/92) and SC13 Best Paper (1/92)
SC13
[357] Alexandru Calotoiu, Torsten Hoefler, Marius Poke, Felix Wolf:
 Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 45:1--45:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)
SC13
[358] Torsten Hoefler:
 The Green Graph500 List (Presentation) presented in Denver, Colorado, Nov. 2013,
SC13
[359] A. Friedley, G. Bronevetsky, Andrew Lumsdaine, Torsten Hoefler:
 Hybrid MPI: Efficient Message Passing for Multi-core Systems In IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 18:1--18:11, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)
PMBS'13
[360] S. Levy, B. Topp, K. Ferreira, D. Arnold, Torsten Hoefler, P. Widener:
 Using Simulation to Evaluate the Performance of Resilience Strategies at Scale presented in Denver, CO, USA, Nov. 2013, Proceedings of the 4th International Workshop in Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS13)
ExaMPI'13
[361] Torsten Hoefler:
 MPI Beyond 3.0 and Towards Larger-Scale Computing (Presentation) presented in Denver, CO, USA, Nov. 2013, Keynote at ExaMPI 2013 Workshop (in conjunction with SC13)
ICPP'13
[362] Timo Schneider, Torsten Hoefler, R. Grant, Brian Barrett, Ron Brightwell:
 Protocols for Fully Offloaded Collective Operations on Accelerated Network Adapters In Parallel Processing (ICPP), 2013 42nd International Conference on, presented in Lyon, France, pages 593-602, ISSN: 0190-3918, Oct. 2013,
LCPC'13
[363] Timo Schneider, Robert Gerstenberger, Torsten Hoefler:
 Compiler Optimizations for Non-Contiguous Remote Data Movement presented in Santa Clara, CA, USA, Sep. 2013, Proceedings of the 26th International Workshop on Languages and Compilers for Parallel Computing
EuroMPI'13
[364] Timo Schneider and F. Kjolstad and Torsten Hoefler:
 MPI Datatype Processing using Runtime Compilation In Proceedings of the 20th European MPI Users' Group Meeting, presented in Madrid, Spain, pages 19--24, ACM, ISBN: 978-1-4503-1903-4, Sep. 2013, Best Paper Award at EuroMPI'13 (1/25)
ICS'13
[365] B. Prisacari, G. Rodriguez, C. Minkenberg and Torsten Hoefler:
 Bandwidth-optimal All-to-all Exchanges in Fat Tree Networks In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, presented in Eugene, OR, USA, pages 139--148, ACM, ISBN: 978-1-4503-2130-3, Jun. 2013, (acceptance rate: 21%, 41/198)
ISC'13
[366] Torsten Hoefler:
 The Green Graph500 List (Presentation) presented in Leipzig, Germany, Jun. 2013,
HPDC'13
[367] Shigang Li, Torsten Hoefler and Marc Snir:
 NUMA-Aware Shared Memory Collective Communication for MPI In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, presented in New York City, NY, USA, pages 85--96, ACM, ISBN: 978-1-4503-1910-2, Jun. 2013, (acceptance rate: 15%, 20/131) Nominated for Best Paper Award at HPDC'13 (3/20)
HPDC'13
[368] Sabela Ramos and Torsten Hoefler:
 Modeling Communication in Cache-Coherent SMP Systems - A Case-Study with Xeon Phi In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, presented in New York City, NY, USA, pages 97--108, ACM, ISBN: 978-1-4503-1910-2, Jun. 2013, (acceptance rate: 15%, 20/131)
Computing
[369] Torsten Hoefler, J. Dinan, D. Buntinas, P. Balaji, Brian Barrett, Ron Brightwell, William Gropp, V. Kale and Rajeev Thakur:
 MPI + MPI: a new hybrid approach to parallel programming with MPI plus shared memory Journal of Computing. Springer, May 2013, doi: 10.1007/s00607-013-0324-2
EASC'13
[370] Torsten Hoefler:
 Application-Centric Benchmarking and Modeling for Co-Design (Presentation) presented in Edinburgh, Great Britain, Apr. 2013, Presented at the Exascale Applications and Software Conference (EASC'13)
TR
[371] Sabela Ramos and Torsten Hoefler:
 Modelling Communications in Cache Coherent Systems Technical Report. SPCL, ETH Zurich. presented in Zurich, Switzerland, Feb. 2013,
PPoPP'13
[372] A. Friedley, Torsten Hoefler, G. Bronevetsky, Andrew Lumsdaine:
 Ownership Passing: Efficient Distributed Memory Programming on Multi-core Systems In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, presented in Shenzen, China, pages 177--186, ACM, ISBN: 978-1-4503-1922-5, Feb. 2013, (acceptance rate: 18%, 26/146)
SC12
[373] Torsten Hoefler, Timo Schneider:
 Optimization Principles for Collective Neighborhood Communications In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, presented in Salt Lake City, Utah, USA, pages 98:1--98:10, IEEE Computer Society Press, ISBN: 978-1-4673-0804-5, Nov. 2012, (acceptance rate: 21%, 100/472)
MPI-3.0
[374] Message Passing Interface Forum:
 MPI: A Message-Passing Interface Standard Version 3.0 Sep. 2012, Chapter author for Collective Communication, Process Topologies, and One Sided Communications
Cluster'12
[375] Simone Pellegrini, Torsten Hoefler, T. Fahringer:
 On the Effects of CPU Caches on MPI Point-to-Point Communications In Proceedings of the 2012 IEEE International Conference on Cluster Computing, presented in Beijing, China, pages 495--503, IEEE Computer Society, ISBN: 978-0-7695-4807-4, Sep. 2012, (acceptance rate: 28.9%, 58/200)
PACT'12
[376] Torsten Hoefler, Timo Schneider:
 Runtime Detection and Optimization of Collective Communication Patterns In Proceedings of the 21st international conference on Parallel Architectures and Compilation Techniques (PACT), presented in Minneapolis, MN, USA, pages 263--272, ACM, ISBN: 978-1-4503-1182-3, Sep. 2012, (acceptance rate: 18.9%, 39/207)
EuroMPI'12
[377] Torsten Hoefler, J. Dinan, D. Buntinas, P. Balaji, Brian Barrett, Ron Brightwell, William Gropp, V. Kale, Rajeev Thakur:
 Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to journal special issue on top picks from EuroMPI'12.
EuroMPI'12
[378] Simone Pellegrini, Torsten Hoefler, T. Fahringer:
 Exact Dependence Analysis for Increased Communication Overlap In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012,
EuroMPI'12
[379] Timo Schneider, Robert Gerstenberger, Torsten Hoefler:
 Micro-Applications for Communication Data Access Patterns and MPI Datatypes Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, pages 121-131, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to a journal special issue on top picks from EuroMPI'12.
MCC'12
[380] Torsten Hoefler:
 MPI-3.0: A Response to New Challenges in Hardware and Software (Presentation) presented in Stuttgart, Germany, Sep. 2012, Keynote at Multicore Challenge 2012
ISC'12
[381] Torsten Hoefler:
 The Green Graph500 (Presentation) presented in Hamburg, Germany, Jul. 2012,
TiTech'12
[382] Torsten Hoefler:
 Optimized routing and process mapping for arbitrary network topologies (Presentation) presented in Tokyo, Japan, Jun. 2012, Tokyo Institute of Technology
CCGrid'12
[383] G. Bauer, S. Gottlieb and Torsten Hoefler:
 Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3 rmd In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 652--659, IEEE Computer Society, ISBN: 978-0-7695-4691-9, May 2012, (acceptance rate: 27%, 83/302)
CUG 2012
[384] G. Bauer, Torsten Hoefler, W. Kramer and B. Fiedler:
 Analyses and Modeling of Applications Used to Demonstrate Sustained Petascale Performance on Blue Waters (Presentation) presented in Stuttgart, Germany, May 2012, Cray User Group
CCGrid'12
[385] P. Gottschling and Torsten Hoefler:
 Productive Parallel Linear Algebra Programming with Unstructured Topology Adaption In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 9--16, IEEE Computer Society, ISBN: 978-0-7695-4691-9, May 2012, (acceptance rate: 27%, 83/302)
TUM'12
[386] Torsten Hoefler:
 New and old Features in MPI-3.0: The Past, the Standard, and the Future (Presentation) University of Illinois at Urbana-Champaign. presented in Munich, Germany, Apr. 2012,
RWTH'12
[387] Torsten Hoefler:
 Performance Modeling for Systematic Performance Tuning (Presentation) presented in Aachen, Germany, Mar. 2012,
SIAM award
[388] Torsten Hoefler:
 Performance-oriented Parallel Programming Integrating Hardware, Middleware, and Applications (Presentation) presented in Savannah, GA, USA, Feb. 2012, SIAM SIAG/SC Junior Scientist Award Lecture
PPoPP'12
[389] F. Kjolstad, Torsten Hoefler and Marc Snir:
 Automatic Datatype Generation and Optimization In Proceedings of the 17th ACM symposium on Principles and practice of parallel programming, Feb. 2012, (poster paper) (acceptance rate (posters): 17%, 32/185)
PPoPP'12
[390] Torsten Hoefler and Timo Schneider:
 Communication-Centric Optimizations by Dynamically Detecting Collective Operations In Proceedings of the 17th ACM symposium on Principles and practice of parallel programming, Feb. 2012, (poster paper) (acceptance rate (posters): 17%, 32/185)
PDP'12
[391] K. Kharbas, D. Kim, Torsten Hoefler and F. Mueller:
 Assessing HPC Failure Detectors for MPI Jobs In Proceedings of the 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, presented in Munich, Germany, pages 81--88, IEEE Computer Society, ISBN: 978-0-7695-4633-9, Feb. 2012,
Utah'12
[392] Torsten Hoefler:
 Energy-aware Software Development for Massive-Scale Systems (Presentation) presented in Salt Lake City, Utah, USA, Jan. 2012,
SC11 panel
[393] Torsten Hoefler:
 Performance Modeling for the Masses (Presentation) presented in Seattle, WA, USA, Nov. 2011,
SC11
[394] Torsten Hoefler, William Gropp, Marc Snir and W. Kramer:
 Performance Modeling for Systematic Performance Tuning In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11), SotP Session, Nov. 2011,
EuroMPI'11
[395] William Gropp, Torsten Hoefler, Rajeev Thakur and Jesper Larsson Träff:
 Performance Expectations and Guidelines for MPI Derived Datatypes Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 150-159, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,
EuroMPI'11
[396] V. Venkatesan, M. Chaarawi, E. Gabriel and Torsten Hoefler:
 Design and Evaluation of Nonblocking Collective I/O Operations Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 90-98, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,
EuroMPI'11
[397] Torsten Hoefler:
 Writing Parallel Libraries with MPI - The Good, the Bad, and the Ugly presented in Santorini, Greece, Sep. 2011, Keynote talk at 18th European PVM/MPI User's Group Meeting Keynote talk at EuroMPI 2011.
EuroMPI'11
[398] Torsten Hoefler and Marc Snir:
 Writing Parallel Libraries with MPI - Common Practice, Issues, and Extensions Vol 6960, In Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, Santorini, Greece, September 18-21, 2011. Proceedings, presented in Santorini, Greece, pages 345--355, Springer, ISBN: 978-3-642-24448-3, Sep. 2011, Keynote paper at IMUDI/EuroMPI 2011.
EnA-HPC'11
[399] Torsten Hoefler:
 Energy-aware Software Development for Massive-Scale Systems (Presentation) presented in Hamburg, Germany, Sep. 2011, Keynote at the International Conference on Energy-Aware High Performance Computing (EnA-HPC'11) EnA-HPC'11 Keynote Presentation
EuroPar'11
[400] Timo Schneider, Sven Eckelmann, Torsten Hoefler, and Wolfgang Rehm:
 Kernel-Based Offload of Collective Operations - Implementation, Evaluation and Lessons Learned In Proceedings of the 17th international conference on Parallel processing - Volume Part II, presented in Bordeaux, France, pages 264--275, Springer-Verlag, ISBN: 978-3-642-23396-8, Aug. 2011, (acceptance rate 29.9%, 81/271)
TG'11
[401] S. Harrell, P. Smith, D. Smith, Torsten Hoefler, A. Labutina and T. Overmeyer:
 Methods of Creating Student Cluster Competition Teams In Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, presented in Salt Lake City, Utah, pages 50:1--50:6, ACM, Jul. 2011,
LSAP'11
[402] Torsten Hoefler and Marc Snir:
 Performance Engineering: A Must for Petaflops and Beyond Jun. 2011, Extended Abstract for Keynote at Large-scale System and Application Performance Workshop 2011 Keynote Paper at LSAP'11
ICS'11
[403] Jeremiah Willcock, Torsten Hoefler, Nicholas Edmonds and Andrew Lumsdaine:
 Active Pebbles: Parallel Programming for Data-Driven Applications In Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11), presented in Tucson, AZ, pages 235--245, ACM, ISBN: 978-1-4503-0102-2, Jun. 2011, (acceptance rate 21.7%, 35/161)
ICS'11
[404] Torsten Hoefler and Marc Snir:
 Generic Topology Mapping Strategies for Large-scale Parallel Architectures In Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11), presented in Tucson, AZ, pages 75--85, ACM, ISBN: 978-1-4503-0102-2, Jun. 2011, (acceptance rate 21.7%, 35/161)
IPDPS'11
[405] Jens Domke, Torsten Hoefler and W. Nagel:
 Deadlock-Free Oblivious Routing for Arbitrary Topologies In Proceedings of the 25th IEEE International Parallel \& Distributed Processing Symposium (IPDPS), presented in Anchorage, AL, USA, pages 613--624, IEEE Computer Society, ISBN: 0-7695-4385-7, May 2011, (acceptance rate: 19.6%, 112/571)
Juelich
[406] Torsten Hoefler:
 Model-Driven, Performance-Centric HPC Software and System Design and Optimization (Presentation) presented in Juelich, Germany, Apr. 2011, Talk at Juelich Supercomputing Center (JSC)
Aachen
[407] Torsten Hoefler:
 Characterizing the Influence of System Noise on Large-Scale Parallel Applications (Presentation) presented in Aachen, Germany, Apr. 2011, Talk at RWTH Aachen University
PPL
[408] P. Balaji, D. Buntinas, D. Goodell, William Gropp, Torsten Hoefler, S. Kumar, E. Lusk, Rajeev Thakur and Jesper Larsson Träff:
 MPI on Millions of Cores Parallel Processing Letters (PPL). Vol 21, Nr. 1, pages 45-60, World Scientific Publishing Company, Mar. 2011,
CSE'11
[409] William Gropp, Torsten Hoefler and Marc Snir:
 Performance Modeling for Systematic Performance Tuning (Presentation) In SIAM Conference on Computational Science and Engineering 2011 (Abstracts), presented in Reno, NV, SIAM, Feb. 2011,
PPoPP'11
[410] Jeremiah Willcock, Torsten Hoefler, Nicholas Edmonds and Andrew Lumsdaine:
 Active Pebbles: A Programming Model For Highly Parallel Fine-Grained Data-Driven Computations In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, pages 305--306, ISBN: 978-1-4503-0119-0, Feb. 2011, (poster paper) (acceptance rate: 25%, 26/165 papers + 16/165 poster) PPoPP'11 Best Poster Award
PADL'11
[411] E. Holk, W. E. Byrd, Jeremiah Willcock, Torsten Hoefler, A. Chauhan and Andrew Lumsdaine:
 Kanor -- A Declarative Language for Explicit Communication In Proceedings of the 13th international conference on Practical aspects of declarative languages, presented in Austin, TX, USA, pages 190--204, Springer-Verlag, ISBN: 978-3-642-18377-5, Jan. 2011,
PROPER'10
[412] Torsten Hoefler:
 Bridging Performance Analysis Tools and Analytic Performance Modeling for HPC In Proceedings of Workshop on Productivity and Performance (PROPER 2010), presented in Ischia, Italy, Springer, Dec. 2010, Keynote extended abstract for PROPER'10.
HiPC'10
[413] Nicholas Edmonds, Torsten Hoefler and Andrew Lumsdaine:
 A Space-Efficient Parallel Algorithm for Computing Betweenness Centrality in Distributed Memory In International Conference on High Performance Computing, presented in Goa, India, pages 1 - 10, ISBN: 978-1-4244-8518-5 , Dec. 2010, (acceptance rate: 19.2%)
HiPC'10
[414] Nicholas Edmonds, J. Willock, Torsten Hoefler and Andrew Lumsdaine:
 Design of a Large-Scale Hybrid-Parallel Graph Library In International Conference on High Performance Computing, Student Research Symposium, presented in Goa, India, IEEE, Dec. 2010,
CiSE
[415] Torsten Hoefler:
 Software and Hardware Techniques for Power-Efficient HPC Networking Computing in Science and Engineering (CiSE). Vol 12, Nr. 6, pages 30-37, IEEE Computer Society, ISSN: 0740-7475, Dec. 2010,
SC10
[416] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 Characterizing the Influence of System Noise on Large-Scale Applications by Simulation In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10), Nov. 2010, (acceptance rate 19.8%, 50/253) SC10 Best Paper Award
NCSA
[417] Torsten Hoefler:
 Optimizing Communication on Blue Waters (Presentation) In Talk at the Blue Waters PRAC Workshop, presented in Urbana, IL, USA, Oct. 2010,
PACT'10
[418] Jeremiah Willcock, Torsten Hoefler, Nicholas Edmonds and Andrew Lumsdaine:
 AM++: A Generalized Active Message Framework In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, presented in Vienna, Austria, pages 401--410, ACM, ISBN: 978-1-4503-0178-7, Sep. 2010, (acceptance rate: 17%, 46/266)
EuroMPI'10
[419] Torsten Hoefler and S. Gottlieb:
 Parallel Zero-Copy Algorithms for Fast Fourier Transform and Conjugate Gradient using MPI Datatypes Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 132--141, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
EuroMPI'10
[420] Torsten Hoefler, William Gropp, Rajeev Thakur and Jesper Larsson Träff:
 Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 21--30, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
EuroMPI'10
[421] Torsten Hoefler, G. Bronevetsky, Brian Barrett, Bronis R. de Supinski and Andrew Lumsdaine:
 Efficient MPI Support for Advanced Hybrid Programming Models Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 50--61, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,
HotI'10
[422] B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, Torsten Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni and R. Rajamony:
 The PERCS High-Performance Interconnect IBM. In Proceedings of 18th Symposium on High-Performance Interconnects (Hot Interconnects 2010), IEEE, Aug. 2010,
PROPER'10
[423] Torsten Hoefler:
 Analytical Performance Modeling and Simulation for Blue Waters (Presentation) In Keynote at Workshop on Productivity and Performance (PROPER 2010), presented in Ischia, Italy, Aug. 2010, PROPER'10 Keynote Presentation
CCPE
[424] Torsten Hoefler, Rolf Rabenseifner, H. Ritzdorf, Bronis R. de Supinski, Rajeev Thakur and Jesper Larsson Träff:
 The Scalable Process Topology Interface of MPI 2.2 Concurrency and Computation: Practice and Experience. Vol 23, Nr. 4, pages 293-310, John Wiley & Sons, Ltd., ISSN: 1532-0634, Aug. 2010,
IJPEDS
[425] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 Accurately Measuring Overhead, Communication Time and Progression of Blocking and Nonblocking Collective Operations at Massive Scale International Journal of Parallel, Emergent and Distributed Systems. Vol 25, Nr. 4, pages 241-258, Taylor & Francis Group, ISSN: 1744-5779, Jul. 2010,
SciDAC'10
[426] Rajeev Thakur, P. Balaji, D. Buntinas, D. Goodell, William Gropp, Torsten Hoefler, S. Kumar, E. Lusk and Jesper Larsson Träff:
 MPI at Exascale In Procceedings of SciDAC 2010, presented in Chattanooga, Tennessee, Jun. 2010,
AMP'10
[427] Torsten Hoefler, Jeremiah Willcock, A. Chauhan and Andrew Lumsdaine:
 The Case for Collective Pattern Specification Jun. 2010, Accepted at the 1st ACM Workshop on Advances in Message Passing (AMP'10)
LSAP'10
[428] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 LogGOPSim - Simulating Large-Scale Applications in the LogGOPS Model In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, presented in Chicago, Illinois, pages 597--604, ACM, ISBN: 978-1-60558-942-8, Jun. 2010, LSAP'10 Best Paper Award
ANL
[429] Torsten Hoefler:
 Nonblocking and Sparse Collective Operations on Petascale Computers (Presentation) presented in Argonne National Laboratory, Jun. 2010,
NCSA
[430] Torsten Hoefler:
 2010 Blue Waters Performance Modeling Workshop -- Opening and Introduction (Presentation) In Opening Slides for the Blue Waters Modeling Workshop, presented in Urbana, IL, USA, Mar. 2010,
PPoPP'10
[431] Torsten Hoefler, Christian Siebert and Andrew Lumsdaine:
 Scalable Communication Protocols for Dynamic Sparse Data Exchange In Proceedings of the 2010 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'10), presented in Bangalore, India, pages 159--168, ACM, ISBN: 978-1-60558-708-0, Jan. 2010, (acceptance rate 16.8%, 29/173)
HiPC'09
[432] P. Kambadur, A. Gupta, Torsten Hoefler and Andrew Lumsdaine:
 Demand-driven Execution of Static Directed Acyclic Graphs Using Task Parallelism presented in Kochi, India, pages 284-293, ISBN: 978-1-4244-4922-4, Dec. 2009, (acceptance rate 11%, 35/320)
MPICH BoF
[433] Torsten Hoefler:
 Selected MPI-2.2 and MPI-3 Features (Presentation) presented in Portland, OR, USA, Nov. 2009, MPICH Birds of a Feather Supercomputing 2009 (SC09), host: Darius Buntinas
TUM
[434] Torsten Hoefler:
 Improving Parallel Computing Platforms (Presentation) presented in Munich, Germany, Oct. 2009, Presentation at the Technical University of Munich, Host: Prof. M. Gerndt
SIMPAT
[435] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 LogGP in Theory and Practice - An In-depth Analysis of Modern Interconnection Networks and Benchmarking Methods for Collective Operations. Elsevier Journal of Simulation Modelling Practice and Theory (SIMPAT). Vol 17, Nr. 9, pages 1511-1521, Elsevier, ISSN: 1569-190X, Oct. 2009,
EuroMPI'09
[436] Torsten Hoefler, Andrew Lumsdaine and Jack Dongarra:
 Towards Efficient MapReduce Using MPI In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 16th European PVM/MPI Users' Group Meeting, presented in Helsinki, Finland, Springer, Sep. 2009,
MPI-2.2
[437] Message Passing Interface Forum:
 MPI: A Message-Passing Interface Standard Version 2.2 Sep. 2009, Chapter author for Collective Communication and Process Topologies
ICPP'09
[438] Torsten Hoefler, Christian Siebert and Andrew Lumsdaine:
 Group Operation Assembly Language - A Flexible Way to Express Collective Communication In ICPP-2009 - The 38th International Conference on Parallel Processing, presented in Vienna, Austria, IEEE, ISBN: 978-0-7695-3802-0, Sep. 2009, (acceptance rate 32%, 71/220)
HotI'09
[439] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 Optimized Routing for Large-Scale InfiniBand Networks In 17th Annual IEEE Symposium on High Performance Interconnects (HOTI 2009), presented in New York, NY, Aug. 2009,
PPL
[440] Torsten Hoefler and Timo Schneider and Andrew Lumsdaine:
 The Effect of Network Noise on Large-Scale Collective Communications Parallel Processing Letters (PPL). Vol 19, Nr. 4, pages 573-593, World Scientific Publishing Company, Aug. 2009,
LSPP'09
[441] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 The Impact of Network Noise at Large-Scale Communication Performance In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, LSPP'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009, Invited to a journal special issue on top picks from LSPP'09.
CAC'09
[442] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 A Power-Aware, Application-Based, Performance Study Of Modern Commodity Cluster Interconnection Networks In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, CAC'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
CAC'09
[443] C. Kaiser, Torsten Hoefler, B. Bierbaum and T. Bemmerl:
 Implementation and Analysis of Nonblocking Collective Operations on SCI Networks In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, CAC'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
HIPS'09
[444] Torsten Hoefler and Jesper Larsson Träff:
 Sparse Collective Operations for MPI In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, HIPS'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,
MPI Forum
[445] Torsten Hoefler on behalf of the MPI Forum:
  MPI: A Message-Passing Interface Standard -- Working-Draft for Nonblocking Collective Operations MPI Forum. MPI Forum, Apr. 2009,
LCI'09
[446] J. Mueller, Timo Schneider, Jens Domke, R. Geyer, M. Haesing, Torsten Hoefler, S. Hoehlig, G. Juckeland, Andrew Lumsdaine, M. Mueller and W. Nagel:
 Cluster Challenge 2008: Optimizing Cluster Configuration and Applications to Maximize Power Efficiency In In proceedings of the 10th LCI International Conference on High-Performance Clustered Computing, presented in Boulder, CO, Mar. 2009, LCI'09 Best Paper Award
IUCS-TR
[447] Timo Schneider, Torsten Hoefler and Andrew Lumsdaine :
 ORCS: An Oblivious Routing Congestion Simulator Indiana University. Nr. 675, Indiana University Computer Science, Feb. 2009,
MPI Forum
[448] D. Gregor, Torsten Hoefler, Brian Barrett and Andrew Lumsdaine :
 Fixing Probe for Multi-Threaded MPI Applications Indiana University. Nr. 674, Indiana University Computer Science, Jan. 2009,
MPI Forum
[449] Torsten Hoefler:
 MPI-3 Collective Working Group - December'08 Meeting (Presentation) Indiana University. presented in Menlo Park, CA, USA, Dec. 2008, Activity Report to the MPI Forum
Cluster'08
[450] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 Multistage Switches are not Crossbars: Effects of Static Routing in High-Performance Networks In Proceedings of the 2008 IEEE International Conference on Cluster Computing, presented in Tsukuba, Japan, IEEE Computer Society, ISSN: 1552-5244, ISBN: 978-1-4244-2640, Oct. 2008, (acceptance rate 30%, 28/92)
Cluster'08
[451] Torsten Hoefler and Andrew Lumsdaine:
 Message Progression in Parallel Computing - To Thread or not to Thread? In Proceedings of the 2008 IEEE International Conference on Cluster Computing, presented in Tsukuba, Japan, IEEE Computer Society, ISSN: 1552-5244, ISBN: 978-1-4244-2640, Oct. 2008, (acceptance rate 30%, 28/92)
MPI Forum
[452] Torsten Hoefler:
 MPI-3 Collective Working Group - October'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Oct. 2008, Activity Report to the MPI Forum
MPI Forum
[453] Torsten Hoefler:
 MPI-3 Collective Working Group - September'08 Meeting (Presentation) Indiana University. presented in Dublin, Ireland, Sep. 2008, Activity Report to the MPI Forum
Ph.D.'08
[454] Torsten Hoefler:
 Principles for Coordinated Optimization of Computation and Communication in Large-Scale Parallel Systems Indiana University. presented in Bloomington, IN, USA, Sep. 2008,
EuroMPI'08
[455] Torsten Hoefler, F. Lorenzen and Andrew Lumsdaine:
 Sparse Non-Blocking Collectives in Quantum Mechanical Calculations Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 55-63, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,
EuroMPI'08
[456] Torsten Hoefler, M. Schellmann, S. Gorlatch and Andrew Lumsdaine:
 Communication Optimization for Medical Image Reconstruction Algorithms Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 75-83, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,
LLNL
[457] Torsten Hoefler:
 Non-blocking Collective Operations for MPI (Presentation) Lawrence Livermore National Lab. presented in Livermore, CA, USA, Aug. 2008,
Cisco
[458] Torsten Hoefler:
 The effects of common communication patterns in large-scale networks with switch-based static routing (Presentation) Nerd Lunch at Cisco Systems. presented in San Jose, CA, USA, Aug. 2008,
HotI'08
[459] P. Geoffray and Torsten Hoefler:
 Adaptive Routing Strategies for Modern High Performance Networks In 16th Annual IEEE Symposium on High Performance Interconnects, HOTI'08, presented in Stanford, CA, USA, pages 165-172, IEEE Computer Society, ISBN: 978-0-7695-3380-3, Aug. 2008, (acceptance rate 30%, 14/47)
LBNL
[460] Torsten Hoefler:
 Multistage Interconnection Networks are not Crossbars (Presentation) Lawrence Berkeley National Lab. presented in Berkeley, CA, USA, Aug. 2008,
MPI Forum
[461] Torsten Hoefler, Jesper Larsson Träff, Christian Siebert and Andrew Lumsdaine:
 MPI-3 Collective Working Group - June'08 Meeting (Presentation) Indiana University. presented in Menlo Park, CA, USA, Jun. 2008, Activity Report to the MPI Forum
TUM'08
[462] Torsten Hoefler:
 Towards coordinated optimization of computation and communication in parallel applications (Presentation) Fakultaet fuer Informatik, Universität Münster. presented in Muenster, Germany, Jun. 2008,
SPAA'08
[463] Torsten Hoefler, P. Gottschling and Andrew Lumsdaine:
 Brief Announcement: Leveraging Non-blocking Collective Communication in High-performance Applications In Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures, SPAA'08, presented in Munich, Germany, pages 113-115, Association for Computing Machinery (ACM), ISBN: 978-1-59593-973-9, Jun. 2008, (short paper) (acceptance rate: 28%, 36/128)
IWR-TUD
[464] Torsten Hoefler:
 Non-Blocking Collectives for MPI (Presentation) Institut fuer Wissenschaftliches Rechnen, Technische Universitaet Dresden. presented in Dresden, Germany, May 2008,
CCGrid'08
[465] Torsten Hoefler and Andrew Lumsdaine:
 Overlapping Communication and Computation with High Level Communication Routines In Proceedings of the 8th IEEE Symposium on Cluster Computing and the Grid (CCGrid 2008), presented in Lyon, France, May 2008, (acceptance rate: 32%)
PMEO'08
[466] Torsten Hoefler, Timo Schneider and Andrew Lumsdaine:
 Accurately Measuring Collective Operations at Massive Scale In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, PMEO'08 Workshop, presented in Miami, FL, ISSN: 1530-2075, ISBN: 978-1-4244-1694-3, Apr. 2008, Invited to a journal special issue on top picks from PMEO'08.
CAC'08
[467] Torsten Hoefler and Andrew Lumsdaine:
 Optimizing non-blocking Collective Operations for InfiniBand In Proceedings of the 22nd IEEE International Parallel & Distributed Processing Symposium, CAC'08 Workshop, presented in Miami, FL, ISSN: 1530-2075, ISBN: 978-1-4244-1694-3, Apr. 2008,
MPI Forum
[468] Torsten Hoefler and Andrew Lumsdaine:
 MPI-3 Collective Working Group - April'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Apr. 2008, Slides with proposals to the MPI-3 collective WG, all preliminary, published on request
MPI Forum
[469] Torsten Hoefler and Andrew Lumsdaine:
 MPI-3 Collective Working Group - January'08 Meeting (Presentation) Indiana University. presented in Chicago, IL, USA, Mar. 2008, Slides with proposals to the MPI-3 collective WG, all preliminary, published on request
MPI Forum
[470] D. Gregor, Torsten Hoefler and Andrew Lumsdaine:
 Dynamically-Sized Messages in MPI-3 Open Systems Lab, Indiana University. MPI Forum, Feb. 2008,
PASA'08
[471] Timo Schneider, Torsten Hoefler, Simon Wunderlich, Torsten Mehlan and Wolfgang Rehm:
 An optimized ZGEMM implementation for the Cell BE In Proceedings of the 9th Workshop on Parallel Systems and Algorithms (PASA), presented in Dresden, Germany, ISSN: 1617-5468, ISBN: 978-3-88579-218-5, Feb. 2008,
MPI Forum
[472] Torsten Hoefler, F. Lorenzen, D. Gregor and Andrew Lumsdaine:
 Topological Collectives for MPI-2 Open Systems Lab, Indiana University. MPI Forum, Feb. 2008,
KiCC'07
[473] Torsten Hoefler, M. Mosch, Torsten Mehlan, Wolfgang Rehm:
 CollGM - A Myrinet/GM optimized collective component for Open MPI In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
KiCC'07
[474] A. Friedley, Torsten Hoefler, M. Leininger, Andrew Lumsdaine:
 Scalable High Performance Message Passing over InfiniBand for Open MPI In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
NEC'07
[475] Torsten Hoefler:
 Accurately Measuring Collective Operations at Massive Scale (Presentation) C&C Research Laboratories, NEC Europe Ltd.. presented in Sankt Augustin, Germany, Dec. 2007,
KiCC'07
[476] Frank Mietke, Torsten Mehlan, Torsten Hoefler, Wolfgang Rehm:
 Design and Evaluation of a 2048 Core Cluster System In Proceedings of 3rd KiCC Workshop 2007, presented in Aachen, Germany, RWTH Aachen, Dec. 2007,
HLRS
[477] Torsten Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) High Performance Computing Center Stuttgart (HLRS). presented in Stuttgart, Germany, Dec. 2007,
SC07
[478] Torsten Hoefler, Andrew Lumsdaine and Wolfgang Rehm:
 Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI In Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07, presented in Reno, USA, IEEE Computer Society/ACM, Nov. 2007, (acceptance rate 20%, 54/268)
CASCON'07
[479] Timo Schneider, Simon Wunderlich, Wolfgang Rehm, Torsten Hoefler and H. Schick:
 Code Optimization for Cell/B.E. - Opportunities for ABINIT In IBM CASCON 2006 Symposium, presented in Dublin, Ireland, IBM, Oct. 2007, Research Poster at the IBM CASCON 2006 Symposium, Dublin, Ireland
ZiH-TUD
[480] Torsten Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) Dresden University of Technology, Center for Information Services and High Performance Computing (ZIH). presented in Dresden, Germany, Oct. 2007,
EuroMPI'07
[481] Torsten Hoefler, P. Kambadur, R. L. Graham, G. Shipman and Andrew Lumsdaine:
 A Case for Standard Non-Blocking Collective Operations Vol 4757, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, EuroPVM/MPI 2007, presented in Paris, France, pages 125-134, Springer, ISSN: 0302-9743, ISBN: 978-3-540-75415-2, Oct. 2007,
HPCC'07
[482] Torsten Hoefler, Torsten Mehlan, Andrew Lumsdaine and Wolfgang Rehm:
 Netgauge: A Network Performance Measurement Framework Vol 4782, In Proceedings of High Performance Computing and Communications, HPCC'07, presented in Houston, USA, pages 659-671, Springer, ISBN: 978-3-540-75443-5, Sep. 2007,
PARCO
[483] Torsten Hoefler, P. Gottschling, Andrew Lumsdaine and Wolfgang Rehm:
 Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations Elsevier Journal of Parallel Computing (PARCO). Vol 33, Nr. 9, pages 624-633, Elsevier, ISSN: 0167-8191, Sep. 2007,
Talk
[484] Frank Mietke, Torsten Hoefler, Torsten Mehlan and Wolfgang Rehm:
 Diskless Cluster und Lustre - Erfahrungsbericht zum CHiC (Presentation) TU Chemnitz. presented in Chemnitz, Germany, Apr. 2007,
ZKI'07
[485] Frank Mietke, Torsten Mehlan, Torsten Hoefler and Wolfgang Rehm:
 Stand HPC Cluster CHiC (Presentation) TU Chemnitz. presented in Chemnitz, Germany, Apr. 2007,
CAC'07
[486] Torsten Hoefler, Christian Siebert and Wolfgang Rehm:
 A practically constant-time MPI Broadcast Algorithm for large-scale InfiniBand Clusters with Multicast TU Chemnitz. In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium (CAC'07 Workshop), presented in Long Beach, CA, USA, pages 232, IEEE Computer Society, ISBN: 1-4244-0909-8, Mar. 2007,
PMEO'07
[487] Torsten Hoefler, Andre Lichei and Wolfgang Rehm:
 Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks TU Chemnitz. In Proceedings of the 21st IEEE International Parallel & Distributed Processing Symposium, PMEO'07 Workshop, presented in Long Beach, CA, USA, IEEE Computer Society, ISBN: 1-4244-0909-8, Mar. 2007,
KiCC'07
[488] Frank Mietke, D. Dunger, Torsten Mehlan, Torsten Hoefler and Wolfgang Rehm:
 A native InfiniBand Transporter for MySQL Cluster TU Chemnitz. In Proceedings of the 2nd Workshop 'Kommunikation in Clusterrechnern und Clusterverbundsystemen' (KiCC'07), presented in Chemnitz, Germany, Feb. 2007,
CEA-TR
[489] Torsten Hoefler and G. Zerah:
 Transforming the high-performance 3d-FFT in ABINIT to enable the use of non-blocking collective operations Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-Chatel, France, Feb. 2007,
CEA'07
[490] Torsten Hoefler:
 Non-Blocking Collectives for MPI-2 (Presentation) Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-chatel, France, Jan. 2007,
CEA'06
[491] Torsten Hoefler:
 Application Optimization with non-blocking Collectives (Presentation) Commissariat a l'Energie Atomique - Direction des applications militaires (CEA-DAM). presented in Bruyeres-le-chatel, France, Jan. 2007,
ABINIT'07
[492] Torsten Hoefler and G. Zerah:
 Optimization of a parallel 3d-FFT with non-blocking collective operations (Presentation) Invited to the 3rd International ABINIT Developer Workshop. presented in Liege, Belgium, Jan. 2007,
FHPCN'06
[493] Torsten Hoefler, J. Squyres, Wolfgang Rehm and Andrew Lumsdaine:
 A Case for Non-Blocking Collective Operations Vol 4331/2006, In Frontiers of High Performance Computing and Networking - ISPA'06 Workshops, presented in Sorrento, Italy, pages 155-164, Springer Berlin / Heidelberg, ISBN: 978-3-540-49860-5, Dec. 2006,
NEC'06
[494] Torsten Hoefler:
 Non-blocking Collectives for MPI-2 (Presentation) C&C Research Laboratories, NEC Europe Ltd.. presented in Sankt Augustin, Germany, Nov. 2006,
HPCNano'06
[495] Torsten Hoefler and R. Janisch and Wolfgang Rehm:
 Parallel scaling of Teter's minimization for Ab Initio calculations presented in Tampa, FL, USA, Nov. 2006, Presented at the workshop HPC Nano in conjunction with the IEEE international conference on Supercomputing (SC'06)
DAPSYS'06
[496] Torsten Hoefler, J. Squyres, G. Fagg, G. Bosilca, Wolfgang Rehm and Andrew Lumsdaine:
 A New Approach to MPI Collective Communication Implementations In Distributed and Parallel Systems - From Cluster to Grid Computing (DAPSYS'06), presented in Innsbruck, Austria, pages 45-54, Springer, ISBN: 978-0-387-69857-1, Sep. 2006,
EuroMPI'06
[497] Torsten Hoefler, P. Gottschling, Wolfgang Rehm and Andrew Lumsdaine:
 Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations In Recent Advantages in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI User's Group Meeting, Proceedings, LNCS 4192, presented in Bonn, Germany, pages 374-382, Springer, ISSN: 0302-9743, ISBN: 3-540-39110-X, Sep. 2006, Invited to a journal special issue on top picks from EuroMPI'06.
PARELEC'06
[498] Torsten Hoefler, C. Viertel, Torsten Mehlan, Frank Mietke, Wolfgang Rehm:
 Assessing Single-Message and Multi-Node Communication Performance of InfiniBand In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 227-232, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,
PARELEC'06
[499] Torsten Mehlan, J. Strunk, Torsten Hoefler, Frank Mietke and Wolfgang Rehm:
 IRS - A portable Interface for Reconfigurable Systems In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 187-191, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,
IUCS-TR
[500] Torsten Hoefler, J. Squyres, G. Bosilca, G. Fagg, Andrew Lumsdaine and Wolfgang Rehm:
 Non-Blocking Collective Operations for MPI-2 Open Systems Lab, Indiana University. presented in Bloomington, IN, USA, School of Informatics, Aug. 2006,
EuroPar'06
[501] Frank Mietke, R. Baumgartl, R. Rex, Torsten Mehlan, Torsten Hoefler and Wolfgang Rehm:
 Analysis of the Memory Registration Process in the Mellanox InfiniBand Software Stack In Proceedings of Euro-Par 2006 Parallel Processing, presented in Dresden, Germany, pages 124-133, Springer-Verlag Berlin, ISBN: 3-540-37783-2, Aug. 2006, (acceptance rate 37.9%, 110/290)
IUCS-TR
[502] Torsten Hoefler and Andrew Lumsdaine:
 Design, Implementation, and Usage of LibNBC Open Systems Lab, Indiana University. presented in Bloomington, IN, USA, School of Informatics, Aug. 2006,
CIB-06-06
[503] Torsten Hoefler, M. Reinhardt, Frank Mietke, Torsten Mehlan, Wolfgang Rehm:
 Low Overhead Ethernet Communication for Open MPI on Linux Clusters TU Chemnitz. Vol CSR-06, Nr. 06, In Chemnitzer Informatik Berichte, presented in Chemnitz, TU Chemnitz, ISSN: 0947-5125, Jul. 2006,
CUG'06
[504] R. Riesen, C. Vaughan, and Torsten Hoefler:
 What if MPI Collective Operations Were Instantaneous? Cray Inc.. In Proceedings of the 2006 Cray User Group Meeting, presented in Lugano, Switzerland, May 2006,
OMPI'06
[505] Torsten Hoefler:
 Open MPI - Collv2 Design (Presentation) Cisco Systems. presented in San Jose, CA, USA, Apr. 2006,
PMEO'06
[506] Torsten Hoefler, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 LogfP - A Model for small Messages in InfiniBand In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS), PMEO-PDS'06 Workshop, presented in Rhodes, Greece, ISBN: 1-4244-0054-6, Apr. 2006,
CAC'06
[507] Torsten Hoefler, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 Fast Barrier Synchronization for InfiniBand In Proceedings of the 20th IEEE International Parallel & Distributed Processing Symposium (IPDPS), CAC'06 Workshop, presented in Rhodes, Greece, ISBN: 1-4244-0054-6, Apr. 2006,
ARCS'06
[508] Torsten Hoefler, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 Adding Low-Cost Hardware Barrier Support to Small Commodity Clusters In Proceedings of 19th International Conference on Architecture and Computing Systems - ARCS'06, presented in Frankfurt, Germany, pages 343-250, ISSN: 3-88579-175-7, Mar. 2006,
ABINIT'06
[509] Torsten Hoefler:
 Parallelization Options for the Band-by-Band Minimization of Teter et. al. (Presentation) Universite catholique de Louvain. presented in Louvain-la-Neuve, Belgium, Feb. 2006,
Report
[510] R. Kullmann, Torsten Hoefler:
 A short Performance Analysis of Abinit under different build environments TU Chemnitz. presented in Chemnitz, Germany, Jan. 2006,
HPCE'05
[511] Torsten Hoefler, R. Janisch and Wolfgang Rehm:
 Improving the parallel scaling of ABINIT CINECA Consorzio Interuniversitario. In Science and Supercomputing in Europe - Report 2005, presented in Caseleccio di Reno, Italy, pages 551-559, CINECA Conzorzio Interuniversitario, ISBN: 88-86037-17-1, Dec. 2005,
22C3
[512] Torsten Hoefler:
 The Cell Processor 22. Chaos Communication Congress. In 22C3 Proceedings, presented in Berlin, Germany, pages 286-292, ISBN: 3-934636-04-7, Dec. 2005,
Book Chapter
[513] Torsten Hoefler, R. Janisch and Wolfgang Rehm:
 A Performance Analysis of ABINIT on a Cluster System TU Chemnitz. In Parallel Algorithms and Cluster Computing, presented in Chemnitz, Germany, pages 37-51, Springer, Lecture Notes in Computational Science and Engineering, ISBN: 3-540-33539-0, Dec. 2005,
KiCC'05
[514] Torsten Hoefler, J. Squyres, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 Implementing a Hardware-based Barrier in Open MPI TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
KiCC'05
[515] Frank Mietke, R. Rex, Torsten Hoefler, Torsten Mehlan and Wolfgang Rehm:
 Reducing the Impact of Memory Registration in InfiniBand. TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
SFB'05
[516] Torsten Hoefler and Wolfgang Rehm:
 Communication/Computation Overlap in MPI (Presentation) Technical University of Chemnitz. presented in Chemnitz, Germany, Nov. 2005,
KiCC'05
[517] Torsten Mehlan, Torsten Hoefler, Frank Mietke and Wolfgang Rehm:
 Integration of the SISCI Shared Memory Interface into Open MPI TU Chemnitz. In Proceedings of 2005 KiCC Workshop, Chemnitzer Informatik Berichte, presented in Chemnitz, Germany, ISSN: 0947-5152, Nov. 2005,
TUM'05
[518] Torsten Hoefler:
 Fast Barrier Synchronization for InfiniBand (Presentation) Technical University of Chemnitz. presented in Munich, Germany, Sep. 2005,
CAT-05-02
[519] Torsten Hoefler and Wolfgang Rehm:
 A short Performance Analysis of Abinit on a Cluster System Computer Architecture Technical Report. Technical University of Chemnitz. presented in Chemnitz, Germany, Jul. 2005,
PARS'05
[520] Frank Mietke, M. Steiger, Torsten Mehlan, Torsten Hoefler und Wolfgang Rehm:
 SHIBA Shared Memory Support for InfiniBand MPICH2 Device In PARS Mitteilungen 2005, presented in Luebeck, Germany, pages 14-23, ISSN: 0177-0454, Jun. 2005,
PARS'06
[521] Torsten Hoefler and Wolfgang Rehm:
 A Communication Model for Small Messages with InfiniBand PARS. In PARS Mitteilungen, presented in Luebeck, Germany, pages 32-41, PARS, ISSN: 0177-0454, Jun. 2005, PARS Junior Researcher Prize
ICPP-W'05
[522] Torsten Hoefler, L. Cerquetti, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 A practical approach to the rating of barrier algorithms using the LogP model and Open-MPI In Proceedings of the 2005 International Conference on Parallel Processing Workshops, presented in Oslo, Norway, pages 562--569, ISBN: 0-7659-2381-1, Jun. 2005,
M.Sc.'05
[523] Torsten Hoefler:
 Evaluation of publicly available Barrier-Algorithms and Improvement of the Barrier-Operation for large-scale Cluster-Systems with special Attention on InfiniBand Networks Technical University of Chemnitz. presented in Chemnitz, Germany, Apr. 2005, TU Chemnitz Best Student Award, 2005
21C3
[524] Torsten Hoefler:
 Remote Network Analysis 21. Chaos Communication Congress. In 21C3 Proceedings, presented in Berlin, Germany, pages 33-37, ISBN: 3-934636-02-0, Dec. 2004,
CIB-04-03
[525] Torsten Hoefler, Torsten Mehlan, Frank Mietke and Wolfgang Rehm:
 A Survey of Barrier Algorithms for Coarse Grained Supercomputers Chemnitzer Informatik Berichte. Technical University of Chemnitz. Vol 04, Nr. 03, presented in Chemnitz, Germany, ISSN: 0947-5152, Dec. 2004,
CIB-04-04
[526] Torsten Hoefler and Wolfgang Rehm:
 A Meta Analysis of Gigabit Ethernet over Copper Solutions for Cluster-Networking Chemnitzer Informatik Berichte. Technical University of Chemnitz. Vol 04, Nr. 04, presented in Chemnitz, Germany, ISSN: 0947-5152, Dec. 2004,
[527] :
  0,
[528] :
  0,
[529] :
  0,
[530] :
  0,
[531] :
  0,


serving: 3.144.235.138:44147© Torsten Hoefler