| 2011 |
| 44 | Hardware and Software Tradeoffs for Task Synchronization on Manycore Architectures. Yonghong Yan, Sanjay Chatterjee, Daniel A. Orozco, Elkin Garcia, Zoran Budimlic, Jun Shirako, Robert S. Pavel, Guang R. Gao, Vivek Sarkar. Euro-Par (2) 2011, 112-123. Web SearchBibTeXDownload |
| 43 | DEEP: an iterative fpga-based many-core emulation system for chip verification and architecture research. Juergen Ributzka, Yuhei Hayashi, Fei Chen, Guang R. Gao. FPGA 2011, 115-118. Web SearchBibTeXDownload |
| 2010 |
| 42 | A Study of a Software Cache Implementation of the OpenMP Memory Model for Multicore and Manycore Architectures. Chen Chen, Joseph B. Manzano, Ge Gan, Guang R. Gao, Vivek Sarkar. Euro-Par (2) 2010, 341-352. Web SearchBibTeXDownload |
| 2009 |
| 41 | Iterative layer-based raytracing on CUDA. Alejandro Segovia, Xiaoming Li, Guang R. Gao. IPCCC 2009, 248-255. Web SearchBibTeXDownload |
| 2008 |
| 40 | Minimum Lock Assignment: A Method for Exploiting Concurrency among Critical Sections. Yuan Zhang, Vugranam C. Sreedhar, Weirong Zhu, Vivek Sarkar, Guang R. Gao. LCPC 2008, 141-155. Web SearchBibTeXDownload |
| 2007 |
| 39 | Experience of Optimizing FFT on Intel Architectures. Daniel A. Orozco, Liping Xue, Murat Bolat, Xiaoming Li, Guang R. Gao. IPDPS 2007, 1-8. Web SearchBibTeXDownload |
| 38 | Automatic Program Segment Similarity Detection in Targeted Program Performance Improvement. Haiping Wu, Eunjung Park, Mihailo Kaplarevic, Yingping Zhang, Murat Bolat, Xiaoming Li, Guang R. Gao. IPDPS 2007, 1-8. Web SearchBibTeXDownload |
| 37 | Optimized lock assignment and allocation: a method for exploiting concurrency among critical sections. Yuan Zhang, Vugranam C. Sreedhar, Weirong Zhu, Vivek Sarkar, Guang R. Gao. PPOPP 2007, 146-147. Web SearchBibTeXDownload |
| 2006 |
| 36 | A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture. Yingping Zhang, Taikyeong Jeong, Fei Chen, Haiping Wu, R. Nitzsche, Guang R. Gao. IPDPS 2006. Web SearchBibTeXDownload |
| 35 | Exploring Financial Applications on Many-Core-on-a-Chip Architecture: A First Experiment. Weirong Zhu, Parimala Thulasiraman, Ruppa K. Thulasiram, Guang R. Gao. ISPA Workshops 2006, 221-230. Web SearchBibTeXDownload |
| 2005 |
| 34 | Sequential Consistency Revisit: The Sufficient Condition and Method to Reason the Consistency Model of a Multiprocessor-on-a-Chip Architecture. Yuan Zhang, Weirong Zhu, Fei Chen, Ziang Hu, Guang R. Gao. Parallel and Distributed Computing and Networks 2005, 13-19. Web SearchBibTeX |
| 2004 |
| 33 | Implementing parallel conjugate gradient on the EARTH multithreaded architecture. Fei Chen, Kevin B. Theobald, Guang R. Gao. CLUSTER 2004, 459-469. Web SearchBibTeXDownload |
| 32 | A fine-grain load-adaptive algorithm of the 2D discrete wavelet transform for multithreaded architectures. Parimala Thulasiraman, Ashfaq A. Khokhar, Gerd Heber, Guang R. Gao. J. Parallel Distrib. Comput. (64): 68-78 (2004). Web SearchBibTeXDownload |
| 2002 |
| 31 | Implementation and evaluation of a communication intensive application on the EARTH multithreaded system. Kevin B. Theobald, Rishi Kumar, Gagan Agrawal, Gerd Heber, Ruppa K. Thulasiram, Guang R. Gao. Concurrency and Computation: Practice and Experience (14): 183-201 (2002). Web SearchBibTeX |
| 30 | Compiling Several Classes of Communication Patterns on a Multithreaded Architecture. Rishi Kumar, Gagan Agrawal, Guang R. Gao. IPDPS 2002. Web SearchBibTeXDownload |
| 2001 |
| 29 | Speculative Prefetching of Induction Pointers. Artour Stoutchinin, José Nelson Amaral, Guang R. Gao, James C. Dehnert, Suneel Jain, Alban Douillet. CC 2001, 289-303. Web SearchBibTeXDownload |
| 28 | Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions. José Nelson Amaral, Wen-Yen Lin, Jean-Luc Gaudiot, Guang R. Gao. Cluster Computing (4): 281-293 (2001). Web SearchBibTeXDownload |
| 27 | Topic 08+13: Instruction-Level Parallelism and Computer Architecture. Eduard Ayguadé, Fredrik Dahlgren, Christine Eisenbeis, Roger Espasa, Guang R. Gao, Henk L. Muller, Rizos Sakellariou, André Seznec. Euro-Par 2001, 385. Web SearchBibTeXDownload |
| 26 | Multithreaded Algorithms for Pricing a Class of Complex Options. Ruppa K. Thulasiram, Lubomir Litov, Hassan Nojumi, Christopher T. Downing, Guang R. Gao. IPDPS 2001, 18. Web SearchBibTeX |
| 2000 |
| 25 | Self-Avoiding Walks over Adaptive Unstructured Grids. Gerd Heber, Rupak Biswas, Guang R. Gao. Concurrency - Practice and Experience (12): 85-109 (2000). Web SearchBibTeX |
| 24 | Developing a Communication Intensive Application on the EARTH Multithreaded Architecture (Distinguished Paper). Kevin B. Theobald, Rishi Kumar, Gagan Agrawal, Gerd Heber, Ruppa K. Thulasiram, Guang R. Gao. Euro-Par 2000, 625-637. Web SearchBibTeXDownload |
| 23 | Automatic compiler techniques for thread coarsening for multithreaded architectures. Gary M. Zoppetti, Gagan Agrawal, Lori L. Pollock, José Nelson Amaral, Xinan Tang, Guang R. Gao. ICS 2000, 306-315. Web SearchBibTeXDownload |
| 22 | Location Consistency-A New Memory Model and Cache Consistency Protocol. Guang R. Gao, Vivek Sarkar. IEEE Trans. Computers (49): 798-813 (2000). Web SearchBibTeXDownload |
| 21 | Caching Single-Assignment Structures to Build a Robust Fine-Grain Multi-Threading System. Wen-Yen Lin, Jean-Luc Gaudiot, José Nelson Amaral, Guang R. Gao. IPDPS 2000, 589-594. Web SearchBibTeXDownload |
| 20 | Parallel FEM Simulation of Crack Propagation - Challenges, Status, and Perspectives. Bruce Carter, Chuin-Shan Chen, L. Paul Chew, Nikos Chrisochoides, Guang R. Gao, Gerd Heber, Anthony R. Ingraffea, Roland Krause, Chris Myers, Démian Nave, Keshav Pingali, Paul Stodghill, Stephen A. Vavasis, Paul A. Wawrzynek. IPDPS Workshops 2000, 443-449. Web SearchBibTeXDownload |
| 19 | Recursive and Iterative Multithreaded Algorithms for Pricing American Securities. Ruppa K. Thulasiram, Christopher T. Downing, Guang R. Gao. PDPTA 2000. Web SearchBibTeX |
| 18 | Landing CG on EARTH: A Case Study of Fine-Grained Multithreading on an Evolutionary Path. Kevin B. Theobald, Gagan Agrawal, Rishi Kumar, Gerd Heber, Guang R. Gao, Paul Stodghill, Keshav Pingali. SC 2000. Web SearchBibTeXDownload |
| 17 | Multithreaded algorithms for the fast Fourier transform. Parimala Thulasiraman, Kevin B. Theobald, Ashfaq A. Khokhar, Guang R. Gao. SPAA 2000, 176-185. Web SearchBibTeXDownload |
| 1999 |
| 16 | Load Adaptive Algorithms and Implementations for the 2D Discrete Wavelet Transform on Fine-Grain Multithreaded Architectures. Ashfaq A. Khokhar, Gerd Heber, Parimala Thulasiraman, Guang R. Gao. IPPS/SPDP 1999, 458-462. Web SearchBibTeXDownload |
| 15 | A New Approach to Parallel Dynamic Partitioning for Adaptive Unstructured Meshes. Gerd Heber, Guang R. Gao, Rupak Biswas. IPPS/SPDP 1999, 360-364. Web SearchBibTeXDownload |
| 14 | Self-Avoiding Walks over Adaptive Unstructured Grids. Gerd Heber, Rupak Biswas, Guang R. Gao. IPPS/SPDP Workshops 1999, 968-977. Web SearchBibTeXDownload |
| 13 | Advances in the dataflow computational model. Walid A. Najjar, Edward A. Lee, Guang R. Gao. Parallel Computing (25): 1907-1929 (1999). Web SearchBibTeXDownload |
| 12 | Self-Avoiding Walks Over Adaptive Triangular Grids. Gerd Heber, Rupak Biswas, Guang R. Gao. PPSC 1999. Web SearchBibTeX |
| 1998 |
| 11 | Partial Sampling with Reverse State Reconstruction: A New Technique for Branch Predictor Performance Estimation. Darren Erik Vengroff, Guang R. Gao. HPCA 1998, 342-351. Web SearchBibTeXDownload |
| 10 | Using Multithreading for the Automatic Load Balancing of Adaptive Finite Element Meshes. Gerd Heber, Rupak Biswas, Parimala Thulasiraman, Guang R. Gao. IRREGULAR 1998, 132-143. Web SearchBibTeXDownload |
| 1997 |
| 9 | A Register Pressure Sensitive Instruction Scheduler for Dynamic Issue Processors. Rad Silvera, Jian Wang, Ramaswamy Govindarajan, Guang R. Gao. IEEE PACT 1997, 78-89. Web SearchBibTeXDownload |
| 8 | On the Importance of an End-To-End View of Memory Consistency in Future Computer Systems. Guang R. Gao, Vivek Sarkar. ISHPC 1997, 30-41. Web SearchBibTeXDownload |
| 7 | Thread Partitioning and Scheduling Based on Cost Model. Xinan Tang, Jing Wang, Kevin B. Theobald, Guang R. Gao. SPAA 1997, 272-281. Web SearchBibTeXDownload |
| 1996 |
| 6 | Pipelining-Dovetailing: A Transformation to Enhance Software Pipelining for Nested Loops. Jian Wang, Guang R. Gao. CC 1996, 1-17. Web SearchBibTeXDownload |
| 5 | Locality Analysis for Distributed Shared-Memory Multiprocessors. Vivek Sarkar, Guang R. Gao, Shaohua Han. LCPC 1996, 20-40. Web SearchBibTeXDownload |
| 1995 |
| 4 | Location Consistency: Stepping Beyond the Memory Coherence Barrier. Guang R. Gao, Vivek Sarkar. ICPP (2) 1995, 73-76. Web SearchBibTeX |
| 1993 |
| 3 | Special Issue on DataFlow and Multithreaded Architectures - Guest Editors' Introduction. Guang R. Gao, Jean-Luc Gaudiot, Lubomir Bic. J. Parallel Distrib. Comput. (18): 271-272 (1993). Web SearchBibTeXDownload |
| 1992 |
| 2 | Collective Loop Fusion for Array Contraction. Guang R. Gao, R. Olsen, Vivek Sarkar, Radhika Thekkath. LCPC 1992, 281-295. Web SearchBibTeXDownload |
| 1991 |
| 1 | Optimization of array accesses by collective loop transformations. Vivek Sarkar, Guang R. Gao. ICS 1991, 194-205. Web SearchBibTeXDownload |