The historical improvements in the performance of general-purpose processors have long provided opportunities for application innovation. Word processing, spreadsheets, desktop publishing, networking and various game genres are just some of the many applications that have arisen because of the increasing capabilities and the versatility of general-purpose processors. Key to these innovations is the fact that general-purpose processors do not predefine the applications that they are going to run.
Currently, the capabilities of individual general-purpose processors are encountering challenges, such as diminishing returns in exploiting instruction-level parallelism and power limits. As a consequence, a variety of approaches are being employed to address this situation, including adding myriad dedicated accelerators. Unfortunately, while this improves performance it sacrifices generality. More specifically, the time, difficulty and cost of special purpose design preclude dedicated logic from serving as a viable avenue for application innovation.
There recently has been progress in addressing this dilemma between providing programmability and higher performance via an interesting middle ground between fully general-purpose computing and dedicated logic. In specific, spatial computing, where the computation is mapped spatially onto an array of small programmable processing elements addresses many of the cost-related liabilities of dedicated logic and is increasingly being applied to general computation problems. While field prgrammable gate arrays (FPGAs) are the best know spatial computing platform there are also a number of coarse grained variants.
In this talk, we will examine the range of spatial computing alternatives and explore in more depth the concept of triggered instructions, a novel control paradigm for arrays of processing elements (PEs) aimed at exploiting spatial parallelism. Triggered instructions completely eliminate the program counter and allow programs to transition concisely between states without explicit branch instructions. They also allow efficient reactivity to inter-PE communication traffic. The approach provides a unified mechanism to avoid over-serialized execution, essentially achieving the effect of techniques such as dynamic instruction reordering and multithreading, which each require distinct hardware mechanisms in a traditional sequential architecture.
About the speaker
Joel Emer is an Intel Fellow and Director of Microarchitecture Research at Intel in Hudson, Massachusetts and is a Professor of the Practice at MIT. Previously he worked at Compaq and Digital Equipment Corporation where he held various research and advanced development positions investigating processor micro-architecture for a variety of VAX and Alpha processors and developing performance modeling and evaluation techniques. His research included pioneering efforts in simultaneous multithreading, processor reliability analysis and early contributions on the now pervasive quantitative approach to processor evaluation. His current research interests include memory hierarchy design, reconfigurable logic-based computation and performance modeling. He received his PhD in electrical engineering at the University of Illinois at Urbana-Champaign in 1979. He is a Fellow of both the ACM and the IEEE. He has also received a number of awards including the 2009 Eckert-Mauchly award for lifetime contributions in computer architecture.