Branch Prediction, Instruction-Window Size, and Cache Size: Performance Tradeoffs and Sampling Techniques

Report ID: TR-578-98
Author: Martonosi, Margaret / Skadron, Kevin / Clark, Douglas W. / Ahuja, Pritpal S.
Date: 1998-04-00
Pages: 38
Download Formats: |Postscript|
Abstract:

Design parameters interact in complex ways in modern processors, especially because out-of-order issue and decoupling buffers sometimes allow latencies to be overlapped. Tradeoffs among instruction-window size, branch-prediction accuracy, and instruction- and data-cache size can change as these parameters move through different domains. For example, modeling unrealistic caches can under- or over-state the benefits of better prediction or a larger instruction window. Avoiding such pitfalls requires understanding how all these parameters interact.

Because such methodological mistakes are common, this paper provides a comprehensive set of SimpleScalar simulation results from SPECint95 programs, showing the interactions among these major structures. In addition to presenting this database of simulation results, major mechanisms driving the observed tradeoffs are described. The paper also considers appropriate simulation techniques when sampling full-length runs with the SPEC reference inputs.

In particular, the results show that branch mispredictions limit the benefits of larger instruction windows, that better branch prediction and better instruction cache behavior have synergistic effects, and that larger instruction windows and larger data caches trade off and have overlapping effects. In addition, simulations of only 50 million instruction in length can yield representative results if these short windows are carefully selected.

This technical report has been published as
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Tradeoffs and Simulation Techniques. K. Skadron, P.S. Ahuja, M. Martonosi, and D.W. Clark. IEEE Transactions on Computers, 48(11):1260-81, Nov. 1999.