Abstract: Control hazards are a significant impediment to performance modern microprocessors. Microprocessors mitigate control hazards by using branch predictors to speculatively fetch and execute instructions beyond conditional branches. The penalty of a mispredicted branch is proportional to the pipeline depth and often exceeds the penalty of afirst-level cache miss. Thus, improving branch predictor accuracy hasthe potential to significantly improve overall performance. I present two recent techniques for improving branch predictor accuracy. The first technique is a compiler optimization that improves performance ona real processor, the Intel Pentium 4. The second technique is a microarchitectural optimization evaluated through simulation.
Pattern history table partitioning improves branch predictor accuracy by reducing the effect of destructive interference between unrelated branches contending for the same branch predictor resouces. The compiler arranges branch addresses to minimize conflicts between branches. The technique improves accuracy by an average of 3.5% on the Intel Pentium 4, leading to a performance improvement of up to 16% and 4.5% on average.
Traditional branch predictors exploit correlations between pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction. This technique develops a set of linear functions, one for each program path to the branch to be predicted, that separate predicted taken from predicted not taken branches. Taken together, all of these linear functions form a piecewise linear decision surface. A practical version of this predictor improves performance by 8% over previous neural branch predictors.