Daniel Jimenez

Idealized Piecewise Linear Branch Prediction

Journal of Instruction-level Parallelism, 2004

Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to... more Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction, an idealized branch predictor that develops a set of linear functions, one for ea ch program path to the branch to

Idealized Piecewise Linear Branch Prediction

Journal of Instruction-level Parallelism, 2004

Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to... more Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction, an idealized branch predictor that develops a set of linear functions, one for ea ch program path to the branch to

Idealized Piecewise Linear Branch Prediction

Journal of Instruction-level Parallelism, 2004

Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to... more Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction, an idealized branch predictor that develops a set of linear functions, one for ea ch program path to the branch to

Idealized Piecewise Linear Branch Prediction

Journal of Instruction-level Parallelism, 2004

Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to... more Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction, an idealized branch predictor that develops a set of linear functions, one for ea ch program path to the branch to

Reconsidering Complex Branch Predictors

International Symposium on High-Performance Computer Architecture, 2003

To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarc... more To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.

Download

Idealized Piecewise Linear Branch Prediction

Journal of Instruction-level Parallelism, 2004

Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to... more Traditional branch predictors exploit correlations betwe en pattern history and branch outcome to predict branches, but there is a stronger and more natural correlation between path history and branch outcome. I exploit this correlation with piecewise linear branch prediction, an idealized branch predictor that develops a set of linear functions, one for ea ch program path to the branch to

Reconsidering Complex Branch Predictors

International Symposium on High-Performance Computer Architecture, 2003

To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarc... more To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.

Download

Reconsidering Complex Branch Predictors

International Symposium on High-Performance Computer Architecture, 2003

To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarc... more To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.

Download

Improved latency and accuracy for neural branch prediction

ACM Transactions on Computer Systems, 2005

Page 1. Improved Latency and Accuracy for Neural Branch Prediction DANIEL A. JIM ´ENEZ Department... more

Reconsidering Complex Branch Predictors

International Symposium on High-Performance Computer Architecture, 2003

To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarc... more To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.

Download

Reconsidering Complex Branch Predictors

International Symposium on High-Performance Computer Architecture, 2003

To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarc... more To sustain instruction throughput rates in more aggressively clocked microarchitectures, microarchitects have incorporated larger and more complex branch predictors into their designs, taking advantage of the increasing numbers of transistors available on a chip. Unfortunately, because of penalties associated with their implementations, the extra accuracy provided by many branch predictors does not produce a proportionate increase in performance. Specifically, we show that the techniques used to hide the latency of a large and complex branch predictor do not scale well and will be unable to sustain IPC for deeper pipelines.

Download

Improved latency and accuracy for neural branch prediction

ACM Transactions on Computer Systems, 2005

Page 1. Improved Latency and Accuracy for Neural Branch Prediction DANIEL A. JIM ´ENEZ Department... more

Neural methods for dynamic branch prediction

ACM Transactions on Computer Systems, 2002

This article presents a new and highly accurate method for branch prediction. The key idea is to ... more This article presents a new and highly accurate method for branch prediction. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length. We describe two versions of perceptron predictors, and we evaluate these predictors with respect to five well-known predictors. We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare. We also introduce a global/local version of our predictor that is 14% more accurate than the McFarling-style hybrid predictor of the Alpha 21264. We show that for hardware budgets of up to 256 KB, this global/local perceptron predictor is more accurate than Evers' multicomponent predictor, so we conclude that ours is the most accurate dynamic predictor currently available. To explore the feasibility of our ideas, we provide a circuit-level design of the perceptron predictor and describe techniques that allow our complex predictor to operate quickly. Finally, we show how the relatively complex perceptron predictor can be used in modern CPUs by having it override a simpler, quicker Smith predictor, providing IPC improvements of 15.8% over gshare and 5.7% over the McFarling hybrid predictor.

Download

Improved latency and accuracy for neural branch prediction

ACM Transactions on Computer Systems, 2005

Page 1. Improved Latency and Accuracy for Neural Branch Prediction DANIEL A. JIM ´ENEZ Department... more

Improved latency and accuracy for neural branch prediction

ACM Transactions on Computer Systems, 2005

Page 1. Improved Latency and Accuracy for Neural Branch Prediction DANIEL A. JIM ´ENEZ Department... more

Improved latency and accuracy for neural branch prediction

ACM Transactions on Computer Systems, 2005

Page 1. Improved Latency and Accuracy for Neural Branch Prediction DANIEL A. JIM ´ENEZ Department... more

Neural methods for dynamic branch prediction

ACM Transactions on Computer Systems, 2002

This article presents a new and highly accurate method for branch prediction. The key idea is to ... more This article presents a new and highly accurate method for branch prediction. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length. We describe two versions of perceptron predictors, and we evaluate these predictors with respect to five well-known predictors. We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare. We also introduce a global/local version of our predictor that is 14% more accurate than the McFarling-style hybrid predictor of the Alpha 21264. We show that for hardware budgets of up to 256 KB, this global/local perceptron predictor is more accurate than Evers' multicomponent predictor, so we conclude that ours is the most accurate dynamic predictor currently available. To explore the feasibility of our ideas, we provide a circuit-level design of the perceptron predictor and describe techniques that allow our complex predictor to operate quickly. Finally, we show how the relatively complex perceptron predictor can be used in modern CPUs by having it override a simpler, quicker Smith predictor, providing IPC improvements of 15.8% over gshare and 5.7% over the McFarling hybrid predictor.

Download

Neural methods for dynamic branch prediction

ACM Transactions on Computer Systems, 2002

This article presents a new and highly accurate method for branch prediction. The key idea is to ... more This article presents a new and highly accurate method for branch prediction. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length. We describe two versions of perceptron predictors, and we evaluate these predictors with respect to five well-known predictors. We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare. We also introduce a global/local version of our predictor that is 14% more accurate than the McFarling-style hybrid predictor of the Alpha 21264. We show that for hardware budgets of up to 256 KB, this global/local perceptron predictor is more accurate than Evers' multicomponent predictor, so we conclude that ours is the most accurate dynamic predictor currently available. To explore the feasibility of our ideas, we provide a circuit-level design of the perceptron predictor and describe techniques that allow our complex predictor to operate quickly. Finally, we show how the relatively complex perceptron predictor can be used in modern CPUs by having it override a simpler, quicker Smith predictor, providing IPC improvements of 15.8% over gshare and 5.7% over the McFarling hybrid predictor.

Download

Neural methods for dynamic branch prediction

ACM Transactions on Computer Systems, 2002

This article presents a new and highly accurate method for branch prediction. The key idea is to ... more This article presents a new and highly accurate method for branch prediction. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length. We describe two versions of perceptron predictors, and we evaluate these predictors with respect to five well-known predictors. We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare. We also introduce a global/local version of our predictor that is 14% more accurate than the McFarling-style hybrid predictor of the Alpha 21264. We show that for hardware budgets of up to 256 KB, this global/local perceptron predictor is more accurate than Evers' multicomponent predictor, so we conclude that ours is the most accurate dynamic predictor currently available. To explore the feasibility of our ideas, we provide a circuit-level design of the perceptron predictor and describe techniques that allow our complex predictor to operate quickly. Finally, we show how the relatively complex perceptron predictor can be used in modern CPUs by having it override a simpler, quicker Smith predictor, providing IPC improvements of 15.8% over gshare and 5.7% over the McFarling hybrid predictor.

Download

Neural methods for dynamic branch prediction

ACM Transactions on Computer Systems, 2002

This article presents a new and highly accurate method for branch prediction. The key idea is to ... more This article presents a new and highly accurate method for branch prediction. The key idea is to use one of the simplest possible neural methods, the perceptron, as an alternative to the commonly used two-bit counters. The source of our predictor's accuracy is its ability to use long history lengths, because the hardware resources for our method scale linearly, rather than exponentially, with the history length. We describe two versions of perceptron predictors, and we evaluate these predictors with respect to five well-known predictors. We show that for a 4 KB hardware budget, a simple version of our method that uses a global history achieves a misprediction rate of 4.6% on the SPEC 2000 integer benchmarks, an improvement of 26% over gshare. We also introduce a global/local version of our predictor that is 14% more accurate than the McFarling-style hybrid predictor of the Alpha 21264. We show that for hardware budgets of up to 256 KB, this global/local perceptron predictor is more accurate than Evers' multicomponent predictor, so we conclude that ours is the most accurate dynamic predictor currently available. To explore the feasibility of our ideas, we provide a circuit-level design of the perceptron predictor and describe techniques that allow our complex predictor to operate quickly. Finally, we show how the relatively complex perceptron predictor can be used in modern CPUs by having it override a simpler, quicker Smith predictor, providing IPC improvements of 15.8% over gshare and 5.7% over the McFarling hybrid predictor.

Download

Uploads

Papers by Daniel Jimenez

Log In