Skip to main content
Hardware multithreading is becoming a generally applied technique in the next generation of microprocessors. Several multithreaded processors are announced by industry or already into production in the areas of high-performance... more
    • by 
    •   7  
      OPERATING SYSTEMThread Level SpeculationNetwork ProcessorHigh performance
The Cydra 5 is a VLIW minisupercomputer with hardware designed to accelerate a broad class of inner loops, presenting unique challenges to its compilers. We discuss the organization of its Fortran/77 compiler and several of the key... more
    • by 
    •   8  
      Distributed ComputingFortranHardware DesignInstruction Scheduling
Improving architectural energy efficiency is important to address diminishing energy efficiency gains from technology scaling. At the same time, limiting hardware complexity is also important. This paper presents a new processor... more
    • by 
    •   5  
      Compiler ConstructionPipelineProcessor ArchitectureSpeculative Execution
Over the last 20 years, the open-source community has provided more and more software on which the world's highperformance computing systems depend for performance and productivity. The community has invested millions of dollars and years... more
    • by 
    •   8  
      Distributed ComputingHigh Performance Computing Applications development for Atmosphere modelingTransactional MemoryOpen Source
an NSF Graduate Research Fellowship and NSF and Darpa grants to the Fugu and Raw projects. While provided a vital support network. Most of all, I have relied on my wife, Kathleen Shannon, and my children, Karissa and Anya. Their love has... more
    • by 
    •   5  
      Flow ControlAutomatic ParallelizationSpeculative ExecutionHardware Implementation of Algorithms
    • by 
    •   6  
      Computer ArchitectureComputer HardwareHewlett PackardSpeculative Execution
The emergence and wide adoption of web applications have moved the client-side component, often written in JavaScript, to the forefront of computing on the web. Web application developers try to move more computation to the client side to... more
    • by 
    •   9  
      OptimizationPerformance ImprovementSpeculative ExecutionMulticore processors
This paper proposes a new hardware technique for us-ing one core of a CMP to prefetch data for a thread run-ning on another core. Our approach simply executes a copy of all non-control instructions in the prefetching core af-ter they have... more
    • by 
    •   3  
      Speculative ExecutionParallel ArchitecturesMultiprocessor System on Chip (MPSoC)
The AMD-K6 MMX-enabled processor is plugcompatible with the industry-standard Socket 7 and is binary compatible with the existing base of legacy X86 software. The microarchitecture is based on an out-of-order, superscalar execution engine... more
    • by 
    •   6  
      High performanceCircuit DesignSpeculative ExecutionSolid State Devices and Circuits
Current microprocessors utilise the instruction-level parallelism by a deep processor pipeline and the superscalar instruction issue technique. VLSI technology offers several solutions for aggressive exploitation of the instruction-level... more
    • by 
    •   7  
      Computer HardwareData DependenceControl DependenceSpeculative Execution
The paper presents an approach helping developers to maintain source code identifiers and comments consistent with high-level artifacts. Specifically the approach computes and shows the textual similarity between source code and related... more
    • by 
    •   12  
      Computer ArchitectureComputer GraphicsDigital Signal ProcessingComputer Hardware
Branch predictor (BP) is an essential component in modern processors since high BP accuracy can improve performance and reduce energy by decreasing the number of instructions executed on wrong-path. However, reducing latency and storage... more
    • by 
    •   6  
      Computer ArchitectureComputer EngineeringLiterature ReviewSurvey
Early web content was expressed statically, making it amenable to straightforward prefetching to reduce user- perceived network delay. In contrast, today's rich web applications often hide content behind JavaScript event handlers,... more
    • by 
    •   3  
      Speculative ExecutionLow LatencyWeb Browsing
To narrow the widening gap between processor and memory performance, the authors propose improving the cache locality of pointer-manipulating programs and bolstering performance by careful placement of structure elements.
    • by 
    •   19  
      SchedulingOPERATING SYSTEMProgrammingData Structure
Designers face many choices when planning a new high-performance, general purpose microprocessor. Options include superscalar organization (the ability to dispatch and execute more than one instruction at a time), out-of-order issue of... more
    • by 
    •   9  
      Computer HardwareModel validationPerformance ModelDesign Space Exploration
The Multiflow compiler uses the trace scheduling algorithm to find and exploit instruction-level parallelism beyond basic blocks. The compiler generates code for VLIW computers that issue up to 28 operations each cycle and maintain more... more
    • by 
    •   12  
      Distributed ComputingSchedulingCompilerPerformance Analysis
Improving MapReduce Performance in Heterogeneous Environments. ... If a node crashes, MapReduce re-runs its tasks on a different machine. ...
    • by 
    •   20  
      Harmonic AnalysisData MiningSchedulingResource Allocation
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially important when the parallelization is performed at fine... more
    • by 
    •   5  
      Speculative ExecutionProgramming ModelMulti-threadingInstruction Sets
In modern superscalar microarchitectures that speculatively execute a great quantity of code, without performing branch prediction, it won't be possible to aggressively exploit instruction level parallelism from programs. Both the... more
    • by 
    •   6  
      Computer ArchitectureInformation TechnologySpeculative ExecutionInteractive graphics
As the di erence in speed between processor and memory system continues to increase, it is becoming crucial to develop and re ne techniques that enhance the e ectiveness of cache hierarchies. Two such techniques are data prefetching and... more
    • by 
    •   2  
      Shared memorySpeculative Execution
In modern superscalar microarchitectures that speculatively execute a great quantity of code, without performing branch prediction, it won't be possible to aggressively exploit program's instruction level parallelism. Both the... more
    • by  and +3
    •   6  
      Computer ArchitectureInformation TechnologySpeculative ExecutionInteractive graphics
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially important when the parallelization is performed at fine... more
    • by 
    •   5  
      Sensorimotor synchronisationSpeculative ExecutionProgramming ModelMulti-threading
ABSTRACT There have been a number of successes in the past few years in use of formal methods for verification of real-time systems, and also in source-to-source transformation of these systems for improved analysis, performance, and... more
    • by 
    •   20  
      EngineeringProgramming LanguagesCompilersControl system
In modern superscalar microarchitectures that speculatively execute a great quantity of code, without performing branch prediction, it won't be possible to aggressively exploit program's instruction level parallelism. Both the... more
    • by 
    •   10  
      Computer ScienceComputer ArchitectureInformation TechnologySimulation
Over the last 20 years, the open-source community has provided more and more software on which the world’s high-performance computing systems depend for performance and productivity. The community has invested millions of dollars and... more
    • by 
    •   10  
      Computer ScienceDistributed ComputingHigh Performance Computing Applications development for Atmosphere modelingTransactional Memory
Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative execution of instructions, a technique that increases instruction... more
    • by 
    •   13  
      Combinatorial OptimizationProblem SolvingCase StudyLow Power
Speculative Multithreading (SpMT) increases the performance by means of executing multiple threads speculatively to exploit thread-level parallelism. By combining software and hardware approaches, we have improved the capabilities of... more
    • by 
    •   4  
      Transactional MemoryCOMPUTER SCIENCE & ENGINEERING COMPUTER APPLICATIONS INFORMATION TECHNOLOGYSpeculative ExecutionMulti-threading
This paper presents Threaded Multi-Path Execution (TME), which exploits existing hardware on a Simultaneous Multithreading (SMT) processor to speculatively execute multiple paths of execution. When there are fewer threads in an SMT... more
    • by 
    •   2  
      Speculative ExecutionSynchronisation
Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-memory multiprocessors, parallel programs are compiled to... more
    • by 
    •   10  
      Distributed ComputingParallel ProgrammingProcessor ArchitectureComputer Software
    • by 
    •   6  
      Computer ArchitectureComputer HardwareHewlett PackardSpeculative Execution
Speculative locking (SL) protocols have been proposed in the literature for improving the performance of read-only transactions (ROTs) without correctness and data currency issues. In these protocols, ROTs carry out speculative executions... more
    • by 
    •   5  
      Performance EvaluationPhase LockingSpeculative ExecutionConcurrency Control
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has provided a systematic approach for parallelizing irregular... more
    • by 
    •   2  
      Data StructureSpeculative Execution
To improve the utilization of machine resources in superscalar processors, the instructions have to be carefully scheduled by the compiler. As internal parallelism and pipelining increases, it becomes evident that scheduling should be... more
    • by 
    •   3  
      Instruction SchedulingData DependenceSpeculative Execution
... Among oth-ers, this support was provided by Rachel Allen, Scott Blomquist, Michael Chan, Cornelia Colyer, Mary Ann Ladd, Anne McCarthy, Marilyn Pierce, Lila Rhoades, Ty Sealy ... Most of all, I have relied on my wife, Kathleen... more
    • by 
    •   5  
      Flow ControlAutomatic ParallelizationSpeculative ExecutionHardware Implementation of Algorithms
Recent research in thread-level speculation (TLS) has proposed several mechanisms for optimistic execution of difficultto-analyze serial codes in parallel. Though it has been shown that TLS helps to achieve higher levels of parallelism,... more
    • by 
    •   7  
      Performance EvaluationThread Level SpeculationStandard DeviationData Dependence
Replicated state machines are an important and widely-studied methodology for tolerating a wide range of faults. Unfortunately, while replicas should be distributed geographically for maximum fault tolerance, current replicated state... more
    • by 
    •   3  
      Operating SystemsFault ToleranceSpeculative Execution
Irregular algorithms are organized around pointer-based data structures such as graphs and trees, and they are ubiquitous in applications. Recent work by the Galois project has provided a systematic approach for parallelizing irregular... more
    • by 
    •   19  
      LanguagesComputer ScienceAlgorithmsDistributed Computing
The WaveScalar is the first DataFlow Architecture that can efficiently provide the sequential memory semantics required by imperative languages. This work presents an alternative memory ordering mechanism for this architecture, the... more
    • by 
    •   8  
      Transactional MemoryCluster ComputingDataflowWavescalar
This paper presents new achievements on the automatic mapping of abstract algorithms, written in imperative software programming languages, to custom computing machines. The reconfigurable hardware element of the target architecture... more
    • by 
    •   6  
      Reconfigurable HardwareField Programmable Gate ArraySpeculative ExecutionBayesian hierarchical model
Instruction-level parallelism in a single stream of code for non-numerical applications has been the subject of many recent researches. This work extends the analysis to symbolic applications described with logic programming. In... more
    • by 
    •   11  
      Logic ProgrammingPerformanceSchedulingParallel Processing
PEN-CHUNG YEW and ROY DZ-CHING JU, TIN-FOOK NGAI, SUN CHAN ________________________________________________________________________ Speculative execution, such as control speculation or data speculation, is an effective way to improve... more
    • by 
    •   5  
      Computer SoftwareProgram TransformationInstruction SchedulingSpeculative Execution
Speculative execution, such as control speculation and data speculation, is an effective way to improve program performance. Using edge/path profile information or simple heuristic rules, existing compiler frameworks can adequately... more
    • by 
    •   6  
      Program AnalysisProgram TransformationInstruction SchedulingCompiler Optimization
The contribution of memory latency to execution time continues to increase, and latency hiding mechanisms become ever more important for efficient processor design. While high-end processors can use elaborate techniques like multiple... more
    • by 
    •   11  
      Embedded SystemsHardwareMicroarchitectureProcess Design
Predicated execution is an effective technique for dealing with conditional branches in application programs. However, there are several problems associated with conventional compiler support for predicated execution. First, all paths of... more
    • by 
    •   11  
      Automatic ControlParallel ProcessingFrequencyPerfect
Cloud computing systems use distributed file systems (DFSs) to store and process large data generated in the organizations. The users of the web-based information systems very frequently perform read operations and infrequently carry out... more
    • by 
    •   4  
      Replacement AlgorithmsPrefetchingSpeculative ExecutionDistributed File System
The AMD-K6 MMX-enabled processor is plugcompatible with the industry-standard Socket 7 and is binary compatible with the existing base of legacy X86 software. The microarchitecture is based on an out-of-order, superscalar execution engine... more
    • by 
    •   6  
      High performanceCircuit DesignSpeculative ExecutionSolid State Devices and Circuits
A long-running transaction is an interactive component of a distributed system which must be executed as if it were a single atomic action. In principle, it should not be interrupted or fail in the middle, and it must not be interleaved... more
    • by 
    •   4  
      Communicating Sequential ProcessesProcess AlgebraSpeculative ExecutionBirthday
    • by 
    •   6  
      Computer ArchitectureComputer HardwareHewlett PackardSpeculative Execution