Skip to main content
Estimating the execution time of nested loops or the volume of data transferred between processors is necessary to make appropriate processor or data allocation. To achieve this goal one need to estimate the execution time of the body and... more
    • by 
    •   3  
      Automatic ParallelizationParallelizing CompilersIPDPS
Adaptive Mesh Re®nement (AMR) calculations carried out on structured meshes play an exceedingly important role in several areas of science and engineering. This is so not just because AMR techniques allow us to carry out calculations very... more
    • by 
    •   12  
      Cognitive ScienceDistributed ComputingParallel ComputingDistributed Shared Memory System
Distributed-memory multicomputers such as the the Intel Paragon, the IBM SP-2, and the Thinking Machines CM-5 o er signi cant advantages over shared-memory multiprocessors in terms of cost and scalability. Unfortunately, extracting all... more
    • by 
    •   19  
      BusinessComputer ArchitectureComputer GraphicsComputational Modeling
This paper extends the algorithms which were developed in Part I to cases in which there is no a ne schedule, i.e. to problems whose parallel complexity is polynomial but not linear. The natural generalization is to multidimensional... more
    • by 
    •   12  
      Computer ArchitectureDistributed ComputingParallel ProgrammingScheduling
This paper extends the algorithms which were developed in Part I to cases in which there is no a ne schedule, i.e. to problems whose parallel complexity is polynomial but not linear. The natural generalization is to multidimensional... more
    • by 
    •   10  
      Distributed ComputingParallel ProgrammingSchedulingComputer Software
    • by 
    •   20  
      Computer ScienceDistributed ComputingCompilersParallel Programming
    • by 
    •   6  
      ProgrammingAutomatic ParallelizationData AccessData Distribution
Static scheduling of a program represented by a directed task graph on a multiprocessor system to minimize the program completion time is a well-known problem in parallel processing. Since finding an optimal schedule is an NPcomplete... more
    • by 
    •   12  
      Graph TheoryParallel ProcessingGenetic AlgorithmOptimal mine design and scheduling
In this paper we present JaMP, an adaptation of the OpenMP standard. JaMP is fitted to Jackal, a software-based DSM implementation for Java.
    • by  and +1
    •   10  
      Distributed ComputingDistributed Shared Memory SystemComputer SoftwareFortran
an NSF Graduate Research Fellowship and NSF and Darpa grants to the Fugu and Raw projects. While provided a vital support network. Most of all, I have relied on my wife, Kathleen Shannon, and my children, Karissa and Anya. Their love has... more
    • by 
    •   5  
      Flow ControlAutomatic ParallelizationSpeculative ExecutionHardware Implementation of Algorithms
Existing theories of multiple object tracking (MOT) offer different predictions concerning the role of higher level cognitive processes, individual differences, effortful attention and parallel processing in MOT. Pylyshyn's model (1989)... more
    • by 
    •   14  
      PsychologyCognitive SciencePerceptionSpatial Memory
GPUs are a class of specialized parallel architectures with tremendous computational power. The new Compute Unified Device Architecture (CUDA) programming model from NVIDIA facilitates programming of general purpose applications on their... more
    • by 
    •   12  
      Program TransformationShared memoryAutomatic ParallelizationData Dependence
Automatic parallelization in the polyhedral model is based on a ne transformations from an original computation domain (iteration space) to a target space-time domain, often with a di erent transformation for each variable. Code... more
    • by  and +1
    •   9  
      Distributed ComputingParallel ProgrammingComputer SoftwareAutomatic Parallelization
per. Software package build upon proposed algorithm is described. Several practical examples of mesh generation on multiprocessor computational systems are given. It is shown that developed parallel algorithm enables us to reduce mesh... more
    • by 
    •   9  
      Mesh generationAutomatic ParallelizationParallel AlgorithmThree Dimensional
We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and... more
    • by 
    •   10  
      Static AnalysisData AnalysisProvenanceFlow Control
A robot-controlled wafer bonding machine was developed for the bonding of different sizes of wafers ranging up to 8 inches diameter. The features of this equipment are such that: (1) After the automatic parallel adjustment for 8-inch... more
    • by 
    •   6  
      ElectronicAutomatic ParallelizationWafer BondingInfrared
    • by  and +1
    •   13  
      Cognitive ScienceComputer ScienceMemory ManagementParallel Processing
The Support Vector Machine (SVM) is a supervised learning algorithm used for recognizing patterns in data. It is a very popular technique in Machine Learning and has been successfully used in applications such as image classification,... more
    • by 
    •   4  
      Distributed ComputingComputer SoftwareAutomatic ParallelizationSupport vector machine
The Set-Sharing domain has been widely used to infer at compiletime interesting properties of logic programs such as occurs-check reduction, automatic parallelization, and finite-tree analysis. However, performing abstract unification in... more
    • by 
    • Automatic Parallelization
Parallelizing compilers have traditionally focussed mainly on parallelizing loops. This paper presents a new framework for automatically parallelizing recursive procedures that typically appear in divide-and-conquer algorithms. We present... more
    • by 
    •   8  
      Distributed ComputingParallel ProgrammingTiming AnalysisComputer Software
With the advent of digitization and growing abundance of graphic and image processing tools, use cases for clipping using circular windows have grown considerably. This paper presents an efficient clipping algorithm for line segments... more
    • by 
    •   14  
      Alfred the Great and the Alfredian CircleVienna CircleAutomatic ParallelizationQuality Circles
The widespread use of multicore processors is not a consequence of significant advances in parallel programming.
    • by 
    •   4  
      Cognitive ScienceDistributed ComputingParallel ComputingAutomatic Parallelization
Data-oriented workflows are often used in scientific applications for executing a set of dependent tasks across multiple computers. We discuss how these can be modeled using lambda calculus, and how ideas from functional programming are... more
    • by 
    •   5  
      Distributed ComputingFunctional ProgrammingLambda CalculusScientific Workflows
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function,... more
    • by 
    •   8  
      Computer ScienceDistributed ComputingAutomatic ParallelizationData Processing
    • by 
    •   10  
      Software EngineeringParallel ProcessingCompilerAutomatic Parallelization
Recent advances in polyhedral compilation technology have made it feasible to automatically transform affine sequential loop nests for tiled parallel execution on multi-core processors. However, for multi-statement input programs with... more
    • by 
    •   8  
      Linear AlgebraAutomatic ParallelizationCode GenerationMulticore processors
Previous literature in alphabetic languages suggests that the occipital-temporal region (the ventral pathway) is specialized for automatic parallel word recognition, whereas the parietal region (the dorsal pathway) is specialized for... more
    • by 
    •   25  
      PersonalityCharacter RecognitionVocabularyMagnetic Resonance Imaging
We describe pHPF, an research prototype HPF compiler for the IBM SP series parallel machines. The compiler accepts as input Fortran 90 and Fortran 77 programs, augmented with HPF directives; sequential loops are automatically... more
    • by 
    •   16  
      Computer ArchitectureData AnalysisDistributed Shared Memory SystemParallel Processing
This paper develops and experimentally demonstrates a robust automatic parallel parking algorithm for parking in tight spaces. Novel fuzzy logic controllers are designed for each step of the maneuvering process. The controllers are first... more
    • by 
    •   8  
      Mechanical EngineeringGeneticsFuzzy LogicGenetic Algorithm
Tiling is a well known loop transformation used to reduce communication overhead in distributed memory machines. Although a lot of theoretical research has been done concerning the selection of proper tile shapes that reduce processor... more
    • by  and +1
    •   12  
      Cognitive ScienceDistributed ComputingParallel ComputingClusters
The problem of writing software for multicore processors is greatly simplified if we could automatically parallelize sequential programs. Although auto-parallelization has been studied for many decades, it has succeeded only in a few... more
    • by 
    •   14  
      Software DevelopmentAbstractionData StructureGraphics
Recent advances in polyhedral compilation technology have made it feasible to automatically transform affine sequential loop nests for tiled parallel execution on multi-core processors. However, for multi-statement input programs with... more
    • by 
    •   11  
      CompilersLinear AlgebraAutomatic ParallelizationCode Generation
We report on a detailed study of the application and e ectiveness of program analysis based on abstract interpretation to automatic program parallelization. We study the case of parallelizing logic programs using the notion of strict... more
    • by 
    •   8  
      Information SystemsAbstract InterpretationLogic ProgrammingProgram Analysis
Current approaches to parallelizing compilation perform a purely structural analysis of the sequential code. Conversely, a semantic analysis performing concept assignment for code sections, can support the recognition of the algorithms... more
    • by 
    •   11  
      Semantic AnalysisParallel ProcessingReverse EngineeringComputer Software
Automatic parallelization is a promising strategy to improve application performance in the multicore era. However, common programming practices such as the reuse of data structures introduce artificial constraints that obstruct automatic... more
    • by 
    •   4  
      Data StructureSpeculationAutomatic Parallelizationruntime system
HEXAR, a new software product developed at Cray Research, Inc., automatically generates good quality meshes directly from surface data produced by computeraided design (CAD) packages. The HEXAR automatic mesh generator is based on a... more
    • by 
    •   6  
      Pattern RecognitionMesh generationAutomatic ParallelizationProduct Development
We discuss the parallelization and object-oriented implementation of Monte Carlo simulations for physical problems. We present a C++ Monte Carlo class library for the automatic parallelization of Monte Carlo simulations. Besides... more
    • by 
    •   9  
      Object Oriented ProgrammingSolid State PhysicsParallel ProgrammingMonte Carlo Simulation
We describe and evaluate a novel approach for the automatic parallelization of programs that use pointer-based dynamic data structures, written in Java. The approach exploits parallelism among methods by creating an asynchronous thread of... more
    • by 
    •   6  
      Distributed ComputingTiming AnalysisJava ProgrammingAutomatic Parallelization
The evolution of high performance computers is progressing toward increasingly heterogeneous systems. These new architectures pose new challenges, particularly in the field of programming languages. New tools and languages are needed if... more
    • by 
    •   6  
      Case StudyAutomatic ParallelizationCode GenerationHeterogeneous Systems
automatically generates good quality meshes directly from surface data produced by computeraided design (CAD) packages. The HEXAR automatic mesh generator is based on a proprietary and parallel algorithm that relies on pattern... more
    • by 
    •   7  
      Computer SciencePattern RecognitionMesh generationAutomatic Parallelization
Two key steps in the compilation of strict functional languages are the conversion of higher-order functions to data structures (closures) and the transformation to tail-recursive style. We show how to perform both steps at once by... more
    • by 
    •   28  
      Object Oriented ProgrammingFunctional ProgrammingTime UseGlobal Analysis
This paper introduces an analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views computations as composed of operations... more
    • by 
    •   17  
      Computer ScienceParallel ComputingObject Oriented ProgrammingData Mining
Atomic operations are a key primitive in parallel computing systems. The standard implementation mechanism for atomic operations uses mutual exclusion locks. In an object-based programming system, the natural granularity is to give each... more
    • by 
    •   5  
      Distributed ComputingParallel & Distributed ComputingAutomatic ParallelizationMutual Exclusion
A flexible compiler framework for distributed-memory multicomputers automatically parallelizes sequential programs. A unified approach efficiently supports regular and irregular computations using data and functional parallelism.
    • by 
    •   10  
      Distributed ComputingHigh Performance ComputingParallel ProgrammingDistributed Shared Memory System
The desire to simulate more and more geometrical and physical features of technical structures and the availability of parallel computers and parallel numerical solvers which can exploit the power of these machines have lead to a steady... more
    • by  and +2
    •   11  
      Applied MathematicsMathematical PhysicsMesh generationAutomatic Parallelization
    • by  and +1
    •   8  
      Parallel ProcessingComputation Fluid DynamicsAutomatic ParallelizationParallel Computer
Tree contraction algorithms, whose idea was first proposed by Miller and Reif, are important parallel algorithms to implement efficient parallel programs manipulating trees. Despite their efficiency, the tree contraction algorithms have... more
    • by 
    •   3  
      Automatic ParallelizationCode GenerationParallel Algorithm
Divide-and-conquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of... more
    • by 
    •   6  
      Program TransformationAutomatic ParallelizationAutomatic code generationParallel Machines