Questions tagged [parallel-computing]
Questions about algorithms or programs that compute on multiple processing units simultaneously. Not to be confused with concurrent or distributed computing!
348 questions
0
votes
0
answers
25
views
How can we include the missing values in our computation?
Please refer to the paper Migacz, S., et al. (2019). Parallel implementation of a sequential Markov chain in Monte Carlo simulations of physical systems with pairwise interactions. Journal of Chemical ...
1
vote
0
answers
45
views
Can you explain how the update should work?
Please refer to the paper Migacz, S., et al. (2019). Parallel implementation of a sequential Markov chain in Monte Carlo simulations of physical systems with pairwise interactions. Journal of Chemical ...
0
votes
0
answers
15
views
Static two-issue RISC-V pipeline: how is this sd and blt data hazard handled?
I was following through this example (From the subheading Simple Multiple-Issue Code Scheduling in chapter 4.10 of
Computer Organization and Design RISC-V Edition by
Hennesy, Patterson), an example ...
0
votes
0
answers
43
views
Is checking GCD in NC or strongly P?
Given two integers $m$ and $n$ computing $\mathsf{GCD}(m,n)$ is not known to be either in $NC$ or in strongly Polynomial time.
Given three integers $m$, $n$ and $g$, is testing $g=\mathsf{GCD}(m,n)$ ...
0
votes
1
answer
29
views
Need a book recommentation for converting a sequential program into a parallel
Suppose I have a sequential program of biological, physical, or chemical simulation.
I need a step-by-step algorithm for translating that sequential program into a parallel program.
Can you recommend ...
0
votes
1
answer
28
views
Find the minimum number of send/recv actions in directed communication graph
Given p processes and n packages. Every process contains k = n / p distinct packages. Now, I ...
1
vote
1
answer
40
views
Are WAR and WAW hazards only possible if we have parallel computing
I was reading on WAW and WAR data hazards and I think that these can only happen if we have parallel computing ,in contrast with RAW data hazard which can happen even if we dont have parallel ...
0
votes
1
answer
90
views
How is possible to the computer make multiple downloads at same time
My intuition is that the computer is receiving parts a little part of every download, so if I am download three files i receive like , part 1 file one , part 1 file two , part 2 file one ...
This ...
0
votes
1
answer
38
views
Complexity of Scheduling n Tasks on m Machines with Identical Execution Times, Dependencies, and Time Lags
This is a scheduling problem with $n$ tasks across $m$ machines. Tasks have dependencies (DAG) and can be divided into two types: A) need resources $R$ and have an identical execution time. B) do not ...
0
votes
2
answers
32
views
Calculating the Maximum Speedup for a Program with Sequential and Parallel Components
A program $P$ consists of two methods: one sequential, which takes $ n + 15050 $ seconds to execute, and another that can be parallelized with full efficiency, taking $\frac{n^2}{100p}$ seconds, where ...
1
vote
2
answers
49
views
Can multi-threading help improve an IO-bound process perforamance?
I am trying to understand the difference between CPU Bound vs IO Bound process. ChatGPT suggested that multi-threading/parallel processing can help CPU bound process; However, I think that having ...
3
votes
1
answer
65
views
Parallel prefix sum/scan on trees
Let $T$ be a rooted, finitely branching tree, with values from $A$ in its nodes; $T(j)$ is the value of a node called $j$. $A$ is a commutative monoid. The "bottom-up scan tree" of $T$ is a ...
0
votes
0
answers
13
views
Hyperplanes and loop parallelization
Is it possible for the hyperplane method for loop parallelization, to return more than one hyperplanes whose points are scheduled to be executed at the same time? In an example I studied (Shang & ...
0
votes
1
answer
244
views
Cumulative sum algorithm that worked on parallel computing
Compared to "vectors addition" which can naturally work independently for the summation of each of its corresponding elements, it allows parallel work by processors. Example:
...
0
votes
0
answers
15
views
Dependency testing and parallel compilers
I currently study well known dependency testing for automatic loop parallization from parallel compilers (ZIV, SIV, MIV, Single Variable Constraint, Multiple Variable Constraint, Acyclic Test, Fourier ...
1
vote
1
answer
183
views
Why bisection width of star connected network is 1?
This may seem like a simple question, but I can neither find the answer on the internet nor AI tools can give a proper answer. I was reading a book about static networks and I saw that the bisection ...
0
votes
1
answer
144
views
How do you understand Systolic Arrays?
I'm working on a problem from the Digital Design and Computer Architecture course on Systolic arrays.
The question set up is as follows:
The following diagram is a systolic array that performs the ...
0
votes
1
answer
65
views
What is the relation of branch instruction in single core CPU v.s. multicore CPU?
In the Computer organization and design book, there is an exercise in chapter 1 that stated:
...
0
votes
1
answer
30
views
Parallel reductions with complex objects
Typically (from what I can tell) reduction operations produce a "number". This makes them easy to deal with since there isn't really any memory overhead.
However, I have something that I'm ...
0
votes
1
answer
73
views
Work and depth of DAG - problem in understanding exam question
I have some problems understanding the following 2 question from an exam last year, which I am preparing for this week in Parallel Programming.
Determine the Work -> $W(n) = 7$ (#vertices)
...
1
vote
1
answer
97
views
Parallel Algorithm Analysis: Loops
I have come across what seems to be a collapsed loop.
parallel-for i, j = 1...n
What would be the work and span/depth/critical path length be of loops like this? ...
0
votes
1
answer
654
views
Parallel Algorithm Pseudocode: Helman-JaJa Listrank
What would Helman-JaJa listrank pseudocode be like? I tried looking around but all I found were "prosecode" descriptions (eg pp. 18-19 here) which I find kinda hard to follow.
3
votes
1
answer
246
views
Are quantum computer strictly "faster" than any massively parallel computer in terms of computational complexity?
I've seen that quantum computing calculations have their own complexity classes https://en.wikipedia.org/wiki/Quantum_complexity_theory, namely BQP and QMA.
I've heard that this would beat any ...
1
vote
1
answer
180
views
fastest algorithm to count leaf nodes (i.e. terminal nodes)
With the following recursive code to count leaf nodes of a binary tree, is there any way to make it faster or parallel-computing optimized in time?
Python code - (mag(P) = number of leaf nodes of tree ...
0
votes
0
answers
44
views
Difference between Distributed File System, Cluster File System and Parallel File System
On the internet, I am unable to find concrete definitions of these three types of file-systems. Can someone clearly explain the difference between these?
0
votes
0
answers
26
views
Analyzing Parallel Matching Algorithm - Why it takes O(n+m) time and work?
Using the algorithm provided by this paper, they said that:
The algorithm defines a single phase of the local max algorithm. Each step of the phase takes at most O(log(m + n)) = O(logn) time and O(n +...
0
votes
1
answer
21
views
What is the associative operator ⊕ in graph matching? and How does it works?
I read a paper about Parallel Matching, and I didn't understand what the associative operator ⊕ in the following lemma/proof and how does it works in vertices and edges in the graph?
Lemma 3. Using ...
0
votes
2
answers
55
views
Desirable fraction of non-parallelizable code in Amdahl's law
I read that if $\alpha$ is the fraction of the code that cannot be made parallel then it is desirable that $\alpha < \frac 1 p$ where $p$ is the number of processors.
Why is it the case?
The speed-...
0
votes
0
answers
113
views
Understanding the faster multiplication hardware in riscv
I found a one question I wonder, but there is something to be clear about.
The link is https://electronics.stackexchange.com/questions/56488/parallel-multiplication-hardware/56518#56518
Q1. In his ...
0
votes
3
answers
88
views
Trading time for work; does this concurrency phenomenon have a name?
Recently, my girlfriend and I were trying to get out of the house, when I encountered a phenomenon which I thought might be analogous to a tradeoff in concurrent systems.
Here's the real world setup.
...
0
votes
1
answer
2k
views
Any ideas for Parallel Computing Project?
I have to make an application as my Parallel Computing university course project.
The application should make use of parallel processing. Any ideas or examples for these kinds of applications ??
3
votes
0
answers
364
views
Is there a way to parallelise find and inserts for a binary search tree?
Background: I'm working on a data structure benchmark tool to benchmark insert and search time and I am trying to improve my own implementation of a BST to support parallelism.
I have implemented a ...
0
votes
3
answers
239
views
Super-linear parallelism or speedup in parallel matrix multiplication algorithms
I'm reading this slides from a MIT course on parallel software performance. They introduced the concepts of Work $T_1$, Span $T_\infty$ and Parallelism (ratio $T_1/T_\infty$). What is called ''...
1
vote
0
answers
49
views
What is the circuit depth of asymptotically fast multiplication algorithms?
I know the circuit depth (i.e. amount of sequential operations on the longest path) for long multiplication can easily be made $O(\log(n))$ (possibly at the cost of increasing the computational ...
2
votes
0
answers
70
views
Parallel algorithm for Polynomial GCD
It is known that half-GCD (https://www.csd.uwo.ca/~mmorenom/CS424/Lectures/FastDivisionAndGcd.html/node6.html) is can be used to compute the GCD of two polynomials over a finite field. Is it possible ...
0
votes
1
answer
44
views
How to represent mutually exclusive events in CSP
I am studying Communicating Sequential Processes (CSP), and I understand that an event can be "emmited" ($\overline{e}$) and an event can be "waited on" ($e$), so that an event ...
1
vote
0
answers
157
views
Multithreaded Regex?
I once had an idea for a multithreaded regex engine.
The first method that occured to me (there may be other ways to do it that I haven't thought of) was to split up the string being matched against ...
3
votes
0
answers
65
views
GPU compute parallel radix sort optimizations
I'm looking at [1] that describes a GPU parallel radix sort algorithm. They describe multiple optimizations but in section 3 where the whole algorithm is described some things elude me.
First, the ...
2
votes
2
answers
157
views
How do I prove the following invariant of this program?
I have been studying different topics within the realm of concurrent programming and came across "Lamport's bakery algorithm" which is based on the original version of the bakery algorithm ...
3
votes
0
answers
271
views
What is the etymology of "swizzle"?
The word swizzle can refer to an operation performed in GPU algorithms:
[The] ability to compose vectors by arbitrarily rearranging and combining components of other vectors.
Swizzle (computer ...
1
vote
1
answer
100
views
Sorting a list over rounds of binary comparisons
You have a list of $n$ items you want to sort over $r$ rounds using binary comparisons. In each round, you specify $k$ binary comparisons to be made in parallel. The objective is to fully sort the ...
0
votes
0
answers
25
views
Resources to self-study counting/sorting networks
I am reading The Art of Multiprocessor Programming. I was able to grasp almost everything up to Counting Networks in chapter 12 - Counting Sorting and Distributed Coordination. However I am having ...
1
vote
0
answers
56
views
(Branchless) Bitonic Sorting Network for a Set of Floating Point Numbers
In the past I've implemented a branchless Bitonic Sorting Network on a gpu using CUDA, for integers.
I am facing a related problem:
In my Order Independent Transparency implementation, I would like to ...
1
vote
1
answer
37
views
Efficient Way To Compute Points Where Convolution Equals Zero
I want to model the following procedure. Given a shape $L\subseteq\mathbf{R}^2$ (imagine e.g. a letter) and another (could be convex if it makes life easier) shape $K\subseteq\mathbf{R}^2$, I want ...
0
votes
1
answer
35
views
Recursive decomposition if a sequence can be parsed in both directions
Given a sequence/string of values and an algorithm that can process said sequence in both directions and obtain the same result - does it follow that the problem can in fact be decomposed in a divide-...
2
votes
0
answers
25
views
Minimum number of registers for mutual exclusion
I was reading distributed algorithms (by Nancy lynch) and there is a part that I would kindly like to get a simpler explanation for.
I know the book mentioned no algorithm can solve mutual exclusion ...
3
votes
2
answers
3k
views
How is this example sequentially consistent but not linearizable?
Background information that I know(But not understand with examples, I will be glad if you could explain it to me with examples)
A replicated service is said to be linearizable if for any executin ...
3
votes
2
answers
252
views
Books about data structures and algorithms but specifically focusing on parallel gpgpu programming
I'm looking for a book on the subject of algorithms and data structures, but I want to specifically learn about the parallel implementations. Books that teach CUDA and OpenCL usually have small ...
2
votes
1
answer
218
views
Is doing BFS over transitive reduction of a directed acyclic graph equivalent to topological ordering of that graph?
I have a directed acyclic graph. Where each node is a task and each edge denotes a dependency.
I want a method to effectively parallelize these tasks. One way would be to topological sort them based ...
1
vote
0
answers
47
views
Puzzled by this interview problem of scheduling a computation graph on a single-processor under a memory constraint
I recently went through a interview session for a SWE/CS role at a well known company.
It wasn't specifically a "coding-round" but was titled a "domain interview" session, so I ...