Skip to main content

Questions tagged [parallel-computing]

Questions about algorithms or programs that compute on multiple processing units simultaneously. Not to be confused with concurrent or distributed computing!

Filter by
Sorted by
Tagged with
0 votes
0 answers
25 views

How can we include the missing values in our computation?

Please refer to the paper Migacz, S., et al. (2019). Parallel implementation of a sequential Markov chain in Monte Carlo simulations of physical systems with pairwise interactions. Journal of Chemical ...
user366312's user avatar
1 vote
0 answers
45 views

Can you explain how the update should work?

Please refer to the paper Migacz, S., et al. (2019). Parallel implementation of a sequential Markov chain in Monte Carlo simulations of physical systems with pairwise interactions. Journal of Chemical ...
user366312's user avatar
0 votes
0 answers
15 views

Static two-issue RISC-V pipeline: how is this sd and blt data hazard handled?

I was following through this example (From the subheading Simple Multiple-Issue Code Scheduling in chapter 4.10 of Computer Organization and Design RISC-V Edition by Hennesy, Patterson), an example ...
albin's user avatar
  • 1
0 votes
0 answers
43 views

Is checking GCD in NC or strongly P?

Given two integers $m$ and $n$ computing $\mathsf{GCD}(m,n)$ is not known to be either in $NC$ or in strongly Polynomial time. Given three integers $m$, $n$ and $g$, is testing $g=\mathsf{GCD}(m,n)$ ...
Turbo's user avatar
  • 2,919
0 votes
1 answer
29 views

Need a book recommentation for converting a sequential program into a parallel

Suppose I have a sequential program of biological, physical, or chemical simulation. I need a step-by-step algorithm for translating that sequential program into a parallel program. Can you recommend ...
user366312's user avatar
0 votes
1 answer
28 views

Find the minimum number of send/recv actions in directed communication graph

Given p processes and n packages. Every process contains k = n / p distinct packages. Now, I ...
Green grün 绿色 vert зеленый's user avatar
1 vote
1 answer
40 views

Are WAR and WAW hazards only possible if we have parallel computing

I was reading on WAW and WAR data hazards and I think that these can only happen if we have parallel computing ,in contrast with RAW data hazard which can happen even if we dont have parallel ...
Root Groves's user avatar
0 votes
1 answer
90 views

How is possible to the computer make multiple downloads at same time

My intuition is that the computer is receiving parts a little part of every download, so if I am download three files i receive like , part 1 file one , part 1 file two , part 2 file one ... This ...
LeoC's user avatar
  • 3
0 votes
1 answer
38 views

Complexity of Scheduling n Tasks on m Machines with Identical Execution Times, Dependencies, and Time Lags

This is a scheduling problem with $n$ tasks across $m$ machines. Tasks have dependencies (DAG) and can be divided into two types: A) need resources $R$ and have an identical execution time. B) do not ...
Aurelia's user avatar
0 votes
2 answers
32 views

Calculating the Maximum Speedup for a Program with Sequential and Parallel Components

A program $P$ consists of two methods: one sequential, which takes $ n + 15050 $ seconds to execute, and another that can be parallelized with full efficiency, taking $\frac{n^2}{100p}$ seconds, where ...
2019's user avatar
  • 1
1 vote
2 answers
49 views

Can multi-threading help improve an IO-bound process perforamance?

I am trying to understand the difference between CPU Bound vs IO Bound process. ChatGPT suggested that multi-threading/parallel processing can help CPU bound process; However, I think that having ...
Sahil's user avatar
  • 149
3 votes
1 answer
65 views

Parallel prefix sum/scan on trees

Let $T$ be a rooted, finitely branching tree, with values from $A$ in its nodes; $T(j)$ is the value of a node called $j$. $A$ is a commutative monoid. The "bottom-up scan tree" of $T$ is a ...
Patrick Nicodemus's user avatar
0 votes
0 answers
13 views

Hyperplanes and loop parallelization

Is it possible for the hyperplane method for loop parallelization, to return more than one hyperplanes whose points are scheduled to be executed at the same time? In an example I studied (Shang & ...
Athanasios Margaris's user avatar
0 votes
1 answer
244 views

Cumulative sum algorithm that worked on parallel computing

Compared to "vectors addition" which can naturally work independently for the summation of each of its corresponding elements, it allows parallel work by processors. Example: ...
Muhammad Ikhwan Perwira's user avatar
0 votes
0 answers
15 views

Dependency testing and parallel compilers

I currently study well known dependency testing for automatic loop parallization from parallel compilers (ZIV, SIV, MIV, Single Variable Constraint, Multiple Variable Constraint, Acyclic Test, Fourier ...
Athanasios Margaris's user avatar
1 vote
1 answer
183 views

Why bisection width of star connected network is 1?

This may seem like a simple question, but I can neither find the answer on the internet nor AI tools can give a proper answer. I was reading a book about static networks and I saw that the bisection ...
user164631's user avatar
0 votes
1 answer
144 views

How do you understand Systolic Arrays?

I'm working on a problem from the Digital Design and Computer Architecture course on Systolic arrays. The question set up is as follows: The following diagram is a systolic array that performs the ...
Connor's user avatar
  • 101
0 votes
1 answer
65 views

What is the relation of branch instruction in single core CPU v.s. multicore CPU?

In the Computer organization and design book, there is an exercise in chapter 1 that stated: ...
user153245's user avatar
0 votes
1 answer
30 views

Parallel reductions with complex objects

Typically (from what I can tell) reduction operations produce a "number". This makes them easy to deal with since there isn't really any memory overhead. However, I have something that I'm ...
user2978125's user avatar
0 votes
1 answer
73 views

Work and depth of DAG - problem in understanding exam question

I have some problems understanding the following 2 question from an exam last year, which I am preparing for this week in Parallel Programming. Determine the Work -> $W(n) = 7$ (#vertices) ...
Zagatho's user avatar
  • 101
1 vote
1 answer
97 views

Parallel Algorithm Analysis: Loops

I have come across what seems to be a collapsed loop. parallel-for i, j = 1...n What would be the work and span/depth/critical path length be of loops like this? ...
jon doyle's user avatar
0 votes
1 answer
654 views

Parallel Algorithm Pseudocode: Helman-JaJa Listrank

What would Helman-JaJa listrank pseudocode be like? I tried looking around but all I found were "prosecode" descriptions (eg pp. 18-19 here) which I find kinda hard to follow.
jon doyle's user avatar
3 votes
1 answer
246 views

Are quantum computer strictly "faster" than any massively parallel computer in terms of computational complexity?

I've seen that quantum computing calculations have their own complexity classes https://en.wikipedia.org/wiki/Quantum_complexity_theory, namely BQP and QMA. I've heard that this would beat any ...
Gere's user avatar
  • 170
1 vote
1 answer
180 views

fastest algorithm to count leaf nodes (i.e. terminal nodes)

With the following recursive code to count leaf nodes of a binary tree, is there any way to make it faster or parallel-computing optimized in time? Python code - (mag(P) = number of leaf nodes of tree ...
Ten's user avatar
  • 119
0 votes
0 answers
44 views

Difference between Distributed File System, Cluster File System and Parallel File System

On the internet, I am unable to find concrete definitions of these three types of file-systems. Can someone clearly explain the difference between these?
Tarun Gupta's user avatar
0 votes
0 answers
26 views

Analyzing Parallel Matching Algorithm - Why it takes O(n+m) time and work?

Using the algorithm provided by this paper, they said that: The algorithm defines a single phase of the local max algorithm. Each step of the phase takes at most O(log(m + n)) = O(logn) time and O(n +...
Reem Aljunaid's user avatar
0 votes
1 answer
21 views

What is the associative operator ⊕ in graph matching? and How does it works?

I read a paper about Parallel Matching, and I didn't understand what the associative operator ⊕ in the following lemma/proof and how does it works in vertices and edges in the graph? Lemma 3. Using ...
RJ94's user avatar
  • 1
0 votes
2 answers
55 views

Desirable fraction of non-parallelizable code in Amdahl's law

I read that if $\alpha$ is the fraction of the code that cannot be made parallel then it is desirable that $\alpha < \frac 1 p$ where $p$ is the number of processors. Why is it the case? The speed-...
user1868607's user avatar
  • 2,204
0 votes
0 answers
113 views

Understanding the faster multiplication hardware in riscv

I found a one question I wonder, but there is something to be clear about. The link is https://electronics.stackexchange.com/questions/56488/parallel-multiplication-hardware/56518#56518 Q1. In his ...
Jin's user avatar
  • 1
0 votes
3 answers
88 views

Trading time for work; does this concurrency phenomenon have a name?

Recently, my girlfriend and I were trying to get out of the house, when I encountered a phenomenon which I thought might be analogous to a tradeoff in concurrent systems. Here's the real world setup. ...
lmonninger's user avatar
0 votes
1 answer
2k views

Any ideas for Parallel Computing Project?

I have to make an application as my Parallel Computing university course project. The application should make use of parallel processing. Any ideas or examples for these kinds of applications ??
Raman Kumar's user avatar
3 votes
0 answers
364 views

Is there a way to parallelise find and inserts for a binary search tree?

Background: I'm working on a data structure benchmark tool to benchmark insert and search time and I am trying to improve my own implementation of a BST to support parallelism. I have implemented a ...
Eterm's user avatar
  • 131
0 votes
3 answers
239 views

Super-linear parallelism or speedup in parallel matrix multiplication algorithms

I'm reading this slides from a MIT course on parallel software performance. They introduced the concepts of Work $T_1$, Span $T_\infty$ and Parallelism (ratio $T_1/T_\infty$). What is called ''...
fulem's user avatar
  • 23
1 vote
0 answers
49 views

What is the circuit depth of asymptotically fast multiplication algorithms?

I know the circuit depth (i.e. amount of sequential operations on the longest path) for long multiplication can easily be made $O(\log(n))$ (possibly at the cost of increasing the computational ...
Nic's user avatar
  • 111
2 votes
0 answers
70 views

Parallel algorithm for Polynomial GCD

It is known that half-GCD (https://www.csd.uwo.ca/~mmorenom/CS424/Lectures/FastDivisionAndGcd.html/node6.html) is can be used to compute the GCD of two polynomials over a finite field. Is it possible ...
Sean's user avatar
  • 21
0 votes
1 answer
44 views

How to represent mutually exclusive events in CSP

I am studying Communicating Sequential Processes (CSP), and I understand that an event can be "emmited" ($\overline{e}$) and an event can be "waited on" ($e$), so that an event ...
EmmanuelMess's user avatar
1 vote
0 answers
157 views

Multithreaded Regex?

I once had an idea for a multithreaded regex engine. The first method that occured to me (there may be other ways to do it that I haven't thought of) was to split up the string being matched against ...
Sam's user avatar
  • 133
3 votes
0 answers
65 views

GPU compute parallel radix sort optimizations

I'm looking at [1] that describes a GPU parallel radix sort algorithm. They describe multiple optimizations but in section 3 where the whole algorithm is described some things elude me. First, the ...
PentaKon's user avatar
  • 141
2 votes
2 answers
157 views

How do I prove the following invariant of this program?

I have been studying different topics within the realm of concurrent programming and came across "Lamport's bakery algorithm" which is based on the original version of the bakery algorithm ...
NoName123's user avatar
  • 171
3 votes
0 answers
271 views

What is the etymology of "swizzle"?

The word swizzle can refer to an operation performed in GPU algorithms: [The] ability to compose vectors by arbitrarily rearranging and combining components of other vectors. Swizzle (computer ...
Chance Snow's user avatar
1 vote
1 answer
100 views

Sorting a list over rounds of binary comparisons

You have a list of $n$ items you want to sort over $r$ rounds using binary comparisons. In each round, you specify $k$ binary comparisons to be made in parallel. The objective is to fully sort the ...
chausies's user avatar
  • 542
0 votes
0 answers
25 views

Resources to self-study counting/sorting networks

I am reading The Art of Multiprocessor Programming. I was able to grasp almost everything up to Counting Networks in chapter 12 - Counting Sorting and Distributed Coordination. However I am having ...
Thanuja Dilhan's user avatar
1 vote
0 answers
56 views

(Branchless) Bitonic Sorting Network for a Set of Floating Point Numbers

In the past I've implemented a branchless Bitonic Sorting Network on a gpu using CUDA, for integers. I am facing a related problem: In my Order Independent Transparency implementation, I would like to ...
Vectorizer's user avatar
1 vote
1 answer
37 views

Efficient Way To Compute Points Where Convolution Equals Zero

I want to model the following procedure. Given a shape $L\subseteq\mathbf{R}^2$ (imagine e.g. a letter) and another (could be convex if it makes life easier) shape $K\subseteq\mathbf{R}^2$, I want ...
fweth's user avatar
  • 239
0 votes
1 answer
35 views

Recursive decomposition if a sequence can be parsed in both directions

Given a sequence/string of values and an algorithm that can process said sequence in both directions and obtain the same result - does it follow that the problem can in fact be decomposed in a divide-...
adrianton3's user avatar
2 votes
0 answers
25 views

Minimum number of registers for mutual exclusion

I was reading distributed algorithms (by Nancy lynch) and there is a part that I would kindly like to get a simpler explanation for. I know the book mentioned no algorithm can solve mutual exclusion ...
user206904's user avatar
3 votes
2 answers
3k views

How is this example sequentially consistent but not linearizable?

Background information that I know(But not understand with examples, I will be glad if you could explain it to me with examples) A replicated service is said to be linearizable if for any executin ...
user avatar
3 votes
2 answers
252 views

Books about data structures and algorithms but specifically focusing on parallel gpgpu programming

I'm looking for a book on the subject of algorithms and data structures, but I want to specifically learn about the parallel implementations. Books that teach CUDA and OpenCL usually have small ...
alagris's user avatar
  • 143
2 votes
1 answer
218 views

Is doing BFS over transitive reduction of a directed acyclic graph equivalent to topological ordering of that graph?

I have a directed acyclic graph. Where each node is a task and each edge denotes a dependency. I want a method to effectively parallelize these tasks. One way would be to topological sort them based ...
thambi's user avatar
  • 125
1 vote
0 answers
47 views

Puzzled by this interview problem of scheduling a computation graph on a single-processor under a memory constraint

I recently went through a interview session for a SWE/CS role at a well known company. It wasn't specifically a "coding-round" but was titled a "domain interview" session, so I ...
Joe Black's user avatar
  • 161

1
2 3 4 5
7