73 questions
0
votes
0
answers
265
views
How do I specify that my assembly code runs on a specific core using MPIDR_EL1 in ARMv8 architecture
I'm trying to understand bare-metal startup code for Raspberry-pi 4.
In that code,
the first thing is to check if execution is done by core 0.
start:
// Check processor ID is zero (executing on ...
0
votes
0
answers
26
views
Merge two sorted subsequences of a an array in parallel
Consider that we have array $A$ of integers with length $n$. Each element of this array is tagged either with "blue" or "red". We know blue elements are sorted, and red elements ...
2
votes
2
answers
4k
views
How to run multiprocess Chroma.from_documents() in Langchain
Can we somehow pass an option to run multiple threads/processes when we call Chroma.from_documents() in Langchain?
I am trying to embed 980 documents (embedding model is mpnet on CUDA), and it take ...
1
vote
0
answers
88
views
Is there a way to jump to long mode using an indirect jump?
I am writing an operating system and have just started the other CPUs, so I am in assembly on each now. I have set up long mode and now I just need to perform a far jump. However, when I try to do any ...
1
vote
0
answers
29
views
multiprocessing with Pool in python, and returned variables
I'm using some python scripts to process images. An image is some between 500x500 or 4000x4000 pixels. The scripts do iterations over every pixel, so they're time consuming, so I set up the ...
0
votes
0
answers
300
views
Can numba, multiprocessor and random number generators work together?
I'm trying to get numba, multiprocessor and random number generators work together. I have downsized my real problem to the following piece of code containing the important elements. The following ...
0
votes
0
answers
217
views
Parallelizing creating list of lists using multiprocessor
I am creating a list of lists using a function that takes a long time to calculate each of the list elements. Since this is slowing down my whole process, I'm trying to make it run faster by using the ...
-1
votes
1
answer
70
views
comparing fairness in single processor vs multiprocessor
The guarded-command looping construct
do
Condition ⇒ Command
...
Condition ⇒ Command
od
involves nondeterministic choice, as explained in the text. An important theoretical concept related to ...
0
votes
0
answers
125
views
why booting APs needs an indirect call in the mit6.828 example OS kernel?
I'm studying mit6.828 course, I don't understand why it must indirect call when booting APs in lab4. I try to direct call call mp_main(source code: movl $mp_main, %eax; call *%eax), but it causes ...
1
vote
1
answer
294
views
Will my server be able to run only one client if its a single-threaded process. If yes, why?
I have googled decent enough to understand threads and processes. One thing I am confused is about single-threaded process.
The scenario is the Server-Client application process where each client is ...
0
votes
0
answers
62
views
How synchronization performed in multiprocessor system between threads, where thread share data but running in different processors
1.a)How synchronization performed in multiprocessor system between threads, where thread share data but running in different processors.
1.b)Is single processor thread synchronization system ...
0
votes
0
answers
95
views
Process becomes zombie - Python3 Multiprocessing
First of all I shall explain the structure of my scripts:
Script1 --(call)--> Script2 function and this function further calls 12-15 functions in different scripts through multiprocessing, and in ...
0
votes
0
answers
664
views
Is there a way to simulate an M/M/C queue where each process requires more than one processor?
I have a CSV with 100k processes with their respective Interarrival time, Service Time and # of processors required. I did simulation in Python but it was not even close to the theoretical results.
...
0
votes
1
answer
329
views
How can OpenMP's round robin scheduling hurt ccNUMA's performance?
I'm trying to understand ccNUMA systems but I'm a little bit confused about how OpenMP's scheduling can hurt the performance.Let's say we have the below code.What is happening if c1 is smaller than c0 ...
2
votes
0
answers
467
views
How to configure slurm workload manager for single node machine on CentOS 7
We just bought single node server with 2x18=36 cores and 200 Gb ram for scientific calculations. There are several users and it is very hard to track jobs and to schedule them between users. This is ...
4
votes
1
answer
5k
views
multiprocessing not achieving full CPU usage on dual-processor windows machine
I am working on a dual-processor windows machine and am trying to run several independent python processes using the multiprocessing library. Of course, I am aiming to maximize the use of both CPU's ...
1
vote
1
answer
1k
views
atomic operation definition and multiprocessor
I'm learning about synchronization and now I'm confused about the definition of atomic operation. Through searching, I could only find out that atomic operation is uninterruptible operation.
Then, ...
1
vote
0
answers
366
views
Multiprocessing Images and Pipe comunication
I'm trying to speed up an imaging pipeline considering two processors.
Take images and write it to a pipe
read image and compute
Therefore, I've read something about pipes:
import os
from ...
0
votes
1
answer
645
views
Sending data (BytesIO buffer) through a Pipe works but causes a Fatal Python exception
Using Python 2.7 on Windows, the following code works but causes a problem with msvc.
import io
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt2
from multiprocessing import Process, ...
0
votes
1
answer
603
views
What is meant by the scalability of an SMP system?
I'm currently learning for my final exam in operating systems and I'm stuck with a (probably very easy) question from earlier exams. The problem is, that we've never had that topic in the lecture and ...
0
votes
0
answers
482
views
Slow computational time with scipy odeint on a multiprocessor machine
I am running a python script on two computers of different configurations: PC-1 is more powerful than PC-2. The issue is that PC-1 has a computational time which is twice that of PC-2...obviously I ...
1
vote
1
answer
798
views
Linux: effect of signal on multiple threads
I don't think this is a duplicate. I have a very specific question about what happens to other threads when a signal handler is invoked.
I have a multithreaded program that plays with hardware. On ...
0
votes
1
answer
43
views
multiprocessing Issues: Different behaviours on different systems, ranging from correct to absolutely incorrect
I am processing large text files for data(>40MB) and doing it serially was taking a lot of time. I decided to use the python 3.5 multiprocessor package. When it works, it is significantly faster but ...
0
votes
0
answers
166
views
How to pass a socket descriptor to another process through shared memory?
How to pass a socket descriptor to another process through shared memory?
I pass the socket handle to control the process (getting there right handle value), then passes the handle of the control to ...
0
votes
1
answer
189
views
Does the pthread API provide synchronization in a multiprocessor environment?
I've just started to study the pthread API. I've been using different books and websites, and judging from what they all report, pthread synchronization functions (e.g. those involving mutexes) all ...
0
votes
2
answers
318
views
How to parallelise nested for loops in python
I have this function containing nested loops. I need to parallelise for faster execution of code.
def euclid_distance(X,BOW_X):
d3=[]
d2=[]
for l in range(len(X)):
for n in ...
1
vote
2
answers
2k
views
Does Python3 automatically use all cores?
I just bumped into some weird performance "issue"/"gain" with python3. The following code loads 5 weight matrices and applies them to a fairly large dataset. While doing so it writes each row out to ...
0
votes
1
answer
2k
views
First Come First Serve Multiprocessor (6 processors) scheduler in C code
I am developing for a FCFS scheduler algorithm. But it only works with one processor. How can divide the task into 6 processors? I would need waiting queue, ready queue, etc.
Each processor should ...
2
votes
1
answer
3k
views
How does multithreaded kernel work?
I have read that linux kernel is multi threaded and there can be multiple threads running concurrently in each core. In a SMP (symmetric multiprocessing) environment where a single OS manages all the ...
0
votes
1
answer
920
views
Sharing websockets object between tornado processes
I start the tornado server with multiple processes:
server.bind(8000)
server.start(0)
Assuming I have a 4 processor system this should create 4 processes. For any client that connects I start ...
0
votes
1
answer
237
views
Why call join() process after results are needed?
I saw this code posted somewhere and was having trouble understanding how it could possibly work properly:
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in ...
0
votes
0
answers
358
views
multiprocessing python code
When I try to paralellize python code I get an assertion error. Here is the code :
check = Parallel(n_jobs=ncpu)(delayed(removeident)(h) for h in splitframe)
individually, each element (h) in ...
1
vote
1
answer
2k
views
Mistake when use abaqus subroutine to read file with multiple processors(cpus)
I got a mistake when I use abaqus subroutine to read file with multiple processors(cpus),could you help me to deal with this mistake.thanks a lot
I want to read variables from a file ,when one cpu is ...
1
vote
3
answers
884
views
is synchronization needed with multiprocessor in python?
when using a code like this
def execute_run(list_out):
... do something
pool = ThreadPoolExecutor(6)
for i in list1:
for j in list2:
pool.submit(myfunc, list_out)
pool.join()
...
2
votes
3
answers
200
views
Perl ithreads :shared variables - multiprocessor kernel threads - visibility
perlthrtut excerpt:
Note that a shared variable guarantees that if two or more threads try
to modify it at the same time, the internal state of the variable will
not become corrupted. However, ...
0
votes
1
answer
1k
views
Multi-thread program(process) on multicore-core processor(s) with hyperthreading
For multicore computing, one thing confusing me from the beginning is the model of multicore hardware is too abstracted from the real machine.
I worked on a laptop with a single intel processor, ...
1
vote
1
answer
189
views
Hierarchical CLH lock behaviour
Could anyone explain how does a HCLH lock handles the new nodes that are created in the local cluster after the Cluster Master has merged the local queue into the global queue?
0
votes
2
answers
453
views
Why 2 registers are required in construction of Regular Boolean MRSW Register?
public class RegBoolMRSWRegister implements Register<Boolean>
private boolean old;
private SafeBoolMRSWRegister value;
public void write(boolean x ) {
if (old != x ) {
...
1
vote
2
answers
540
views
Synchronization of data in multiple-core environment (Java-based)
this is my first question ever so please be gentle on me.
What happens when two threads, say t1 and t2, running on separate CPU cores invoke a synchronized method on a shared object AT THE SAME TIME, ...
33
votes
4
answers
27k
views
Gradle android build for different processor architectures
I want to build 4 separate apks for 4 different Android CPU processor architectures (armeabi armeabi-v7a x86 mips) using Gradle.
I have native OpenCV libraries built for 4 CPU architectures in the ...
0
votes
2
answers
145
views
how do processor knows about the latest copy of cache line in multiprocessor system
In multiprocessor system where each processor have its own copy of cache, how processor comes to know from where to get the copy of data.
As it will be present in its own cache,also in caches of other ...
0
votes
0
answers
61
views
How to protect critical sections of processes executing in different cores?
I have assigned the processes I create to different cores by using sched_setaffinity() and created mutex as process shared:
pthread_mutexattr_setpshared(&psharedm,PTHREAD_PROCESS_SHARED);
This ...
0
votes
1
answer
2k
views
memory access in multi core processors vs multiple cpu's
I have a question,
is it possible for multiple processor machine to access data from RAM ( single ram system ) ?
for eg machine has 2 processors p1, p2 which are executing in parallel , is it ...
1
vote
1
answer
262
views
Moving or designating thread stack space in windows
I'm doing parallel programming in a NUMA computer (I do not have the computer yet, it's scheduled to arrive soon™).
I have a pool of worker threads on each NUMA node (with processor affinity set) and ...
2
votes
1
answer
274
views
Grid of thread blocks and Multiprocessor
The CUDA programing guide states:
The CUDA architecture is built around a scalable array of multithreaded Streaming Multiprocessors (SMs). When a CUDA program on the host CPU invokes a kernel grid, ...
0
votes
1
answer
81
views
Block processing patterns of gpu cards using their SM cores
I have a question about the scheduling processes of compute capability 1.3 and 2.0 gpu cards.
The maximum blocks scheduled each time at a Streaming Multiprocessor are 8 in both cases, at least that's ...
0
votes
1
answer
61
views
How to close file so that it can be used by other processor?
I'm trying to remove file text.pckl with command os.remove('text.pckl'). I have created the file by other processor and I get error:
WindowsError: [Error 32] The process cannot access the file ...
0
votes
1
answer
2k
views
cache coherency : snooping v directory based
from what I understand: directory based system is more server centric design and snooping is more peer to peer centric.
That is why directory based requires less messages for any read-miss as it can ...
3
votes
2
answers
167
views
How does Nvidia's Fermi GPU issue threadblocks to streaming multiprocessor
Assume I have 8 threadblocks and my GPU has 8 SMs. Then how does GPU issue this threadblocks to the SMs?
I found some programs or articles suggest a breadth-first manner, that is , each SM runs a ...
0
votes
1
answer
2k
views
How to measure Streaming Multiprocessor use/idle times in CUDA?
A simple question, really: I have a kernel which runs with the maximum number of blocks per Streaming Multiprocessor (SM) possible and would like to know how much more performance I could ...