19 questions
0
votes
0
answers
8
views
How to Set Dask Dashboard Address with SLURMRunner (Jobqueue) and Access It via SSH Port Forwarding?
I am trying to run a Dask Scheduler and Workers on a remote cluster using SLURMRunner from dask-jobqueue. I want to bind the Dask dashboard to 0.0.0.0 (so it’s accessible via port forwarding) and ...
0
votes
0
answers
104
views
Worker log file gets mangled when log_directory is set
I used to have the worker logs as such:
./slurm-<id>.out
...
So I wanted to have SLURMCluster writes the worker logs in a separate directory (as opposed to the current working dir), so I ...
2
votes
1
answer
630
views
Difference between dask node and compute node for slurm configuration
First off, apologies if I use confusing or incorrect terminology, I am still learning.
I am trying to set up configuration for a Slurm-enabled adaptive cluster.
Documentation of the supercomputer and ...
1
vote
1
answer
604
views
Does Dask LocalCluster Shutdown when kernel restarts
If I restart my jupyter kernel will any existing LocalCluster shutdown or will the dask worker processes keep running?
I know when I used a SLURM Cluster the processes keep running if I restart my ...
1
vote
1
answer
192
views
Logging in Dask
I am using a SLURM cluster and want to be able to be able to add custom logs inside my task that should appear in the logs on the dashboard when inspecting a particular worker.
Alternatively I would ...
0
votes
1
answer
1k
views
Job, Worker, and Task in dask_jobqueue
I am using a SLURM cluster with Dask and don't quite understand the configuration part. The documentation talks of jobs and workers and even has a section on the difference:
In dask-distributed, a ...
0
votes
1
answer
182
views
How to change dask job_name to SGECluster
I am using dask_jobqueue.SGECluster() and when I submit jobs to the grid they are all listed as dask-worker. I want to have different names for each submitted job.
Here is one example:
futures = []
...
0
votes
2
answers
552
views
Dask workers get stuck in SLURM queue and won't start until the master hits the walltime
Lately, I've been trying to do some machine learning work with Dask on an HPC cluster which uses the SLURM scheduler. Importantly, on this cluster SLURM is configured to have a hard wall-time limit of ...
0
votes
1
answer
131
views
How to speed up launching workers when the number of workers is large?
Currently, I use dask_jobsqueue to parallelize my code, and I have difficulty setting up a cluster quickly when the number of workers is large.
When I scale up the number of workers (say more than ...
1
vote
1
answer
120
views
Reconfigure Dask jobqueue on the fly
I have a jobqueue configuration for Slurm which looks something like:
cluster = SLURMCluster(cores=20,
processes=2,
memory='62GB',
...
1
vote
1
answer
728
views
Dask Jobqueue - Why does using processes result in cancelled jobs?
Main issue
I'm using Dask Jobequeue on a Slurm supercomputer. My workload includes a mix of threaded (i.e. numpy) and python workloads, so I think a balance of threads and processes would be best for ...
5
votes
0
answers
1k
views
Dask distributed KeyError
I am trying to learn Dask using a small example. Basically I read in a file and calculate row means.
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(cores=4, memory='24 GB')
cluster....
1
vote
1
answer
462
views
Dask jobqueue job killed due to permission
I'm trying to use Dask job-queue on our HPC system. And this is the code I'm using:
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(cores=2, memory='20GB', processes=1,
...
0
votes
1
answer
273
views
Dask: Would storage network speed cause a worker to die
I am running a process that writes large files across the storage network. I can run the process using a simple loop and I get no failures. I can run using distributed and jobqueue during off peak ...
0
votes
1
answer
170
views
How can I keep a PBSCluster running?
I have access to a cluster running PBS Pro and would like to keep a PBSCluster instance running on the headnode. My current (obviously broken) script is:
import dask_jobqueue
from paths import ...
4
votes
1
answer
275
views
Is there a way of using dask jobqueue over ssh
Dask jobqueue seems to be a very nice solution for distributing jobs to PBS/Slurm managed clusters. However, if I'm understanding its use correctly, you must create instance of "PBSCluster/...
0
votes
1
answer
222
views
Trouble with setting PBS Cluster using dask that finds my own modules
I am running into some errors when trying to set up my own client using jobqueue PBS Cluster instead of using a default local cluster (i.e., client = Client()).
When setting the default, my own ...
0
votes
2
answers
1k
views
Create local_directory for dask_jobqueue
I'm trying to run dask on an HPC system that uses NFS for storage. As such, I want to configure dask to use local storage for scratch space. Each cluster node has a /scratch/ folder that all users can ...
0
votes
1
answer
632
views
Custom job script submission to PBS via Dask?
I have a PBS job script with an executable that writes results to out file.
### some lines
PBS_O_EXEDIR="path/to/software"
EXECUTABLE="executablefile"
OUTFILE="out"
### Copy application directory ...