Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
8 views

How to Set Dask Dashboard Address with SLURMRunner (Jobqueue) and Access It via SSH Port Forwarding?

I am trying to run a Dask Scheduler and Workers on a remote cluster using SLURMRunner from dask-jobqueue. I want to bind the Dask dashboard to 0.0.0.0 (so it’s accessible via port forwarding) and ...
user1834164's user avatar
0 votes
0 answers
104 views

Worker log file gets mangled when log_directory is set

I used to have the worker logs as such: ./slurm-<id>.out ... So I wanted to have SLURMCluster writes the worker logs in a separate directory (as opposed to the current working dir), so I ...
michaelgbj's user avatar
2 votes
1 answer
630 views

Difference between dask node and compute node for slurm configuration

First off, apologies if I use confusing or incorrect terminology, I am still learning. I am trying to set up configuration for a Slurm-enabled adaptive cluster. Documentation of the supercomputer and ...
pgierz's user avatar
  • 744
1 vote
1 answer
604 views

Does Dask LocalCluster Shutdown when kernel restarts

If I restart my jupyter kernel will any existing LocalCluster shutdown or will the dask worker processes keep running? I know when I used a SLURM Cluster the processes keep running if I restart my ...
HashBr0wn 's user avatar
1 vote
1 answer
192 views

Logging in Dask

I am using a SLURM cluster and want to be able to be able to add custom logs inside my task that should appear in the logs on the dashboard when inspecting a particular worker. Alternatively I would ...
HashBr0wn 's user avatar
0 votes
1 answer
1k views

Job, Worker, and Task in dask_jobqueue

I am using a SLURM cluster with Dask and don't quite understand the configuration part. The documentation talks of jobs and workers and even has a section on the difference: In dask-distributed, a ...
HashBr0wn 's user avatar
0 votes
1 answer
182 views

How to change dask job_name to SGECluster

I am using dask_jobqueue.SGECluster() and when I submit jobs to the grid they are all listed as dask-worker. I want to have different names for each submitted job. Here is one example: futures = [] ...
IvanV's user avatar
  • 1
0 votes
2 answers
552 views

Dask workers get stuck in SLURM queue and won't start until the master hits the walltime

Lately, I've been trying to do some machine learning work with Dask on an HPC cluster which uses the SLURM scheduler. Importantly, on this cluster SLURM is configured to have a hard wall-time limit of ...
Marta Moreno's user avatar
0 votes
1 answer
131 views

How to speed up launching workers when the number of workers is large?

Currently, I use dask_jobsqueue to parallelize my code, and I have difficulty setting up a cluster quickly when the number of workers is large. When I scale up the number of workers (say more than ...
Yuki's user avatar
  • 1
1 vote
1 answer
120 views

Reconfigure Dask jobqueue on the fly

I have a jobqueue configuration for Slurm which looks something like: cluster = SLURMCluster(cores=20, processes=2, memory='62GB', ...
Albatross's user avatar
  • 1,055
1 vote
1 answer
728 views

Dask Jobqueue - Why does using processes result in cancelled jobs?

Main issue I'm using Dask Jobequeue on a Slurm supercomputer. My workload includes a mix of threaded (i.e. numpy) and python workloads, so I think a balance of threads and processes would be best for ...
Albatross's user avatar
  • 1,055
5 votes
0 answers
1k views

Dask distributed KeyError

I am trying to learn Dask using a small example. Basically I read in a file and calculate row means. from dask_jobqueue import SLURMCluster cluster = SLURMCluster(cores=4, memory='24 GB') cluster....
Phoenix Mu's user avatar
1 vote
1 answer
462 views

Dask jobqueue job killed due to permission

I'm trying to use Dask job-queue on our HPC system. And this is the code I'm using: from dask_jobqueue import SLURMCluster cluster = SLURMCluster(cores=2, memory='20GB', processes=1, ...
Phoenix Mu's user avatar
0 votes
1 answer
273 views

Dask: Would storage network speed cause a worker to die

I am running a process that writes large files across the storage network. I can run the process using a simple loop and I get no failures. I can run using distributed and jobqueue during off peak ...
schierkolk's user avatar
0 votes
1 answer
170 views

How can I keep a PBSCluster running?

I have access to a cluster running PBS Pro and would like to keep a PBSCluster instance running on the headnode. My current (obviously broken) script is: import dask_jobqueue from paths import ...
Trauer's user avatar
  • 2,090
4 votes
1 answer
275 views

Is there a way of using dask jobqueue over ssh

Dask jobqueue seems to be a very nice solution for distributing jobs to PBS/Slurm managed clusters. However, if I'm understanding its use correctly, you must create instance of "PBSCluster/...
Phil Reinhold's user avatar
0 votes
1 answer
222 views

Trouble with setting PBS Cluster using dask that finds my own modules

I am running into some errors when trying to set up my own client using jobqueue PBS Cluster instead of using a default local cluster (i.e., client = Client()). When setting the default, my own ...
DanS's user avatar
  • 1
0 votes
2 answers
1k views

Create local_directory for dask_jobqueue

I'm trying to run dask on an HPC system that uses NFS for storage. As such, I want to configure dask to use local storage for scratch space. Each cluster node has a /scratch/ folder that all users can ...
lsterzinger's user avatar
0 votes
1 answer
632 views

Custom job script submission to PBS via Dask?

I have a PBS job script with an executable that writes results to out file. ### some lines PBS_O_EXEDIR="path/to/software" EXECUTABLE="executablefile" OUTFILE="out" ### Copy application directory ...
ranjith's user avatar
  • 135