This is my approach using queues on an 8 GPU machine.
First, I set up a queue for each GPU. This only needs to be done once.
for i in {0..7}
do
guild run --background --yes queue gpus=$i
done
I can check that my queues are running with
guild runs -Fo queue
To submit jobs, I use, e.g.,
guild run --stage-trials --quiet train.py learning_rate='logspace[-3:0:4]' batch_size='[1,2,5,10,20]'
The queues automatically grab jobs and run them.