Article 6MC3J trixie, slurmctls problem, troubleshooting loop

trixie, slurmctls problem, troubleshooting loop

by
kaz2100
from LinuxQuestions.org on (#6MC3J)
Dear Debian experts:

System
testing (trixie) amd64 A while ago, this penguin crashed after dist-upgrade, then resuscitated (minimal system clean installed, many packages installed following apt log)

Problem
Cluster system (slurm) stopped working.

Troubleshooting
1. Following logs, it looks like that slurmctld fails to kick in.
2. systemctl status slurmctld.service says that "slurmctld: fatal: Can't find plugin for select/cons_res"
3. For some reason, /usr/lib/x86_64-linux-gnu/slurm-wlm/select_cons_res.so is missing in current repo, while bookworm repo has it.

4. To rule out some mix-up, I tried to purge slurmctld, and install.
aptitude purge slurmctld says that pre-removal script fails.
5. Followed /var/lib/dpkg/info/slurmctls.prerm, this script complains that "slurm_load_jobs error: Unable to contact slurm controller (connect failure)"
6. goto 1. above.

Question
Am I on the right track?
Is there any good idea to get out of this loop?
Another system upgrade does not help.

Any information will be appreciated
External Content
Source RSS or Atom Feed
Feed Location https://feeds.feedburner.com/linuxquestions/latest
Feed Title LinuxQuestions.org
Feed Link https://www.linuxquestions.org/questions/
Reply 0 comments