I'm looking for best practices for optimizing smartmontools performance on a large system with >1000 drives. I’d hate for this simple widget to swamp the CPUs or memory, or slow disk IO especially under regular operational loads, but I don’t have a lab system big enough to test it. Are there any metrics that would allow me to extrapolate system load effects with smartmontools at this level?
My goal is to just monitor disk health basics in the background, which would let me flag a pass/fail event and then let me check logs or run more extensive tests manually when there’s a problem.
smartd config (logging plus weekly short tests)
/dev/sda -H -o on -S on -l selftest -l error -f -s (S/../.././02)