Finding nvml.h

File this entry under "preventing others from banging their head against a wall as long as I did".

I've been playing with NVIDIA's NVML library recently. It provides a bunch of handy functions for querying the health and configuration of Tesla GPUs, setting compute mode and ECC state, etc.

However, while the Python bindings (and presumably the Perl ones) work just fine accessing on their own, I had problems when I tried writing something in C. Specifically, I couldn't find the nvml.h header file.

Turns out it lives in the Tesla Deployment Kit, along with a handy program called nvidia-healthmon which fits neatly into your NHC or prologue/epilogue scripts for doing GPU sanity checks. You'll also need nvml.h if you want to do something like build Torque with NVIDIA GPU support. (Torque's website completely fails to mention you need the TDK, incidentally.)

OK, public service announcement achieved. Back to hacking... :)