Better SMT support in Univa Grid Engine 8.1.2 (2012-10-11)

The current release of Univa‘s Grid Engine 8.1.2 is not only further improving stability, it also has a small enhancement for a better support of heterogenous clusters having hyper-threaded and non hyper-threaded hosts (this enhacement is available since 8.1.1). This request originally came from a larger research institute, which is exploiting Univa Grid Engine‘s core binding feature.

The situation was following: The jobs should always be bound to cores depending on the amount of slots the job requests. This can easily be solved by adding -binding linear:<n> where <n> denotes the number of slots requested per host. But some of the hosts had hyper-threading enabled while others not and the jobs are allowed to run on both host-types. Their parallel jobs had 2 threads running on a hyper-threaded core while on non hyper-threaded machines only one thread per core was allowed.

Hence a new scheduler parameter was introduced: COUNT_CORES_AS_THREADS
It can be set globally by opening the scheduler configuration with qconf -msconf, and append the parameter to the params field (like params COUNT_CORES_AS_THREADS=1).

What the scheduler now does (when it is enabled) is following: On all hosts it requests just as many cores needed to have <n> processing units (or hardware threads) bound to the job. So the core count request is transformed on each host to a thread count request.

Following example demonstrates this:

qsub -b y -binding linear:4 sleep 123

The scheduler will allow this job to run on hyper-threaded host, when it has 2 cores (and 4 threads) unbound. On a non-hyperthreaded host, the job will need 4 cores and therefore is dispatched by the scheduler, when such a host offers 4 unbound cores.