-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Description
I have tried many methods to pin OpenBLAS threads (for microbenchmark purpose), but none of the implied methods work. I have tried every combination of:
NO_AFFINITY=1orNO_AFFINITY=0OPENBLAS_NUM_THREADSGOTOBLAS_MAIN_FREE,OPENBLAS_MAIN_FREEopenblas_set_num_threadsbefore any call, right before set affinity, and after set affinity.openblas_setaffinitywithcpuset_tthat are known to be correct on other microbenchmarks.- Whether I pin the main thread or not.
No matter what I do, OpenBLAS uses more threads than I allow it, except in the one case of setting OPENBLAS_NUM_THREADS. It seems like it just automatically increases the amount of threads to use all hyperthreads on the physical cores, sometimes more.
How exactly does the OpenBlas thread affinity work? Am I supposed to set NO_AFFINITY to 1 or 0? Does openblas_set_num_threads have any effects in run time?
Here's my config:
OpenBLAS 0.3.30 NO_AFFINITY HASWELL MAX_THREADS=28
This is how I pin the threads (using hwloc):
for (int i = 0; i < threads; ++i) {
int core = i;
int ht = pin_offset;
hwloc_obj_t core_obj = hwloc_get_obj_by_type(topo, HWLOC_OBJ_CORE, core);
if (!core_obj) {
fprintf(stderr, "Failed to get core\n");
exit(1);
}
hwloc_obj_t ht_obj = hwloc_get_obj_below_by_type(topo, HWLOC_OBJ_CORE,
core, HWLOC_OBJ_PU, ht);
if (!ht_obj) {
fprintf(stderr, "Failed to get thread\n");
exit(1);
}
fprintf(stderr, "Logical core [%d:%d] is physical [%d:%d]\n", core, ht,
core_obj->os_index, ht_obj->os_index);
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
hwloc_cpuset_to_glibc_sched_affinity(topo, ht_obj->cpuset, &cpu_set,
sizeof(cpu_set));
if (openblas_setaffinity(i, sizeof(cpu_set_t), &cpu_set)) {
perror("openblas_setaffinity()");
exit(1);
}
}
Is this a bug? A misuse? I have no idea.
Metadata
Metadata
Assignees
Labels
No labels