WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

It seems impossible to pin the threads #5590

@rongcuid

Description

@rongcuid

I have tried many methods to pin OpenBLAS threads (for microbenchmark purpose), but none of the implied methods work. I have tried every combination of:

  • NO_AFFINITY=1 or NO_AFFINITY=0
  • OPENBLAS_NUM_THREADS
  • GOTOBLAS_MAIN_FREE, OPENBLAS_MAIN_FREE
  • openblas_set_num_threads before any call, right before set affinity, and after set affinity.
  • openblas_setaffinity with cpuset_t that are known to be correct on other microbenchmarks.
  • Whether I pin the main thread or not.

No matter what I do, OpenBLAS uses more threads than I allow it, except in the one case of setting OPENBLAS_NUM_THREADS. It seems like it just automatically increases the amount of threads to use all hyperthreads on the physical cores, sometimes more.

How exactly does the OpenBlas thread affinity work? Am I supposed to set NO_AFFINITY to 1 or 0? Does openblas_set_num_threads have any effects in run time?

Here's my config:

OpenBLAS 0.3.30 NO_AFFINITY HASWELL MAX_THREADS=28

This is how I pin the threads (using hwloc):

    for (int i = 0; i < threads; ++i) {
      int core = i;
      int ht = pin_offset;
      hwloc_obj_t core_obj = hwloc_get_obj_by_type(topo, HWLOC_OBJ_CORE, core);
      if (!core_obj) {
        fprintf(stderr, "Failed to get core\n");
        exit(1);
      }
      hwloc_obj_t ht_obj = hwloc_get_obj_below_by_type(topo, HWLOC_OBJ_CORE,
                                                       core, HWLOC_OBJ_PU, ht);
      if (!ht_obj) {
        fprintf(stderr, "Failed to get thread\n");
        exit(1);
      }
      fprintf(stderr, "Logical core [%d:%d] is physical [%d:%d]\n", core, ht,
              core_obj->os_index, ht_obj->os_index);
      cpu_set_t cpu_set;
      CPU_ZERO(&cpu_set);
      hwloc_cpuset_to_glibc_sched_affinity(topo, ht_obj->cpuset, &cpu_set,
                                           sizeof(cpu_set));
      if (openblas_setaffinity(i, sizeof(cpu_set_t), &cpu_set)) {
        perror("openblas_setaffinity()");
        exit(1);
      }
    }

Is this a bug? A misuse? I have no idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions