WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

[Graph API] Is compiled_partition thread-safe for concurrent calls on same instance? #4381

@lxq2t

Description

@lxq2t

We observe crashes (double-free, heap corruption) when multiple threads concurrently call execute() on the same compiled_partition instance. The crash occurs inside matmul_t::execute_impl during scratchpad deallocation.

oneDNN version: 3.6.0
Graph contains: MatMul operations
OS: Linux 6.8.0-060800rc6-generic

Can you clarify:

  1. Is compiled_partition::execute() guaranteed thread-safe for concurrent calls on the same instance?
  2. If not, should we maintain a pool of compiled_partition instances (one per concurrent execution)?
  3. Is there an allocator or scratchpad configuration that enables safe concurrent execution? We already tried dnnl::graph::make_engine_with_allocator(dnnl::engine::kind::cpu, 0, allocator); with functions to system malloc/free but without effect.
// Setup
dnnl::graph::graph g(dnnl::engine::kind::cpu);
// ... add matmul operations ...
g.finalize();

auto partitions = g.get_partitions();
auto cp = partitions[0].compile(inputs, outputs, engine);

// Concurrent execution - CRASHES
std::vector<std::thread> threads;
for (int i = 0; i < 4; ++i) {
    threads.emplace_back([&]() {
        for (int iter = 0; iter < 100; ++iter) {
            // Each thread has its own input/output tensors
            std::vector<dnnl::graph::tensor> my_inputs = /* thread-local */;
            std::vector<dnnl::graph::tensor> my_outputs = /* thread-local */;
            
            cp.execute(stream, my_inputs, my_outputs);  // CRASH here
        }
    });
}
for (auto& t : threads) t.join();

Thank you in advance.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions