[TSan] Fix deadlocks during TSan error reporting on Apple platforms #151495

DanBlackwell · 2025-07-31T10:58:55Z

Changes:

All platforms: move OutputReport (and thus symbolization) after the locked region when reporting.
- ScopedReportBase constructor requires the thread_registry lock to be held, so the regions where ScopedReports get created must be inside the locked region - to do this I have taken a new unnamed scope.
- This has been done for all functions calling OutputReport, hence the diff here is large; this could be split into multiple PRs.
Apple only: Bail early from MemoryAccess if we are symbolizing.

This is a follow up to #151120, which delayed symbolization of ScopedReports.

* All platforms: move OutputReport (and thus symbolization) after the locked region when reporting. * ScopedReportBase constructor requires the thread_registry lock to be held, so the regions where ScopedReports get created must be inside the locked region. * Apple only: Bail early from MemoryAccess if we are symbolizing.

llvmbot · 2025-07-31T10:59:24Z

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Dan Blackwell (DanBlackwell)

Changes

Changes:

All platforms: move OutputReport (and thus symbolization) after the locked region when reporting.
- ScopedReportBase constructor requires the thread_registry lock to be held, so the regions where ScopedReports get created must be inside the locked region - to do this I have taken a new unnamed scope.
- This has been done for all functions calling OutputReport, hence the diff here is large; this could be split into multiple PRs.
Apple only: Bail early from MemoryAccess if we are symbolizing.

This is a follow up to #151120, which delayed symbolization of ScopedReports.

Full diff: https://github.com/llvm/llvm-project/pull/151495.diff

7 Files Affected:

(modified) compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (+18-7)
(modified) compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp (+19-10)
(modified) compiler-rt/lib/tsan/rtl/tsan_mman.cpp (+13-4)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp (+5)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp (+80-52)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp (+60-48)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp (+18-7)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 795e05394d71a..fa160126485ba 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -22,6 +22,7 @@
 #include "sanitizer_common/sanitizer_internal_defs.h"
 #include "sanitizer_common/sanitizer_libc.h"
 #include "sanitizer_common/sanitizer_linux.h"
+#include "sanitizer_common/sanitizer_placement_new.h"
 #include "sanitizer_common/sanitizer_platform_interceptors.h"
 #include "sanitizer_common/sanitizer_platform_limits_netbsd.h"
 #include "sanitizer_common/sanitizer_platform_limits_posix.h"
@@ -2141,13 +2142,23 @@ static void ReportErrnoSpoiling(ThreadState *thr, uptr pc, int sig) {
   // StackTrace::GetNestInstructionPc(pc) is used because return address is
   // expected, OutputReport() will undo this.
   ObtainCurrentStack(thr, StackTrace::GetNextInstructionPc(pc), &stack);
-  ThreadRegistryLock l(&ctx->thread_registry);
-  ScopedReport rep(ReportTypeErrnoInSignal);
-  rep.SetSigNum(sig);
-  if (!IsFiredSuppression(ctx, ReportTypeErrnoInSignal, stack)) {
-    rep.AddStack(stack, true);
-    OutputReport(thr, rep);
-  }
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    new (_rep) ScopedReport(ReportTypeErrnoInSignal);
+    ScopedReport &rep = *_rep;
+    rep.SetSigNum(sig);
+    if (!IsFiredSuppression(ctx, ReportTypeErrnoInSignal, stack)) {
+      rep.AddStack(stack, true);
+      OutputReport(thr, rep);
+    }
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 static void CallUserSignalHandler(ThreadState *thr, bool sync, bool acquire,
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
index befd6a369026d..fa9899b982d25 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
@@ -437,16 +437,25 @@ void __tsan_mutex_post_divert(void *addr, unsigned flagz) {
 }
 
 static void ReportMutexHeldWrongContext(ThreadState *thr, uptr pc) {
-  ThreadRegistryLock l(&ctx->thread_registry);
-  ScopedReport rep(ReportTypeMutexHeldWrongContext);
-  for (uptr i = 0; i < thr->mset.Size(); ++i) {
-    MutexSet::Desc desc = thr->mset.Get(i);
-    rep.AddMutex(desc.addr, desc.stack_id);
-  }
-  VarSizeStackTrace trace;
-  ObtainCurrentStack(thr, pc, &trace);
-  rep.AddStack(trace, true);
-  OutputReport(thr, rep);
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    new (_rep) ScopedReport(ReportTypeMutexHeldWrongContext);
+    ScopedReport &rep = *_rep;
+    for (uptr i = 0; i < thr->mset.Size(); ++i) {
+      MutexSet::Desc desc = thr->mset.Get(i);
+      rep.AddMutex(desc.addr, desc.stack_id);
+    }
+    VarSizeStackTrace trace;
+    ObtainCurrentStack(thr, pc, &trace);
+    rep.AddStack(trace, true);
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 INTERFACE_ATTRIBUTE
diff --git a/compiler-rt/lib/tsan/rtl/tsan_mman.cpp b/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
index 0ea83fb3b5982..fce1d70de1d6f 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
@@ -182,10 +182,19 @@ static void SignalUnsafeCall(ThreadState *thr, uptr pc) {
   ObtainCurrentStack(thr, pc, &stack);
   if (IsFiredSuppression(ctx, ReportTypeSignalUnsafe, stack))
     return;
-  ThreadRegistryLock l(&ctx->thread_registry);
-  ScopedReport rep(ReportTypeSignalUnsafe);
-  rep.AddStack(stack, true);
-  OutputReport(thr, rep);
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    new (_rep) ScopedReport(ReportTypeSignalUnsafe);
+    ScopedReport &rep = *_rep;
+    rep.AddStack(stack, true);
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
index 487fa490636eb..02719beac9f07 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
@@ -419,6 +419,11 @@ NOINLINE void TraceRestartMemoryAccess(ThreadState* thr, uptr pc, uptr addr,
 
 ALWAYS_INLINE USED void MemoryAccess(ThreadState* thr, uptr pc, uptr addr,
                                      uptr size, AccessType typ) {
+#if SANITIZER_APPLE && !SANITIZER_GO
+  // Swift symbolizer can be intercepted and deadlock without this
+  if (thr->in_symbolizer)
+    return;
+#endif
   RawShadow* shadow_mem = MemToShadow(addr);
   UNUSED char memBuf[4][64];
   DPrintf2("#%d: Access: %d@%d %p/%zd typ=0x%x {%s, %s, %s, %s}\n", thr->tid,
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp
index 2a2bf42c92ecb..4026f7a8e0c14 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp
@@ -11,14 +11,15 @@
 //===----------------------------------------------------------------------===//
 
 #include <sanitizer_common/sanitizer_deadlock_detector_interface.h>
+#include <sanitizer_common/sanitizer_placement_new.h>
 #include <sanitizer_common/sanitizer_stackdepot.h>
 
-#include "tsan_rtl.h"
 #include "tsan_flags.h"
-#include "tsan_sync.h"
+#include "tsan_platform.h"
 #include "tsan_report.h"
+#include "tsan_rtl.h"
 #include "tsan_symbolize.h"
-#include "tsan_platform.h"
+#include "tsan_sync.h"
 
 namespace __tsan {
 
@@ -55,14 +56,23 @@ static void ReportMutexMisuse(ThreadState *thr, uptr pc, ReportType typ,
     return;
   if (!ShouldReport(thr, typ))
     return;
-  ThreadRegistryLock l(&ctx->thread_registry);
-  ScopedReport rep(typ);
-  rep.AddMutex(addr, creation_stack_id);
-  VarSizeStackTrace trace;
-  ObtainCurrentStack(thr, pc, &trace);
-  rep.AddStack(trace, true);
-  rep.AddLocation(addr, 1);
-  OutputReport(thr, rep);
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    new (_rep) ScopedReport(typ);
+    ScopedReport &rep = *_rep;
+    rep.AddMutex(addr, creation_stack_id);
+    VarSizeStackTrace trace;
+    ObtainCurrentStack(thr, pc, &trace);
+    rep.AddStack(trace, true);
+    rep.AddLocation(addr, 1);
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 static void RecordMutexLock(ThreadState *thr, uptr pc, uptr addr,
@@ -528,53 +538,71 @@ void AfterSleep(ThreadState *thr, uptr pc) {
 void ReportDeadlock(ThreadState *thr, uptr pc, DDReport *r) {
   if (r == 0 || !ShouldReport(thr, ReportTypeDeadlock))
     return;
-  ThreadRegistryLock l(&ctx->thread_registry);
-  ScopedReport rep(ReportTypeDeadlock);
-  for (int i = 0; i < r->n; i++) {
-    rep.AddMutex(r->loop[i].mtx_ctx0, r->loop[i].stk[0]);
-    rep.AddUniqueTid((int)r->loop[i].thr_ctx);
-    rep.AddThread((int)r->loop[i].thr_ctx);
-  }
-  uptr dummy_pc = 0x42;
-  for (int i = 0; i < r->n; i++) {
-    for (int j = 0; j < (flags()->second_deadlock_stack ? 2 : 1); j++) {
-      u32 stk = r->loop[i].stk[j];
-      StackTrace stack;
-      if (stk && stk != kInvalidStackID) {
-        stack = StackDepotGet(stk);
-      } else {
-        // Sometimes we fail to extract the stack trace (FIXME: investigate),
-        // but we should still produce some stack trace in the report.
-        stack = StackTrace(&dummy_pc, 1);
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    new (_rep) ScopedReport(ReportTypeDeadlock);
+    ScopedReport &rep = *_rep;
+    for (int i = 0; i < r->n; i++) {
+      rep.AddMutex(r->loop[i].mtx_ctx0, r->loop[i].stk[0]);
+      rep.AddUniqueTid((int)r->loop[i].thr_ctx);
+      rep.AddThread((int)r->loop[i].thr_ctx);
+    }
+    uptr dummy_pc = 0x42;
+    for (int i = 0; i < r->n; i++) {
+      for (int j = 0; j < (flags()->second_deadlock_stack ? 2 : 1); j++) {
+        u32 stk = r->loop[i].stk[j];
+        StackTrace stack;
+        if (stk && stk != kInvalidStackID) {
+          stack = StackDepotGet(stk);
+        } else {
+          // Sometimes we fail to extract the stack trace (FIXME: investigate),
+          // but we should still produce some stack trace in the report.
+          stack = StackTrace(&dummy_pc, 1);
+        }
+        rep.AddStack(stack, true);
       }
-      rep.AddStack(stack, true);
     }
-  }
-  OutputReport(thr, rep);
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 void ReportDestroyLocked(ThreadState *thr, uptr pc, uptr addr,
                          FastState last_lock, StackID creation_stack_id) {
-  // We need to lock the slot during RestoreStack because it protects
-  // the slot journal.
-  Lock slot_lock(&ctx->slots[static_cast<uptr>(last_lock.sid())].mtx);
-  ThreadRegistryLock l0(&ctx->thread_registry);
-  Lock slots_lock(&ctx->slot_mtx);
-  ScopedReport rep(ReportTypeMutexDestroyLocked);
-  rep.AddMutex(addr, creation_stack_id);
-  VarSizeStackTrace trace;
-  ObtainCurrentStack(thr, pc, &trace);
-  rep.AddStack(trace, true);
-
-  Tid tid;
-  DynamicMutexSet mset;
-  uptr tag;
-  if (!RestoreStack(EventType::kLock, last_lock.sid(), last_lock.epoch(), addr,
-                    0, kAccessWrite, &tid, &trace, mset, &tag))
-    return;
-  rep.AddStack(trace, true);
-  rep.AddLocation(addr, 1);
-  OutputReport(thr, rep);
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Apple platforms require the below locks released
+  // before symbolizing in order to avoid a deadlock
+  {
+    // We need to lock the slot during RestoreStack because it protects
+    // the slot journal.
+    Lock slot_lock(&ctx->slots[static_cast<uptr>(last_lock.sid())].mtx);
+    ThreadRegistryLock l0(&ctx->thread_registry);
+    Lock slots_lock(&ctx->slot_mtx);
+    new (_rep) ScopedReport(ReportTypeMutexDestroyLocked);
+    ScopedReport &rep = *_rep;
+    rep.AddMutex(addr, creation_stack_id);
+    VarSizeStackTrace trace;
+    ObtainCurrentStack(thr, pc, &trace);
+    rep.AddStack(trace, true);
+
+    Tid tid;
+    DynamicMutexSet mset;
+    uptr tag;
+    if (!RestoreStack(EventType::kLock, last_lock.sid(), last_lock.epoch(),
+                      addr, 0, kAccessWrite, &tid, &trace, mset, &tag))
+      return;
+    rep.AddStack(trace, true);
+    rep.AddLocation(addr, 1);
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 }  // namespace __tsan
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
index e6f0fda9c72af..c2e9ece3ad09e 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp
@@ -16,6 +16,7 @@
 #include "sanitizer_common/sanitizer_placement_new.h"
 #include "sanitizer_common/sanitizer_stackdepot.h"
 #include "sanitizer_common/sanitizer_stacktrace.h"
+#include "tsan_defs.h"
 #include "tsan_fd.h"
 #include "tsan_flags.h"
 #include "tsan_mman.h"
@@ -806,65 +807,76 @@ void ReportRace(ThreadState *thr, RawShadow *shadow_mem, Shadow cur, Shadow old,
   DynamicMutexSet mset1;
   MutexSet *mset[kMop] = {&thr->mset, mset1};
 
-  // We need to lock the slot during RestoreStack because it protects
-  // the slot journal.
-  Lock slot_lock(&ctx->slots[static_cast<uptr>(s[1].sid())].mtx);
-  ThreadRegistryLock l0(&ctx->thread_registry);
-  Lock slots_lock(&ctx->slot_mtx);
-  if (SpuriousRace(old))
-    return;
-  if (!RestoreStack(EventType::kAccessExt, s[1].sid(), s[1].epoch(), addr1,
-                    size1, typ1, &tids[1], &traces[1], mset[1], &tags[1])) {
-    StoreShadow(&ctx->last_spurious_race, old.raw());
-    return;
-  }
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Swift symbolizer the below locks released before
+  // symbolizing in order to avoid a deadlock
+  {
+    // We need to lock the slot during RestoreStack because it protects
+    // the slot journal.
+    Lock slot_lock(&ctx->slots[static_cast<uptr>(s[1].sid())].mtx);
+    ThreadRegistryLock l0(&ctx->thread_registry);
+    Lock slots_lock(&ctx->slot_mtx);
+    if (SpuriousRace(old))
+      return;
+    if (!RestoreStack(EventType::kAccessExt, s[1].sid(), s[1].epoch(), addr1,
+                      size1, typ1, &tids[1], &traces[1], mset[1], &tags[1])) {
+      StoreShadow(&ctx->last_spurious_race, old.raw());
+      return;
+    }
 
-  if (IsFiredSuppression(ctx, rep_typ, traces[1]))
-    return;
+    if (IsFiredSuppression(ctx, rep_typ, traces[1]))
+      return;
 
-  if (HandleRacyStacks(thr, traces))
-    return;
+    if (HandleRacyStacks(thr, traces))
+      return;
 
-  // If any of the accesses has a tag, treat this as an "external" race.
-  uptr tag = kExternalTagNone;
-  for (uptr i = 0; i < kMop; i++) {
-    if (tags[i] != kExternalTagNone) {
-      rep_typ = ReportTypeExternalRace;
-      tag = tags[i];
-      break;
+    // If any of the accesses has a tag, treat this as an "external" race.
+    uptr tag = kExternalTagNone;
+    for (uptr i = 0; i < kMop; i++) {
+      if (tags[i] != kExternalTagNone) {
+        rep_typ = ReportTypeExternalRace;
+        tag = tags[i];
+        break;
+      }
     }
-  }
 
-  ScopedReport rep(rep_typ, tag);
-  for (uptr i = 0; i < kMop; i++)
-    rep.AddMemoryAccess(addr, tags[i], s[i], tids[i], traces[i], mset[i]);
+    new (_rep) ScopedReport(rep_typ, tag);
+    ScopedReport &rep = *_rep;
+    for (uptr i = 0; i < kMop; i++)
+      rep.AddMemoryAccess(addr, tags[i], s[i], tids[i], traces[i], mset[i]);
 
-  for (uptr i = 0; i < kMop; i++) {
-    ThreadContext *tctx = static_cast<ThreadContext *>(
-        ctx->thread_registry.GetThreadLocked(tids[i]));
-    rep.AddThread(tctx);
-  }
+    for (uptr i = 0; i < kMop; i++) {
+      ThreadContext *tctx = static_cast<ThreadContext *>(
+          ctx->thread_registry.GetThreadLocked(tids[i]));
+      rep.AddThread(tctx);
+    }
 
-  rep.AddLocation(addr_min, addr_max - addr_min);
-
-  if (flags()->print_full_thread_history) {
-    const ReportDesc *rep_desc = rep.GetReport();
-    for (uptr i = 0; i < rep_desc->threads.Size(); i++) {
-      Tid parent_tid = rep_desc->threads[i]->parent_tid;
-      if (parent_tid == kMainTid || parent_tid == kInvalidTid)
-        continue;
-      ThreadContext *parent_tctx = static_cast<ThreadContext *>(
-          ctx->thread_registry.GetThreadLocked(parent_tid));
-      rep.AddThread(parent_tctx);
+    rep.AddLocation(addr_min, addr_max - addr_min);
+
+    if (flags()->print_full_thread_history) {
+      const ReportDesc *rep_desc = rep.GetReport();
+      for (uptr i = 0; i < rep_desc->threads.Size(); i++) {
+        Tid parent_tid = rep_desc->threads[i]->parent_tid;
+        if (parent_tid == kMainTid || parent_tid == kInvalidTid)
+          continue;
+        ThreadContext *parent_tctx = static_cast<ThreadContext *>(
+            ctx->thread_registry.GetThreadLocked(parent_tid));
+        rep.AddThread(parent_tctx);
+      }
     }
-  }
 
 #if !SANITIZER_GO
-  if (!((typ0 | typ1) & kAccessFree) &&
-      s[1].epoch() <= thr->last_sleep_clock.Get(s[1].sid()))
-    rep.AddSleep(thr->last_sleep_stack_id);
+    if (!((typ0 | typ1) & kAccessFree) &&
+        s[1].epoch() <= thr->last_sleep_clock.Get(s[1].sid()))
+      rep.AddSleep(thr->last_sleep_stack_id);
 #endif
-  OutputReport(thr, rep);
+
+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+
+  // Need to manually destroy this because we used placement new to allocate
+  _rep->~ScopedReport();
 }
 
 void PrintCurrentStack(ThreadState *thr, uptr pc) {
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
index c6a8fd2acb6a8..d346798c7dfdc 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_thread.cpp
@@ -88,15 +88,26 @@ void ThreadFinalize(ThreadState *thr) {
 #if !SANITIZER_GO
   if (!ShouldReport(thr, ReportTypeThreadLeak))
     return;
-  ThreadRegistryLock l(&ctx->thread_registry);
   Vector<ThreadLeak> leaks;
-  ctx->thread_registry.RunCallbackForEachThreadLocked(CollectThreadLeaks,
-                                                      &leaks);
+  {
+    ThreadRegistryLock l(&ctx->thread_registry);
+    ctx->thread_registry.RunCallbackForEachThreadLocked(CollectThreadLeaks,
+                                                        &leaks);
+  }
+
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
   for (uptr i = 0; i < leaks.Size(); i++) {
-    ScopedReport rep(ReportTypeThreadLeak);
-    rep.AddThread(leaks[i].tctx, true);
-    rep.SetCount(leaks[i].count);
-    OutputReport(thr, rep);
+    {
+      ThreadRegistryLock l(&ctx->thread_registry);
+      new (_rep) ScopedReport(ReportTypeThreadLeak);
+      ScopedReport &rep = *_rep;
+      rep.AddThread(leaks[i].tctx, true);
+      rep.SetCount(leaks[i].count);
+    }  // Close this scope to release the locks
+
+    OutputReport(thr, *_rep);
+    // Need to manually destroy this because we used placement new to allocate
+    _rep->~ScopedReport();
   }
 #endif
 }

DanBlackwell · 2025-07-31T10:59:26Z

@thurstond here is the follow up to #151120.

I realised that while I saw the deadlock in ReportRace, in theory it could happen in any of the Report* functions so have patched them all.

thurstond

All platforms: move OutputReport (and thus symbolization) after the locked region when reporting.

What is the intuition on why this is safe on all platforms?
(I don't have a counterexample, but it's not immediately obvious to me that it is safe. If this change had been conditionally #if SANITIZER_APPLE, it would be a straightforward app(rov)al.)

thurstond · 2025-07-31T16:34:38Z

compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp

+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate


Is there a way to avoid manual memory management, perhaps with dynamic allocation and a unique_ptr?
Reporting usually is rare enough that hyper-optimization isn't necessary.

I think I did it this way originally because I couldn't see unique_ptr being used anywhere else in the sanitizer runtimes (just tests). Is it ok to use std unique_ptr or is there some internal equivalent? I agree it would make this much cleaner.

I don't know of any reason not to use unique_ptr. If the tests have it, that implies the sanitizers assume a libc++ with unique_ptr support.

Any thoughts/objections from other reviewers?

llvm-project/compiler-rt/lib/tsan/rtl/tsan_rtl.h

Line 19 in 0398ad4

// - No system headers included in header files (*).

I think we can use unique_ptr in the .cpp files actually based on this.

Unfortunately we can't allocate during signal handling (tests signal_malloc.cpp and signal_errno.cpp) as it causes another deadlock. I think this will have to stay as the ugly looking alloca.

Unfortunately we can't allocate during signal handling (tests signal_malloc.cpp and signal_errno.cpp) as it causes another deadlock. I think this will have to stay as the ugly looking alloca.

Please add a comment next to allocas explaining this.

thurstond · 2025-07-31T16:34:59Z

compiler-rt/lib/tsan/rtl/tsan_rtl_report.cpp

-    return;
-  }
+  ScopedReport *_rep = (ScopedReport *)__builtin_alloca(sizeof(ScopedReport));
+  // Take a new scope as Swift symbolizer the below locks released before


DanBlackwell · 2025-07-31T17:30:05Z

All platforms: move OutputReport (and thus symbolization) after the locked region when reporting.

What is the intuition on why this is safe on all platforms? (I don't have a counterexample, but it's not immediately obvious to me that it is safe. If this change had been conditionally #if SANITIZER_APPLE, it would be a straightforward app(rov)al.)

I don't think I know enough about the other platforms to give an opinion - I will alter this PR to guard any behaviour divergence behind #if SANITIZER_APPLE to be safe.

usama54321 · 2025-07-31T18:53:37Z

compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp

+    rep.SetSigNum(sig);
+    if (!IsFiredSuppression(ctx, ReportTypeErrnoInSignal, stack)) {
+      rep.AddStack(stack, true);
+      OutputReport(thr, rep);


Should this OutputReport be removed?

Or maybe the outputreport below needs gating

Good spot! I needed to move that outside of the locked region.

…platforms

…efore calling

DanBlackwell · 2025-08-04T15:10:29Z

Ok @thurstond, I have guarded this all so that only Apple platforms release locks before outputting. It's quite a big diff, let me know if you think it needs splitting up.

thurstond · 2025-08-04T17:45:17Z

compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp

+#if SANITIZER_APPLE
+  }  // Close this scope to release the locks before writing report
+  if (!suppressed)
+    OutputReport(thr, *rep);
+#else
+    if (!suppressed)
+      OutputReport(thr, *rep);
  }
+#endif


Nit: there's less duplication with:

#if SANITIZER_APPLE } // Close this scope to release the locks before writing report #endif if (!suppressed) OutputReport(thr, *rep); #if !SANITIZER_APPLE } #endif

OTOH this suggestion is arguably less clear because the code indentation is wrong. WDYT?

thurstond · 2025-08-04T17:48:30Z

compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp

+  }  // Close this scope to release the locks
+
+  OutputReport(thr, *_rep);
+  // Need to manually destroy this because we used placement new to allocate


Unfortunately we can't allocate during signal handling (tests signal_malloc.cpp and signal_errno.cpp) as it causes another deadlock. I think this will have to stay as the ugly looking alloca.

Please add a comment next to allocas explaining this.

llvmbot added compiler-rt compiler-rt:tsan Thread sanitizer compiler-rt:sanitizer labels Jul 31, 2025

thetruestblue requested review from yln, wrotki, thurstond, thetruestblue, MaskRay, vitalybuka and dvyukov and removed request for MaskRay July 31, 2025 11:39

thurstond reviewed Jul 31, 2025

View reviewed changes

usama54321 reviewed Jul 31, 2025

View reviewed changes

DanBlackwell added 3 commits August 4, 2025 14:36

Move ReportErrnoSpoiling OutputReport call outside of locks on Apple …

5c02711

…platforms

Guard any OutputReports so that only SANITIZER_APPLE releases locks b…

5292631

…efore calling

Fix grammar issue in comment

ddb0daf

thurstond approved these changes Aug 4, 2025

View reviewed changes

[TSan] Fix deadlocks during TSan error reporting on Apple platforms #151495

Are you sure you want to change the base?

[TSan] Fix deadlocks during TSan error reporting on Apple platforms #151495

Conversation

DanBlackwell commented Jul 31, 2025

Uh oh!

llvmbot commented Jul 31, 2025

Uh oh!

DanBlackwell commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thurstond left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DanBlackwell commented Jul 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DanBlackwell commented Aug 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DanBlackwell commented Jul 31, 2025 •

edited

Loading

thurstond left a comment •

edited

Loading