Skip to content

[BOLT] Fix debug line emission for functions in multiple compilation units #151230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

grigorypas
Copy link
Contributor

This patch fixes a bug in BOLT's debug line emission where functions that belong to multiple compilation units (such as inline functions in header files) were not handled correctly. Previously, BOLT incorrectly assumed that a binary function could belong to only one compilation unit, leading to incomplete or incorrect debug line information.

Problem

When a function appears in multiple compilation units (common scenarios include):

  • Template instantiated functions
  • Inline functions defined in header files included by multiple source files

BOLT would only emit debug line information for one compilation unit, losing debug information for other CUs where the function was compiled. This resulted in incomplete debugging information and could cause debuggers to fail to set breakpoints or show incorrect source locations.

Root Cause

The issue was in BOLT's assumption that each binary function maps to exactly one compilation unit. However, when the same function (e.g., an inline function from a header) is compiled into multiple object files, it legitimately belongs to multiple CUs in the final binary.

@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2025

@llvm/pr-subscribers-bolt

Author: Grigory Pastukhov (grigorypas)

Changes

This patch fixes a bug in BOLT's debug line emission where functions that belong to multiple compilation units (such as inline functions in header files) were not handled correctly. Previously, BOLT incorrectly assumed that a binary function could belong to only one compilation unit, leading to incomplete or incorrect debug line information.

Problem

When a function appears in multiple compilation units (common scenarios include):

  • Template instantiated functions
  • Inline functions defined in header files included by multiple source files

BOLT would only emit debug line information for one compilation unit, losing debug information for other CUs where the function was compiled. This resulted in incomplete debugging information and could cause debuggers to fail to set breakpoints or show incorrect source locations.

Root Cause

The issue was in BOLT's assumption that each binary function maps to exactly one compilation unit. However, when the same function (e.g., an inline function from a header) is compiled into multiple object files, it legitimately belongs to multiple CUs in the final binary.


Patch is 44.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151230.diff

15 Files Affected:

  • (modified) bolt/include/bolt/Core/BinaryContext.h (+6)
  • (modified) bolt/include/bolt/Core/BinaryFunction.h (+18-9)
  • (modified) bolt/include/bolt/Core/DebugData.h (+95-21)
  • (modified) bolt/lib/Core/BinaryContext.cpp (+38-29)
  • (modified) bolt/lib/Core/BinaryEmitter.cpp (+120-84)
  • (modified) bolt/lib/Core/BinaryFunction.cpp (+31-24)
  • (modified) bolt/lib/Core/DebugData.cpp (-2)
  • (added) bolt/test/Inputs/multi-cu-common.h (+10)
  • (added) bolt/test/Inputs/multi-cu-file1.c (+9)
  • (added) bolt/test/Inputs/multi-cu-file2.c (+8)
  • (added) bolt/test/Inputs/process-debug-line.sh (+101)
  • (added) bolt/test/X86/multi-cu-debug-line.test (+108)
  • (modified) bolt/test/perf2bolt/Inputs/perf_test.lds (+5-6)
  • (modified) bolt/unittests/Core/CMakeLists.txt (+1)
  • (added) bolt/unittests/Core/ClusteredRows.cpp (+141)
diff --git a/bolt/include/bolt/Core/BinaryContext.h b/bolt/include/bolt/Core/BinaryContext.h
index 91ecf89da618c..48bc9a5d1f92c 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -288,6 +288,12 @@ class BinaryContext {
   /// overwritten, but it is okay to re-generate debug info for them.
   std::set<const DWARFUnit *> ProcessedCUs;
 
+  /// DWARF-related container to manage lifecycle of groups of rows from line
+  /// tables associated with instructions. Since binary functions can span
+  /// multiple compilation units, instructions may reference debug line
+  /// information from multiple CUs.
+  ClasteredRowsContainer ClasteredRows;
+
   // Setup MCPlus target builder
   void initializeTarget(std::unique_ptr<MCPlusBuilder> TargetBuilder) {
     MIB = std::move(TargetBuilder);
diff --git a/bolt/include/bolt/Core/BinaryFunction.h b/bolt/include/bolt/Core/BinaryFunction.h
index ae580520b9110..966559e0c6fa6 100644
--- a/bolt/include/bolt/Core/BinaryFunction.h
+++ b/bolt/include/bolt/Core/BinaryFunction.h
@@ -423,8 +423,8 @@ class BinaryFunction {
   /// Original LSDA type encoding
   unsigned LSDATypeEncoding{dwarf::DW_EH_PE_omit};
 
-  /// Containing compilation unit for the function.
-  DWARFUnit *DwarfUnit{nullptr};
+  /// All compilation units this function belongs to.
+  SmallVector<DWARFUnit *, 1> DwarfUnitVec;
 
   /// Last computed hash value. Note that the value could be recomputed using
   /// different parameters by every pass.
@@ -2414,15 +2414,24 @@ class BinaryFunction {
   void
   computeBlockHashes(HashFunction HashFunction = HashFunction::Default) const;
 
-  void setDWARFUnit(DWARFUnit *Unit) { DwarfUnit = Unit; }
+  void addDWARFUnit(DWARFUnit *Unit) { DwarfUnitVec.push_back(Unit); }
 
-  /// Return DWARF compile unit for this function.
-  DWARFUnit *getDWARFUnit() const { return DwarfUnit; }
+  void removeDWARFUnit(DWARFUnit *Unit) {
+    auto *It = std::find(DwarfUnitVec.begin(), DwarfUnitVec.end(), Unit);
+    // If found, erase it
+    if (It != DwarfUnitVec.end()) {
+      DwarfUnitVec.erase(It);
+    }
+  }
+
+  /// Return DWARF compile units for this function.
+  const SmallVector<DWARFUnit *, 1> getDWARFUnits() const {
+    return DwarfUnitVec;
+  }
 
-  /// Return line info table for this function.
-  const DWARFDebugLine::LineTable *getDWARFLineTable() const {
-    return getDWARFUnit() ? BC.DwCtx->getLineTableForUnit(getDWARFUnit())
-                          : nullptr;
+  const DWARFDebugLine::LineTable *
+  getDWARFLineTableForUnit(DWARFUnit *Unit) const {
+    return BC.DwCtx->getLineTableForUnit(Unit);
   }
 
   /// Finalize profile for the function.
diff --git a/bolt/include/bolt/Core/DebugData.h b/bolt/include/bolt/Core/DebugData.h
index 6ea3b1af1024f..048594946d8a9 100644
--- a/bolt/include/bolt/Core/DebugData.h
+++ b/bolt/include/bolt/Core/DebugData.h
@@ -135,8 +135,6 @@ struct DebugLineTableRowRef {
   uint32_t DwCompileUnitIndex;
   uint32_t RowIndex;
 
-  const static DebugLineTableRowRef NULL_ROW;
-
   bool operator==(const DebugLineTableRowRef &Rhs) const {
     return DwCompileUnitIndex == Rhs.DwCompileUnitIndex &&
            RowIndex == Rhs.RowIndex;
@@ -145,24 +143,6 @@ struct DebugLineTableRowRef {
   bool operator!=(const DebugLineTableRowRef &Rhs) const {
     return !(*this == Rhs);
   }
-
-  static DebugLineTableRowRef fromSMLoc(const SMLoc &Loc) {
-    union {
-      decltype(Loc.getPointer()) Ptr;
-      DebugLineTableRowRef Ref;
-    } U;
-    U.Ptr = Loc.getPointer();
-    return U.Ref;
-  }
-
-  SMLoc toSMLoc() const {
-    union {
-      decltype(SMLoc().getPointer()) Ptr;
-      DebugLineTableRowRef Ref;
-    } U;
-    U.Ref = *this;
-    return SMLoc::getFromPointer(U.Ptr);
-  }
 };
 
 /// Common buffer vector used for debug info handling.
@@ -210,7 +190,7 @@ class DebugRangesSectionWriter {
   static bool classof(const DebugRangesSectionWriter *Writer) {
     return Writer->getKind() == RangesWriterKind::DebugRangesWriter;
   }
-  
+
   /// Append a range to the main buffer.
   void appendToRangeBuffer(const DebugBufferVector &CUBuffer);
 
@@ -852,6 +832,100 @@ class DwarfLineTable {
   // Returns DWARF Version for this line table.
   uint16_t getDwarfVersion() const { return DwarfVersion; }
 };
+
+/// ClusteredRows represents a collection of debug line table row references.
+/// Since a Binary function can belong to multiple compilation units (CUs),
+/// a single MCInst can have multiple debug line table rows associated with it
+/// from different CUs. This class manages such clustered row references.
+///
+/// MEMORY LAYOUT AND DESIGN:
+/// This class uses a flexible array member pattern to store all
+/// DebugLineTableRowRef elements in a single contiguous memory allocation.
+/// The memory layout is:
+///
+/// +------------------+
+/// | ClusteredRows    |  <- Object header (Size + first element)
+/// | - Size           |
+/// | - Raws (element) |  <- First DebugLineTableRowRef element
+/// +------------------+
+/// | element[1]       |  <- Additional DebugLineTableRowRef elements
+/// | element[2]       |     stored immediately after the object
+/// | ...              |
+/// | element[Size-1]  |
+/// +------------------+
+///
+/// PERFORMANCE BENEFITS:
+/// - Single memory allocation: All elements are stored in one contiguous block,
+///   eliminating the need for separate heap allocations for the array.
+/// - No extra dereferencing: Elements are accessed directly via pointer
+///   arithmetic (beginPtr() + offset) rather than through an additional
+///   pointer indirection.
+/// - Cache locality: All elements are guaranteed to be adjacent in memory,
+///   improving cache performance during iteration.
+/// - Memory efficiency: No overhead from separate pointer storage or
+///   fragmented allocations.
+///
+/// The 'Raws' member serves as both the first element storage and the base
+/// address for pointer arithmetic to access subsequent elements.
+class ClusteredRows {
+public:
+  ArrayRef<DebugLineTableRowRef> getRows() const {
+    return ArrayRef<DebugLineTableRowRef>(beginPtrConst(), Size);
+  }
+  uint64_t size() const { return Size; }
+  static const ClusteredRows *fromSMLoc(const SMLoc &Loc) {
+    return reinterpret_cast<const ClusteredRows *>(Loc.getPointer());
+  }
+  SMLoc toSMLoc() const {
+    return SMLoc::getFromPointer(reinterpret_cast<const char *>(this));
+  }
+
+  template <typename T> void populate(const T Vec) {
+    assert(Vec.size() == Size && "");
+    DebugLineTableRowRef *CurRawPtr = beginPtr();
+    for (DebugLineTableRowRef RowRef : Vec) {
+      *CurRawPtr = RowRef;
+      ++CurRawPtr;
+    }
+  }
+
+private:
+  uint64_t Size;
+  DebugLineTableRowRef Raws;
+
+  ClusteredRows(uint64_t Size) : Size(Size) {}
+  static uint64_t getTotalSize(uint64_t Size) {
+    assert(Size > 0 && "Size must be greater than 0");
+    return sizeof(ClusteredRows) + (Size - 1) * sizeof(DebugLineTableRowRef);
+  }
+  const DebugLineTableRowRef *beginPtrConst() const {
+    return reinterpret_cast<const DebugLineTableRowRef *>(&Raws);
+  }
+  DebugLineTableRowRef *beginPtr() {
+    return reinterpret_cast<DebugLineTableRowRef *>(&Raws);
+  }
+
+  friend class ClasteredRowsContainer;
+};
+
+/// ClasteredRowsContainer manages the lifecycle of ClusteredRows objects.
+class ClasteredRowsContainer {
+public:
+  ClusteredRows *createClusteredRows(uint64_t Size) {
+    auto *CR = new (std::malloc(ClusteredRows::getTotalSize(Size)))
+        ClusteredRows(Size);
+    Clusters.push_back(CR);
+    return CR;
+  }
+  ~ClasteredRowsContainer() {
+    for (auto *CR : Clusters)
+      std::free(CR);
+  }
+
+private:
+  std::vector<ClusteredRows *> Clusters;
+};
+
 } // namespace bolt
 } // namespace llvm
 
diff --git a/bolt/lib/Core/BinaryContext.cpp b/bolt/lib/Core/BinaryContext.cpp
index 84f1853469709..df151f398bd54 100644
--- a/bolt/lib/Core/BinaryContext.cpp
+++ b/bolt/lib/Core/BinaryContext.cpp
@@ -1568,23 +1568,19 @@ unsigned BinaryContext::addDebugFilenameToUnit(const uint32_t DestCUID,
   DWARFCompileUnit *SrcUnit = DwCtx->getCompileUnitForOffset(SrcCUID);
   const DWARFDebugLine::LineTable *LineTable =
       DwCtx->getLineTableForUnit(SrcUnit);
-  const std::vector<DWARFDebugLine::FileNameEntry> &FileNames =
-      LineTable->Prologue.FileNames;
+  const DWARFDebugLine::FileNameEntry &FileNameEntry =
+      LineTable->Prologue.getFileNameEntry(FileIndex);
   // Dir indexes start at 1, as DWARF file numbers, and a dir index 0
   // means empty dir.
-  assert(FileIndex > 0 && FileIndex <= FileNames.size() &&
-         "FileIndex out of range for the compilation unit.");
   StringRef Dir = "";
-  if (FileNames[FileIndex - 1].DirIdx != 0) {
+  if (FileNameEntry.DirIdx != 0) {
     if (std::optional<const char *> DirName = dwarf::toString(
-            LineTable->Prologue
-                .IncludeDirectories[FileNames[FileIndex - 1].DirIdx - 1])) {
+            LineTable->Prologue.IncludeDirectories[FileNameEntry.DirIdx - 1])) {
       Dir = *DirName;
     }
   }
   StringRef FileName = "";
-  if (std::optional<const char *> FName =
-          dwarf::toString(FileNames[FileIndex - 1].Name))
+  if (std::optional<const char *> FName = dwarf::toString(FileNameEntry.Name))
     FileName = *FName;
   assert(FileName != "");
   DWARFCompileUnit *DstUnit = DwCtx->getCompileUnitForOffset(DestCUID);
@@ -1697,22 +1693,35 @@ void BinaryContext::preprocessDebugInfo() {
 
     auto It = llvm::partition_point(
         AllRanges, [=](CURange R) { return R.HighPC <= FunctionAddress; });
-    if (It != AllRanges.end() && It->LowPC <= FunctionAddress)
-      Function.setDWARFUnit(It->Unit);
+    if (It == AllRanges.end() || It->LowPC > FunctionAddress) {
+      continue;
+    }
+    Function.addDWARFUnit(It->Unit);
+
+    // Go forward and add all units from ranges that cover the function
+    while (++It != AllRanges.end()) {
+      if (It->LowPC <= FunctionAddress && FunctionAddress < It->HighPC) {
+        Function.addDWARFUnit(It->Unit);
+      } else {
+        break;
+      }
+    }
   }
 
   // Discover units with debug info that needs to be updated.
   for (const auto &KV : BinaryFunctions) {
     const BinaryFunction &BF = KV.second;
-    if (shouldEmit(BF) && BF.getDWARFUnit())
-      ProcessedCUs.insert(BF.getDWARFUnit());
+    if (shouldEmit(BF) && !BF.getDWARFUnits().empty())
+      for (const DWARFUnit *Unit : BF.getDWARFUnits())
+        ProcessedCUs.insert(Unit);
   }
-
   // Clear debug info for functions from units that we are not going to process.
   for (auto &KV : BinaryFunctions) {
     BinaryFunction &BF = KV.second;
-    if (BF.getDWARFUnit() && !ProcessedCUs.count(BF.getDWARFUnit()))
-      BF.setDWARFUnit(nullptr);
+    for (auto *Unit : BF.getDWARFUnits()) {
+      if (!ProcessedCUs.count(Unit))
+        BF.removeDWARFUnit(Unit);
+    }
   }
 
   if (opts::Verbosity >= 1) {
@@ -1907,25 +1916,25 @@ bool BinaryContext::isMarker(const SymbolRef &Symbol) const {
 static void printDebugInfo(raw_ostream &OS, const MCInst &Instruction,
                            const BinaryFunction *Function,
                            DWARFContext *DwCtx) {
-  DebugLineTableRowRef RowRef =
-      DebugLineTableRowRef::fromSMLoc(Instruction.getLoc());
-  if (RowRef == DebugLineTableRowRef::NULL_ROW)
+  const ClusteredRows *LineTableRows =
+      ClusteredRows::fromSMLoc(Instruction.getLoc());
+  if (LineTableRows == nullptr)
     return;
 
-  const DWARFDebugLine::LineTable *LineTable;
-  if (Function && Function->getDWARFUnit() &&
-      Function->getDWARFUnit()->getOffset() == RowRef.DwCompileUnitIndex) {
-    LineTable = Function->getDWARFLineTable();
-  } else {
-    LineTable = DwCtx->getLineTableForUnit(
-        DwCtx->getCompileUnitForOffset(RowRef.DwCompileUnitIndex));
-  }
-  assert(LineTable && "line table expected for instruction with debug info");
+  // File name and line number should be the same for all CUs.
+  // So it is sufficient to check the first one.
+  DebugLineTableRowRef RowRef = LineTableRows->getRows().front();
+  const DWARFDebugLine::LineTable *LineTable = DwCtx->getLineTableForUnit(
+      DwCtx->getCompileUnitForOffset(RowRef.DwCompileUnitIndex));
+
+  if (!LineTable)
+    return;
 
   const DWARFDebugLine::Row &Row = LineTable->Rows[RowRef.RowIndex - 1];
   StringRef FileName = "";
+
   if (std::optional<const char *> FName =
-          dwarf::toString(LineTable->Prologue.FileNames[Row.File - 1].Name))
+          dwarf::toString(LineTable->Prologue.getFileNameEntry(Row.File).Name))
     FileName = *FName;
   OS << " # debug line " << FileName << ":" << Row.Line;
   if (Row.Column)
diff --git a/bolt/lib/Core/BinaryEmitter.cpp b/bolt/lib/Core/BinaryEmitter.cpp
index 7b5cd276fee89..8862f0680cb7e 100644
--- a/bolt/lib/Core/BinaryEmitter.cpp
+++ b/bolt/lib/Core/BinaryEmitter.cpp
@@ -177,7 +177,8 @@ class BinaryEmitter {
   /// Note that it does not automatically result in the insertion of the EOS
   /// marker in the line table program, but provides one to the DWARF generator
   /// when it needs it.
-  void emitLineInfoEnd(const BinaryFunction &BF, MCSymbol *FunctionEndSymbol);
+  void emitLineInfoEnd(const BinaryFunction &BF, MCSymbol *FunctionEndSymbol,
+                       DWARFUnit *Unit);
 
   /// Emit debug line info for unprocessed functions from CUs that include
   /// emitted functions.
@@ -436,8 +437,9 @@ bool BinaryEmitter::emitFunction(BinaryFunction &Function,
     Streamer.emitELFSize(StartSymbol, SizeExpr);
   }
 
-  if (opts::UpdateDebugSections && Function.getDWARFUnit())
-    emitLineInfoEnd(Function, EndSymbol);
+  if (opts::UpdateDebugSections && !Function.getDWARFUnits().empty())
+    for (DWARFUnit *Unit : Function.getDWARFUnits())
+      emitLineInfoEnd(Function, EndSymbol, Unit);
 
   // Exception handling info for the function.
   emitLSDA(Function, FF);
@@ -486,7 +488,7 @@ void BinaryEmitter::emitFunctionBody(BinaryFunction &BF, FunctionFragment &FF,
         // A symbol to be emitted before the instruction to mark its ___location.
         MCSymbol *InstrLabel = BC.MIB->getInstLabel(Instr);
 
-        if (opts::UpdateDebugSections && BF.getDWARFUnit()) {
+        if (opts::UpdateDebugSections && !BF.getDWARFUnits().empty()) {
           LastLocSeen = emitLineInfo(BF, Instr.getLoc(), LastLocSeen,
                                      FirstInstr, InstrLabel);
           FirstInstr = false;
@@ -679,74 +681,104 @@ void BinaryEmitter::emitConstantIslands(BinaryFunction &BF, bool EmitColdPart,
 SMLoc BinaryEmitter::emitLineInfo(const BinaryFunction &BF, SMLoc NewLoc,
                                   SMLoc PrevLoc, bool FirstInstr,
                                   MCSymbol *&InstrLabel) {
-  DWARFUnit *FunctionCU = BF.getDWARFUnit();
-  const DWARFDebugLine::LineTable *FunctionLineTable = BF.getDWARFLineTable();
-  assert(FunctionCU && "cannot emit line info for function without CU");
-
-  DebugLineTableRowRef RowReference = DebugLineTableRowRef::fromSMLoc(NewLoc);
-
-  // Check if no new line info needs to be emitted.
-  if (RowReference == DebugLineTableRowRef::NULL_ROW ||
+  if (NewLoc.getPointer() == nullptr ||
       NewLoc.getPointer() == PrevLoc.getPointer())
     return PrevLoc;
+  const ClusteredRows *Cluster = ClusteredRows::fromSMLoc(NewLoc);
+
+  auto addToLineTable = [&](DebugLineTableRowRef RowReference,
+                            const DWARFUnit *TargetCU, unsigned Flags,
+                            MCSymbol *InstrLabel,
+                            const DWARFDebugLine::Row &CurrentRow) {
+    const uint64_t TargetUnitIndex = TargetCU->getOffset();
+    unsigned TargetFilenum = CurrentRow.File;
+    const uint32_t CurrentUnitIndex = RowReference.DwCompileUnitIndex;
+    // If the CU id from the current instruction ___location does not
+    // match the target CU id, it means that we have come across some
+    // inlined code (by BOLT).  We must look up the CU for the instruction's
+    // original function and get the line table from that.
+    if (TargetUnitIndex != CurrentUnitIndex) {
+      // Add filename from the inlined function to the current CU.
+      TargetFilenum = BC.addDebugFilenameToUnit(
+          TargetUnitIndex, CurrentUnitIndex, CurrentRow.File);
+    }
+    BC.Ctx->setCurrentDwarfLoc(TargetFilenum, CurrentRow.Line,
+                               CurrentRow.Column, Flags, CurrentRow.Isa,
+                               CurrentRow.Discriminator);
+    const MCDwarfLoc &DwarfLoc = BC.Ctx->getCurrentDwarfLoc();
+    BC.Ctx->clearDwarfLocSeen();
+    auto &MapLineEntries = BC.getDwarfLineTable(TargetUnitIndex)
+                               .getMCLineSections()
+                               .getMCLineEntries();
+    const auto *It = MapLineEntries.find(Streamer.getCurrentSectionOnly());
+    auto NewLineEntry = MCDwarfLineEntry(InstrLabel, DwarfLoc);
+
+    // Check if line table exists and has entries before doing comparison
+    if (It != MapLineEntries.end() && !It->second.empty()) {
+      // Check if the new line entry has the same debug info as the last one
+      // to avoid duplicates. We don't compare labels since different
+      // instructions can have the same line info.
+      const auto &LastEntry = It->second.back();
+      if (LastEntry.getFileNum() == NewLineEntry.getFileNum() &&
+          LastEntry.getLine() == NewLineEntry.getLine() &&
+          LastEntry.getColumn() == NewLineEntry.getColumn() &&
+          LastEntry.getFlags() == NewLineEntry.getFlags() &&
+          LastEntry.getIsa() == NewLineEntry.getIsa() &&
+          LastEntry.getDiscriminator() == NewLineEntry.getDiscriminator())
+        return;
+    }
 
-  unsigned CurrentFilenum = 0;
-  const DWARFDebugLine::LineTable *CurrentLineTable = FunctionLineTable;
-
-  // If the CU id from the current instruction ___location does not
-  // match the CU id from the current function, it means that we
-  // have come across some inlined code.  We must look up the CU
-  // for the instruction's original function and get the line table
-  // from that.
-  const uint64_t FunctionUnitIndex = FunctionCU->getOffset();
-  const uint32_t CurrentUnitIndex = RowReference.DwCompileUnitIndex;
-  if (CurrentUnitIndex != FunctionUnitIndex) {
-    CurrentLineTable = BC.DwCtx->getLineTableForUnit(
-        BC.DwCtx->getCompileUnitForOffset(CurrentUnitIndex));
-    // Add filename from the inlined function to the current CU.
-    CurrentFilenum = BC.addDebugFilenameToUnit(
-        FunctionUnitIndex, CurrentUnitIndex,
-        CurrentLineTable->Rows[RowReference.RowIndex - 1].File);
-  }
-
-  const DWARFDebugLine::Row &CurrentRow =
-      CurrentLineTable->Rows[RowReference.RowIndex - 1];
-  if (!CurrentFilenum)
-    CurrentFilenum = CurrentRow.File;
-
-  unsigned Flags = (DWARF2_FLAG_IS_STMT * CurrentRow.IsStmt) |
-                   (DWARF2_FLAG_BASIC_BLOCK * CurrentRow.BasicBlock) |
-                   (DWARF2_FLAG_PROLOGUE_END * CurrentRow.PrologueEnd) |
-                   (DWARF2_FLAG_EPILOGUE_BEGIN * CurrentRow.EpilogueBegin);
-
-  // Always emit is_stmt at the beginning of function fragment.
-  if (FirstInstr)
-    Flags |= DWARF2_FLAG_IS_STMT;
-
-  BC.Ctx->setCurrentDwarfLoc(CurrentFilenum, CurrentRow.Line, CurrentRow.Column,
-                             Flags, CurrentRow.Isa, CurrentRow.Discriminator);
-  const MCDwarfLoc &DwarfLoc = BC.Ctx->getCurrentDwarfLoc();
-  BC.Ctx->clearDwarfLocSeen();
+    BC.getDwarfLineTable(TargetUnitIndex)
+        .getMCLineSections()
+        .addLineEntry(NewLineEntry, Streamer.getCurrentSectionOnly());
+  };
 
   if (!InstrLabel)
     InstrLabel = BC.Ctx->createTempSymbol();
-
-  BC.getDwarfLineTable(FunctionUnitIndex)
-      .getMCLineSections()
-      .addLineEntry(MCDwarfLineEntry(InstrLabel, DwarfLoc),
-                    Streamer.getCurrentSectionOnly());
+  for (DebugLineTableRowRef RowReference : Cluster->getRows()) {
+    const DWARFDebugLine::LineTable *CurrentLineTable =
+        BC.DwCtx->getLineTableForUnit(
+            BC.DwCtx->getCompileUnitForOffset(RowReference.DwCompileUnitIndex));
+    const DWARFDebugLine::Row &CurrentRow =
+        CurrentLineTable->Rows[RowReference.RowIndex - 1];
+    unsigned Flags = (DWARF2_FLAG_IS_STMT * CurrentRow.IsStmt) |
+                     (DWARF2_FLAG_BASIC_BLOCK * CurrentRow.BasicBlock) |
+                     (DWARF2_FLAG_PROLOGUE_END * CurrentRow.PrologueEnd) |
+                     (DWARF2_FLAG_EPILOGUE_BEGIN * CurrentRow.EpilogueBegin);
+
+  ...
[truncated]

Copy link

github-actions bot commented Jul 29, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions cpp,c,h -- bolt/test/Inputs/multi-cu-common.h bolt/test/Inputs/multi-cu-file1.c bolt/test/Inputs/multi-cu-file2.c bolt/unittests/Core/ClusteredRows.cpp bolt/include/bolt/Core/BinaryContext.h bolt/include/bolt/Core/BinaryFunction.h bolt/include/bolt/Core/DebugData.h bolt/lib/Core/BinaryContext.cpp bolt/lib/Core/BinaryEmitter.cpp bolt/lib/Core/BinaryFunction.cpp bolt/lib/Core/DebugData.cpp
View the diff from clang-format here.
diff --git a/bolt/include/bolt/Core/BinaryFunction.h b/bolt/include/bolt/Core/BinaryFunction.h
index ec56ff3e3..91a7c8340 100644
--- a/bolt/include/bolt/Core/BinaryFunction.h
+++ b/bolt/include/bolt/Core/BinaryFunction.h
@@ -2425,7 +2425,7 @@ public:
   }
 
   /// Return DWARF compile units for this function.
-  const SmallVector<DWARFUnit *, 1>& getDWARFUnits() const {
+  const SmallVector<DWARFUnit *, 1> &getDWARFUnits() const {
     return DwarfUnitVec;
   }
 
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 8635cba00..8993c031d 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -1488,25 +1488,25 @@ Error BinaryFunction::disassemble() {
     }
 
 add_instruction:
-    if (!getDWARFUnits().empty()) {
-      SmallVector<DebugLineTableRowRef, 1> Rows;
-      for (DWARFUnit *Unit : getDWARFUnits()) {
-        const DWARFDebugLine::LineTable *LineTable =
-            getDWARFLineTableForUnit(Unit);
-        if (!LineTable)
-          continue;
-        if (std::optional<DebugLineTableRowRef> RowRef =
-                findDebugLineInformationForInstructionAt(AbsoluteInstrAddr,
-                                                         Unit, LineTable))
-          Rows.emplace_back(*RowRef);
-      }
-      if (!Rows.empty()) {
-        ClusteredRows *Cluster =
-            BC.ClusteredRows.createClusteredRows(Rows.size());
-        Cluster->populate(Rows);
-        Instruction.setLoc(Cluster->toSMLoc());
-      }
+  if (!getDWARFUnits().empty()) {
+    SmallVector<DebugLineTableRowRef, 1> Rows;
+    for (DWARFUnit *Unit : getDWARFUnits()) {
+      const DWARFDebugLine::LineTable *LineTable =
+          getDWARFLineTableForUnit(Unit);
+      if (!LineTable)
+        continue;
+      if (std::optional<DebugLineTableRowRef> RowRef =
+              findDebugLineInformationForInstructionAt(AbsoluteInstrAddr, Unit,
+                                                       LineTable))
+        Rows.emplace_back(*RowRef);
+    }
+    if (!Rows.empty()) {
+      ClusteredRows *Cluster =
+          BC.ClusteredRows.createClusteredRows(Rows.size());
+      Cluster->populate(Rows);
+      Instruction.setLoc(Cluster->toSMLoc());
     }
+  }
 
     // Record offset of the instruction for profile matching.
     if (BC.keepOffsetForInstruction(Instruction))

Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patch @grigorypas.

I've only had a quick look, so this isn't a full review.

I've added a few commetns to help reshape this big work, and hopefully get it closer to being merged.

It looks like parts of 74747d1 could be in a standalone PR?
A few more comments below.

Comment on lines 857 to 866
/// PERFORMANCE BENEFITS:
/// - Single memory allocation: All elements are stored in one contiguous block,
/// eliminating the need for separate heap allocations for the array.
/// - No extra dereferencing: Elements are accessed directly via pointer
/// arithmetic (beginPtr() + offset) rather than through an additional
/// pointer indirection.
/// - Cache locality: All elements are guaranteed to be adjacent in memory,
/// improving cache performance during iteration.
/// - Memory efficiency: No overhead from separate pointer storage or
/// fragmented allocations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could keep maybe in the commit message and add a short 1-2 lines summary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the size of the comment. I kept the part that shows memory layout. I believe, it is helpful to understand how the class is supposed to work.

Comment on lines +1571 to +1572
const DWARFDebugLine::FileNameEntry &FileNameEntry =
LineTable->Prologue.getFileNameEntry(FileIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some of the refactoring changes could be separate PRs that come before your big change?
Like this example here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created refactoring PR: #151401

@@ -0,0 +1,105 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could place this script in bolt/test directory (parent one), and link it in lit.cfg.py as a tool, similarly with how link_fdata is done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,108 @@
## Test that BOLT correctly handles debug line information for functions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a quick glance, this test is not tied to X86, right?
Can you make it also run on AArch64?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the test to the parent directory. It should be architecture independent now.


// Go forward and add all units from ranges that cover the function
while (++It != AllRanges.end()) {
if (It->LowPC <= FunctionAddress && FunctionAddress < It->HighPC) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify logic

/// Containing compilation unit for the function.
DWARFUnit *DwarfUnit{nullptr};
/// All compilation units this function belongs to.
SmallVector<DWARFUnit *, 1> DwarfUnitVec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DenseSet, if they don't need to be ordered?

const ClusteredRows *Cluster = ClusteredRows::fromSMLoc(NewLoc);

auto addToLineTable = [&](DebugLineTableRowRef RowReference,
const DWARFUnit *TargetCU, unsigned Flags,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pass by reference if it can't be null, otherwise check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants