Skip to content

Commit f890591

Browse files
authored
[lldb][Expression] Encode Module and DIE UIDs into function AsmLabels (#148877)
LLDB currently attaches `AsmLabel`s to `FunctionDecl`s such that that the `IRExecutionUnit` can determine which mangled name to call (we can't rely on Clang deriving the correct mangled name to call because the debug-info AST doesn't contain all the info that would be encoded in the DWARF linkage names). However, we don't attach `AsmLabel`s for structors because they have multiple variants and thus it's not clear which mangled name to use. In the [RFC on fixing expression evaluation of abi-tagged structors](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816) we discussed encoding the structor variant into the `AsmLabel`s. Specifically in [this thread](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7) we discussed that the contents of the `AsmLabel` are completely under LLDB's control and we could make use of it to uniquely identify a function by encoding the exact module and DIE that the function is associated with (mangled names need not be enough since two identical mangled symbols may live in different modules). So if we already have a custom `AsmLabel` format, we can encode the structor variant in a follow-up (the current idea is to append the structor variant as a suffix to our custom `AsmLabel` when Clang emits the mangled name into the JITted IR). Then we would just have to teach the `IRExecutionUnit` to pick the correct structor variant DIE during symbol resolution. The draft of this is available [here](#149827) This patch sets up the infrastructure for the custom `AsmLabel` format by encoding the module id, DIE id and mangled name in it. **Implementation** The flow is as follows: 1. Create the label in `DWARFASTParserClang`. The format is: `$__lldb_func:module_id:die_id:mangled_name` 2. When resolving external symbols in `IRExecutionUnit`, we parse this label and then do a lookup by DIE ID (or mangled name into the module if the encoded DIE is a declaration). Depends on #151355
1 parent e20413c commit f890591

File tree

23 files changed

+639
-6
lines changed

23 files changed

+639
-6
lines changed

lldb/include/lldb/Core/Module.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,8 @@ struct ModuleFunctionSearchOptions {
8686
///
8787
/// The module will parse more detailed information as more queries are made.
8888
class Module : public std::enable_shared_from_this<Module>,
89-
public SymbolContextScope {
89+
public SymbolContextScope,
90+
public UserID {
9091
public:
9192
class LookupInfo;
9293
// Static functions that can track the lifetime of module objects. This is

lldb/include/lldb/Core/ModuleList.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,14 @@ class ModuleList {
352352
// UUID values is very efficient and accurate.
353353
lldb::ModuleSP FindModule(const UUID &uuid) const;
354354

355+
/// Find a module by LLDB-specific unique identifier.
356+
///
357+
/// \param[in] uid The UID of the module assigned to it on construction.
358+
///
359+
/// \returns ModuleSP of module with \c uid. Returns nullptr if no such
360+
/// module could be found.
361+
lldb::ModuleSP FindModule(lldb::user_id_t uid) const;
362+
355363
/// Finds the first module whose file specification matches \a module_spec.
356364
lldb::ModuleSP FindFirstModule(const ModuleSpec &module_spec) const;
357365

lldb/include/lldb/Expression/Expression.h

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include <string>
1414
#include <vector>
1515

16+
#include "llvm/Support/FormatProviders.h"
1617

1718
#include "lldb/Expression/ExpressionTypeSystemHelper.h"
1819
#include "lldb/lldb-forward.h"
@@ -96,6 +97,62 @@ class Expression {
9697
///invalid.
9798
};
9899

100+
/// Holds parsed information about a function call label that
101+
/// LLDB attaches as an AsmLabel to function AST nodes it parses
102+
/// from debug-info.
103+
///
104+
/// The format being:
105+
///
106+
/// <prefix>:<module uid>:<symbol uid>:<name>
107+
///
108+
/// The label string needs to stay valid for the entire lifetime
109+
/// of this object.
110+
struct FunctionCallLabel {
111+
/// Unique identifier of the lldb_private::Module
112+
/// which contains the symbol identified by \c symbol_id.
113+
lldb::user_id_t module_id;
114+
115+
/// Unique identifier of the function symbol on which to
116+
/// perform the function call. For example, for DWARF this would
117+
/// be the DIE UID.
118+
lldb::user_id_t symbol_id;
119+
120+
/// Name to use when searching for the function symbol in
121+
/// \c module_id. For most function calls this will be a
122+
/// mangled name. In cases where a mangled name can't be used,
123+
/// this will be the function name.
124+
///
125+
/// NOTE: kept as last element so we don't have to worry about
126+
/// ':' in the mangled name when parsing the label.
127+
llvm::StringRef lookup_name;
128+
129+
/// Decodes the specified function \c label into a \c FunctionCallLabel.
130+
static llvm::Expected<FunctionCallLabel> fromString(llvm::StringRef label);
131+
132+
/// Encode this FunctionCallLabel into its string representation.
133+
///
134+
/// The representation roundtrips through \c fromString:
135+
/// \code{.cpp}
136+
/// llvm::StringRef encoded = "$__lldb_func:0x0:0x0:_Z3foov";
137+
/// FunctionCallLabel label = *fromString(label);
138+
///
139+
/// assert (label.toString() == encoded);
140+
/// assert (*fromString(label.toString()) == label);
141+
/// \endcode
142+
std::string toString() const;
143+
};
144+
145+
/// LLDB attaches this prefix to mangled names of functions that get called
146+
/// from JITted expressions.
147+
inline constexpr llvm::StringRef FunctionCallLabelPrefix = "$__lldb_func";
148+
99149
} // namespace lldb_private
100150

151+
namespace llvm {
152+
template <> struct format_provider<lldb_private::FunctionCallLabel> {
153+
static void format(const lldb_private::FunctionCallLabel &label,
154+
raw_ostream &OS, StringRef Style);
155+
};
156+
} // namespace llvm
157+
101158
#endif // LLDB_EXPRESSION_EXPRESSION_H

lldb/include/lldb/Symbol/SymbolFile.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
#include "lldb/Symbol/CompilerType.h"
1919
#include "lldb/Symbol/Function.h"
2020
#include "lldb/Symbol/SourceModule.h"
21+
#include "lldb/Symbol/SymbolContext.h"
2122
#include "lldb/Symbol/Type.h"
2223
#include "lldb/Symbol/TypeList.h"
2324
#include "lldb/Symbol/TypeSystem.h"
@@ -328,6 +329,18 @@ class SymbolFile : public PluginInterface {
328329
GetMangledNamesForFunction(const std::string &scope_qualified_name,
329330
std::vector<ConstString> &mangled_names);
330331

332+
/// Resolves the function corresponding to the specified LLDB function
333+
/// call \c label.
334+
///
335+
/// \param[in] label The FunctionCallLabel to be resolved.
336+
///
337+
/// \returns An llvm::Error if the specified \c label couldn't be resolved.
338+
/// Returns the resolved function (as a SymbolContext) otherwise.
339+
virtual llvm::Expected<SymbolContext>
340+
ResolveFunctionCallLabel(const FunctionCallLabel &label) {
341+
return llvm::createStringError("Not implemented");
342+
}
343+
331344
virtual void GetTypes(lldb_private::SymbolContextScope *sc_scope,
332345
lldb::TypeClass type_mask,
333346
lldb_private::TypeList &type_list) = 0;

lldb/source/Core/Module.cpp

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -130,8 +130,10 @@ Module *Module::GetAllocatedModuleAtIndex(size_t idx) {
130130
return nullptr;
131131
}
132132

133+
static std::atomic<lldb::user_id_t> g_unique_id = 1;
134+
133135
Module::Module(const ModuleSpec &module_spec)
134-
: m_unwind_table(*this), m_file_has_changed(false),
136+
: UserID(g_unique_id++), m_unwind_table(*this), m_file_has_changed(false),
135137
m_first_file_changed_log(false) {
136138
// Scope for locker below...
137139
{
@@ -236,7 +238,8 @@ Module::Module(const ModuleSpec &module_spec)
236238
Module::Module(const FileSpec &file_spec, const ArchSpec &arch,
237239
ConstString object_name, lldb::offset_t object_offset,
238240
const llvm::sys::TimePoint<> &object_mod_time)
239-
: m_mod_time(FileSystem::Instance().GetModificationTime(file_spec)),
241+
: UserID(g_unique_id++),
242+
m_mod_time(FileSystem::Instance().GetModificationTime(file_spec)),
240243
m_arch(arch), m_file(file_spec), m_object_name(object_name),
241244
m_object_offset(object_offset), m_object_mod_time(object_mod_time),
242245
m_unwind_table(*this), m_file_has_changed(false),
@@ -257,7 +260,7 @@ Module::Module(const FileSpec &file_spec, const ArchSpec &arch,
257260
}
258261

259262
Module::Module()
260-
: m_unwind_table(*this), m_file_has_changed(false),
263+
: UserID(g_unique_id++), m_unwind_table(*this), m_file_has_changed(false),
261264
m_first_file_changed_log(false) {
262265
std::lock_guard<std::recursive_mutex> guard(
263266
GetAllocationModuleCollectionMutex());

lldb/source/Core/ModuleList.cpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -584,6 +584,20 @@ ModuleSP ModuleList::FindModule(const UUID &uuid) const {
584584
return module_sp;
585585
}
586586

587+
ModuleSP ModuleList::FindModule(lldb::user_id_t uid) const {
588+
ModuleSP module_sp;
589+
ForEach([&](const ModuleSP &m) {
590+
if (m->GetID() == uid) {
591+
module_sp = m;
592+
return IterationAction::Stop;
593+
}
594+
595+
return IterationAction::Continue;
596+
});
597+
598+
return module_sp;
599+
}
600+
587601
void ModuleList::FindTypes(Module *search_first, const TypeQuery &query,
588602
TypeResults &results) const {
589603
std::lock_guard<std::recursive_mutex> guard(m_modules_mutex);

lldb/source/Expression/Expression.cpp

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,11 @@
1010
#include "lldb/Target/ExecutionContextScope.h"
1111
#include "lldb/Target/Target.h"
1212

13+
#include "llvm/ADT/SmallVector.h"
14+
#include "llvm/ADT/StringExtras.h"
15+
#include "llvm/ADT/StringRef.h"
16+
#include "llvm/Support/Error.h"
17+
1318
using namespace lldb_private;
1419

1520
Expression::Expression(Target &target)
@@ -26,3 +31,47 @@ Expression::Expression(ExecutionContextScope &exe_scope)
2631
m_jit_end_addr(LLDB_INVALID_ADDRESS) {
2732
assert(m_target_wp.lock());
2833
}
34+
35+
llvm::Expected<FunctionCallLabel>
36+
lldb_private::FunctionCallLabel::fromString(llvm::StringRef label) {
37+
llvm::SmallVector<llvm::StringRef, 4> components;
38+
label.split(components, ":", /*MaxSplit=*/3);
39+
40+
if (components.size() != 4)
41+
return llvm::createStringError("malformed function call label.");
42+
43+
if (components[0] != FunctionCallLabelPrefix)
44+
return llvm::createStringError(llvm::formatv(
45+
"expected function call label prefix '{0}' but found '{1}' instead.",
46+
FunctionCallLabelPrefix, components[0]));
47+
48+
llvm::StringRef module_label = components[1];
49+
llvm::StringRef die_label = components[2];
50+
51+
lldb::user_id_t module_id = 0;
52+
if (!llvm::to_integer(module_label, module_id))
53+
return llvm::createStringError(
54+
llvm::formatv("failed to parse module ID from '{0}'.", module_label));
55+
56+
lldb::user_id_t die_id;
57+
if (!llvm::to_integer(die_label, die_id))
58+
return llvm::createStringError(
59+
llvm::formatv("failed to parse symbol ID from '{0}'.", die_label));
60+
61+
return FunctionCallLabel{/*.module_id=*/module_id,
62+
/*.symbol_id=*/die_id,
63+
/*.lookup_name=*/components[3]};
64+
}
65+
66+
std::string lldb_private::FunctionCallLabel::toString() const {
67+
return llvm::formatv("{0}:{1:x}:{2:x}:{3}", FunctionCallLabelPrefix,
68+
module_id, symbol_id, lookup_name)
69+
.str();
70+
}
71+
72+
void llvm::format_provider<FunctionCallLabel>::format(
73+
const FunctionCallLabel &label, raw_ostream &OS, StringRef Style) {
74+
OS << llvm::formatv("FunctionCallLabel{ module_id: {0:x}, symbol_id: {1:x}, "
75+
"lookup_name: {2} }",
76+
label.module_id, label.symbol_id, label.lookup_name);
77+
}

lldb/source/Expression/IRExecutionUnit.cpp

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,15 @@
1313
#include "llvm/IR/DiagnosticInfo.h"
1414
#include "llvm/IR/LLVMContext.h"
1515
#include "llvm/IR/Module.h"
16+
#include "llvm/Support/Error.h"
1617
#include "llvm/Support/SourceMgr.h"
1718
#include "llvm/Support/raw_ostream.h"
1819

1920
#include "lldb/Core/Debugger.h"
2021
#include "lldb/Core/Disassembler.h"
2122
#include "lldb/Core/Module.h"
2223
#include "lldb/Core/Section.h"
24+
#include "lldb/Expression/Expression.h"
2325
#include "lldb/Expression/IRExecutionUnit.h"
2426
#include "lldb/Expression/ObjectFileJIT.h"
2527
#include "lldb/Host/HostInfo.h"
@@ -36,6 +38,7 @@
3638
#include "lldb/Utility/LLDBAssert.h"
3739
#include "lldb/Utility/LLDBLog.h"
3840
#include "lldb/Utility/Log.h"
41+
#include "lldb/lldb-defines.h"
3942

4043
#include <optional>
4144

@@ -771,6 +774,40 @@ class LoadAddressResolver {
771774
lldb::addr_t m_best_internal_load_address = LLDB_INVALID_ADDRESS;
772775
};
773776

777+
/// Returns address of the function referred to by the special function call
778+
/// label \c label.
779+
static llvm::Expected<lldb::addr_t>
780+
ResolveFunctionCallLabel(const FunctionCallLabel &label,
781+
const lldb_private::SymbolContext &sc,
782+
bool &symbol_was_missing_weak) {
783+
symbol_was_missing_weak = false;
784+
785+
if (!sc.target_sp)
786+
return llvm::createStringError("target not available.");
787+
788+
auto module_sp = sc.target_sp->GetImages().FindModule(label.module_id);
789+
if (!module_sp)
790+
return llvm::createStringError(
791+
llvm::formatv("failed to find module by UID {0}", label.module_id));
792+
793+
auto *symbol_file = module_sp->GetSymbolFile();
794+
if (!symbol_file)
795+
return llvm::createStringError(
796+
llvm::formatv("no SymbolFile found on module {0:x}.", module_sp.get()));
797+
798+
auto sc_or_err = symbol_file->ResolveFunctionCallLabel(label);
799+
if (!sc_or_err)
800+
return llvm::joinErrors(
801+
llvm::createStringError("failed to resolve function by UID"),
802+
sc_or_err.takeError());
803+
804+
SymbolContextList sc_list;
805+
sc_list.Append(*sc_or_err);
806+
807+
LoadAddressResolver resolver(*sc.target_sp, symbol_was_missing_weak);
808+
return resolver.Resolve(sc_list).value_or(LLDB_INVALID_ADDRESS);
809+
}
810+
774811
lldb::addr_t
775812
IRExecutionUnit::FindInSymbols(const std::vector<ConstString> &names,
776813
const lldb_private::SymbolContext &sc,
@@ -906,6 +943,34 @@ lldb::addr_t IRExecutionUnit::FindInUserDefinedSymbols(
906943

907944
lldb::addr_t IRExecutionUnit::FindSymbol(lldb_private::ConstString name,
908945
bool &missing_weak) {
946+
if (name.GetStringRef().starts_with(FunctionCallLabelPrefix)) {
947+
auto label_or_err = FunctionCallLabel::fromString(name);
948+
if (!label_or_err) {
949+
LLDB_LOG_ERROR(GetLog(LLDBLog::Expressions), label_or_err.takeError(),
950+
"failed to create FunctionCallLabel from '{1}': {0}",
951+
name.GetStringRef());
952+
return LLDB_INVALID_ADDRESS;
953+
}
954+
955+
if (auto addr_or_err =
956+
ResolveFunctionCallLabel(*label_or_err, m_sym_ctx, missing_weak)) {
957+
return *addr_or_err;
958+
} else {
959+
LLDB_LOG_ERROR(GetLog(LLDBLog::Expressions), addr_or_err.takeError(),
960+
"Failed to resolve function call label '{1}': {0}",
961+
name.GetStringRef());
962+
963+
// Fall back to lookup by name despite error in resolving the label.
964+
// May happen in practice if the definition of a function lives in
965+
// a different lldb_private::Module than it's declaration. Meaning
966+
// we couldn't pin-point it using the information encoded in the label.
967+
name.SetString(label_or_err->lookup_name);
968+
}
969+
}
970+
971+
// TODO: now with function call labels, do we still need to
972+
// generate alternate manglings?
973+
909974
std::vector<ConstString> candidate_C_names;
910975
std::vector<ConstString> candidate_CPlusPlus_names;
911976

lldb/source/Expression/IRInterpreter.cpp

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -259,7 +259,9 @@ class InterpreterStackFrame {
259259
break;
260260
case Value::FunctionVal:
261261
if (const Function *constant_func = dyn_cast<Function>(constant)) {
262-
lldb_private::ConstString name(constant_func->getName());
262+
lldb_private::ConstString name(
263+
llvm::GlobalValue::dropLLVMManglingEscape(
264+
constant_func->getName()));
263265
bool missing_weak = false;
264266
lldb::addr_t addr = m_execution_unit.FindSymbol(name, missing_weak);
265267
if (addr == LLDB_INVALID_ADDRESS)

lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
#include "Plugins/Language/ObjC/ObjCLanguage.h"
2525
#include "lldb/Core/Module.h"
2626
#include "lldb/Core/Value.h"
27+
#include "lldb/Expression/Expression.h"
2728
#include "lldb/Host/Host.h"
2829
#include "lldb/Symbol/CompileUnit.h"
2930
#include "lldb/Symbol/Function.h"
@@ -254,7 +255,40 @@ static std::string MakeLLDBFuncAsmLabel(const DWARFDIE &die) {
254255
if (!name)
255256
return {};
256257

257-
return name;
258+
SymbolFileDWARF *dwarf = die.GetDWARF();
259+
if (!dwarf)
260+
return {};
261+
262+
auto get_module_id = [&](SymbolFile *sym) {
263+
if (!sym)
264+
return LLDB_INVALID_UID;
265+
266+
auto *obj = sym->GetMainObjectFile();
267+
if (!obj)
268+
return LLDB_INVALID_UID;
269+
270+
auto module_sp = obj->GetModule();
271+
if (!module_sp)
272+
return LLDB_INVALID_UID;
273+
274+
return module_sp->GetID();
275+
};
276+
277+
lldb::user_id_t module_id = get_module_id(dwarf->GetDebugMapSymfile());
278+
if (module_id == LLDB_INVALID_UID)
279+
module_id = get_module_id(dwarf);
280+
281+
if (module_id == LLDB_INVALID_UID)
282+
return {};
283+
284+
const auto die_id = die.GetID();
285+
if (die_id == LLDB_INVALID_UID)
286+
return {};
287+
288+
return FunctionCallLabel{/*module_id=*/module_id,
289+
/*symbol_id=*/die_id,
290+
/*.lookup_name=*/name}
291+
.toString();
258292
}
259293

260294
TypeSP DWARFASTParserClang::ParseTypeFromClangModule(const SymbolContext &sc,

0 commit comments

Comments
 (0)