Skip to content

[lldb] Zero extend APInt when piece size is bigger than the bitwidth #150149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 4, 2025

Conversation

satyajanga
Copy link
Contributor

@satyajanga satyajanga commented Jul 23, 2025

Summary

We have internally seen cases like this
DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
where we have longer op pieces than what Scalar supports (32, 64 or 128 bits). In these cases LLDB is currently hitting the assertion assert(ap_int.getBitWidth() >= bit_size);

We are extending the generated APInt to the piece size by filling zeros.

Test plan

Added a unit to cover this case.

Reviewers

@clayborg , @jeffreytan81 , @Jlalond

@satyajanga satyajanga force-pushed the dw_op_piece_zest branch 4 times, most recently from 0431e9a to 3cec63f Compare July 23, 2025 05:30
@satyajanga satyajanga marked this pull request as ready for review July 23, 2025 05:54
@satyajanga satyajanga requested a review from JDevlieghere as a code owner July 23, 2025 05:54
@llvmbot llvmbot added the lldb label Jul 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 23, 2025

@llvm/pr-subscribers-lldb

Author: satyanarayana reddy janga (satyajanga)

Changes

Summary

We have internally seen cases like this
DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
where we have longer op pieces than what Scalar supports (32, 64 or 128 bits). In these cases LLDB is currently hitting the assertion assert(ap_int.getBitWidth() >= bit_size);

We are extending the generated APInt to the piece size by filling zeros.

Test plan

Added a unit to cover this case.


Full diff: https://github.com/llvm/llvm-project/pull/150149.diff

2 Files Affected:

  • (modified) lldb/source/Expression/DWARFExpression.cpp (+6-1)
  • (modified) lldb/unittests/Expression/DWARFExpressionTest.cpp (+21)
diff --git a/lldb/source/Expression/DWARFExpression.cpp b/lldb/source/Expression/DWARFExpression.cpp
index 52891fcefd68b..c00795b97467b 100644
--- a/lldb/source/Expression/DWARFExpression.cpp
+++ b/lldb/source/Expression/DWARFExpression.cpp
@@ -1978,7 +1978,12 @@ llvm::Expected<Value> DWARFExpression::Evaluate(
             // grows to the nearest host integer type.
             llvm::APInt fail_value(1, 0, false);
             llvm::APInt ap_int = scalar.UInt128(fail_value);
-            assert(ap_int.getBitWidth() >= bit_size);
+            // We have seen a case where we have expression like:
+            //      DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
+            // here we are assuming the compiler was trying to zero
+            // extend the value that we should append to the buffer.
+            if (ap_int.getBitWidth() < bit_size)
+              ap_int = ap_int.zext(bit_size);
             llvm::ArrayRef<uint64_t> buf{ap_int.getRawData(),
                                          ap_int.getNumWords()};
             curr_piece.GetScalar() = Scalar(llvm::APInt(bit_size, buf));
diff --git a/lldb/unittests/Expression/DWARFExpressionTest.cpp b/lldb/unittests/Expression/DWARFExpressionTest.cpp
index fdc9bfae1876c..86c3b56e320fd 100644
--- a/lldb/unittests/Expression/DWARFExpressionTest.cpp
+++ b/lldb/unittests/Expression/DWARFExpressionTest.cpp
@@ -358,6 +358,27 @@ TEST(DWARFExpression, DW_OP_piece) {
       llvm::HasValue(GetScalar(16, 0xff00, true)));
 }
 
+TEST(DWARFExpression, DW_OP_piece_host_address) {
+  static const uint8_t expr_data[] = {DW_OP_lit2, DW_OP_stack_value,
+                                      DW_OP_piece, 40};
+  llvm::ArrayRef<uint8_t> expr(expr_data, sizeof(expr_data));
+  DataExtractor extractor(expr.data(), expr.size(), lldb::eByteOrderLittle, 4);
+
+  // This tests if ap_int is extended to the right width.
+  // expect 40*8 = 320 bits size.
+  llvm::Expected<Value> result =
+      DWARFExpression::Evaluate(nullptr, nullptr, nullptr, extractor, nullptr,
+                                lldb::eRegisterKindDWARF, nullptr, nullptr);
+  ASSERT_THAT_EXPECTED(result, llvm::Succeeded());
+  ASSERT_EQ(result->GetValueType(), Value::ValueType::HostAddress);
+  ASSERT_EQ(result->GetBuffer().GetByteSize(), 40ul);
+  const uint8_t *data = result->GetBuffer().GetBytes();
+  ASSERT_EQ(data[0], 2);
+  for (int i = 1; i < 40; i++) {
+    ASSERT_EQ(data[i], 0);
+  }
+}
+
 TEST(DWARFExpression, DW_OP_implicit_value) {
   unsigned char bytes = 4;
 

@nikic nikic changed the title Zero extend APInt when piece size is bigger than the bitwidth [lldb] Zero extend APInt when piece size is bigger than the bitwidth Jul 23, 2025
// DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
// here we are assuming the compiler was trying to zero
// extend the value that we should append to the buffer.
if (ap_int.getBitWidth() < bit_size)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, isn't this effectively just max(bit_Size, ap_int.getBitWidth())

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, APInt and APSInt classes can handle any sized integers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of expression seems to want to zero fill in the value.

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this. We should wait for anyone else to chime in for a day or two.

// DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
// here we are assuming the compiler was trying to zero
// extend the value that we should append to the buffer.
if (ap_int.getBitWidth() < bit_size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, APInt and APSInt classes can handle any sized integers.

// DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
// here we are assuming the compiler was trying to zero
// extend the value that we should append to the buffer.
if (ap_int.getBitWidth() < bit_size)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of expression seems to want to zero fill in the value.

@clayborg clayborg requested review from labath and dwblaikie July 24, 2025 17:21
@clayborg
Copy link
Collaborator

I added Pavel Labath and David Blaikie in case they have any info on this.

The story if we have some RiscV code that is creating ___location values for variables that use DW_OP_piece and they are using:

DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28

to try and zero fill in the value. Not sure if this is common or if there is a better way to do the zero fill. But clang is producing this. as a series of DW_OP_constu XXX, DW_OP_stack_value, DW_OP_piece 0x8 followed by the above expression.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that what clang is doing is completely compliant. According to the standard, DWARF expression values """can represent a value of any supported base type of the target machine. Instead of a base type, elements can have a generic type, which is an integral type that has the size of an address on the target machine and unspecified signedness""", so using a single value to represent 40 bytes is seems a bit dodgy. It's kind of obvious what you mean if the value is zero, but it gets a bit fuzzy for other values (where do you place that value, do you sign-extend it, etc.)

Nevertheless, looking at this from the consumer side, I think what you've done is a reasonable interpretation of this, and is better than crashing. I'm just wondering if there isn't a simpler way to express this. See the inline comment about a possible simplification.

Comment on lines 1979 to 1989
llvm::APInt fail_value(1, 0, false);
llvm::APInt ap_int = scalar.UInt128(fail_value);
assert(ap_int.getBitWidth() >= bit_size);
// We have seen a case where we have expression like:
// DW_OP_lit0, DW_OP_stack_value, DW_OP_piece 0x28
// here we are assuming the compiler was trying to zero
// extend the value that we should append to the buffer.
if (ap_int.getBitWidth() < bit_size)
ap_int = ap_int.zext(bit_size);
llvm::ArrayRef<uint64_t> buf{ap_int.getRawData(),
ap_int.getNumWords()};
curr_piece.GetScalar() = Scalar(llvm::APInt(bit_size, buf));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't all of this equivalent to curr_piece.GetScalar() = scalar.TruncOrExtendTo(bit_size, /*sign=*/false)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't all of this equivalent to curr_piece.GetScalar() = scalar.TruncOrExtendTo(bit_size, /*sign=*/false)?

Yes. it seems so.
if the scalar has float value set then it wont work, but not sure if that case ever happens.
@labath do you know if the scalar contains a float value in the dwarf expression followed by DW_OP_Piece?

Copy link
Collaborator

@labath labath Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but I'm not particularly worried by that given that the only other call of this function is also inside DWARFExpression (in DW_OP_convert). If that becomes an issue, we can address both cases together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, but I'm not particularly worried by that given that the only other call of this function is also inside DWARFExpression (in DW_OP_convert). If that becomes an issue, we can address both cases together.

scalar.UInt128(fail_value) seems to robust to handle all the edge cases. So I am leaving the way it is. let me know if you think otherwise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two cases should be unified I don't see any reason why extension (or truncation) in DW_OP_piece should behave any differently from DW_OP_convert. And I don't like how this violates "campground rules" (leave the place in a better state than you found it in): instead of cleaning things up, it piles onto the existing dodgy implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. updated based on the recommendation.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: https://phabricator.intern.facebook.com/D78791142
Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@Jlalond
Copy link
Contributor

Jlalond commented Aug 4, 2025

Merging for @satyajanga

@Jlalond Jlalond merged commit a0db29d into llvm:main Aug 4, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants