-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[libc++][hardening] Add a greppable prefix to assertion messages. #150560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc++][hardening] Add a greppable prefix to assertion messages. #150560
Conversation
@llvm/pr-subscribers-libcxx Author: Konstantin Varlamov (var-const) ChangesThe current assertion failure messages produced by Hardening are not Full diff: https://github.com/llvm/llvm-project/pull/150560.diff 2 Files Affected:
diff --git a/libcxx/include/__assert b/libcxx/include/__assert
index 90eaa6023587b..a9451daf47f2f 100644
--- a/libcxx/include/__assert
+++ b/libcxx/include/__assert
@@ -20,8 +20,8 @@
#define _LIBCPP_ASSERT(expression, message) \
(__builtin_expect(static_cast<bool>(expression), 1) \
? (void)0 \
- : _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING(__LINE__) ": assertion " _LIBCPP_TOSTRING( \
- expression) " failed: " message "\n"))
+ : _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING( \
+ __LINE__) ": libc++ Hardening assertion " _LIBCPP_TOSTRING(expression) " failed: " message "\n"))
// WARNING: __builtin_assume can currently inhibit optimizations. Only add assumptions with a clear
// optimization intent. See https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 for a
diff --git a/libcxx/test/support/check_assertion.h b/libcxx/test/support/check_assertion.h
index a279400d651b4..ebd52848a6459 100644
--- a/libcxx/test/support/check_assertion.h
+++ b/libcxx/test/support/check_assertion.h
@@ -47,7 +47,7 @@ using Matcher = std::function<MatchResult(const std::string& /*text*/)>;
MatchResult MatchAssertionMessage(const std::string& text, std::string_view expected_message) {
// Extract information from the error message. This has to stay synchronized with how we format assertions in the
// library.
- std::regex assertion_format(".*###\\n(.*):(\\d+): assertion (.*) failed: (.*)\\n###");
+ std::regex assertion_format(".*###\\n(.*):(\\d+): libc\\+\\+ Hardening assertion (.*) failed: (.*)\\n###");
std::smatch match_result;
bool has_match = std::regex_match(text, match_result, assertion_format);
|
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING(__LINE__) ": assertion " _LIBCPP_TOSTRING( \ | ||
expression) " failed: " message "\n")) | ||
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING( \ | ||
__LINE__) ": libc++ Hardening assertion " _LIBCPP_TOSTRING(expression) " failed: " message "\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very open to suggestions about how exactly to word the prefix. The ++
part in libc++
is slightly annoying since it needs escaping in the regex (no big deal, just a slight annoyance) -- personally, I'd lean towards using libcxx
for that reason, but happy with anything, really.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess libc++
will be more familiar to end-users than libcxx
. Would it also make sense to use h
instead of H
in Hardening
? I can imagine most people will just type hardening assertion
or hardening
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback!
To be clear about the use case, the main intent is to support people turning on the upcoming observe
mode and deliberately "grepping" (whichever actual tool they use) their logs for libc++ Hardening assertions. I presume we will document the prefix and they will be searching for a specific string -- for me at least, the intent is not to make our Hardening stand out in general. Of course, we should still strive to make the prefix readable and reasonable.
I guess
libc++
will be more familiar to end-users thanlibcxx
.
This is a good point.
Would it also make sense to use
h
instead ofH
inHardening
?
I don't feel super strongly about it, but recently I've been deliberately trying to find a way to make it clearer that we use the word "hardening" in a very specific sense, as the name of our specific mechanism, and not just in the general sense of making things more robust and secure. Capitalization is supposed to stress that this refers to a specific feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend for _LIBCPP_ASSERT
to only ever be used for hardening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend for
_LIBCPP_ASSERT
to only ever be used for hardening?
I think it depends how you think about it. Many existing assertions are technically not really "hardening" in the sense of "security hardening". But since we've decided to classify all of the assertions and simply discriminate them based on the mode selected by the user, we consider all of them to be part of "Hardening" (as in, the libc++ extension).
So because of that, yes I think that _LIBCPP_ASSERT
will always be used for "libc++ Hardening" related checks (since by definition any check we might add would fall into it), but not all of them are true "security hardening" in the general sense of it. True "security hardening" checks would be closer to the ones included in the fast (and maybe extensive) modes, probably without any check that is purely pedantic in nature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@var-const Maybe use LIBCPP as a prefix, which would be more consistent with libc++ naming conventions instead of your original "libcxx" alternative prefix? Just as the hardening macros' prefix themselves. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a consistent naming convention, unfortunately. Internal macros use the LIBCPP
prefix like you said, but CMake variables (which are public-facing) use LIBCXX
, and also the name of the directory within the LLVM repo is libcxx
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. libc++
vs. some other spelling: I'd prefer libc++
. You can simply grep for Hardening assertion
if the pluses are too much for you. I doubt there is much of a clash with other libraries, and if there is, you probably care about these other library diagnostics as well. If you don't then oh well, you might have to escape them. It's not the end of the world.
I think I'm happy with the current message if we rename _LIBCPP_ASSERT
to make it clear that it's an implementation detail of hardening assertions (in a separate patch). That wasn't clear to me until now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, what I would do is something like this:
// This is the old _LIBCPP_ASSERT macro, we probably don't want to rename
// it but I'm using this new name here to clarify the intent
#define _LIBCPP_CONTRACT_VIOLATION(expression, message) \
(__builtin_expect(static_cast<bool>(expression), 1) \
? (void)0 \
: _LIBCPP_ASSERTION_HANDLER(message))
#define _LIBCPP_HARDENING_ASSERT(expression, message) \
_LIBCPP_CONTRACT_VIOLATION(expression, \
__FILE__ ":" _LIBCPP_TOSTRING(__LINE__) \
": assertion " _LIBCPP_TOSTRING(expression) " failed: " message "\n")
So basically, we have:
- A mechanism to set the "contract violation handler" (which we call the "assertion handler"). That is not tied to libc++ Hardening specifically
- A macro to trigger a "contract violation" (that is currently called
_LIBCPP_ASSERT
) - A macro that triggers a contract violation originating from a libc++ Hardening assertion (that's what we'd now call
_LIBCPP_HARDENING_ASSERT
)
That way, we clarify what is a Hardening assertion but we still retain a general purpose mechanism for setting (and talking about) contract violations which should be forward compatible if we add non-Hardening related assertions.
Connecting the current vendor-controlled way of defining a custom "contract violation handler aka assertion handler" is going to be a bit challenging but I think it's doable, hopefully while staying backwards compatible. Otherwise it's possible that vendors will need to change their stuff a little bit but it should be pretty self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I really like this way of looking at it -- an assertion implements a contract violation, and Hardening happens to be the only part of the library that is currently using this. I'll do a follow-up to implement this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that this patch makes sense and makes it a bit easier to distinguish where an assertion comes from, but let's finish the discussion in the review.
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING(__LINE__) ": assertion " _LIBCPP_TOSTRING( \ | ||
expression) " failed: " message "\n")) | ||
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING( \ | ||
__LINE__) ": libc++ Hardening assertion " _LIBCPP_TOSTRING(expression) " failed: " message "\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend for
_LIBCPP_ASSERT
to only ever be used for hardening?
I think it depends how you think about it. Many existing assertions are technically not really "hardening" in the sense of "security hardening". But since we've decided to classify all of the assertions and simply discriminate them based on the mode selected by the user, we consider all of them to be part of "Hardening" (as in, the libc++ extension).
So because of that, yes I think that _LIBCPP_ASSERT
will always be used for "libc++ Hardening" related checks (since by definition any check we might add would fall into it), but not all of them are true "security hardening" in the general sense of it. True "security hardening" checks would be closer to the ones included in the fast (and maybe extensive) modes, probably without any check that is purely pedantic in nature.
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING(__LINE__) ": assertion " _LIBCPP_TOSTRING( \ | ||
expression) " failed: " message "\n")) | ||
: _LIBCPP_ASSERTION_HANDLER(__FILE__ ":" _LIBCPP_TOSTRING( \ | ||
__LINE__) ": libc++ Hardening assertion " _LIBCPP_TOSTRING(expression) " failed: " message "\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. libc++
vs. some other spelling: I'd prefer libc++
. You can simply grep for Hardening assertion
if the pluses are too much for you. I doubt there is much of a clash with other libraries, and if there is, you probably care about these other library diagnostics as well. If you don't then oh well, you might have to escape them. It's not the end of the world.
I think I'm happy with the current message if we rename _LIBCPP_ASSERT
to make it clear that it's an implementation detail of hardening assertions (in a separate patch). That wasn't clear to me until now.
The current assertion failure messages produced by Hardening are not very grep-friendly (the common part is rarther generic and requires wildcards to match). While it's possible to use `__FILE__` for grepping, it's easier and more straighforward to simply add a libc++-specific prefix; this is especially important for the planned `observe` mode that might produce many assertion failure messages over the course of the program's execution that later need to be filtered and examined.
b0c645f
to
cecee23
Compare
/cherry-pick 4ef9246 |
Error: Command failed due to missing milestone. |
/cherry-pick 4ef9246 |
Failed to cherry-pick: 4ef9246 https://github.com/llvm/llvm-project/actions/runs/16655407332 Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request |
/cherry-pick 4ef9246 |
/pull-request #151921 |
…vm#150560) The current assertion failure messages produced by Hardening are not very grep-friendly (the common part is rarther generic and requires wildcards to match). While it's possible to use `__FILE__` for grepping, it's easier and more straighforward to simply add a libc++-specific prefix; this is especially important for the planned `observe` mode that might produce many assertion failure messages over the course of the program's execution that later need to be filtered and examined. (cherry picked from commit 4ef9246)
The current assertion failure messages produced by Hardening are not
very grep-friendly (the common part is rarther generic and requires
wildcards to match). While it's possible to use
__FILE__
for grepping,it's easier and more straighforward to simply add a libc++-specific
prefix; this is especially important for the planned
observe
mode thatmight produce many assertion failure messages over the course of the
program's execution that later need to be filtered and examined.