-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[AArch64] Dont inline streaming fn into non-streaming caller #150595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Sander de Smalen (sdesmalen-arm) ChangesWithout this change, the following test would fail to compile
Full diff: https://github.com/llvm/llvm-project/pull/150595.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 40f49dade6131..38711206a9ea1 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -280,6 +280,15 @@ bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
if (CallAttrs.callee().isNewZA() || CallAttrs.callee().isNewZT0())
return false;
+ const TargetMachine &TM = getTLI()->getTargetMachine();
+ const FeatureBitset &CallerBits =
+ TM.getSubtargetImpl(*Caller)->getFeatureBits();
+
+ // Cannot inline a streaming function into a non-streaming function,
+ // if the caller has no SVE.
+ if (CallAttrs.requiresSMChange() && !CallerBits.test(AArch64::FeatureSVE))
+ return false;
+
if (CallAttrs.requiresLazySave() || CallAttrs.requiresSMChange() ||
CallAttrs.requiresPreservingZT0() ||
CallAttrs.requiresPreservingAllZAState()) {
@@ -287,9 +296,6 @@ bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
return false;
}
- const TargetMachine &TM = getTLI()->getTargetMachine();
- const FeatureBitset &CallerBits =
- TM.getSubtargetImpl(*Caller)->getFeatureBits();
const FeatureBitset &CalleeBits =
TM.getSubtargetImpl(*Callee)->getFeatureBits();
// Adjust the feature bitsets by inverting some of the bits. This is needed
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index 6cb16928ae6ca..ce323701cbe48 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -676,4 +676,26 @@ define void @streaming_caller_multiple_streaming_compatible_callees_inline() #0
ret void
}
+define void @nosve_streaming_function(ptr %ptr) "target-features"="+sme" "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define void @nosve_streaming_function
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR2]] {
+; CHECK-NEXT: store <vscale x 4 x i32> zeroinitializer, ptr [[PTR]], align 16
+; CHECK-NEXT: ret void
+;
+ store <vscale x 4 x i32> zeroinitializer, ptr %ptr
+ ret void
+}
+
+; Don't allow inlining a streaming function into a non-streaming function
+; if the non-streaming function has no SVE.
+define void @nosve_non_streaming_caller_streaming_callee_dont_inline(ptr %ptr) "target-features"="+sme" {
+; CHECK-LABEL: define void @nosve_non_streaming_caller_streaming_callee_dont_inline
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR1]] {
+; CHECK-NEXT: call void @nosve_streaming_function(ptr [[PTR]])
+; CHECK-NEXT: ret void
+;
+ call void @nosve_streaming_function(ptr %ptr)
+ ret void
+}
+
attributes #0 = { "target-features"="+sve,+sme" }
|
@llvm/pr-subscribers-backend-aarch64 Author: Sander de Smalen (sdesmalen-arm) ChangesWithout this change, the following test would fail to compile
Full diff: https://github.com/llvm/llvm-project/pull/150595.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 40f49dade6131..38711206a9ea1 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -280,6 +280,15 @@ bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
if (CallAttrs.callee().isNewZA() || CallAttrs.callee().isNewZT0())
return false;
+ const TargetMachine &TM = getTLI()->getTargetMachine();
+ const FeatureBitset &CallerBits =
+ TM.getSubtargetImpl(*Caller)->getFeatureBits();
+
+ // Cannot inline a streaming function into a non-streaming function,
+ // if the caller has no SVE.
+ if (CallAttrs.requiresSMChange() && !CallerBits.test(AArch64::FeatureSVE))
+ return false;
+
if (CallAttrs.requiresLazySave() || CallAttrs.requiresSMChange() ||
CallAttrs.requiresPreservingZT0() ||
CallAttrs.requiresPreservingAllZAState()) {
@@ -287,9 +296,6 @@ bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
return false;
}
- const TargetMachine &TM = getTLI()->getTargetMachine();
- const FeatureBitset &CallerBits =
- TM.getSubtargetImpl(*Caller)->getFeatureBits();
const FeatureBitset &CalleeBits =
TM.getSubtargetImpl(*Callee)->getFeatureBits();
// Adjust the feature bitsets by inverting some of the bits. This is needed
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index 6cb16928ae6ca..ce323701cbe48 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -676,4 +676,26 @@ define void @streaming_caller_multiple_streaming_compatible_callees_inline() #0
ret void
}
+define void @nosve_streaming_function(ptr %ptr) "target-features"="+sme" "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define void @nosve_streaming_function
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR2]] {
+; CHECK-NEXT: store <vscale x 4 x i32> zeroinitializer, ptr [[PTR]], align 16
+; CHECK-NEXT: ret void
+;
+ store <vscale x 4 x i32> zeroinitializer, ptr %ptr
+ ret void
+}
+
+; Don't allow inlining a streaming function into a non-streaming function
+; if the non-streaming function has no SVE.
+define void @nosve_non_streaming_caller_streaming_callee_dont_inline(ptr %ptr) "target-features"="+sme" {
+; CHECK-LABEL: define void @nosve_non_streaming_caller_streaming_callee_dont_inline
+; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR1]] {
+; CHECK-NEXT: call void @nosve_streaming_function(ptr [[PTR]])
+; CHECK-NEXT: ret void
+;
+ call void @nosve_streaming_function(ptr %ptr)
+ ret void
+}
+
attributes #0 = { "target-features"="+sve,+sme" }
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (with a minor comment, but I think this is okay as-is)
// Cannot inline a streaming function into a non-streaming function, | ||
// if the caller has no SVE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checking here is okay, but this feels like something hasPossibleIncompatibleOps
should notice (given in your example, the use of scalable types makes the operations incompatible).
define void @nosve_streaming_function(ptr %ptr) "target-features"="+sme" "aarch64_pstate_sm_enabled" { | ||
; CHECK-LABEL: define void @nosve_streaming_function | ||
; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR2]] { | ||
; CHECK-NEXT: store <vscale x 4 x i32> zeroinitializer, ptr [[PTR]], align 16 | ||
; CHECK-NEXT: ret void | ||
; | ||
store <vscale x 4 x i32> zeroinitializer, ptr %ptr | ||
ret void | ||
} | ||
|
||
; Don't allow inlining a streaming function into a non-streaming function | ||
; if the non-streaming function has no SVE. | ||
define void @nosve_non_streaming_caller_streaming_callee_dont_inline(ptr %ptr) "target-features"="+sme" { | ||
; CHECK-LABEL: define void @nosve_non_streaming_caller_streaming_callee_dont_inline | ||
; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR1]] { | ||
; CHECK-NEXT: call void @nosve_streaming_function(ptr [[PTR]]) | ||
; CHECK-NEXT: ret void | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the ATTR2
and ATTR1
checks are failing CI:
2025-07-25T09:28:26.9143177Z /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll:681:15: error: CHECK-SAME: expected string not found in input
2025-07-25T09:28:26.9144429Z ; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR2]] {
2025-07-25T09:28:26.9144838Z ^
2025-07-25T09:28:26.9145147Z <stdin>:278:38: note: scanning from here
2025-07-25T09:28:26.9145614Z define void @nosve_streaming_function(ptr %ptr) #9 {
2025-07-25T09:28:26.9146067Z ^
2025-07-25T09:28:26.9146497Z <stdin>:278:38: note: with "ATTR2" equal to "2"
2025-07-25T09:28:26.9146989Z define void @nosve_streaming_function(ptr %ptr) #9 {
2025-07-25T09:28:26.9147446Z ^
2025-07-25T09:28:26.9148713Z /home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll:693:15: error: CHECK-SAME: expected string not found in input
2025-07-25T09:28:26.9149942Z ; CHECK-SAME: (ptr [[PTR:%.*]]) #[[ATTR1]] {
2025-07-25T09:28:26.9150351Z ^
2025-07-25T09:28:26.9150656Z <stdin>:283:69: note: scanning from here
2025-07-25T09:28:26.9151289Z define void @nosve_non_streaming_caller_streaming_callee_dont_inline(ptr %ptr) #10 {
2025-07-25T09:28:26.9151948Z ^
2025-07-25T09:28:26.9152585Z <stdin>:283:69: note: with "ATTR1" equal to "1"
2025-07-25T09:28:26.9153230Z define void @nosve_non_streaming_caller_streaming_callee_dont_inline(ptr %ptr) #10 {
I think this is due to the addition of target-features
8e6e7ee
to
932c27b
Compare
; CHECK-LABEL: define void @streaming_caller_multiple_streaming_callees_dont_inline | ||
; CHECK-SAME: () #[[ATTR6]] { | ||
; CHECK-NEXT: call void @streaming_caller_multiple_streaming_callees() | ||
define void @nonstreaming_caller_multiple_nonstreaming_callees_dont_inline() #0 "aarch64_pstate_sm_enabled" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is aarch64_pstate_sm_enabled
intentional here? The function is nonstreaming_caller
, so I'd guess not? Generally, I'd prefer that the precommit does not change any existing tests other than adding new cases (since it makes it harder to tell what's changed). If tests need tweaking, I'd be more obvious to see why as a commit after the LLVM change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is intentional, although I agree the name of the function makes no sense, but the name never made much sense before either. The name seems to relate to the mode of the callee and its callee's.
The reason for making this function aarch64_pstate_sm_enabled
is described in the commit message of the precommit:
I've had to modify some of the other tests to test inlining with streaming mode changes the
other way around (streaming <- non-streaming instead of non-streaming <- streaming)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I'd prefer that the precommit does not change any existing tests other than adding new cases (since it makes it harder to tell what's changed)
I've rejigged the test changes a bit. Let me know if this makes it easier to review.
Without this change, the following test would fail to compile with `-march=armv8-a+sme`: void func1(const svuint32_t *in, svuint32_t *out) { [&]() __arm_streaming { *out = *in; }(); } But in general, it's probably better never to inline streaming functions into non-streaming functions, because they will have been marked as 'streaming' for a reason by the user.
932c27b
to
c7b4c03
Compare
c7b4c03
to
443e282
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks! Just a few nits on comments in the tests:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Without this change, the following test would fail to compile
with
-march=armv8-a+sme
:But in general, it's probably better never to inline
streaming functions into non-streaming functions, because
they will have been marked as 'streaming' for a reason
by the user.