Skip to content

[AArch64] Remove wrong processor feature #151289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tomershafir
Copy link
Contributor

@tomershafir tomershafir commented Jul 30, 2025

Previously introduced by: #144152

@tomershafir tomershafir force-pushed the aarch64-remove-wrong-processor-feature branch from f76a5c5 to 30f404e Compare July 30, 2025 07:38
@tomershafir tomershafir marked this pull request as ready for review July 30, 2025 08:43
@llvmbot
Copy link
Member

llvmbot commented Jul 30, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Tomer Shafir (tomershafir)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/151289.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64Processors.td (-10)
  • (removed) llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll (-103)
diff --git a/llvm/lib/Target/AArch64/AArch64Processors.td b/llvm/lib/Target/AArch64/AArch64Processors.td
index adc984ad795af..fa037df531028 100644
--- a/llvm/lib/Target/AArch64/AArch64Processors.td
+++ b/llvm/lib/Target/AArch64/AArch64Processors.td
@@ -318,7 +318,6 @@ def TuneAppleA7  : SubtargetFeature<"apple-a7", "ARMProcFamily", "AppleA7",
                                     FeatureFuseAES, FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing,
                                     FeatureZCZeroingFPWorkaround]>;
 
@@ -332,7 +331,6 @@ def TuneAppleA10 : SubtargetFeature<"apple-a10", "ARMProcFamily", "AppleA10",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
@@ -345,7 +343,6 @@ def TuneAppleA11 : SubtargetFeature<"apple-a11", "ARMProcFamily", "AppleA11",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
@@ -358,7 +355,6 @@ def TuneAppleA12 : SubtargetFeature<"apple-a12", "ARMProcFamily", "AppleA12",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
@@ -371,7 +367,6 @@ def TuneAppleA13 : SubtargetFeature<"apple-a13", "ARMProcFamily", "AppleA13",
                                     FeatureFuseCryptoEOR,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
@@ -389,7 +384,6 @@ def TuneAppleA14 : SubtargetFeature<"apple-a14", "ARMProcFamily", "AppleA14",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
@@ -407,7 +401,6 @@ def TuneAppleA15 : SubtargetFeature<"apple-a15", "ARMProcFamily", "AppleA15",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
@@ -425,7 +418,6 @@ def TuneAppleA16 : SubtargetFeature<"apple-a16", "ARMProcFamily", "AppleA16",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
@@ -443,7 +435,6 @@ def TuneAppleA17 : SubtargetFeature<"apple-a17", "ARMProcFamily", "AppleA17",
                                     FeatureFuseLiterals,
                                     FeatureStorePairSuppress,
                                     FeatureZCRegMoveGPR64,
-                                    FeatureZCRegMoveFPR64,
                                     FeatureZCZeroing]>;
 
 def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
@@ -460,7 +451,6 @@ def TuneAppleM4 : SubtargetFeature<"apple-m4", "ARMProcFamily", "AppleM4",
                                      FeatureFuseCryptoEOR,
                                      FeatureFuseLiterals,
                                      FeatureZCRegMoveGPR64,
-                                     FeatureZCRegMoveFPR64,
                                      FeatureZCZeroing
                                      ]>;
 
diff --git a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll b/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll
deleted file mode 100644
index f422f96f33495..0000000000000
--- a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-regmov-fpr.ll
+++ /dev/null
@@ -1,103 +0,0 @@
-; RUN: llc < %s -mtriple=arm64-linux-gnu | FileCheck %s -check-prefixes=NOTCPU-LINUX --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=generic | FileCheck %s -check-prefixes=NOTCPU-APPLE --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=apple-m1 | FileCheck %s -check-prefixes=CPU --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=apple-m1 -mattr=-zcm-fpr64 | FileCheck %s -check-prefixes=NOTATTR --match-full-lines
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mattr=+zcm-fpr64 | FileCheck %s -check-prefixes=ATTR --match-full-lines
-
-define void @zero_cycle_regmov_FPR32(float %a, float %b, float %c, float %d) {
-entry:
-; CHECK-LABEL: t:
-; NOTCPU-LINUX: fmov s0, s2
-; NOTCPU-LINUX: fmov s1, s3
-; NOTCPU-LINUX: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-LINUX: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-LINUX-NEXT: bl {{_?foo_float}}
-; NOTCPU-LINUX: fmov s0, [[REG1]]
-; NOTCPU-LINUX: fmov s1, [[REG2]]
-
-; NOTCPU-APPLE: fmov s0, s2
-; NOTCPU-APPLE: fmov s1, s3
-; NOTCPU-APPLE: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-APPLE: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-APPLE-NEXT: bl {{_?foo_float}}
-; NOTCPU-APPLE: fmov s0, [[REG1]]
-; NOTCPU-APPLE: fmov s1, [[REG2]]
-
-; CPU: fmov [[REG2:d[0-9]+]], d3
-; CPU: fmov [[REG1:d[0-9]+]], d2
-; CPU: fmov d0, d2
-; CPU: fmov d1, d3
-; CPU-NEXT: bl {{_?foo_float}}
-; CPU: fmov d0, [[REG1]]
-; CPU: fmov d1, [[REG2]]
-
-; NOTATTR: fmov [[REG2:s[0-9]+]], s3
-; NOTATTR: fmov [[REG1:s[0-9]+]], s2
-; NOTATTR: fmov s0, s2
-; NOTATTR: fmov s1, s3
-; NOTATTR-NEXT: bl {{_?foo_float}}
-; NOTATTR: fmov s0, [[REG1]]
-; NOTATTR: fmov s1, [[REG2]]
-
-; ATTR: fmov d0, d2
-; ATTR: fmov d1, d3
-; ATTR: fmov [[REG2:d[0-9]+]], d3
-; ATTR: fmov [[REG1:d[0-9]+]], d2
-; ATTR-NEXT: bl {{_?foo_float}}
-; ATTR: fmov d0, [[REG1]]
-; ATTR: fmov d1, [[REG2]]
-  %call = call float @foo_float(float %c, float %d)
-  %call1 = call float @foo_float(float %c, float %d)
-  unreachable
-}
-
-declare float @foo_float(float, float)
-
-define void @zero_cycle_regmov_FPR16(half %a, half %b, half %c, half %d) {
-entry:
-; CHECK-LABEL: t:
-; NOTCPU-LINUX: fmov s0, s2
-; NOTCPU-LINUX: fmov s1, s3
-; NOTCPU-LINUX: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-LINUX: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-LINUX-NEXT: bl {{_?foo_half}}
-; NOTCPU-LINUX: fmov s0, [[REG1]]
-; NOTCPU-LINUX: fmov s1, [[REG2]]
-
-; NOTCPU-APPLE: fmov s0, s2
-; NOTCPU-APPLE: fmov s1, s3
-; NOTCPU-APPLE: fmov [[REG2:s[0-9]+]], s3
-; NOTCPU-APPLE: fmov [[REG1:s[0-9]+]], s2
-; NOTCPU-APPLE-NEXT: bl {{_?foo_half}}
-; NOTCPU-APPLE: fmov s0, [[REG1]]
-; NOTCPU-APPLE: fmov s1, [[REG2]]
-
-; CPU: fmov [[REG2:d[0-9]+]], d3
-; CPU: fmov [[REG1:d[0-9]+]], d2
-; CPU: fmov d0, d2
-; CPU: fmov d1, d3
-; CPU-NEXT: bl {{_?foo_half}}
-; CPU: fmov d0, [[REG1]]
-; CPU: fmov d1, [[REG2]]
-
-; NOTATTR: fmov [[REG2:s[0-9]+]], s3
-; NOTATTR: fmov [[REG1:s[0-9]+]], s2
-; NOTATTR: fmov s0, s2
-; NOTATTR: fmov s1, s3
-; NOTATTR-NEXT: bl {{_?foo_half}}
-; NOTATTR: fmov s0, [[REG1]]
-; NOTATTR: fmov s1, [[REG2]]
-
-; ATTR: fmov d0, d2
-; ATTR: fmov d1, d3
-; ATTR: fmov [[REG2:d[0-9]+]], d3
-; ATTR: fmov [[REG1:d[0-9]+]], d2
-; ATTR-NEXT: bl {{_?foo_half}}
-; ATTR: fmov d0, [[REG1]]
-; ATTR: fmov d1, [[REG2]]
-  %call = call half @foo_half(half %c, half %d)
-  %call1 = call half @foo_half(half %c, half %d)
-  unreachable
-}
-
-declare half @foo_half(half, half)

@tomershafir tomershafir requested a review from jroelofs July 30, 2025 08:46
@DavidSpickett
Copy link
Collaborator

No opinion on the change itself but your PR description does need more detail. A link to the change that added it would be good too.

@tomershafir tomershafir force-pushed the aarch64-remove-wrong-processor-feature branch from 30f404e to 31d6c6f Compare July 30, 2025 10:12
@tomershafir
Copy link
Contributor Author

@DavidSpickett Done

@DavidSpickett
Copy link
Collaborator

Ok, better, but why is the feature wrong?

If I do a downstream merge and see that this change made my benchmarks go one way or the other, that's what I'd be looking for in the commit message. Perhaps the intent was good you just want to implement it differently, perhaps you found it made zero difference and doesn't need to be there, whatever the reason is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants