Skip to content

[AMDGPU] Ensure non-reserved CSR spilled regs are live-in #146427

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -983,6 +983,7 @@ void SIFrameLowering::emitCSRSpillStores(
const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
const SIInstrInfo *TII = ST.getInstrInfo();
const SIRegisterInfo &TRI = TII->getRegisterInfo();
MachineRegisterInfo &MRI = MF.getRegInfo();

// Spill Whole-Wave Mode VGPRs. Save only the inactive lanes of the scratch
// registers. However, save all lanes of callee-saved VGPRs. Due to this, we
Expand All @@ -1005,6 +1006,12 @@ void SIFrameLowering::emitCSRSpillStores(
}
};

for (const Register Reg : make_first_range(WWMScratchRegs)) {
if (!MRI.isReserved(Reg)) {
MRI.addLiveIn(Reg);
MBB.addLiveIn(Reg);
}
Comment on lines +1010 to +1013
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought all WWM registers should be reserved at this point, this is papering over not adding them to reserved?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We initially had the code that adds the wwm-regs conservatively to all Blocks LiveIn sets. We removed them earlier with one of the wwm regalloc/spill patches (I can't recollect the exact reason). I had this impression that we should at least add them to the entry block LiveIn set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reg is $agpr13.
Where would you expect it to get reserved? SIMachineFunctionInfo::allocateWWMSpill()?

}
StoreWWMRegisters(WWMScratchRegs);

auto EnableAllLanes = [&]() {
Expand Down
97 changes: 97 additions & 0 deletions llvm/test/CodeGen/AMDGPU/bug-undef-spilled-agpr.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -run-pass=si-lower-sgpr-spills,greedy,si-lower-wwm-copies,virtregrewriter,prologepilog -verify-machineinstrs -o - %s | FileCheck -check-prefix=GCN %s

---
name: widget
tracksRegLiveness: true
frameInfo:
adjustsStack: true
stack:
- { id: 0, type: spill-slot, size: 4, alignment: 4, stack-id: sgpr-spill }
- { id: 1, type: spill-slot, size: 4, alignment: 4, stack-id: sgpr-spill }
machineFunctionInfo:
hasSpilledSGPRs: true
scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3'
stackPtrOffsetReg: '$sgpr32'
body: |
; GCN-LABEL: name: widget
; GCN: bb.0:
; GCN-NEXT: successors: %bb.1(0x80000000)
; GCN-NEXT: liveins: $sgpr12, $sgpr13, $sgpr14, $sgpr15, $agpr0
; GCN-NEXT: {{ $}}
; GCN-NEXT: $sgpr4_sgpr5 = S_XOR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
; GCN-NEXT: $vgpr63 = V_ACCVGPR_READ_B32_e64 $agpr0, implicit $exec
; GCN-NEXT: BUFFER_STORE_DWORD_OFFSET $vgpr63, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec :: (store (s32) into %stack.2, addrspace 5)
; GCN-NEXT: $exec = S_MOV_B64 -1
; GCN-NEXT: BUFFER_STORE_DWORD_OFFSET killed $vgpr62, $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (store (s32) into %stack.3, addrspace 5)
; GCN-NEXT: $exec = S_MOV_B64 killed $sgpr4_sgpr5
; GCN-NEXT: renamable $vgpr62 = IMPLICIT_DEF
; GCN-NEXT: $vgpr62 = SI_SPILL_S32_TO_VGPR $sgpr15, 0, killed $vgpr62
; GCN-NEXT: $noreg = S_OR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
; GCN-NEXT: renamable $agpr0 = COPY killed renamable $vgpr62
; GCN-NEXT: $exec = S_MOV_B64 killed $noreg
; GCN-NEXT: renamable $vgpr62 = IMPLICIT_DEF
; GCN-NEXT: dead renamable $vgpr62 = V_AND_B32_e32 1, killed $vgpr62, implicit $exec
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.1:
; GCN-NEXT: successors: %bb.3(0x40000000), %bb.2(0x40000000)
; GCN-NEXT: liveins: $agpr0
; GCN-NEXT: {{ $}}
; GCN-NEXT: S_CBRANCH_EXECZ %bb.2, implicit $exec
; GCN-NEXT: S_BRANCH %bb.3
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.2:
; GCN-NEXT: successors: %bb.4(0x04000000), %bb.1(0x7c000000)
; GCN-NEXT: liveins: $agpr0, $sgpr86, $sgpr87, $sgpr66_sgpr67, $sgpr68_sgpr69, $sgpr70_sgpr71, $sgpr80_sgpr81, $sgpr82_sgpr83, $sgpr84_sgpr85, $sgpr96_sgpr97, $sgpr98_sgpr99
; GCN-NEXT: {{ $}}
; GCN-NEXT: S_CBRANCH_EXECNZ %bb.1, implicit $exec
; GCN-NEXT: S_BRANCH %bb.4
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.3:
; GCN-NEXT: successors: %bb.2(0x80000000)
; GCN-NEXT: liveins: $agpr0
; GCN-NEXT: {{ $}}
; GCN-NEXT: $noreg = S_OR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
; GCN-NEXT: renamable $vgpr62 = COPY renamable $agpr0
; GCN-NEXT: $exec = S_MOV_B64 killed $noreg
; GCN-NEXT: $sgpr14 = SI_RESTORE_S32_FROM_VGPR killed $vgpr62, 1
; GCN-NEXT: S_BRANCH %bb.2
; GCN-NEXT: {{ $}}
; GCN-NEXT: bb.4:
; GCN-NEXT: $sgpr4_sgpr5 = S_XOR_SAVEEXEC_B64 -1, implicit-def $exec, implicit-def dead $scc, implicit $exec
; GCN-NEXT: $vgpr63 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 0, 0, 0, implicit $exec :: (load (s32) from %stack.2, addrspace 5)
; GCN-NEXT: $agpr0 = V_ACCVGPR_WRITE_B32_e64 killed $vgpr63, implicit $exec
; GCN-NEXT: $exec = S_MOV_B64 -1
; GCN-NEXT: $vgpr62 = BUFFER_LOAD_DWORD_OFFSET $sgpr0_sgpr1_sgpr2_sgpr3, $sgpr32, 4, 0, 0, implicit $exec :: (load (s32) from %stack.3, addrspace 5)
; GCN-NEXT: $exec = S_MOV_B64 killed $sgpr4_sgpr5
; GCN-NEXT: SI_RETURN
bb.0:
liveins: $sgpr12, $sgpr13, $sgpr14, $sgpr15
%45:vgpr_32 = IMPLICIT_DEF
SI_SPILL_S32_SAVE $sgpr15, %stack.0, implicit $exec, implicit $sgpr32 :: (store (s32) into %stack.0, addrspace 5)
%16:vgpr_32 = V_AND_B32_e32 1, %45, implicit $exec
bb.1:
successors: %bb.3, %bb.2
S_CBRANCH_EXECZ %bb.2, implicit $exec
S_BRANCH %bb.3
bb.2:
successors: %bb.4(0x04000000), %bb.1(0x7c000000)
liveins: $sgpr86, $sgpr87, $sgpr66_sgpr67, $sgpr68_sgpr69, $sgpr70_sgpr71, $sgpr80_sgpr81, $sgpr82_sgpr83, $sgpr84_sgpr85, $sgpr96_sgpr97, $sgpr98_sgpr99
S_CBRANCH_EXECNZ %bb.1, implicit $exec
S_BRANCH %bb.4
bb.3:
ADJCALLSTACKUP 0, 0, implicit-def dead $scc, implicit-def $sgpr32, implicit $sgpr32
$sgpr14 = SI_SPILL_S32_RESTORE %stack.1, implicit $exec, implicit $sgpr32 :: (load (s32) from %stack.1, addrspace 5)
ADJCALLSTACKDOWN 0, 28, implicit-def dead $scc, implicit-def $sgpr32, implicit $sgpr32
S_BRANCH %bb.2
bb.4:
SI_RETURN
...