Skip to content

[X86] Set .llvmbc and .llvmcmd to exclude sections #151910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

HaohaiWen
Copy link
Contributor

The linked .llvmbc and .llvmcmd sections are meaningless in the final
binary. They can occupy significant space and should be simply discarded.
This patch sets them to exclude section kind for ELF and COFF. Wasm does
not support exclude kind, so those sections are retained as metadata and
they will be dropped by linker explicitly.

The linked .llvmbc and .llvmcmd sections are meaningless in the final
binary. They can occupy significant space and should be simply discarded.
This patch sets them to exclude section kind for ELF and COFF. Wasm does
not support exclude kind, so those sections are retained as metadata and
they will be dropped by linker explicitly.
@HaohaiWen HaohaiWen requested review from MaskRay and ZequanWu August 4, 2025 07:20
@llvmbot llvmbot added backend:X86 llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO) labels Aug 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-lto

Author: Haohai Wen (HaohaiWen)

Changes

The linked .llvmbc and .llvmcmd sections are meaningless in the final
binary. They can occupy significant space and should be simply discarded.
This patch sets them to exclude section kind for ELF and COFF. Wasm does
not support exclude kind, so those sections are retained as metadata and
they will be dropped by linker explicitly.


Full diff: https://github.com/llvm/llvm-project/pull/151910.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp (+9-4)
  • (modified) llvm/test/CodeGen/X86/embed-bitcode.ll (+4-2)
  • (modified) llvm/test/LTO/X86/embed-bitcode.ll (+1-1)
diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index d19ef923ef740..1220bed503fc1 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -466,10 +466,12 @@ static SectionKind getELFKindForNamedSection(StringRef Name, SectionKind K) {
       Name == getInstrProfSectionName(IPSK_covdata, Triple::ELF,
                                       /*AddSegmentInfo=*/false) ||
       Name == getInstrProfSectionName(IPSK_covname, Triple::ELF,
-                                      /*AddSegmentInfo=*/false) ||
-      Name == ".llvmbc" || Name == ".llvmcmd")
+                                      /*AddSegmentInfo=*/false))
     return SectionKind::getMetadata();
 
+  if (Name == ".llvmbc" || Name == ".llvmcmd")
+    return SectionKind::getExclude();
+
   if (!Name.starts_with(".")) return K;
 
   // Default implementation based on some magic section names.
@@ -1735,9 +1737,12 @@ MCSection *TargetLoweringObjectFileCOFF::getExplicitSectionGlobal(
       Name == getInstrProfSectionName(IPSK_covdata, Triple::COFF,
                                       /*AddSegmentInfo=*/false) ||
       Name == getInstrProfSectionName(IPSK_covname, Triple::COFF,
-                                      /*AddSegmentInfo=*/false) ||
-      Name == ".llvmbc" || Name == ".llvmcmd")
+                                      /*AddSegmentInfo=*/false))
     Kind = SectionKind::getMetadata();
+
+  if (Name == ".llvmbc" || Name == ".llvmcmd")
+    Kind = SectionKind::getExclude();
+
   int Selection = 0;
   unsigned Characteristics = getCOFFSectionFlags(Kind, TM);
   StringRef COMDATSymName = "";
diff --git a/llvm/test/CodeGen/X86/embed-bitcode.ll b/llvm/test/CodeGen/X86/embed-bitcode.ll
index d4af9544bc1be..7b08c69926d8a 100644
--- a/llvm/test/CodeGen/X86/embed-bitcode.ll
+++ b/llvm/test/CodeGen/X86/embed-bitcode.ll
@@ -4,17 +4,19 @@
 ; RUN: llvm-readobj -S %t | FileCheck %s --check-prefix=COFF
 
 ; CHECK:      .text    PROGBITS 0000000000000000 [[#%x,OFF:]] 000000 00 AX 0
-; CHECK-NEXT: .llvmbc  PROGBITS 0000000000000000 [[#%x,OFF:]] 000004 00    0
-; CHECK-NEXT: .llvmcmd PROGBITS 0000000000000000 [[#%x,OFF:]] 000005 00    0
+; CHECK-NEXT: .llvmbc  PROGBITS 0000000000000000 [[#%x,OFF:]] 000004 00 E  0
+; CHECK-NEXT: .llvmcmd PROGBITS 0000000000000000 [[#%x,OFF:]] 000005 00 E  0
 
 ; COFF:      Name: .llvmbc (2E 6C 6C 76 6D 62 63 00)
 ; COFF:      Characteristics [
 ; COFF-NEXT:   IMAGE_SCN_ALIGN_1BYTES
+; COFF-NEXT:   IMAGE_SCN_LNK_REMOVE
 ; COFF-NEXT:   IMAGE_SCN_MEM_DISCARDABLE
 ; COFF-NEXT: ]
 ; COFF:      Name: .llvmcmd (2E 6C 6C 76 6D 63 6D 64)
 ; COFF:      Characteristics [
 ; COFF-NEXT:   IMAGE_SCN_ALIGN_1BYTES
+; COFF-NEXT:   IMAGE_SCN_LNK_REMOVE
 ; COFF-NEXT:   IMAGE_SCN_MEM_DISCARDABLE
 ; COFF-NEXT: ]
 
diff --git a/llvm/test/LTO/X86/embed-bitcode.ll b/llvm/test/LTO/X86/embed-bitcode.ll
index bdddd079d2265..d8ebbdf85bc32 100644
--- a/llvm/test/LTO/X86/embed-bitcode.ll
+++ b/llvm/test/LTO/X86/embed-bitcode.ll
@@ -19,7 +19,7 @@
 ; RUN: llvm-dis %t-embedded.bc -o - | FileCheck %s --check-prefixes=CHECK-LL,CHECK-NOOPT
 
 ; CHECK-ELF:      .text   PROGBITS 0000000000000000 [[#%x,OFF:]] [[#%x,SIZE:]] 00 AX 0
-; CHECK-ELF-NEXT: .llvmbc PROGBITS 0000000000000000 [[#%x,OFF:]] [[#%x,SIZE:]] 00    0
+; CHECK-ELF-NEXT: .llvmbc PROGBITS 0000000000000000 [[#%x,OFF:]] [[#%x,SIZE:]] 00 E  0
 
 ; CHECK-LL: @_start
 ; CHECK-LL: @foo

@HaohaiWen HaohaiWen requested a review from sbc100 August 4, 2025 07:21
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Aug 4, 2025
@HaohaiWen HaohaiWen requested review from w2yehia and smithp35 August 4, 2025 08:25
@MaskRay
Copy link
Member

MaskRay commented Aug 4, 2025

This is incorrect. Some use cases require the sections to be combined during linking and adding ELF SHF_EXCLUDE would break the usage. https://reviews.llvm.org/D86374

If you want these flags to be relocatable files only, specify /DISCARD/ in a linker script to discard them or discard them post-linking with llvm-objcopy.

@HaohaiWen
Copy link
Contributor Author

This is incorrect. Some use cases require the sections to be combined during linking and adding ELF SHF_EXCLUDE would break the usage. https://reviews.llvm.org/D86374

llvm-lto2 in llvm/test/LTO/X86/embed-bitcode.ll combines llvm bitcodes and the output is "object file". I don't see test in D86374 is trying to combine .llvmbc/.llvmcmd from object file to executable/shared library and then extract them for parsing.

.llvmcmd in each object file is the join of compiler flags by separator '\0'. There's no begin/end marker. Simply combining them makes it nearly impossible to split them and remap it back to corresponding object file/bitcode.
I didn't take a look llvm bitcode format. Maybe it has same combining issue like .llvmcmd.

If user want to recompile bitcode with cmd. I think parsing them in .o is better choice.

@MaskRay
Copy link
Member

MaskRay commented Aug 5, 2025

I think not using SHF_EXCLUDE is an intention design. There are some discussions on
https://discourse.llvm.org/t/end-to-end-fembed-bitcode-llvmbc-and-llvmcmd/56265

The concatenated bitcode might allow analysis like https://github.com/travitch/whole-program-llvm (though it doesn't seem to use embed-bitcode).

@HaohaiWen
Copy link
Contributor Author

HaohaiWen commented Aug 5, 2025

I think not using SHF_EXCLUDE is an intention design. There are some discussions on https://discourse.llvm.org/t/end-to-end-fembed-bitcode-llvmbc-and-llvmcmd/56265

The concatenated bitcode might allow analysis like https://github.com/travitch/whole-program-llvm (though it doesn't seem to use embed-bitcode).

Thanks for explanation.
Someone have already removed it from wasm linker.
I removed those sections from PE image due to limitation of PE size (< 2GB for executable and < 4GB for a valid PE32+ file): #150897.

Rather than trying to parse combined .llvmbc and .llvmcmd. I think using object/lib file is a greater choice to reproduce compilation.

@HaohaiWen HaohaiWen closed this Aug 5, 2025
@HaohaiWen HaohaiWen deleted the cmd branch August 5, 2025 12:52
@HaohaiWen HaohaiWen restored the cmd branch August 5, 2025 12:54
@HaohaiWen HaohaiWen deleted the cmd branch August 5, 2025 12:55
@HaohaiWen
Copy link
Contributor Author

Let's close it first and revive this PR later if we comfirm that there're no post linked users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang Clang issues not falling into any other category llvm:codegen LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants