From ccd6021fc9bf523e3dafbf7dff80027c47410204 Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 10:08:35 -0400 Subject: [PATCH 1/7] [Docs] Some updates to the Clang user's manual * Fills out the terminology section * Removes the basic usage section (we should bring it back someday though!) * Updates the list of supported language versions * Adds information about what versions of Clang are officially supported * Moves some extensions into the intentionally unsupported extensions section. There are likely far more updates that could be done, but this seemed worth posting just to get things moving. --- clang/docs/UsersManual.rst | 105 +++++++++++++++++++++---------------- 1 file changed, 59 insertions(+), 46 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index af0a8746d45e7..06a867ef38fa4 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -36,7 +36,7 @@ language-specific information, please see the corresponding language specific section: - :ref:`C Language `: K&R C, ANSI C89, ISO C90, ISO C94 (C89+AMD1), ISO - C99 (+TC1, TC2, TC3). + C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y. - :ref:`Objective-C Language `: ObjC 1, ObjC 2, ObjC 2.1, plus variants depending on base language. - :ref:`C++ Language ` @@ -60,29 +60,46 @@ features that depend on what CPU architecture or operating system is being compiled for. Please see the :ref:`Target-Specific Features and Limitations ` section for more details. -The rest of the introduction introduces some basic :ref:`compiler -terminology ` that is used throughout this manual and -contains a basic :ref:`introduction to using Clang ` as a -command line compiler. - .. _terminology: Terminology ----------- +* Lexer -- the part of the compiler responsible for converting source code into + abstract representations called tokens. +* Preprocessor -- the part of the compiler responsible for in-place textual + replacement of source constructs. When the lexer is required to produce a + token, it will run the preprocessor while determining which token to produce. + In other words, when the lexer encounters something like `#include` or a macro + name, the preprocessor will be used to perform the inclusion or expand the + macro name into its replacement list, and return the resulting non-preprocessor + token. +* Parser -- the part of the compiler responsible for determining syntactic + correctness of the source code. The parser will request tokens from the lexer + and after performing semantic analysis of the production, generates an + abstract representation of the source called an AST. +* Diagnostic -- a message to the user about properties of the source code. For + example, errors or warnings and their associated notes. +* Undefined behavior -- behavior for which the standard imposes no requirements + on how the code behaves. Generally speaking, undefined behavior is a bug in + the user's code. However, it can also be a place for the compiler to define + the behavior, called an extension. +* Optimizer -- the part of the compiler responsible for transforming user code + into faster user code, without changing the semantics of how the code behaves. + Note, the optimizer assumes the code has no undefined behavior, so if the code + does contain undefined behavior, it will often behave differently depending on + which optimization level is enabled. +* Front end -- the Lexer, Preprocessor, Parser, semantic analysis, and LLVM IR + code generation parts of the compiler. +* Backend -- the parts of the compiler which run after LLVM IR code generation, + such as the optimizer. + +Support +------- +Clang releases happen roughly `every six months `_. +Only the current public release is officially supported. Bug-fix releases for +the current release will be produced on an as-needed basis, but bug fixes are +not backported to releases older than the current one. -Front end, parser, backend, preprocessor, undefined behavior, -diagnostic, optimizer - -.. _basicusage: - -Basic Usage ------------ - -Intro to how to use a C compiler for newbies. - -compile + link compile then link debug info enabling optimizations -picking a language to use, defaults to C17 by default. Autosenses based -on extension. using a makefile Command Line Options ==================== @@ -3797,8 +3814,8 @@ This environment variable does not affect the options added by the config files. C Language Features =================== -The support for standard C in clang is feature-complete except for the -C99 floating-point pragmas. +The support for standard C in Clang is mostly feature-complete, see the `C +status page `_ for more details. Extensions supported by clang ----------------------------- @@ -3883,23 +3900,10 @@ GCC extensions not implemented yet ---------------------------------- clang tries to be compatible with gcc as much as possible, but some gcc -extensions are not implemented yet: +extensions are not implemented: - clang does not support decimal floating point types (``_Decimal32`` and friends) yet. -- clang does not support nested functions; this is a complex feature - which is infrequently used, so it is unlikely to be implemented - anytime soon. In C++11 it can be emulated by assigning lambda - functions to local variables, e.g: - - .. code-block:: cpp - - auto const local_function = [&](int parameter) { - // Do something - }; - ... - local_function(1); - - clang only supports global register variables when the register specified is non-allocatable (e.g. the stack pointer). Support for general global register variables is unlikely to be implemented soon because it requires @@ -3914,18 +3918,13 @@ extensions are not implemented yet: that because clang pretends to be like GCC 4.2, and this extension was introduced in 4.3, the glibc headers will not try to use this extension with clang at the moment. -- clang does not support the gcc extension for forward-declaring - function parameters; this has not shown up in any real-world code - yet, though, so it might never be implemented. This is not a complete list; if you find an unsupported extension -missing from this list, please send an e-mail to cfe-dev. This list -currently excludes C++; see :ref:`C++ Language Features `. Also, this -list does not include bugs in mostly-implemented features; please see -the `bug -tracker `_ -for known existing bugs (FIXME: Is there a section for bug-reporting -guidelines somewhere?). +missing from this list, please file a `feature request `_. +This list currently excludes C++; see :ref:`C++ Language Features `. Also, +this list does not include bugs in mostly-implemented features; please see the +`issues list `_ for known existing +bugs. Intentionally unsupported GCC extensions ---------------------------------------- @@ -3944,6 +3943,20 @@ Intentionally unsupported GCC extensions variable) will likely never be accepted by Clang. - clang does not support ``__builtin_apply`` and friends; this extension is extremely obscure and difficult to implement reliably. +- clang does not support the gcc extension for forward-declaring + function parameters. +- clang does not support nested functions; this is a complex feature which is + infrequently used, so it is unlikely to be implemented. In C++11 it can be + emulated by assigning lambda functions to local variables, e.g: + + .. code-block:: cpp + + auto const local_function = [&](int parameter) { + // Do something + }; + ... + local_function(1); + .. _c_ms: @@ -3983,7 +3996,7 @@ C++ Language Features clang fully implements all of standard C++98 except for exported templates (which were removed in C++11), all of standard C++11, -C++14, and C++17, and most of C++20. +C++14, and C++17, and most of C++20 and C++23. See the `C++ support in Clang `_ page for detailed information on C++ feature support across Clang versions. From 4da6e09b3555daab4fcfc92667f52d6c1c241e4e Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 10:27:01 -0400 Subject: [PATCH 2/7] Update based on review feedback --- clang/docs/UsersManual.rst | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 06a867ef38fa4..5abd11abef1c9 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -77,6 +77,9 @@ Terminology correctness of the source code. The parser will request tokens from the lexer and after performing semantic analysis of the production, generates an abstract representation of the source called an AST. +* Sema -- the part of the compiler responsible for determining semantic + correctness of the source code. It is closely related to the parser and is + where many diagnostics are produced. * Diagnostic -- a message to the user about properties of the source code. For example, errors or warnings and their associated notes. * Undefined behavior -- behavior for which the standard imposes no requirements @@ -88,11 +91,14 @@ Terminology Note, the optimizer assumes the code has no undefined behavior, so if the code does contain undefined behavior, it will often behave differently depending on which optimization level is enabled. -* Front end -- the Lexer, Preprocessor, Parser, semantic analysis, and LLVM IR - code generation parts of the compiler. +* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler. +* Middle-end -- converts the AST into LLVM IR, adds debug information, etc. * Backend -- the parts of the compiler which run after LLVM IR code generation, such as the optimizer. +See the :doc:`InternalsManual` for more details about the internal construction +of the compiler. + Support ------- Clang releases happen roughly `every six months `_. From b206c19f9ebf63bd38ca5b496261cbbf3a6e7d7c Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 10:34:47 -0400 Subject: [PATCH 3/7] Update the optimizer terminology based on review feedback --- clang/docs/UsersManual.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 5abd11abef1c9..ecce39b8c0a30 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -86,11 +86,11 @@ Terminology on how the code behaves. Generally speaking, undefined behavior is a bug in the user's code. However, it can also be a place for the compiler to define the behavior, called an extension. -* Optimizer -- the part of the compiler responsible for transforming user code - into faster user code, without changing the semantics of how the code behaves. - Note, the optimizer assumes the code has no undefined behavior, so if the code - does contain undefined behavior, it will often behave differently depending on - which optimization level is enabled. +* Optimizer -- the part of the compiler responsible for transforming code to + have better performance characteristics without changing the semantics of how + the code behaves. Note, the optimizer assumes the code has no undefined + behavior, so if the code does contain undefined behavior, it will often behave + differently depending on which optimization level is enabled. * Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler. * Middle-end -- converts the AST into LLVM IR, adds debug information, etc. * Backend -- the parts of the compiler which run after LLVM IR code generation, From 03feb55dc759c425f53c1f6981f262df80cfc896 Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 10:35:49 -0400 Subject: [PATCH 4/7] Add C++ standards --- clang/docs/UsersManual.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index ecce39b8c0a30..cfdacaf229d56 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -39,7 +39,8 @@ specific section: C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y. - :ref:`Objective-C Language `: ObjC 1, ObjC 2, ObjC 2.1, plus variants depending on base language. -- :ref:`C++ Language ` +- :ref:`C++ Language `: C++98, C++03, C++11, C++14, C++17, C++20, C++23, + and C++2c. - :ref:`Objective C++ Language ` - :ref:`OpenCL Kernel Language `: OpenCL C 1.0, 1.1, 1.2, 2.0, 3.0, and C++ for OpenCL 1.0 and 2021. From 3792dde486f0ecccf9feb48f09805951fb115cb4 Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 10:37:11 -0400 Subject: [PATCH 5/7] Remove some repeated mentions of ISO --- clang/docs/UsersManual.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index cfdacaf229d56..ed51246555285 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -35,8 +35,8 @@ which includes :ref:`C `, :ref:`Objective-C `, :ref:`C++ `, and language-specific information, please see the corresponding language specific section: -- :ref:`C Language `: K&R C, ANSI C89, ISO C90, ISO C94 (C89+AMD1), ISO - C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y. +- :ref:`C Language `: K&R C, ANSI C89, ISO C90, C94 (C89+AMD1), C99 (+TC1, + TC2, TC3), C11, C17, C23, and C2y. - :ref:`Objective-C Language `: ObjC 1, ObjC 2, ObjC 2.1, plus variants depending on base language. - :ref:`C++ Language `: C++98, C++03, C++11, C++14, C++17, C++20, C++23, From af7feb144393aca7934a0ca038fa95b2ca037f51 Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 11:05:59 -0400 Subject: [PATCH 6/7] Update based on review feedback --- clang/docs/UsersManual.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index ed51246555285..d51d6fc39e002 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -40,7 +40,7 @@ specific section: - :ref:`Objective-C Language `: ObjC 1, ObjC 2, ObjC 2.1, plus variants depending on base language. - :ref:`C++ Language `: C++98, C++03, C++11, C++14, C++17, C++20, C++23, - and C++2c. + and C++26. - :ref:`Objective C++ Language ` - :ref:`OpenCL Kernel Language `: OpenCL C 1.0, 1.1, 1.2, 2.0, 3.0, and C++ for OpenCL 1.0 and 2021. @@ -95,7 +95,7 @@ Terminology * Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler. * Middle-end -- converts the AST into LLVM IR, adds debug information, etc. * Backend -- the parts of the compiler which run after LLVM IR code generation, - such as the optimizer. + such as the optimizer and generation of assembly code. See the :doc:`InternalsManual` for more details about the internal construction of the compiler. From b654a02731d737466dfdf2de9a2e7f56e4bb1aa0 Mon Sep 17 00:00:00 2001 From: Aaron Ballman Date: Fri, 1 Aug 2025 11:07:02 -0400 Subject: [PATCH 7/7] More updates based on review feedback --- clang/docs/UsersManual.rst | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index d51d6fc39e002..c7039290ec6d5 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -92,8 +92,11 @@ Terminology the code behaves. Note, the optimizer assumes the code has no undefined behavior, so if the code does contain undefined behavior, it will often behave differently depending on which optimization level is enabled. -* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler. -* Middle-end -- converts the AST into LLVM IR, adds debug information, etc. +* Frontend -- the Lexer, Preprocessor, Parser, Sema, and LLVM IR code generation + parts of the compiler. +* Middle-end -- a term used for the of the subset of the backend that does + (typically not target specific) optimizations prior to assembly code + generation. * Backend -- the parts of the compiler which run after LLVM IR code generation, such as the optimizer and generation of assembly code.