Skip to content

Commit a208623

Browse files
laddBigcheese
authored andcommitted
Improve module.pcm lock file performance on machines with high core counts
Summary: When building a large Xcode project with multiple module dependencies, and mixed Objective-C & Swift, I observed a large number of clang processes stalling at zero CPU for 30+ seconds throughout the build. This was especially prevalent on my 18-core iMac Pro. After some sampling, the major cause appears to be the lock file implementation for precompiled modules in the module cache. When the lock is heavily contended by multiple clang processes, the exponential backoff runs in lockstep, with some of the processes sleeping for 30+ seconds in order to acquire the file lock. In the attached patch, I implemented a more aggressive polling mechanism that limits the sleep interval to a max of 500ms, and randomizes the wait time. I preserved a limited form of exponential backoff. I also updated the code to use cross-platform timing, thread sleep, and random number capabilities available in C++11. On iMac Pro (2.3 GHz Intel Xeon W, 18 core): Xcode 11.1 bundled clang: 502.2 seconds (average of 5 runs) Custom clang build with LockFileManager patch applied: 276.6 seconds (average of 5 runs) This is a 1.82x speedup for this use case. On MacBook Pro (4 core 3.1GHz Intel i7): Xcode 11.1 bundled clang: 539.4 seconds (average of 2 runs) Custom clang build with LockFileManager patch applied: 509.5 seconds (average of 2 runs) As expected, machines with fewer cores benefit less from this change. ``` Call graph: 2992 Thread_393602 DispatchQueue_1: com.apple.main-thread (serial) 2992 start (in libdyld.dylib) + 1 [0x7fff6a1683d5] 2992 main (in clang) + 297 [0x1097a1059] 2992 driver_main(int, char const**) (in clang) + 2803 [0x1097a5513] 2992 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (in clang) + 1608 [0x1097a7cc8] 2992 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (in clang) + 3299 [0x1097dace3] 2992 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (in clang) + 509 [0x1097dcc1d] 2992 clang::FrontendAction::Execute() (in clang) + 42 [0x109818b3a] 2992 clang::ParseAST(clang::Sema&, bool, bool) (in clang) + 185 [0x10981b369] 2992 clang::Parser::ParseFirstTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) (in clang) + 37 [0x10983e9b5] 2992 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&) (in clang) + 141 [0x10983ecfd] 2992 clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) (in clang) + 695 [0x10983f3b7] 2992 clang::Parser::ParseObjCAtDirectives(clang::Parser::ParsedAttributesWithRange&) (in clang) + 637 [0x10a9be9bd] 2992 clang::Parser::ParseModuleImport(clang::SourceLocation) (in clang) + 170 [0x10c4841ba] 2992 clang::Parser::ParseModuleName(clang::SourceLocation, llvm::SmallVectorImpl<std::__1::pair<clang::IdentifierInfo*, clang::SourceLocation> >&, bool) (in clang) + 503 [0x10c485267] 2992 clang::Preprocessor::Lex(clang::Token&) (in clang) + 316 [0x1098285cc] 2992 clang::Preprocessor::LexAfterModuleImport(clang::Token&) (in clang) + 690 [0x10cc7af62] 2992 clang::CompilerInstance::loadModule(clang::SourceLocation, llvm::ArrayRef<std::__1::pair<clang::IdentifierInfo*, clang::SourceLocation> >, clang::Module::NameVisibilityKind, bool) (in clang) + 7989 [0x10bba6535] 2992 compileAndLoadModule(clang::CompilerInstance&, clang::SourceLocation, clang::SourceLocation, clang::Module*, llvm::StringRef) (in clang) + 296 [0x10bba8318] 2992 llvm::LockFileManager::waitForUnlock() (in clang) + 91 [0x10b6953ab] 2992 nanosleep (in libsystem_c.dylib) + 199 [0x7fff6a22c914] 2992 __semwait_signal (in libsystem_kernel.dylib) + 10 [0x7fff6a2a0f32] ``` Differential Revision: https://reviews.llvm.org/D69575
1 parent b6ae893 commit a208623

File tree

2 files changed

+35
-33
lines changed

2 files changed

+35
-33
lines changed

llvm/include/llvm/Support/LockFileManager.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,8 +78,8 @@ class LockFileManager {
7878

7979
/// For a shared lock, wait until the owner releases the lock.
8080
/// Total timeout for the file to appear is ~1.5 minutes.
81-
/// \param MaxSeconds the maximum wait time per iteration in seconds.
82-
WaitForUnlockResult waitForUnlock(const unsigned MaxSeconds = 40);
81+
/// \param MaxSeconds the maximum total wait time in seconds.
82+
WaitForUnlockResult waitForUnlock(const unsigned MaxSeconds = 90);
8383

8484
/// Remove the lock file. This may delete a different lock file than
8585
/// the one previously read if there is a race.

llvm/lib/Support/LockFileManager.cpp

Lines changed: 33 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,16 @@
1717
#include "llvm/Support/Signals.h"
1818
#include "llvm/Support/raw_ostream.h"
1919
#include <cerrno>
20+
#include <chrono>
2021
#include <ctime>
2122
#include <memory>
23+
#include <random>
2224
#include <sys/stat.h>
2325
#include <sys/types.h>
2426
#include <system_error>
27+
#include <thread>
2528
#include <tuple>
29+
2630
#ifdef _WIN32
2731
#include <windows.h>
2832
#endif
@@ -295,23 +299,29 @@ LockFileManager::waitForUnlock(const unsigned MaxSeconds) {
295299
if (getState() != LFS_Shared)
296300
return Res_Success;
297301

298-
#ifdef _WIN32
299-
unsigned long Interval = 1;
300-
#else
301-
struct timespec Interval;
302-
Interval.tv_sec = 0;
303-
Interval.tv_nsec = 1000000;
304-
#endif
302+
// Since we don't yet have an event-based method to wait for the lock file,
303+
// implement randomized exponential backoff, similar to Ethernet collision
304+
// algorithm. This improves performance on machines with high core counts
305+
// when the file lock is heavily contended by multiple clang processes
306+
const unsigned long MinWaitDurationMS = 10;
307+
const unsigned long MaxWaitMultiplier = 50; // 500ms max wait
308+
unsigned long WaitMultiplier = 1;
309+
unsigned long ElapsedTimeSeconds = 0;
310+
311+
std::random_device Device;
312+
std::default_random_engine Engine(Device());
313+
314+
auto StartTime = std::chrono::steady_clock::now();
315+
305316
do {
317+
// FIXME: implement event-based waiting
318+
306319
// Sleep for the designated interval, to allow the owning process time to
307320
// finish up and remove the lock file.
308-
// FIXME: Should we hook in to system APIs to get a notification when the
309-
// lock file is deleted?
310-
#ifdef _WIN32
311-
Sleep(Interval);
312-
#else
313-
nanosleep(&Interval, nullptr);
314-
#endif
321+
std::uniform_int_distribution<unsigned long> Distribution(1,
322+
WaitMultiplier);
323+
unsigned long WaitDurationMS = MinWaitDurationMS * Distribution(Engine);
324+
std::this_thread::sleep_for(std::chrono::milliseconds(WaitDurationMS));
315325

316326
if (sys::fs::access(LockFileName.c_str(), sys::fs::AccessMode::Exist) ==
317327
errc::no_such_file_or_directory) {
@@ -325,24 +335,16 @@ LockFileManager::waitForUnlock(const unsigned MaxSeconds) {
325335
if (!processStillExecuting((*Owner).first, (*Owner).second))
326336
return Res_OwnerDied;
327337

328-
// Exponentially increase the time we wait for the lock to be removed.
329-
#ifdef _WIN32
330-
Interval *= 2;
331-
#else
332-
Interval.tv_sec *= 2;
333-
Interval.tv_nsec *= 2;
334-
if (Interval.tv_nsec >= 1000000000) {
335-
++Interval.tv_sec;
336-
Interval.tv_nsec -= 1000000000;
338+
WaitMultiplier *= 2;
339+
if (WaitMultiplier > MaxWaitMultiplier) {
340+
WaitMultiplier = MaxWaitMultiplier;
337341
}
338-
#endif
339-
} while (
340-
#ifdef _WIN32
341-
Interval < MaxSeconds * 1000
342-
#else
343-
Interval.tv_sec < (time_t)MaxSeconds
344-
#endif
345-
);
342+
343+
ElapsedTimeSeconds = std::chrono::duration_cast<std::chrono::seconds>(
344+
std::chrono::steady_clock::now() - StartTime)
345+
.count();
346+
347+
} while (ElapsedTimeSeconds < MaxSeconds);
346348

347349
// Give up.
348350
return Res_Timeout;

0 commit comments

Comments
 (0)