What is the reason for double NULL check of pointer for mutex lockOptimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations
Android Material and appcompat Manifest merger failed in react-native or ExpoKit
Why does cooking oatmeal starting with cold milk make it creamy?
Why does the Saturn V have standalone inter-stage rings?
What can I do with a research project that is my university’s intellectual property?
Is there any proof that high saturation and contrast makes a picture more appealing in social media?
Does the monk's Step of the Wind feature activate the benefit of the Mobile feat?
DBCC checkdb on tempdb
Is "Busen" just the area between the breasts?
Is there any difference between Т34ВМ1 and КМ1858ВМ1/3?
If the Dragon's Breath spell is cast on a familiar, does it use the wizard's DC or familiar's DC?
What does it mean to not be able to take the derivative of a function multiple times?
Story about hunting giant lizards for hides on privately owned planet
How to remove this component from PCB
Loss of power when I remove item from the outlet
What is the origin of Scooby-Doo's name?
Why is it easier to balance a non-moving bike standing up than sitting down?
Why does independence imply zero correlation?
Encounter design and XP thresholds
How did Gollum enter Moria?
How large would a mega structure have to be to host 1 billion people indefinitely?
LWC - Local Dev - How can I run the local server on HTTPS?
Do I need a shock-proof watch for cycling?
How would modern naval warfare have to have developed differently for battleships to still be relevant in the 21st century?
Should an enameled cast iron pan be seasoned?
What is the reason for double NULL check of pointer for mutex lock
Optimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I recently read a book about system software.
There is an example in it that I don't understand.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL)
lock();
if (pInst == NULL)
pInst = new T;
unlock();
return pInst;
Why does the author check (pInst == NULL)
twice?
c++ if-statement locking
|
show 1 more comment
I recently read a book about system software.
There is an example in it that I don't understand.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL)
lock();
if (pInst == NULL)
pInst = new T;
unlock();
return pInst;
Why does the author check (pInst == NULL)
twice?
c++ if-statement locking
17
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
4
This code is UB.pInst
is not atomic so there is a race condition here. This book must be very very old.
– Oliv
Jun 4 at 11:35
2
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
1
In Java this works, becausevolatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.
– Pete Becker
Jun 4 at 19:34
|
show 1 more comment
I recently read a book about system software.
There is an example in it that I don't understand.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL)
lock();
if (pInst == NULL)
pInst = new T;
unlock();
return pInst;
Why does the author check (pInst == NULL)
twice?
c++ if-statement locking
I recently read a book about system software.
There is an example in it that I don't understand.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL)
lock();
if (pInst == NULL)
pInst = new T;
unlock();
return pInst;
Why does the author check (pInst == NULL)
twice?
c++ if-statement locking
c++ if-statement locking
edited Jun 4 at 13:44
1201ProgramAlarm
20.2k72741
20.2k72741
asked Jun 4 at 9:21
BigDongleBigDongle
986
986
17
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
4
This code is UB.pInst
is not atomic so there is a race condition here. This book must be very very old.
– Oliv
Jun 4 at 11:35
2
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
1
In Java this works, becausevolatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.
– Pete Becker
Jun 4 at 19:34
|
show 1 more comment
17
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
4
This code is UB.pInst
is not atomic so there is a race condition here. This book must be very very old.
– Oliv
Jun 4 at 11:35
2
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
1
In Java this works, becausevolatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.
– Pete Becker
Jun 4 at 19:34
17
17
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
4
4
This code is UB.
pInst
is not atomic so there is a race condition here. This book must be very very old.– Oliv
Jun 4 at 11:35
This code is UB.
pInst
is not atomic so there is a race condition here. This book must be very very old.– Oliv
Jun 4 at 11:35
2
2
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
1
1
In Java this works, because
volatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.– Pete Becker
Jun 4 at 19:34
In Java this works, because
volatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.– Pete Becker
Jun 4 at 19:34
|
show 1 more comment
2 Answers
2
active
oldest
votes
When two threads try call GetInstance()
for the first time at the same time, both will see pInst == NULL
at the first check. One thread will get the lock first, which allows it to modify pInst
.
The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst
has already been modified by the first thread, so the second one doesn't need to create a new instance.
Only the second check between lock()
and unlock()
is safe. It would work without the first check, but it would be slower because every call to GetInstance()
would call lock()
and unlock()
. The first check avoids unnecessary lock()
calls.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()
lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();
return pInst;
See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).
Note: This implementation requires that both read and write accesses to volatile T* pInst
are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.
If access to pInst
was not atomic, the second thread may read a partially written non-NULL value when checking pInst
before getting the lock and then may execute return pInst
before the first thread has finished its operation, which would result in returning a wrong pointer value.
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replacedvolatile T*
withstd::atomic<T*>
?
– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,volatile ?
was used in many dialects foratomic<?>
, and the compilers complied with it.
– Yakk - Adam Nevraumont
Jun 5 at 15:00
|
show 3 more comments
I assume lock()
is costly operation. I also assume that read on T*
pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL
, as the load operation of pInst
value will be ex. a single assembly instruction on this platform.
Assuming that: If lock()
is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL
. This will be a single assembly instruction, so we don't need to lock()
it. If pInst == NULL
, we need to modify it's value, allocate new pInst = new ...
.
But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL
and right before lock()
. Both threads will to pInst = new
. They already checked the first pInst == NULL
and for both of them it was true.
The first (any) thread starts it's execution and does lock(); pInst = new T; unlock()
. Then the second thread waiting on lock()
starts it's execution. When it starts, pInst != NULL
, because another thread allocated that. So we need to check it pInst == NULL
inside lock()
again, so that memory is not leaked and pInst
overwritten..
I also assume that read and write onT*
pointers are done atomically Atomicity is not necessary, assuming thelock()
is implemented usingpthread_mutex_lock()
or similar, aspthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.
– Andrew Henle
Jun 4 at 10:20
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
When two threads try call GetInstance()
for the first time at the same time, both will see pInst == NULL
at the first check. One thread will get the lock first, which allows it to modify pInst
.
The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst
has already been modified by the first thread, so the second one doesn't need to create a new instance.
Only the second check between lock()
and unlock()
is safe. It would work without the first check, but it would be slower because every call to GetInstance()
would call lock()
and unlock()
. The first check avoids unnecessary lock()
calls.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()
lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();
return pInst;
See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).
Note: This implementation requires that both read and write accesses to volatile T* pInst
are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.
If access to pInst
was not atomic, the second thread may read a partially written non-NULL value when checking pInst
before getting the lock and then may execute return pInst
before the first thread has finished its operation, which would result in returning a wrong pointer value.
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replacedvolatile T*
withstd::atomic<T*>
?
– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,volatile ?
was used in many dialects foratomic<?>
, and the compilers complied with it.
– Yakk - Adam Nevraumont
Jun 5 at 15:00
|
show 3 more comments
When two threads try call GetInstance()
for the first time at the same time, both will see pInst == NULL
at the first check. One thread will get the lock first, which allows it to modify pInst
.
The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst
has already been modified by the first thread, so the second one doesn't need to create a new instance.
Only the second check between lock()
and unlock()
is safe. It would work without the first check, but it would be slower because every call to GetInstance()
would call lock()
and unlock()
. The first check avoids unnecessary lock()
calls.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()
lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();
return pInst;
See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).
Note: This implementation requires that both read and write accesses to volatile T* pInst
are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.
If access to pInst
was not atomic, the second thread may read a partially written non-NULL value when checking pInst
before getting the lock and then may execute return pInst
before the first thread has finished its operation, which would result in returning a wrong pointer value.
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replacedvolatile T*
withstd::atomic<T*>
?
– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,volatile ?
was used in many dialects foratomic<?>
, and the compilers complied with it.
– Yakk - Adam Nevraumont
Jun 5 at 15:00
|
show 3 more comments
When two threads try call GetInstance()
for the first time at the same time, both will see pInst == NULL
at the first check. One thread will get the lock first, which allows it to modify pInst
.
The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst
has already been modified by the first thread, so the second one doesn't need to create a new instance.
Only the second check between lock()
and unlock()
is safe. It would work without the first check, but it would be slower because every call to GetInstance()
would call lock()
and unlock()
. The first check avoids unnecessary lock()
calls.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()
lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();
return pInst;
See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).
Note: This implementation requires that both read and write accesses to volatile T* pInst
are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.
If access to pInst
was not atomic, the second thread may read a partially written non-NULL value when checking pInst
before getting the lock and then may execute return pInst
before the first thread has finished its operation, which would result in returning a wrong pointer value.
When two threads try call GetInstance()
for the first time at the same time, both will see pInst == NULL
at the first check. One thread will get the lock first, which allows it to modify pInst
.
The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst
has already been modified by the first thread, so the second one doesn't need to create a new instance.
Only the second check between lock()
and unlock()
is safe. It would work without the first check, but it would be slower because every call to GetInstance()
would call lock()
and unlock()
. The first check avoids unnecessary lock()
calls.
volatile T* pInst = 0;
T* GetInstance()
if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()
lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();
return pInst;
See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).
Note: This implementation requires that both read and write accesses to volatile T* pInst
are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.
If access to pInst
was not atomic, the second thread may read a partially written non-NULL value when checking pInst
before getting the lock and then may execute return pInst
before the first thread has finished its operation, which would result in returning a wrong pointer value.
edited Jun 5 at 16:45
answered Jun 4 at 9:30
BodoBodo
2,6501417
2,6501417
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replacedvolatile T*
withstd::atomic<T*>
?
– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,volatile ?
was used in many dialects foratomic<?>
, and the compilers complied with it.
– Yakk - Adam Nevraumont
Jun 5 at 15:00
|
show 3 more comments
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replacedvolatile T*
withstd::atomic<T*>
?
– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,volatile ?
was used in many dialects foratomic<?>
, and the compilers complied with it.
– Yakk - Adam Nevraumont
Jun 5 at 15:00
2
2
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.
– Pete Becker
Jun 4 at 19:08
1
1
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.
– Pete Becker
Jun 4 at 19:08
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.
– TooTea
Jun 4 at 22:05
@PeteBecker what if you replaced
volatile T*
with std::atomic<T*>
?– hanshenrik
Jun 4 at 22:18
@PeteBecker what if you replaced
volatile T*
with std::atomic<T*>
?– hanshenrik
Jun 4 at 22:18
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,
volatile ?
was used in many dialects for atomic<?>
, and the compilers complied with it.– Yakk - Adam Nevraumont
Jun 5 at 15:00
@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age,
volatile ?
was used in many dialects for atomic<?>
, and the compilers complied with it.– Yakk - Adam Nevraumont
Jun 5 at 15:00
|
show 3 more comments
I assume lock()
is costly operation. I also assume that read on T*
pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL
, as the load operation of pInst
value will be ex. a single assembly instruction on this platform.
Assuming that: If lock()
is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL
. This will be a single assembly instruction, so we don't need to lock()
it. If pInst == NULL
, we need to modify it's value, allocate new pInst = new ...
.
But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL
and right before lock()
. Both threads will to pInst = new
. They already checked the first pInst == NULL
and for both of them it was true.
The first (any) thread starts it's execution and does lock(); pInst = new T; unlock()
. Then the second thread waiting on lock()
starts it's execution. When it starts, pInst != NULL
, because another thread allocated that. So we need to check it pInst == NULL
inside lock()
again, so that memory is not leaked and pInst
overwritten..
I also assume that read and write onT*
pointers are done atomically Atomicity is not necessary, assuming thelock()
is implemented usingpthread_mutex_lock()
or similar, aspthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.
– Andrew Henle
Jun 4 at 10:20
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
add a comment |
I assume lock()
is costly operation. I also assume that read on T*
pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL
, as the load operation of pInst
value will be ex. a single assembly instruction on this platform.
Assuming that: If lock()
is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL
. This will be a single assembly instruction, so we don't need to lock()
it. If pInst == NULL
, we need to modify it's value, allocate new pInst = new ...
.
But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL
and right before lock()
. Both threads will to pInst = new
. They already checked the first pInst == NULL
and for both of them it was true.
The first (any) thread starts it's execution and does lock(); pInst = new T; unlock()
. Then the second thread waiting on lock()
starts it's execution. When it starts, pInst != NULL
, because another thread allocated that. So we need to check it pInst == NULL
inside lock()
again, so that memory is not leaked and pInst
overwritten..
I also assume that read and write onT*
pointers are done atomically Atomicity is not necessary, assuming thelock()
is implemented usingpthread_mutex_lock()
or similar, aspthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.
– Andrew Henle
Jun 4 at 10:20
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
add a comment |
I assume lock()
is costly operation. I also assume that read on T*
pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL
, as the load operation of pInst
value will be ex. a single assembly instruction on this platform.
Assuming that: If lock()
is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL
. This will be a single assembly instruction, so we don't need to lock()
it. If pInst == NULL
, we need to modify it's value, allocate new pInst = new ...
.
But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL
and right before lock()
. Both threads will to pInst = new
. They already checked the first pInst == NULL
and for both of them it was true.
The first (any) thread starts it's execution and does lock(); pInst = new T; unlock()
. Then the second thread waiting on lock()
starts it's execution. When it starts, pInst != NULL
, because another thread allocated that. So we need to check it pInst == NULL
inside lock()
again, so that memory is not leaked and pInst
overwritten..
I assume lock()
is costly operation. I also assume that read on T*
pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL
, as the load operation of pInst
value will be ex. a single assembly instruction on this platform.
Assuming that: If lock()
is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL
. This will be a single assembly instruction, so we don't need to lock()
it. If pInst == NULL
, we need to modify it's value, allocate new pInst = new ...
.
But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL
and right before lock()
. Both threads will to pInst = new
. They already checked the first pInst == NULL
and for both of them it was true.
The first (any) thread starts it's execution and does lock(); pInst = new T; unlock()
. Then the second thread waiting on lock()
starts it's execution. When it starts, pInst != NULL
, because another thread allocated that. So we need to check it pInst == NULL
inside lock()
again, so that memory is not leaked and pInst
overwritten..
edited Jun 4 at 21:23
answered Jun 4 at 9:36
Kamil CukKamil Cuk
17.1k2736
17.1k2736
I also assume that read and write onT*
pointers are done atomically Atomicity is not necessary, assuming thelock()
is implemented usingpthread_mutex_lock()
or similar, aspthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.
– Andrew Henle
Jun 4 at 10:20
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
add a comment |
I also assume that read and write onT*
pointers are done atomically Atomicity is not necessary, assuming thelock()
is implemented usingpthread_mutex_lock()
or similar, aspthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.
– Andrew Henle
Jun 4 at 10:20
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
I also assume that read and write on
T*
pointers are done atomically Atomicity is not necessary, assuming the lock()
is implemented using pthread_mutex_lock()
or similar, as pthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.– Andrew Henle
Jun 4 at 10:20
I also assume that read and write on
T*
pointers are done atomically Atomicity is not necessary, assuming the lock()
is implemented using pthread_mutex_lock()
or similar, as pthread_mutex_lock()
and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.– Andrew Henle
Jun 4 at 10:20
3
3
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.
– Jonas Schäfer
Jun 4 at 18:31
1
1
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.
– Stefan
Jun 4 at 20:40
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
17
en.wikipedia.org/wiki/Double-checked_locking
– interjay
Jun 4 at 9:25
@interjay wow it has a name... maybe I should change the title
– BigDongle
Jun 4 at 9:33
4
This code is UB.
pInst
is not atomic so there is a race condition here. This book must be very very old.– Oliv
Jun 4 at 11:35
2
This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.
– Pete Becker
Jun 4 at 18:50
1
In Java this works, because
volatile
also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.– Pete Becker
Jun 4 at 19:34