What is the reason for double NULL check of pointer for mutex lockOptimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations

Android Material and appcompat Manifest merger failed in react-native or ExpoKit

Why does cooking oatmeal starting with cold milk make it creamy?

Why does the Saturn V have standalone inter-stage rings?

What can I do with a research project that is my university’s intellectual property?

Is there any proof that high saturation and contrast makes a picture more appealing in social media?

Does the monk's Step of the Wind feature activate the benefit of the Mobile feat?

DBCC checkdb on tempdb

Is "Busen" just the area between the breasts?

Is there any difference between Т34ВМ1 and КМ1858ВМ1/3?

If the Dragon's Breath spell is cast on a familiar, does it use the wizard's DC or familiar's DC?

What does it mean to not be able to take the derivative of a function multiple times?

Story about hunting giant lizards for hides on privately owned planet

How to remove this component from PCB

Loss of power when I remove item from the outlet

What is the origin of Scooby-Doo's name?

Why is it easier to balance a non-moving bike standing up than sitting down?

Why does independence imply zero correlation?

Encounter design and XP thresholds

How did Gollum enter Moria?

How large would a mega structure have to be to host 1 billion people indefinitely?

LWC - Local Dev - How can I run the local server on HTTPS?

Do I need a shock-proof watch for cycling?

How would modern naval warfare have to have developed differently for battleships to still be relevant in the 21st century?

Should an enameled cast iron pan be seasoned?



What is the reason for double NULL check of pointer for mutex lock


Optimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








14















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question



















  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34

















14















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question



















  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34













14












14








14








I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question
















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?







c++ if-statement locking






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jun 4 at 13:44









1201ProgramAlarm

20.2k72741




20.2k72741










asked Jun 4 at 9:21









BigDongleBigDongle

986




986







  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34












  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34







17




17





en.wikipedia.org/wiki/Double-checked_locking

– interjay
Jun 4 at 9:25





en.wikipedia.org/wiki/Double-checked_locking

– interjay
Jun 4 at 9:25













@interjay wow it has a name... maybe I should change the title

– BigDongle
Jun 4 at 9:33





@interjay wow it has a name... maybe I should change the title

– BigDongle
Jun 4 at 9:33




4




4





This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

– Oliv
Jun 4 at 11:35





This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

– Oliv
Jun 4 at 11:35




2




2





This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

– Pete Becker
Jun 4 at 18:50





This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

– Pete Becker
Jun 4 at 18:50




1




1





In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

– Pete Becker
Jun 4 at 19:34





In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

– Pete Becker
Jun 4 at 19:34












2 Answers
2






active

oldest

votes


















18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00


















2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00















18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00













18












18








18







When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer















When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jun 5 at 16:45

























answered Jun 4 at 9:30









BodoBodo

2,6501417




2,6501417







  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00












  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00







2




2





The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

– Pete Becker
Jun 4 at 19:08





The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

– Pete Becker
Jun 4 at 19:08




1




1





Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

– Pete Becker
Jun 4 at 19:08





Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

– Pete Becker
Jun 4 at 19:08













@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

– TooTea
Jun 4 at 22:05





@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

– TooTea
Jun 4 at 22:05













@PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

– hanshenrik
Jun 4 at 22:18





@PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

– hanshenrik
Jun 4 at 22:18













@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

– Yakk - Adam Nevraumont
Jun 5 at 15:00





@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

– Yakk - Adam Nevraumont
Jun 5 at 15:00













2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40
















2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40














2












2








2







I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer















I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..







share|improve this answer














share|improve this answer



share|improve this answer








edited Jun 4 at 21:23

























answered Jun 4 at 9:36









Kamil CukKamil Cuk

17.1k2736




17.1k2736












  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40


















  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40

















I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

– Andrew Henle
Jun 4 at 10:20






I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

– Andrew Henle
Jun 4 at 10:20





3




3





@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

– Jonas Schäfer
Jun 4 at 18:31






@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

– Jonas Schäfer
Jun 4 at 18:31





1




1





The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

– Stefan
Jun 4 at 20:40






The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

– Stefan
Jun 4 at 20:40


















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Wikipedia:Vital articles Мазмуну Biography - Өмүр баян Philosophy and psychology - Философия жана психология Religion - Дин Social sciences - Коомдук илимдер Language and literature - Тил жана адабият Science - Илим Technology - Технология Arts and recreation - Искусство жана эс алуу History and geography - Тарых жана география Навигация менюсу

Bruxelas-Capital Índice Historia | Composición | Situación lingüística | Clima | Cidades irmandadas | Notas | Véxase tamén | Menú de navegacióneO uso das linguas en Bruxelas e a situación do neerlandés"Rexión de Bruxelas Capital"o orixinalSitio da rexiónPáxina de Bruselas no sitio da Oficina de Promoción Turística de Valonia e BruxelasMapa Interactivo da Rexión de Bruxelas-CapitaleeWorldCat332144929079854441105155190212ID28008674080552-90000 0001 0666 3698n94104302ID540940339365017018237

What should I write in an apology letter, since I have decided not to join a company after accepting an offer letterShould I keep looking after accepting a job offer?What should I do when I've been verbally told I would get an offer letter, but still haven't gotten one after 4 weeks?Do I accept an offer from a company that I am not likely to join?New job hasn't confirmed starting date and I want to give current employer as much notice as possibleHow should I address my manager in my resignation letter?HR delayed background verification, now jobless as resignedNo email communication after accepting a formal written offer. How should I phrase the call?What should I do if after receiving a verbal offer letter I am informed that my written job offer is put on hold due to some internal issues?Should I inform the current employer that I am about to resign within 1-2 weeks since I have signed the offer letter and waiting for visa?What company will do, if I send their offer letter to another company