What is the reason for double NULL check of pointer for mutex lockOptimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations

Android Material and appcompat Manifest merger failed in react-native or ExpoKit

Why does cooking oatmeal starting with cold milk make it creamy?

Why does the Saturn V have standalone inter-stage rings?

What can I do with a research project that is my university’s intellectual property?

Is there any proof that high saturation and contrast makes a picture more appealing in social media?

Does the monk's Step of the Wind feature activate the benefit of the Mobile feat?

DBCC checkdb on tempdb

Is "Busen" just the area between the breasts?

Is there any difference between Т34ВМ1 and КМ1858ВМ1/3?

If the Dragon's Breath spell is cast on a familiar, does it use the wizard's DC or familiar's DC?

What does it mean to not be able to take the derivative of a function multiple times?

Story about hunting giant lizards for hides on privately owned planet

How to remove this component from PCB

Loss of power when I remove item from the outlet

What is the origin of Scooby-Doo's name?

Why is it easier to balance a non-moving bike standing up than sitting down?

Why does independence imply zero correlation?

Encounter design and XP thresholds

How did Gollum enter Moria?

How large would a mega structure have to be to host 1 billion people indefinitely?

LWC - Local Dev - How can I run the local server on HTTPS?

Do I need a shock-proof watch for cycling?

How would modern naval warfare have to have developed differently for battleships to still be relevant in the 21st century?

Should an enameled cast iron pan be seasoned?



What is the reason for double NULL check of pointer for mutex lock


Optimistic vs. Pessimistic lockingDouble checked locking ArticleRecursive Lock (Mutex) vs Non-Recursive Lock (Mutex)How efficient is locking an unlocked mutex? What is the cost of a mutex?Checking for NULL pointer in C/C++C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?int a[] = 1,2,; Weird comma allowed. Any particular reason?Why is volatile used in double checked lockingWhy does python use 'else' after for and while loops?Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








14















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question



















  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34

















14















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question



















  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34













14












14








14








I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?










share|improve this question
















I recently read a book about system software.
There is an example in it that I don't understand.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL)

lock();
if (pInst == NULL)
pInst = new T;
unlock();

return pInst;



Why does the author check (pInst == NULL) twice?







c++ if-statement locking






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jun 4 at 13:44









1201ProgramAlarm

20.2k72741




20.2k72741










asked Jun 4 at 9:21









BigDongleBigDongle

986




986







  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34












  • 17





    en.wikipedia.org/wiki/Double-checked_locking

    – interjay
    Jun 4 at 9:25











  • @interjay wow it has a name... maybe I should change the title

    – BigDongle
    Jun 4 at 9:33






  • 4





    This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

    – Oliv
    Jun 4 at 11:35






  • 2





    This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

    – Pete Becker
    Jun 4 at 18:50






  • 1





    In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

    – Pete Becker
    Jun 4 at 19:34







17




17





en.wikipedia.org/wiki/Double-checked_locking

– interjay
Jun 4 at 9:25





en.wikipedia.org/wiki/Double-checked_locking

– interjay
Jun 4 at 9:25













@interjay wow it has a name... maybe I should change the title

– BigDongle
Jun 4 at 9:33





@interjay wow it has a name... maybe I should change the title

– BigDongle
Jun 4 at 9:33




4




4





This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

– Oliv
Jun 4 at 11:35





This code is UB. pInst is not atomic so there is a race condition here. This book must be very very old.

– Oliv
Jun 4 at 11:35




2




2





This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

– Pete Becker
Jun 4 at 18:50





This is called "double-checked locking". Many newcomers to multi-threaded code reinvent it. It doesn't work.

– Pete Becker
Jun 4 at 18:50




1




1





In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

– Pete Becker
Jun 4 at 19:34





In Java this works, because volatile also means atomic. In C++ that's not the case. And, no, guessing about whether values get torn is not sufficient. One problem you can run into is that the value of the pointer may get stored and flushed, so that other threads see the updated value, but the actual data that it points to may not have been flushed, so other threads see garbage values. Don't mess with synchronization shortcuts. Your intuition (and mine) is almost certainly wrong.

– Pete Becker
Jun 4 at 19:34












2 Answers
2






active

oldest

votes


















18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00


















2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00















18














When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer




















  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00













18












18








18







When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.






share|improve this answer















When two threads try call GetInstance() for the first time at the same time, both will see pInst == NULL at the first check. One thread will get the lock first, which allows it to modify pInst.



The second thread will wait for the lock to get available. When the first thread releases the lock, the second will get it, and now the value of pInst has already been modified by the first thread, so the second one doesn't need to create a new instance.



Only the second check between lock() and unlock() is safe. It would work without the first check, but it would be slower because every call to GetInstance() would call lock() and unlock(). The first check avoids unnecessary lock() calls.



volatile T* pInst = 0;
T* GetInstance()

if (pInst == NULL) // unsafe check to avoid unnecessary and maybe slow lock()

lock(); // after this, only one thread can access pInst
if (pInst == NULL) // check again because other thread may have modified it between first check and returning from lock()
pInst = new T;
unlock();

return pInst;



See also https://en.wikipedia.org/wiki/Double-checked_locking (copied from interjay's comment).



Note: This implementation requires that both read and write accesses to volatile T* pInst are atomic. Otherwise the second thread may read a partially written value just being written by the first thread. For modern processors, accessing a pointer value (not the data being pointed to) is an atomic operation, although not guaranteed for all architectures.



If access to pInst was not atomic, the second thread may read a partially written non-NULL value when checking pInst before getting the lock and then may execute return pInst before the first thread has finished its operation, which would result in returning a wrong pointer value.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jun 5 at 16:45

























answered Jun 4 at 9:30









BodoBodo

2,6501417




2,6501417







  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00












  • 2





    The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

    – Pete Becker
    Jun 4 at 19:08






  • 1





    Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

    – Pete Becker
    Jun 4 at 19:08











  • @PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

    – TooTea
    Jun 4 at 22:05











  • @PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

    – hanshenrik
    Jun 4 at 22:18











  • @PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

    – Yakk - Adam Nevraumont
    Jun 5 at 15:00







2




2





The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

– Pete Becker
Jun 4 at 19:08





The initial check introduces a data race, so the behavior is undefined. In practice, this means that the code will work just fine in house, but when you give a demo to your most valuable customer it will crash.

– Pete Becker
Jun 4 at 19:08




1




1





Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

– Pete Becker
Jun 4 at 19:08





Note that "probably slow lock()" is generally not the case. Locking lightly contested mutexes is generally pretty fast these days.

– Pete Becker
Jun 4 at 19:08













@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

– TooTea
Jun 4 at 22:05





@PeteBecker Only if by "pretty fast" you actually mean "still quite slow". At the very least, a lock will involve a memory barrier and an atomic operation, leading to suboptimal instruction scheduling, pipeline stalls, and scaling issues with many cores. If you have 100 threads each locking its own mutex, there's theoretically no contention at all, but all the inter-CPU synchronization will become a bottleneck anyway.

– TooTea
Jun 4 at 22:05













@PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

– hanshenrik
Jun 4 at 22:18





@PeteBecker what if you replaced volatile T* with std::atomic<T*> ?

– hanshenrik
Jun 4 at 22:18













@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

– Yakk - Adam Nevraumont
Jun 5 at 15:00





@PeteBecker That book would be from before C++ had a memory model; in that paleoc++ic age, volatile ? was used in many dialects for atomic<?>, and the compilers complied with it.

– Yakk - Adam Nevraumont
Jun 5 at 15:00













2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40
















2














I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer

























  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40














2












2








2







I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..






share|improve this answer















I assume lock() is costly operation. I also assume that read on T* pointers is done atomically on this platform, so you don't need to lock simple comparisons pInst == NULL, as the load operation of pInst value will be ex. a single assembly instruction on this platform.



Assuming that: If lock() is a costly operation, it's best not to execute it, if we don't have to. So first we check if pInst == NULL. This will be a single assembly instruction, so we don't need to lock() it. If pInst == NULL, we need to modify it's value, allocate new pInst = new ....



But - imagine a situation, where 2 (or more) threads are right in the point between first pInst == NULL and right before lock(). Both threads will to pInst = new. They already checked the first pInst == NULL and for both of them it was true.



The first (any) thread starts it's execution and does lock(); pInst = new T; unlock(). Then the second thread waiting on lock() starts it's execution. When it starts, pInst != NULL, because another thread allocated that. So we need to check it pInst == NULL inside lock() again, so that memory is not leaked and pInst overwritten..







share|improve this answer














share|improve this answer



share|improve this answer








edited Jun 4 at 21:23

























answered Jun 4 at 9:36









Kamil CukKamil Cuk

17.1k2736




17.1k2736












  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40


















  • I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

    – Andrew Henle
    Jun 4 at 10:20







  • 3





    @AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

    – Jonas Schäfer
    Jun 4 at 18:31







  • 1





    The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

    – Stefan
    Jun 4 at 20:40

















I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

– Andrew Henle
Jun 4 at 10:20






I also assume that read and write on T* pointers are done atomically Atomicity is not necessary, assuming the lock() is implemented using pthread_mutex_lock() or similar, as pthread_mutex_lock() and other POSIX synchronization functions are all full memory barriers. The same for Windows synchronization functions.

– Andrew Henle
Jun 4 at 10:20





3




3





@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

– Jonas Schäfer
Jun 4 at 18:31






@AndrewHenle I’m not sure on the details of atomicity and memory barries. But wouldn’t, if the load wasn’t atomic, the following be possible (T1, T2 are threads): T1 calls GetInstance(), acquires lock, initialises pInst. T2 calls GetInstance(), sees a partially written pInst (e.g. lower half is set). The NULL check fails, thus no lock()/unlock() in the code path, thus no barrier(?). The (partial) pInst value can be cached in a register and is returned instead of re-read from the global memory location.

– Jonas Schäfer
Jun 4 at 18:31





1




1





The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

– Stefan
Jun 4 at 20:40






The minimum number of modifications required to make this code work is an acquire operation on the read, a release operation on the write, and atomic access. You need atomic access because otherwise, a thread that did not take the lock might observe a torn write to pInst. You need a release operation to prevent the thread that creates the object from setting pInst while the object is only half instantiated. You need an acquire operation to ensure that when a thread observes pInst != NULL, it can also observe the rest of the writes made by the thread that created the object.

– Stefan
Jun 4 at 20:40


















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56441027%2fwhat-is-the-reason-for-double-null-check-of-pointer-for-mutex-lock%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020