Simple function that simulates survey results based on sample size and probabilitySpeed up simple Python function that uses list comprehensionis_palindrome function that ignores whitespace and punctuationFunction that builds dictionary based on lambda paramsPython function that returns statistics based on copy file actionFunction that takes an object and a query obj that tries to regexp matches its values4 distinct integers, whose reciprocals sum up to 1Given a function that returns a random number 0 or 1, write a function that returns a number between 0 and 199Simple wrapper function grouping and summarising variableGeneric framework to handle parameterized commandsSimulation of an alien population
How long is it safe to leave marker on a Chessex battle map?
Was planting UN flag on Moon ever discussed?
Why are MBA programs closing in the United States?
How to avoid typing 'git' at the begining of every Git command
How to safely destroy (a large quantity of) valid checks?
Did Apple bundle a specific monitor with the Apple II+ for schools?
2019 gold coins to share
What aircraft was used as Air Force One for the flight between Southampton and Shannon?
Non-aqueous eyes?
Analogy between an unknown in an argument, and a contradiction in the principle of explosion
Has there been a multiethnic Star Trek character?
Solving ‘Null geometry…’ error during distance matrix operation?
Why do radiation hardened IC packages often have long leads?
tabular: caption and align problem
Amplitude of a crest and trough in a sound wave?
Is it okay to have a sequel start immediately after the end of the first book?
Can a human be transformed into a Mind Flayer?
Separate SPI data
Java Servlet & JSP simple login
Were tables of square roots ever in use?
Ability To Change Root User Password (Vulnerability?)
Why is long-term living in Almost-Earth causing severe health problems?
What would prevent chimeras from reproducing with each other?
Why was this person allowed to become Grand Maester?
Simple function that simulates survey results based on sample size and probability
Speed up simple Python function that uses list comprehensionis_palindrome function that ignores whitespace and punctuationFunction that builds dictionary based on lambda paramsPython function that returns statistics based on copy file actionFunction that takes an object and a query obj that tries to regexp matches its values4 distinct integers, whose reciprocals sum up to 1Given a function that returns a random number 0 or 1, write a function that returns a number between 0 and 199Simple wrapper function grouping and summarising variableGeneric framework to handle parameterized commandsSimulation of an alien population
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
What is this:
This is a simple function, part of a basic Monte Carlo simulation. It takes sample size and probability as parameters. It returns the simulation result (positive answers) plus the input parameters in a tuple.
What I'm asking:
I'm trying to avoid using temporary variables, I have two questions.
- Do I really save memory by avoiding storing interim results?
- How could I improve readability without adding variables?
def simulate_survey(sample_size, percent_subscribes):
return (
sample_size,
percent_subscribes,
round(
(
sum([
r.random() < percent_subscribes
for _ in range(sample_size)
]) / sample_size
),
2
)
)
python functional-programming random simulation numerical-methods
$endgroup$
add a comment |
$begingroup$
What is this:
This is a simple function, part of a basic Monte Carlo simulation. It takes sample size and probability as parameters. It returns the simulation result (positive answers) plus the input parameters in a tuple.
What I'm asking:
I'm trying to avoid using temporary variables, I have two questions.
- Do I really save memory by avoiding storing interim results?
- How could I improve readability without adding variables?
def simulate_survey(sample_size, percent_subscribes):
return (
sample_size,
percent_subscribes,
round(
(
sum([
r.random() < percent_subscribes
for _ in range(sample_size)
]) / sample_size
),
2
)
)
python functional-programming random simulation numerical-methods
$endgroup$
add a comment |
$begingroup$
What is this:
This is a simple function, part of a basic Monte Carlo simulation. It takes sample size and probability as parameters. It returns the simulation result (positive answers) plus the input parameters in a tuple.
What I'm asking:
I'm trying to avoid using temporary variables, I have two questions.
- Do I really save memory by avoiding storing interim results?
- How could I improve readability without adding variables?
def simulate_survey(sample_size, percent_subscribes):
return (
sample_size,
percent_subscribes,
round(
(
sum([
r.random() < percent_subscribes
for _ in range(sample_size)
]) / sample_size
),
2
)
)
python functional-programming random simulation numerical-methods
$endgroup$
What is this:
This is a simple function, part of a basic Monte Carlo simulation. It takes sample size and probability as parameters. It returns the simulation result (positive answers) plus the input parameters in a tuple.
What I'm asking:
I'm trying to avoid using temporary variables, I have two questions.
- Do I really save memory by avoiding storing interim results?
- How could I improve readability without adding variables?
def simulate_survey(sample_size, percent_subscribes):
return (
sample_size,
percent_subscribes,
round(
(
sum([
r.random() < percent_subscribes
for _ in range(sample_size)
]) / sample_size
),
2
)
)
python functional-programming random simulation numerical-methods
python functional-programming random simulation numerical-methods
edited May 25 at 16:33
200_success
133k20163433
133k20163433
asked May 25 at 13:26
Lorinc NyitraiLorinc Nyitrai
1836
1836
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
As I discovered recently, summing a lot of booleans, where the chance that the value is
False
is not negligible, can be surprisingly slow.So I would change your survey result calculation to:
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
This allows
sum
to use its faster integer implementation and you do not sum a bunch of zeros.Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use
numpy.random.binomial
:import numpy as np
def simulate_survey(sample_size, percent_subscribes):
subscribers = np.random.binomial(sample_size, percent_subscribes)
return sample_size, percent_subscribes, round(subscribers / sample_size, 2)Using
numpy
here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.
I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.
You should add a
docstring
describing what your function does.
$endgroup$
add a comment |
$begingroup$
I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).
So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?
I used resource
module for comparing them. You can use this too if you are working on a UNIX based os.
Here is the code that I tried in both programs for measuring memory usage:
print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.
So let's change the third value to the way you creates that:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([
r.random() < percent_subscribes
for _ in range(sample_size)
])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).
So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f220989%2fsimple-function-that-simulates-survey-results-based-on-sample-size-and-probabili%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
As I discovered recently, summing a lot of booleans, where the chance that the value is
False
is not negligible, can be surprisingly slow.So I would change your survey result calculation to:
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
This allows
sum
to use its faster integer implementation and you do not sum a bunch of zeros.Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use
numpy.random.binomial
:import numpy as np
def simulate_survey(sample_size, percent_subscribes):
subscribers = np.random.binomial(sample_size, percent_subscribes)
return sample_size, percent_subscribes, round(subscribers / sample_size, 2)Using
numpy
here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.
I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.
You should add a
docstring
describing what your function does.
$endgroup$
add a comment |
$begingroup$
As I discovered recently, summing a lot of booleans, where the chance that the value is
False
is not negligible, can be surprisingly slow.So I would change your survey result calculation to:
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
This allows
sum
to use its faster integer implementation and you do not sum a bunch of zeros.Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use
numpy.random.binomial
:import numpy as np
def simulate_survey(sample_size, percent_subscribes):
subscribers = np.random.binomial(sample_size, percent_subscribes)
return sample_size, percent_subscribes, round(subscribers / sample_size, 2)Using
numpy
here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.
I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.
You should add a
docstring
describing what your function does.
$endgroup$
add a comment |
$begingroup$
As I discovered recently, summing a lot of booleans, where the chance that the value is
False
is not negligible, can be surprisingly slow.So I would change your survey result calculation to:
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
This allows
sum
to use its faster integer implementation and you do not sum a bunch of zeros.Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use
numpy.random.binomial
:import numpy as np
def simulate_survey(sample_size, percent_subscribes):
subscribers = np.random.binomial(sample_size, percent_subscribes)
return sample_size, percent_subscribes, round(subscribers / sample_size, 2)Using
numpy
here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.
I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.
You should add a
docstring
describing what your function does.
$endgroup$
As I discovered recently, summing a lot of booleans, where the chance that the value is
False
is not negligible, can be surprisingly slow.So I would change your survey result calculation to:
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
This allows
sum
to use its faster integer implementation and you do not sum a bunch of zeros.Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use
numpy.random.binomial
:import numpy as np
def simulate_survey(sample_size, percent_subscribes):
subscribers = np.random.binomial(sample_size, percent_subscribes)
return sample_size, percent_subscribes, round(subscribers / sample_size, 2)Using
numpy
here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.
I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.
You should add a
docstring
describing what your function does.
answered May 25 at 16:18
GraipherGraipher
28.5k546101
28.5k546101
add a comment |
add a comment |
$begingroup$
I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).
So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?
I used resource
module for comparing them. You can use this too if you are working on a UNIX based os.
Here is the code that I tried in both programs for measuring memory usage:
print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.
So let's change the third value to the way you creates that:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([
r.random() < percent_subscribes
for _ in range(sample_size)
])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).
So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.
$endgroup$
add a comment |
$begingroup$
I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).
So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?
I used resource
module for comparing them. You can use this too if you are working on a UNIX based os.
Here is the code that I tried in both programs for measuring memory usage:
print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.
So let's change the third value to the way you creates that:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([
r.random() < percent_subscribes
for _ in range(sample_size)
])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).
So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.
$endgroup$
add a comment |
$begingroup$
I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).
So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?
I used resource
module for comparing them. You can use this too if you are working on a UNIX based os.
Here is the code that I tried in both programs for measuring memory usage:
print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.
So let's change the third value to the way you creates that:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([
r.random() < percent_subscribes
for _ in range(sample_size)
])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).
So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.
$endgroup$
I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).
So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?
I used resource
module for comparing them. You can use this too if you are working on a UNIX based os.
Here is the code that I tried in both programs for measuring memory usage:
print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.
So let's change the third value to the way you creates that:
def simulate_survey(sample_size, percent_subscribes):
sum_result = sum([
r.random() < percent_subscribes
for _ in range(sample_size)
])
third_value = round(sum_result / sample_size, 2)
return (
sample_size,
percent_subscribes,
third_value
)
So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).
So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.
answered May 25 at 15:06
Mr AlihoseinyMr Alihoseiny
3097
3097
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f220989%2fsimple-function-that-simulates-survey-results-based-on-sample-size-and-probabili%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown