Simple function that simulates survey results based on sample size and probabilitySpeed up simple Python function that uses list comprehensionis_palindrome function that ignores whitespace and punctuationFunction that builds dictionary based on lambda paramsPython function that returns statistics based on copy file actionFunction that takes an object and a query obj that tries to regexp matches its values4 distinct integers, whose reciprocals sum up to 1Given a function that returns a random number 0 or 1, write a function that returns a number between 0 and 199Simple wrapper function grouping and summarising variableGeneric framework to handle parameterized commandsSimulation of an alien population

How long is it safe to leave marker on a Chessex battle map?

Was planting UN flag on Moon ever discussed?

Why are MBA programs closing in the United States?

How to avoid typing 'git' at the begining of every Git command

How to safely destroy (a large quantity of) valid checks?

Did Apple bundle a specific monitor with the Apple II+ for schools?

2019 gold coins to share

What aircraft was used as Air Force One for the flight between Southampton and Shannon?

Non-aqueous eyes?

Analogy between an unknown in an argument, and a contradiction in the principle of explosion

Has there been a multiethnic Star Trek character?

Solving ‘Null geometry…’ error during distance matrix operation?

Why do radiation hardened IC packages often have long leads?

tabular: caption and align problem

Amplitude of a crest and trough in a sound wave?

Is it okay to have a sequel start immediately after the end of the first book?

Can a human be transformed into a Mind Flayer?

Separate SPI data

Java Servlet & JSP simple login

Were tables of square roots ever in use?

Ability To Change Root User Password (Vulnerability?)

Why is long-term living in Almost-Earth causing severe health problems?

What would prevent chimeras from reproducing with each other?

Why was this person allowed to become Grand Maester?

Simple function that simulates survey results based on sample size and probability

Speed up simple Python function that uses list comprehensionis_palindrome function that ignores whitespace and punctuationFunction that builds dictionary based on lambda paramsPython function that returns statistics based on copy file actionFunction that takes an object and a query obj that tries to regexp matches its values4 distinct integers, whose reciprocals sum up to 1Given a function that returns a random number 0 or 1, write a function that returns a number between 0 and 199Simple wrapper function grouping and summarising variableGeneric framework to handle parameterized commandsSimulation of an alien population

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

What is this:

This is a simple function, part of a basic Monte Carlo simulation. It takes sample size and probability as parameters. It returns the simulation result (positive answers) plus the input parameters in a tuple.

What I'm asking:

I'm trying to avoid using temporary variables, I have two questions.

Do I really save memory by avoiding storing interim results?

How could I improve readability without adding variables?


def simulate_survey(sample_size, percent_subscribes):
 return (
 sample_size,
 percent_subscribes,
 round(
 (
 sum([
 r.random() < percent_subscribes 
 for _ in range(sample_size)
 ]) / sample_size
 ),
 2
 )
 )

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

add a comment |

What is this:

What I'm asking:

I'm trying to avoid using temporary variables, I have two questions.

Do I really save memory by avoiding storing interim results?

How could I improve readability without adding variables?


def simulate_survey(sample_size, percent_subscribes):
 return (
 sample_size,
 percent_subscribes,
 round(
 (
 sum([
 r.random() < percent_subscribes 
 for _ in range(sample_size)
 ]) / sample_size
 ),
 2
 )
 )

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

add a comment |

What is this:

What I'm asking:

I'm trying to avoid using temporary variables, I have two questions.

Do I really save memory by avoiding storing interim results?

How could I improve readability without adding variables?


def simulate_survey(sample_size, percent_subscribes):
 return (
 sample_size,
 percent_subscribes,
 round(
 (
 sum([
 r.random() < percent_subscribes 
 for _ in range(sample_size)
 ]) / sample_size
 ),
 2
 )
 )

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

What is this:

What I'm asking:

I'm trying to avoid using temporary variables, I have two questions.

Do I really save memory by avoiding storing interim results?

How could I improve readability without adding variables?


def simulate_survey(sample_size, percent_subscribes):
 return (
 sample_size,
 percent_subscribes,
 round(
 (
 sum([
 r.random() < percent_subscribes 
 for _ in range(sample_size)
 ]) / sample_size
 ),
 2
 )
 )

python functional-programming random simulation numerical-methods

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

edited May 25 at 16:33

200_success

133k20163433

edited May 25 at 16:33

200_success

133k20163433

edited May 25 at 16:33

200_success

133k20163433

asked May 25 at 13:26

Lorinc Nyitrai

1836

asked May 25 at 13:26

Lorinc Nyitrai

1836

asked May 25 at 13:26

Lorinc Nyitrai

1836

add a comment |

2 Answers
2

active

oldest

votes

As I discovered recently, summing a lot of booleans, where the chance that the value is False is not negligible, can be surprisingly slow.

So I would change your survey result calculation to:
```
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
```
This allows sum to use its faster integer implementation and you do not sum a bunch of zeros.

Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use numpy.random.binomial:
```
import numpy as np

def simulate_survey(sample_size, percent_subscribes):
 subscribers = np.random.binomial(sample_size, percent_subscribes)
 return sample_size, percent_subscribes, round(subscribers / sample_size, 2)
```
Using numpy here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.

IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.

I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.

You should add a docstring describing what your function does.

answered May 25 at 16:18

Graipher

28.5k546101

add a comment |

I think avoiding temporary variables, when we have no strict memory limit, is a bad idea. There is no way to have a readable code without using variables.
So let's create a version of your code with temp variables:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

It's not the most readable version of your code, But it's clearly more readable (I changed the way you created the sum value. I'm programming with Python for years, but that syntax is so strange to me. I hope my code do what your code did).

So Is there a huge memory usage gap between those programs? We now that Python does not remove temporary variables as a part of its optimization process (you can read more about it here). So obviously, my program should use more memory than yours. But how much?

I used resource module for comparing them. You can use this too if you are working on a UNIX based os.

Here is the code that I tried in both programs for measuring memory usage:

print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

Your variable-less program shows values around 11860 KB, But my program with temporary variables used almost 12008 KB. There is 200 KB difference, but don't forget that my code is not completely the same as your code and I changed how it creates third value.

So let's change the third value to the way you creates that:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([
 r.random() < percent_subscribes
 for _ in range(sample_size)
 ])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

So what happens if we test memory usage of this code that has the exact same logic as the first version? The result is around 11896 KB. Only between 10 to 30 KB more than the first version (Because each time we create a process, does not exactly same things happen, memory usage values are different each time).

So, as a conclusion, if you are not working on a machine with very tiny memory (something like embedded programming that is not common using python), I really recommend you that always use things like temporary variables to make your code readable.

answered May 25 at 15:06

Mr Alihoseiny

3097

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f220989%2fsimple-function-that-simulates-survey-results-based-on-sample-size-and-probabili%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

As I discovered recently, summing a lot of booleans, where the chance that the value is False is not negligible, can be surprisingly slow.

So I would change your survey result calculation to:
```
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
```
This allows sum to use its faster integer implementation and you do not sum a bunch of zeros.

Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use numpy.random.binomial:
```
import numpy as np

def simulate_survey(sample_size, percent_subscribes):
 subscribers = np.random.binomial(sample_size, percent_subscribes)
 return sample_size, percent_subscribes, round(subscribers / sample_size, 2)
```
Using numpy here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.

IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.

I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.

You should add a docstring describing what your function does.

answered May 25 at 16:18

Graipher

28.5k546101

add a comment |

As I discovered recently, summing a lot of booleans, where the chance that the value is False is not negligible, can be surprisingly slow.

So I would change your survey result calculation to:
```
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
```
This allows sum to use its faster integer implementation and you do not sum a bunch of zeros.

Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use numpy.random.binomial:
```
import numpy as np

def simulate_survey(sample_size, percent_subscribes):
 subscribers = np.random.binomial(sample_size, percent_subscribes)
 return sample_size, percent_subscribes, round(subscribers / sample_size, 2)
```
Using numpy here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.

IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.

I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.

You should add a docstring describing what your function does.

answered May 25 at 16:18

Graipher

28.5k546101

add a comment |

As I discovered recently, summing a lot of booleans, where the chance that the value is False is not negligible, can be surprisingly slow.

So I would change your survey result calculation to:
```
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
```
This allows sum to use its faster integer implementation and you do not sum a bunch of zeros.

Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use numpy.random.binomial:
```
import numpy as np

def simulate_survey(sample_size, percent_subscribes):
 subscribers = np.random.binomial(sample_size, percent_subscribes)
 return sample_size, percent_subscribes, round(subscribers / sample_size, 2)
```
Using numpy here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.

IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.

I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.

You should add a docstring describing what your function does.

answered May 25 at 16:18

Graipher

28.5k546101

As I discovered recently, summing a lot of booleans, where the chance that the value is False is not negligible, can be surprisingly slow.

So I would change your survey result calculation to:
```
sum([1 for _ in range(sample_size) if r.random() < percent_subscribes])
```
This allows sum to use its faster integer implementation and you do not sum a bunch of zeros.

Alternatively, you could look at this problem as an application of the binomial distribution. You have some chance that a certain result is obtained and you want to know how often that chance was true for some population. For this you can use numpy.random.binomial:
```
import numpy as np

def simulate_survey(sample_size, percent_subscribes):
 subscribers = np.random.binomial(sample_size, percent_subscribes)
 return sample_size, percent_subscribes, round(subscribers / sample_size, 2)
```
Using numpy here may also speed up your process in other places. If you need to run this function many times, you probably want to use the third argument to generate multiple values at once.

IMO, the readability is also greatly increased by using one temporary variable here, instead of your many levels of parenthesis.

I am not a fan of your function returning its inputs. The values of those should already be available in the scope calling this function, so this seems unnecessary. One exception would be that you have other, similar, functions which actually return different/modified values there.

You should add a docstring describing what your function does.

answered May 25 at 16:18

Graipher

28.5k546101

answered May 25 at 16:18

Graipher

28.5k546101

answered May 25 at 16:18

Graipher

28.5k546101

answered May 25 at 16:18

Graipher

28.5k546101

add a comment |

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

I used resource module for comparing them. You can use this too if you are working on a UNIX based os.

Here is the code that I tried in both programs for measuring memory usage:

print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

So let's change the third value to the way you creates that:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([
 r.random() < percent_subscribes
 for _ in range(sample_size)
 ])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

answered May 25 at 15:06

Mr Alihoseiny

3097

add a comment |

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

I used resource module for comparing them. You can use this too if you are working on a UNIX based os.

Here is the code that I tried in both programs for measuring memory usage:

print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

So let's change the third value to the way you creates that:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([
 r.random() < percent_subscribes
 for _ in range(sample_size)
 ])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

answered May 25 at 15:06

Mr Alihoseiny

3097

add a comment |

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

I used resource module for comparing them. You can use this too if you are working on a UNIX based os.

Here is the code that I tried in both programs for measuring memory usage:

print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

So let's change the third value to the way you creates that:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([
 r.random() < percent_subscribes
 for _ in range(sample_size)
 ])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

answered May 25 at 15:06

Mr Alihoseiny

3097

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([x for x in [True] * sample_size if r.random() < percent_subscribes])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

I used resource module for comparing them. You can use this too if you are working on a UNIX based os.

Here is the code that I tried in both programs for measuring memory usage:

print(simulate_survey(64, 0.5))
print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

So let's change the third value to the way you creates that:

def simulate_survey(sample_size, percent_subscribes):
 sum_result = sum([
 r.random() < percent_subscribes
 for _ in range(sample_size)
 ])
 third_value = round(sum_result / sample_size, 2)
 return (
 sample_size,
 percent_subscribes,
 third_value
 )

answered May 25 at 15:06

Mr Alihoseiny

3097

answered May 25 at 15:06

Mr Alihoseiny

3097

answered May 25 at 15:06

Mr Alihoseiny

3097

answered May 25 at 15:06

Mr Alihoseiny

3097

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Otdfbt

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

2 Answers
2

2 Answers
2

2 Answers
2