Why does logistic function use e rather than 2?Why is Reconstruction in Autoencoders Using the Same Activation Function as Forward Activation, and not the Inverse?Sigmoid's stabilityWhy ReLU is better than the other activation functionsMachine Learning: Why do the error in cost function need to be squared?Why does Q-learning use an actor model and critic model?Why do we need the sigmoid function in logistic regression?Preprocessing and dropout in Autoencoders?Properly using activation functions of neural networkPurpose of backpropagation in neural networksPossible reasons for word2vec learning context words as most similar rather than words in similar contexts

Fedora boot screen shows both Fedora logo and Lenovo logo. Why and How?

How much will studying magic in an academy cost?

Applicability of Lagrange Multipliers in the analysis of large-scale MILPs?

Cascading Repair Costs following Blown Head Gasket on a 2004 Subaru Outback

How does metta sutra develop loving kindness

In the Marvel universe, can a human have a baby with any non-human?

Why aren't cotton tents more popular?

Folding basket - is there such a thing?

Does Marvel have an equivalent of the Green Lantern?

Swapping rooks in a 4x4 board

Require advice on power conservation for backpacking trip

What reason would an alien civilization have for building a Dyson Sphere (or Swarm) if cheap Nuclear fusion is available?

Would it be a copyright violation if I made a character’s full name refer to a song?

Interaction between Leyline of Anticipation and Teferi, Time Raveler

Did Karl Marx ever use any example that involved cotton and dollars to illustrate the way capital and surplus value were generated?

First-year PhD giving a talk among well-established researchers in the field

Is my Rep in Stack-Exchange Form?

Do I have any obligations to my PhD supervisor's requests after I have graduated?

Why did pressing the joystick button spit out keypresses?

Can humans ever directly see a few photons at a time? Can a human see a single photon?

Long term BTC investing

Accidentals and ties

Should my manager be aware of private LinkedIn approaches I receive? How to politely have this happen?

How do I turn off a repeating trade?

Why does logistic function use e rather than 2?

Why is Reconstruction in Autoencoders Using the Same Activation Function as Forward Activation, and not the Inverse?Sigmoid's stabilityWhy ReLU is better than the other activation functionsMachine Learning: Why do the error in cost function need to be squared?Why does Q-learning use an actor model and critic model?Why do we need the sigmoid function in logistic regression?Preprocessing and dropout in Autoencoders?Properly using activation functions of neural networkPurpose of backpropagation in neural networksPossible reasons for word2vec learning context words as most similar rather than words in similar contexts

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

sigmoid function could be used as activation function in machine learning.

$$displaystyle S(x)=frac 11+e^-x=frac e^xe^x+1.$$

If substitute e with 2,

def sigmoid2(z):
 return 1/(1+2**(-z))
x = np.arange(-9,9,dtype=float)
y = sigmoid2(x)
plt.scatter(x,y)

the plot looks similar.

enter image description here

Why does the logistic function use $e$ rather than 2?

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

add a comment |

sigmoid function could be used as activation function in machine learning.

$$displaystyle S(x)=frac 11+e^-x=frac e^xe^x+1.$$

If substitute e with 2,

def sigmoid2(z):
 return 1/(1+2**(-z))
x = np.arange(-9,9,dtype=float)
y = sigmoid2(x)
plt.scatter(x,y)

the plot looks similar.

enter image description here

Why does the logistic function use $e$ rather than 2?

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

add a comment |

sigmoid function could be used as activation function in machine learning.

$$displaystyle S(x)=frac 11+e^-x=frac e^xe^x+1.$$

If substitute e with 2,

def sigmoid2(z):
 return 1/(1+2**(-z))
x = np.arange(-9,9,dtype=float)
y = sigmoid2(x)
plt.scatter(x,y)

the plot looks similar.

enter image description here

Why does the logistic function use $e$ rather than 2?

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

sigmoid function could be used as activation function in machine learning.

$$displaystyle S(x)=frac 11+e^-x=frac e^xe^x+1.$$

If substitute e with 2,

def sigmoid2(z):
 return 1/(1+2**(-z))
x = np.arange(-9,9,dtype=float)
y = sigmoid2(x)
plt.scatter(x,y)

the plot looks similar.

enter image description here

Why does the logistic function use $e$ rather than 2?

machine-learning deep-learning

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

edited Jun 6 at 14:48

Ethan

1,0088 silver badges29 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

asked Jun 6 at 7:55

baojieqh

805 bronze badges

add a comment |

3 Answers
3

active

oldest

votes

Since you are going to minimize later on the log likelihood, there is actually no big difference between $log 2^x=x * log2$ and $log e^x=x$. You see the difference is simply a constant.

Nevertheless one could argue to use $2^x$ instead of $e^x$ und also use $log_2$ instead of $log$ when it comes to the optimizing step. In fact it is possible to use $2^x$ and also many other functions, which show some desired properties.
Which are:

$limlimits_x rightarrow inftyf(x)=1$

$limlimits_x rightarrow -inftyf(x)=0$

$f(x) = -f(-x) + 1$, (symmetric in $(0, 0.5)$

Here is an example of suitable functions from wikipedia.

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

8

$begingroup$
I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.
$endgroup$
– Calvin Godfrey
Jun 6 at 16:55

$begingroup$
Same goes for $2^x$ when using $log_2$.
$endgroup$
– Andreas Look
Jun 6 at 20:44

$begingroup$
@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).
$endgroup$
– sfmiller940
Jun 12 at 20:08

$begingroup$
No, check out binary logarithm. $log_2 (2^x)=x$.
$endgroup$
– Andreas Look
Jun 13 at 19:32

add a comment |

So there are many functions that look sigmoid including the 2 you mentioned, but there are reasons why $e$ is special. The main reason it that the logistic function was originally used to model population growth. And populations, much like interest, can compound over time. Thus, the $e$ becomes a very natural object for this reason. In addition, for theoretical reasons concerning the canonical link function of a glm the logistic is one of the theoretically simplest objects to work with which makes it easy to prove things with.

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

2

$begingroup$
thanks for your answer. what does "canonical link function of a glm" mean?
$endgroup$
– baojieqh
Jun 6 at 9:22

$begingroup$
@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.
$endgroup$
– aranglol
Jun 6 at 23:55

$begingroup$
Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.
$endgroup$
– aranglol
Jun 6 at 23:58

$begingroup$
@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371
$endgroup$
– baojieqh
Jun 7 at 0:37

$begingroup$
This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.
$endgroup$
– David Richerby
Jun 7 at 9:21

add a comment |

It comes from the basic assumption of the model that there exists a continuous/latent/unobservable $Y^*$ that relates somehow to the observed values of $Y$. The model further assumes that $Y=1$ if the signal of $Y^*$ is above some threshold, and otherwise $Y=0$. The third and last assumption is that the underlying distribution of $Y*$ is the logistic distribution. Once you have these assumptions, it is only a matter of algebra to derive the model.

You can read more details at my blog.

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f53308%2fwhy-does-logistic-function-use-e-rather-than-2%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

$limlimits_x rightarrow inftyf(x)=1$

$limlimits_x rightarrow -inftyf(x)=0$

$f(x) = -f(-x) + 1$, (symmetric in $(0, 0.5)$

Here is an example of suitable functions from wikipedia.

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

8

$begingroup$
I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.
$endgroup$
– Calvin Godfrey
Jun 6 at 16:55

$begingroup$
Same goes for $2^x$ when using $log_2$.
$endgroup$
– Andreas Look
Jun 6 at 20:44

$begingroup$
@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).
$endgroup$
– sfmiller940
Jun 12 at 20:08

$begingroup$
No, check out binary logarithm. $log_2 (2^x)=x$.
$endgroup$
– Andreas Look
Jun 13 at 19:32

add a comment |

$limlimits_x rightarrow inftyf(x)=1$

$limlimits_x rightarrow -inftyf(x)=0$

$f(x) = -f(-x) + 1$, (symmetric in $(0, 0.5)$

Here is an example of suitable functions from wikipedia.

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

8

$begingroup$
I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.
$endgroup$
– Calvin Godfrey
Jun 6 at 16:55

$begingroup$
Same goes for $2^x$ when using $log_2$.
$endgroup$
– Andreas Look
Jun 6 at 20:44

$begingroup$
@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).
$endgroup$
– sfmiller940
Jun 12 at 20:08

$begingroup$
No, check out binary logarithm. $log_2 (2^x)=x$.
$endgroup$
– Andreas Look
Jun 13 at 19:32

add a comment |

$limlimits_x rightarrow inftyf(x)=1$

$limlimits_x rightarrow -inftyf(x)=0$

$f(x) = -f(-x) + 1$, (symmetric in $(0, 0.5)$

Here is an example of suitable functions from wikipedia.

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

$limlimits_x rightarrow inftyf(x)=1$

$limlimits_x rightarrow -inftyf(x)=0$

$f(x) = -f(-x) + 1$, (symmetric in $(0, 0.5)$

Here is an example of suitable functions from wikipedia.

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

answered Jun 6 at 8:18

Andreas Look

6431 silver badge12 bronze badges

8

$begingroup$
I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.
$endgroup$
– Calvin Godfrey
Jun 6 at 16:55

$begingroup$
Same goes for $2^x$ when using $log_2$.
$endgroup$
– Andreas Look
Jun 6 at 20:44

$begingroup$
@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).
$endgroup$
– sfmiller940
Jun 12 at 20:08

$begingroup$
No, check out binary logarithm. $log_2 (2^x)=x$.
$endgroup$
– Andreas Look
Jun 13 at 19:32

add a comment |

8

$begingroup$
I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.
$endgroup$
– Calvin Godfrey
Jun 6 at 16:55

$begingroup$
Same goes for $2^x$ when using $log_2$.
$endgroup$
– Andreas Look
Jun 6 at 20:44

$begingroup$
@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).
$endgroup$
– sfmiller940
Jun 12 at 20:08

$begingroup$
No, check out binary logarithm. $log_2 (2^x)=x$.
$endgroup$
– Andreas Look
Jun 13 at 19:32

I think it's also worth pointing out that one nice reason to use $e$ as the base is that the derivative of $sigma(x)=frac11+e^-x$ is $sigma'(x)=sigma(x)(1-sigma(x))$. Without doing the actual computation, I think if the base was different the formula would only differ by a constant again, but it's a nice property that is specific to $e$.

– Calvin Godfrey
Jun 6 at 16:55

Same goes for $2^x$ when using $log_2$.

– Andreas Look
Jun 6 at 20:44

@AndreasLook I'm not sure what you mean. If you use $2^-x$ then there's an extra factor of $ln(2)$ in the derivative (like Calvin Godfrey said).

– sfmiller940
Jun 12 at 20:08

No, check out binary logarithm. $log_2 (2^x)=x$.

– Andreas Look
Jun 13 at 19:32

add a comment |

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

2

$begingroup$
thanks for your answer. what does "canonical link function of a glm" mean?
$endgroup$
– baojieqh
Jun 6 at 9:22

$begingroup$
@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.
$endgroup$
– aranglol
Jun 6 at 23:55

$begingroup$
Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.
$endgroup$
– aranglol
Jun 6 at 23:58

$begingroup$
@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371
$endgroup$
– baojieqh
Jun 7 at 0:37

$begingroup$
This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.
$endgroup$
– David Richerby
Jun 7 at 9:21

add a comment |

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

2

$begingroup$
thanks for your answer. what does "canonical link function of a glm" mean?
$endgroup$
– baojieqh
Jun 6 at 9:22

$begingroup$
@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.
$endgroup$
– aranglol
Jun 6 at 23:55

$begingroup$
Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.
$endgroup$
– aranglol
Jun 6 at 23:58

$begingroup$
@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371
$endgroup$
– baojieqh
Jun 7 at 0:37

$begingroup$
This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.
$endgroup$
– David Richerby
Jun 7 at 9:21

add a comment |

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

edited Jun 6 at 15:00

Ethan

1,0088 silver badges29 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

answered Jun 6 at 8:12

Anonymous Emu

1504 bronze badges

2

$begingroup$
thanks for your answer. what does "canonical link function of a glm" mean?
$endgroup$
– baojieqh
Jun 6 at 9:22

$begingroup$
@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.
$endgroup$
– aranglol
Jun 6 at 23:55

$begingroup$
Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.
$endgroup$
– aranglol
Jun 6 at 23:58

$begingroup$
@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371
$endgroup$
– baojieqh
Jun 7 at 0:37

$begingroup$
This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.
$endgroup$
– David Richerby
Jun 7 at 9:21

add a comment |

2

$begingroup$
thanks for your answer. what does "canonical link function of a glm" mean?
$endgroup$
– baojieqh
Jun 6 at 9:22

$begingroup$
@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.
$endgroup$
– aranglol
Jun 6 at 23:55

$begingroup$
Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.
$endgroup$
– aranglol
Jun 6 at 23:58

$begingroup$
@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371
$endgroup$
– baojieqh
Jun 7 at 0:37

$begingroup$
This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.
$endgroup$
– David Richerby
Jun 7 at 9:21

thanks for your answer. what does "canonical link function of a glm" mean?

– baojieqh
Jun 6 at 9:22

@baojieqh For all generalized linear models, one needs to specify a member of the exponential family of distributions. These distributions all share a property where they can be written in such a way so that a function of the scale parameter of the distribution sits "by itself" in an exponent (and the function is only a function of the scale parameter). This function is what people refer to as the canonical link function. For the bernoulli/binomial distribution, where the scale parameter is p, it turns out that this function is ln(p/(1-p)) which is the logit link function.

– aranglol
Jun 6 at 23:55

Hence, the canonical link function for the logistic regression, which assumes a Bernoulli distribution for each row, is the logit link. There are other more theoretical properties as well that make the canonical link function desirable. But it is technically not necessary to use it, you could use the probit for example.

– aranglol
Jun 6 at 23:58

@aranglol thanks for you comments, would you please take a look at this link math.stackexchange.com/q/3253634/656371

– baojieqh
Jun 7 at 0:37

This seems to be just a hand-waving appeal to the claim that "$e$ is special", without giving any justification about why $e$ is special. Really, the only specialness is the convenience that $tfracddxa^x=a^xln a$, which means that $tfracddxe^x=e^x$.

– David Richerby
Jun 7 at 9:21

add a comment |

You can read more details at my blog.

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

add a comment |

You can read more details at my blog.

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

add a comment |

You can read more details at my blog.

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

You can read more details at my blog.

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

edited Jun 16 at 13:03

Stephen Rauch♦

1,5436 gold badges13 silver badges30 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

answered Jun 16 at 11:50

Yossi Levy

12 bronze badges

New contributor

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

BTRX3F

搜尋此網誌

Otdfbt

3 Answers
3

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Post as a guest

Popular posts from this blog

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O﻿ / ﻿43.24775, -8.60070

3 Answers
3

3 Answers
3

3 Answers
3

Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070