Why is there a change in the number of degrees of freedom when the following modification is made?How to understand degrees of freedom?Deegrees of freedom of Student's distributionWhat are the specific degrees of freedom of a Chi-squared Goodness of Fit test?why are the degrees of freedom for a chi-square test on a 2x2 contigency table always 1?Why does the mean have $n-1$ degrees of freedom?Estimating the number of degrees of freedom in a chi-squared distributionExplaining degrees of freedom when testing the number of common factorsUnder the Assumptions of the Simple Linear Regression Model, Why Is This Term a Chi-Square Random Variable with $n - 2$ Degrees of Freedom?Degrees of freedom of likelihood ratio test with equal dimension on the null and the parameter space?Why doesn't the distribution involve degrees of freedom even when the test statistic (Z) includes the sample size?Why is degree of freedom so important?

How come mathematicians published in Annals of Eugenics?

Why do unstable nuclei form?

When do you stop "pushing" a book?

Was there a contingency plan in place if Little Boy failed to detonate?

Do Monks gain the 9th level Unarmored Movement benefit when wearing armor or using a shield?

Is it safe to keep the GPU on 100% utilization for a very long time?

Gift for mentor after his thesis defense?

Gain of Non-Inverting Amplifier Does Not vary with Resistor Values

Rusty Chain and back cassette – Replace or Repair?

What's the difference between "ricochet" and "bounce"?

Not taking the bishop with the knight, why?

Compactness in normed vector spaces.

Are double contractions formal? Eg: "couldn't've" for "could not have"

How can I test a shell script in a "safe environment" to avoid harm to my computer?

Renting a house to a graduate student in my department

Does a surprised creature obey the 1st level spell Command?

Best species to breed to intelligence

How to get MAX value using SOQL when there are more than 50,000 rows

Thawing Glaciers return to hand interaction

Integral with DiracDelta. Can Mathematica be made to solve this?

Company stopped paying my salary. What are my options?

Is every story set in the future "science fiction"?

Program for finding longest run of zeros from a list of 100 random integers which are either 0 or 1

Why use steam instead of just hot air?



Why is there a change in the number of degrees of freedom when the following modification is made?


How to understand degrees of freedom?Deegrees of freedom of Student's distributionWhat are the specific degrees of freedom of a Chi-squared Goodness of Fit test?why are the degrees of freedom for a chi-square test on a 2x2 contigency table always 1?Why does the mean have $n-1$ degrees of freedom?Estimating the number of degrees of freedom in a chi-squared distributionExplaining degrees of freedom when testing the number of common factorsUnder the Assumptions of the Simple Linear Regression Model, Why Is This Term a Chi-Square Random Variable with $n - 2$ Degrees of Freedom?Degrees of freedom of likelihood ratio test with equal dimension on the null and the parameter space?Why doesn't the distribution involve degrees of freedom even when the test statistic (Z) includes the sample size?Why is degree of freedom so important?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


In the notes that I'm working through it says the following:
"Let $X_1,...,X_n$ be a random sample from $N(mu,sigma)$



$$sum^n_i=1Bigg[frac(X_i-mu)sigmaBigg]^2$$ has a $chi^2$ distribution with $n$ degrees of freedom.



Now if we modify this by replacing $mu$ with $overlineX$ the distribution changes and we obtain:
$$sum^n_i=1Bigg[frac(X_i-overlineX)sigmaBigg]^2$$ has a $chi^2$ distribution with $n-1$ degrees of freedom."



My question is: why does the number of degrees of freedom change?



My understanding of what a degree of freedom is that the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. So surely as there are $n$ $X_i$ even when we introduce $overlineX$ the number of values that are free to change in the calculation of the statistic is still the same??










share|cite|improve this question









$endgroup$







  • 2




    $begingroup$
    Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
    $endgroup$
    – BruceET
    Apr 30 at 1:45







  • 1




    $begingroup$
    Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
    $endgroup$
    – Glen_b
    Apr 30 at 1:58










  • $begingroup$
    The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
    $endgroup$
    – whuber
    Apr 30 at 12:38

















2












$begingroup$


In the notes that I'm working through it says the following:
"Let $X_1,...,X_n$ be a random sample from $N(mu,sigma)$



$$sum^n_i=1Bigg[frac(X_i-mu)sigmaBigg]^2$$ has a $chi^2$ distribution with $n$ degrees of freedom.



Now if we modify this by replacing $mu$ with $overlineX$ the distribution changes and we obtain:
$$sum^n_i=1Bigg[frac(X_i-overlineX)sigmaBigg]^2$$ has a $chi^2$ distribution with $n-1$ degrees of freedom."



My question is: why does the number of degrees of freedom change?



My understanding of what a degree of freedom is that the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. So surely as there are $n$ $X_i$ even when we introduce $overlineX$ the number of values that are free to change in the calculation of the statistic is still the same??










share|cite|improve this question









$endgroup$







  • 2




    $begingroup$
    Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
    $endgroup$
    – BruceET
    Apr 30 at 1:45







  • 1




    $begingroup$
    Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
    $endgroup$
    – Glen_b
    Apr 30 at 1:58










  • $begingroup$
    The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
    $endgroup$
    – whuber
    Apr 30 at 12:38













2












2








2





$begingroup$


In the notes that I'm working through it says the following:
"Let $X_1,...,X_n$ be a random sample from $N(mu,sigma)$



$$sum^n_i=1Bigg[frac(X_i-mu)sigmaBigg]^2$$ has a $chi^2$ distribution with $n$ degrees of freedom.



Now if we modify this by replacing $mu$ with $overlineX$ the distribution changes and we obtain:
$$sum^n_i=1Bigg[frac(X_i-overlineX)sigmaBigg]^2$$ has a $chi^2$ distribution with $n-1$ degrees of freedom."



My question is: why does the number of degrees of freedom change?



My understanding of what a degree of freedom is that the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. So surely as there are $n$ $X_i$ even when we introduce $overlineX$ the number of values that are free to change in the calculation of the statistic is still the same??










share|cite|improve this question









$endgroup$




In the notes that I'm working through it says the following:
"Let $X_1,...,X_n$ be a random sample from $N(mu,sigma)$



$$sum^n_i=1Bigg[frac(X_i-mu)sigmaBigg]^2$$ has a $chi^2$ distribution with $n$ degrees of freedom.



Now if we modify this by replacing $mu$ with $overlineX$ the distribution changes and we obtain:
$$sum^n_i=1Bigg[frac(X_i-overlineX)sigmaBigg]^2$$ has a $chi^2$ distribution with $n-1$ degrees of freedom."



My question is: why does the number of degrees of freedom change?



My understanding of what a degree of freedom is that the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary. So surely as there are $n$ $X_i$ even when we introduce $overlineX$ the number of values that are free to change in the calculation of the statistic is still the same??







distributions normal-distribution chi-squared degrees-of-freedom






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Apr 29 at 22:34









stochasticmrfoxstochasticmrfox

112




112







  • 2




    $begingroup$
    Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
    $endgroup$
    – BruceET
    Apr 30 at 1:45







  • 1




    $begingroup$
    Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
    $endgroup$
    – Glen_b
    Apr 30 at 1:58










  • $begingroup$
    The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
    $endgroup$
    – whuber
    Apr 30 at 12:38












  • 2




    $begingroup$
    Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
    $endgroup$
    – BruceET
    Apr 30 at 1:45







  • 1




    $begingroup$
    Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
    $endgroup$
    – Glen_b
    Apr 30 at 1:58










  • $begingroup$
    The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
    $endgroup$
    – whuber
    Apr 30 at 12:38







2




2




$begingroup$
Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
$endgroup$
– BruceET
Apr 30 at 1:45





$begingroup$
Try it with $n = 2,$ Once you know $bar X$ and $X_1,$ then you know $X_2.$ // The glib, supposedly intuitive, 'explanation' is that you "lose one degree of freedom estimating the mean." // More rigorously $sum_i (X_i - mu)^2$ can be decomposed into $sum_i (X_i - bar X)^2 +$ the square of one other normal random variable. // Some people are 'convinced' by a simulation of each and fitting the relevant CHISQ random variables with $n$ and $n-1$ DF, which I will attempt.
$endgroup$
– BruceET
Apr 30 at 1:45





1




1




$begingroup$
Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
$endgroup$
– Glen_b
Apr 30 at 1:58




$begingroup$
Consider that $barX$ is both closer to the data than $mu$ is, and dependent on it.
$endgroup$
– Glen_b
Apr 30 at 1:58












$begingroup$
The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
$endgroup$
– whuber
Apr 30 at 12:38




$begingroup$
The concept of degrees of freedom is thoroughly discussed in our thread at stats.stackexchange.com/questions/16921.
$endgroup$
– whuber
Apr 30 at 12:38










3 Answers
3






active

oldest

votes


















3












$begingroup$

Suppose we have a random sample from $mathsfNorm(mu, sigma).$



Let the $V_1 = S^2 = frac1n-1sum_i=1^n (X_i - bar X)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is unknown and estimated by $bar X.$ Then $Q_1 = frac(n-1)V_1sigma^2 sim mathsfChisq(n-1).$



Let the $V_2 = frac1nsum_i=1^n (X_i - mu)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is known. Then $Q_2 = fracnV_2sigma^2 sim mathsfChisq(n).$



In the R code below, we choose $m = 10^6$ samples of size $n = 5$ from $mathsfNorm(mu = 50, sigma = 7).$
Then we make histograms of $Q_1$ and $Q_2.$ Chi-squared densities with degrees of freedom $4$ and $5,$ respectively, fit the histograms. The density curve
for $mathsfChisq(5)$ does not fit the histogram
for $Q_1.$



set.seed(2019)
m = 10^6; n = 5; mu = 50; sg = 7
x = rnorm(m*n, mu, sg)
MAT = matrix(x, nrow=m) # each row a sample of n
v1 = apply(MAT, 1, var) # uses sample mean
v2 = rowSums((MAT - mu)^2)/n # uses population mean
q1 = (n-1)*v1 /sg^2
q2 = n*v2 / sg^2

mean(q1); var(q1)
[1] 3.997226 # aprx E(Q1) = 4
[1] 8.00637 # aprx Var(Q1) = 8
mean(q2); var(q2)
[1] 4.997005 # aprx E(Q2) = 5
[1] 9.98925 # aprx Var(Q2) = 10

par(mfrow=c(1,2)) # enables 2 panels per plot
hist(q1, prob=T, br=50, col="skyblue2", ylim=c(0,.2))
curve(dchisq(x, n-1), add=T, lwd=2)
curve(dchisq(x, n), add=T, col="red", lwd=2, lty="dotted")
hist(q2, prob=T, br=50, col="skyblue2")
curve(dchisq(x, n), add=T, lwd=2)
par(mfrow=c(1,1))


enter image description here






share|cite|improve this answer











$endgroup$




















    2












    $begingroup$

    Your understanding of degrees of freedom is correct for one of the two senses of the term. The vector $left(X_1-overline X,ldots,X_n-overline Xright)$ has $n-1$ degrees of freedom because it is subject to the constraint that the sum of the components must be $0,$ so if you know $n-1$ of them plus that constraint, then you know all of them.



    The other sense of the term degrees of freedom is used when one speaks of a chi-square distribution with a specified number of degrees of freedom. Consider any orthonormal basis of the $(n-1)$-dimensional space of $n$-tuples in which the sum of the components is $0.$ Let $U_1,ldots,U_n-1$ be the scalar components of $left( X_1-overline X, ldots, X_n-overline X right)$ with respect to that basis. Then
    beginalign
    & left( X_1-overline Xright)^2 + cdots + left( X_n - overline Xright)^2 = U_1^2 + cdots + U_n-1^2 \[5pt]
    & textand U_1,ldots,U_n-1 sim texti.i.d. N(0, sigma^2).
    endalign






    share|cite|improve this answer









    $endgroup$




















      1












      $begingroup$

      Your understanding of degrees of freedom is correct. The difference essentially boils down to a subtle difference between $mu$ and $barX$.



      The sample mean $barX$ is determined by the values of the observed samples. As a result, there's some redundancy between $barX$ and the individual values of $x_i$. Suppose we knew $barX$ and the first $N-1$ values. That's enough, because we can write out the equation for $barX$ as
      $$barX=frac1Nx_1 +frac1Nx_2 + frac1Nx_3+ ldots + frac1Nx_N-1 + frac1Nx_N$$



      A little bit of rearrangement lets us find that missing value: $$x_N = barX - fracN-1N(x_1+x_2+ldots +x_N-1) $$



      This isn't the case when the population mean ($mu$) is used, since it is not dependent on the observed data; it's known (or assumed to be known) ahead of time. You therefore need to use all $N$ values to produce calculate your test statistic.






      share|cite|improve this answer









      $endgroup$













        Your Answer








        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "65"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: false,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: null,
        bindNavPrevention: true,
        postfix: "",
        imageUploader:
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        ,
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );













        draft saved

        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f405756%2fwhy-is-there-a-change-in-the-number-of-degrees-of-freedom-when-the-following-mod%23new-answer', 'question_page');

        );

        Post as a guest















        Required, but never shown

























        3 Answers
        3






        active

        oldest

        votes








        3 Answers
        3






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        3












        $begingroup$

        Suppose we have a random sample from $mathsfNorm(mu, sigma).$



        Let the $V_1 = S^2 = frac1n-1sum_i=1^n (X_i - bar X)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is unknown and estimated by $bar X.$ Then $Q_1 = frac(n-1)V_1sigma^2 sim mathsfChisq(n-1).$



        Let the $V_2 = frac1nsum_i=1^n (X_i - mu)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is known. Then $Q_2 = fracnV_2sigma^2 sim mathsfChisq(n).$



        In the R code below, we choose $m = 10^6$ samples of size $n = 5$ from $mathsfNorm(mu = 50, sigma = 7).$
        Then we make histograms of $Q_1$ and $Q_2.$ Chi-squared densities with degrees of freedom $4$ and $5,$ respectively, fit the histograms. The density curve
        for $mathsfChisq(5)$ does not fit the histogram
        for $Q_1.$



        set.seed(2019)
        m = 10^6; n = 5; mu = 50; sg = 7
        x = rnorm(m*n, mu, sg)
        MAT = matrix(x, nrow=m) # each row a sample of n
        v1 = apply(MAT, 1, var) # uses sample mean
        v2 = rowSums((MAT - mu)^2)/n # uses population mean
        q1 = (n-1)*v1 /sg^2
        q2 = n*v2 / sg^2

        mean(q1); var(q1)
        [1] 3.997226 # aprx E(Q1) = 4
        [1] 8.00637 # aprx Var(Q1) = 8
        mean(q2); var(q2)
        [1] 4.997005 # aprx E(Q2) = 5
        [1] 9.98925 # aprx Var(Q2) = 10

        par(mfrow=c(1,2)) # enables 2 panels per plot
        hist(q1, prob=T, br=50, col="skyblue2", ylim=c(0,.2))
        curve(dchisq(x, n-1), add=T, lwd=2)
        curve(dchisq(x, n), add=T, col="red", lwd=2, lty="dotted")
        hist(q2, prob=T, br=50, col="skyblue2")
        curve(dchisq(x, n), add=T, lwd=2)
        par(mfrow=c(1,1))


        enter image description here






        share|cite|improve this answer











        $endgroup$

















          3












          $begingroup$

          Suppose we have a random sample from $mathsfNorm(mu, sigma).$



          Let the $V_1 = S^2 = frac1n-1sum_i=1^n (X_i - bar X)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is unknown and estimated by $bar X.$ Then $Q_1 = frac(n-1)V_1sigma^2 sim mathsfChisq(n-1).$



          Let the $V_2 = frac1nsum_i=1^n (X_i - mu)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is known. Then $Q_2 = fracnV_2sigma^2 sim mathsfChisq(n).$



          In the R code below, we choose $m = 10^6$ samples of size $n = 5$ from $mathsfNorm(mu = 50, sigma = 7).$
          Then we make histograms of $Q_1$ and $Q_2.$ Chi-squared densities with degrees of freedom $4$ and $5,$ respectively, fit the histograms. The density curve
          for $mathsfChisq(5)$ does not fit the histogram
          for $Q_1.$



          set.seed(2019)
          m = 10^6; n = 5; mu = 50; sg = 7
          x = rnorm(m*n, mu, sg)
          MAT = matrix(x, nrow=m) # each row a sample of n
          v1 = apply(MAT, 1, var) # uses sample mean
          v2 = rowSums((MAT - mu)^2)/n # uses population mean
          q1 = (n-1)*v1 /sg^2
          q2 = n*v2 / sg^2

          mean(q1); var(q1)
          [1] 3.997226 # aprx E(Q1) = 4
          [1] 8.00637 # aprx Var(Q1) = 8
          mean(q2); var(q2)
          [1] 4.997005 # aprx E(Q2) = 5
          [1] 9.98925 # aprx Var(Q2) = 10

          par(mfrow=c(1,2)) # enables 2 panels per plot
          hist(q1, prob=T, br=50, col="skyblue2", ylim=c(0,.2))
          curve(dchisq(x, n-1), add=T, lwd=2)
          curve(dchisq(x, n), add=T, col="red", lwd=2, lty="dotted")
          hist(q2, prob=T, br=50, col="skyblue2")
          curve(dchisq(x, n), add=T, lwd=2)
          par(mfrow=c(1,1))


          enter image description here






          share|cite|improve this answer











          $endgroup$















            3












            3








            3





            $begingroup$

            Suppose we have a random sample from $mathsfNorm(mu, sigma).$



            Let the $V_1 = S^2 = frac1n-1sum_i=1^n (X_i - bar X)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is unknown and estimated by $bar X.$ Then $Q_1 = frac(n-1)V_1sigma^2 sim mathsfChisq(n-1).$



            Let the $V_2 = frac1nsum_i=1^n (X_i - mu)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is known. Then $Q_2 = fracnV_2sigma^2 sim mathsfChisq(n).$



            In the R code below, we choose $m = 10^6$ samples of size $n = 5$ from $mathsfNorm(mu = 50, sigma = 7).$
            Then we make histograms of $Q_1$ and $Q_2.$ Chi-squared densities with degrees of freedom $4$ and $5,$ respectively, fit the histograms. The density curve
            for $mathsfChisq(5)$ does not fit the histogram
            for $Q_1.$



            set.seed(2019)
            m = 10^6; n = 5; mu = 50; sg = 7
            x = rnorm(m*n, mu, sg)
            MAT = matrix(x, nrow=m) # each row a sample of n
            v1 = apply(MAT, 1, var) # uses sample mean
            v2 = rowSums((MAT - mu)^2)/n # uses population mean
            q1 = (n-1)*v1 /sg^2
            q2 = n*v2 / sg^2

            mean(q1); var(q1)
            [1] 3.997226 # aprx E(Q1) = 4
            [1] 8.00637 # aprx Var(Q1) = 8
            mean(q2); var(q2)
            [1] 4.997005 # aprx E(Q2) = 5
            [1] 9.98925 # aprx Var(Q2) = 10

            par(mfrow=c(1,2)) # enables 2 panels per plot
            hist(q1, prob=T, br=50, col="skyblue2", ylim=c(0,.2))
            curve(dchisq(x, n-1), add=T, lwd=2)
            curve(dchisq(x, n), add=T, col="red", lwd=2, lty="dotted")
            hist(q2, prob=T, br=50, col="skyblue2")
            curve(dchisq(x, n), add=T, lwd=2)
            par(mfrow=c(1,1))


            enter image description here






            share|cite|improve this answer











            $endgroup$



            Suppose we have a random sample from $mathsfNorm(mu, sigma).$



            Let the $V_1 = S^2 = frac1n-1sum_i=1^n (X_i - bar X)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is unknown and estimated by $bar X.$ Then $Q_1 = frac(n-1)V_1sigma^2 sim mathsfChisq(n-1).$



            Let the $V_2 = frac1nsum_i=1^n (X_i - mu)^2$ be the estimate of the population variance $sigma^2$ when $mu$ is known. Then $Q_2 = fracnV_2sigma^2 sim mathsfChisq(n).$



            In the R code below, we choose $m = 10^6$ samples of size $n = 5$ from $mathsfNorm(mu = 50, sigma = 7).$
            Then we make histograms of $Q_1$ and $Q_2.$ Chi-squared densities with degrees of freedom $4$ and $5,$ respectively, fit the histograms. The density curve
            for $mathsfChisq(5)$ does not fit the histogram
            for $Q_1.$



            set.seed(2019)
            m = 10^6; n = 5; mu = 50; sg = 7
            x = rnorm(m*n, mu, sg)
            MAT = matrix(x, nrow=m) # each row a sample of n
            v1 = apply(MAT, 1, var) # uses sample mean
            v2 = rowSums((MAT - mu)^2)/n # uses population mean
            q1 = (n-1)*v1 /sg^2
            q2 = n*v2 / sg^2

            mean(q1); var(q1)
            [1] 3.997226 # aprx E(Q1) = 4
            [1] 8.00637 # aprx Var(Q1) = 8
            mean(q2); var(q2)
            [1] 4.997005 # aprx E(Q2) = 5
            [1] 9.98925 # aprx Var(Q2) = 10

            par(mfrow=c(1,2)) # enables 2 panels per plot
            hist(q1, prob=T, br=50, col="skyblue2", ylim=c(0,.2))
            curve(dchisq(x, n-1), add=T, lwd=2)
            curve(dchisq(x, n), add=T, col="red", lwd=2, lty="dotted")
            hist(q2, prob=T, br=50, col="skyblue2")
            curve(dchisq(x, n), add=T, lwd=2)
            par(mfrow=c(1,1))


            enter image description here







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Apr 30 at 2:51

























            answered Apr 30 at 2:32









            BruceETBruceET

            7,6811721




            7,6811721























                2












                $begingroup$

                Your understanding of degrees of freedom is correct for one of the two senses of the term. The vector $left(X_1-overline X,ldots,X_n-overline Xright)$ has $n-1$ degrees of freedom because it is subject to the constraint that the sum of the components must be $0,$ so if you know $n-1$ of them plus that constraint, then you know all of them.



                The other sense of the term degrees of freedom is used when one speaks of a chi-square distribution with a specified number of degrees of freedom. Consider any orthonormal basis of the $(n-1)$-dimensional space of $n$-tuples in which the sum of the components is $0.$ Let $U_1,ldots,U_n-1$ be the scalar components of $left( X_1-overline X, ldots, X_n-overline X right)$ with respect to that basis. Then
                beginalign
                & left( X_1-overline Xright)^2 + cdots + left( X_n - overline Xright)^2 = U_1^2 + cdots + U_n-1^2 \[5pt]
                & textand U_1,ldots,U_n-1 sim texti.i.d. N(0, sigma^2).
                endalign






                share|cite|improve this answer









                $endgroup$

















                  2












                  $begingroup$

                  Your understanding of degrees of freedom is correct for one of the two senses of the term. The vector $left(X_1-overline X,ldots,X_n-overline Xright)$ has $n-1$ degrees of freedom because it is subject to the constraint that the sum of the components must be $0,$ so if you know $n-1$ of them plus that constraint, then you know all of them.



                  The other sense of the term degrees of freedom is used when one speaks of a chi-square distribution with a specified number of degrees of freedom. Consider any orthonormal basis of the $(n-1)$-dimensional space of $n$-tuples in which the sum of the components is $0.$ Let $U_1,ldots,U_n-1$ be the scalar components of $left( X_1-overline X, ldots, X_n-overline X right)$ with respect to that basis. Then
                  beginalign
                  & left( X_1-overline Xright)^2 + cdots + left( X_n - overline Xright)^2 = U_1^2 + cdots + U_n-1^2 \[5pt]
                  & textand U_1,ldots,U_n-1 sim texti.i.d. N(0, sigma^2).
                  endalign






                  share|cite|improve this answer









                  $endgroup$















                    2












                    2








                    2





                    $begingroup$

                    Your understanding of degrees of freedom is correct for one of the two senses of the term. The vector $left(X_1-overline X,ldots,X_n-overline Xright)$ has $n-1$ degrees of freedom because it is subject to the constraint that the sum of the components must be $0,$ so if you know $n-1$ of them plus that constraint, then you know all of them.



                    The other sense of the term degrees of freedom is used when one speaks of a chi-square distribution with a specified number of degrees of freedom. Consider any orthonormal basis of the $(n-1)$-dimensional space of $n$-tuples in which the sum of the components is $0.$ Let $U_1,ldots,U_n-1$ be the scalar components of $left( X_1-overline X, ldots, X_n-overline X right)$ with respect to that basis. Then
                    beginalign
                    & left( X_1-overline Xright)^2 + cdots + left( X_n - overline Xright)^2 = U_1^2 + cdots + U_n-1^2 \[5pt]
                    & textand U_1,ldots,U_n-1 sim texti.i.d. N(0, sigma^2).
                    endalign






                    share|cite|improve this answer









                    $endgroup$



                    Your understanding of degrees of freedom is correct for one of the two senses of the term. The vector $left(X_1-overline X,ldots,X_n-overline Xright)$ has $n-1$ degrees of freedom because it is subject to the constraint that the sum of the components must be $0,$ so if you know $n-1$ of them plus that constraint, then you know all of them.



                    The other sense of the term degrees of freedom is used when one speaks of a chi-square distribution with a specified number of degrees of freedom. Consider any orthonormal basis of the $(n-1)$-dimensional space of $n$-tuples in which the sum of the components is $0.$ Let $U_1,ldots,U_n-1$ be the scalar components of $left( X_1-overline X, ldots, X_n-overline X right)$ with respect to that basis. Then
                    beginalign
                    & left( X_1-overline Xright)^2 + cdots + left( X_n - overline Xright)^2 = U_1^2 + cdots + U_n-1^2 \[5pt]
                    & textand U_1,ldots,U_n-1 sim texti.i.d. N(0, sigma^2).
                    endalign







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered Apr 30 at 5:21









                    Michael HardyMichael Hardy

                    4,2351430




                    4,2351430





















                        1












                        $begingroup$

                        Your understanding of degrees of freedom is correct. The difference essentially boils down to a subtle difference between $mu$ and $barX$.



                        The sample mean $barX$ is determined by the values of the observed samples. As a result, there's some redundancy between $barX$ and the individual values of $x_i$. Suppose we knew $barX$ and the first $N-1$ values. That's enough, because we can write out the equation for $barX$ as
                        $$barX=frac1Nx_1 +frac1Nx_2 + frac1Nx_3+ ldots + frac1Nx_N-1 + frac1Nx_N$$



                        A little bit of rearrangement lets us find that missing value: $$x_N = barX - fracN-1N(x_1+x_2+ldots +x_N-1) $$



                        This isn't the case when the population mean ($mu$) is used, since it is not dependent on the observed data; it's known (or assumed to be known) ahead of time. You therefore need to use all $N$ values to produce calculate your test statistic.






                        share|cite|improve this answer









                        $endgroup$

















                          1












                          $begingroup$

                          Your understanding of degrees of freedom is correct. The difference essentially boils down to a subtle difference between $mu$ and $barX$.



                          The sample mean $barX$ is determined by the values of the observed samples. As a result, there's some redundancy between $barX$ and the individual values of $x_i$. Suppose we knew $barX$ and the first $N-1$ values. That's enough, because we can write out the equation for $barX$ as
                          $$barX=frac1Nx_1 +frac1Nx_2 + frac1Nx_3+ ldots + frac1Nx_N-1 + frac1Nx_N$$



                          A little bit of rearrangement lets us find that missing value: $$x_N = barX - fracN-1N(x_1+x_2+ldots +x_N-1) $$



                          This isn't the case when the population mean ($mu$) is used, since it is not dependent on the observed data; it's known (or assumed to be known) ahead of time. You therefore need to use all $N$ values to produce calculate your test statistic.






                          share|cite|improve this answer









                          $endgroup$















                            1












                            1








                            1





                            $begingroup$

                            Your understanding of degrees of freedom is correct. The difference essentially boils down to a subtle difference between $mu$ and $barX$.



                            The sample mean $barX$ is determined by the values of the observed samples. As a result, there's some redundancy between $barX$ and the individual values of $x_i$. Suppose we knew $barX$ and the first $N-1$ values. That's enough, because we can write out the equation for $barX$ as
                            $$barX=frac1Nx_1 +frac1Nx_2 + frac1Nx_3+ ldots + frac1Nx_N-1 + frac1Nx_N$$



                            A little bit of rearrangement lets us find that missing value: $$x_N = barX - fracN-1N(x_1+x_2+ldots +x_N-1) $$



                            This isn't the case when the population mean ($mu$) is used, since it is not dependent on the observed data; it's known (or assumed to be known) ahead of time. You therefore need to use all $N$ values to produce calculate your test statistic.






                            share|cite|improve this answer









                            $endgroup$



                            Your understanding of degrees of freedom is correct. The difference essentially boils down to a subtle difference between $mu$ and $barX$.



                            The sample mean $barX$ is determined by the values of the observed samples. As a result, there's some redundancy between $barX$ and the individual values of $x_i$. Suppose we knew $barX$ and the first $N-1$ values. That's enough, because we can write out the equation for $barX$ as
                            $$barX=frac1Nx_1 +frac1Nx_2 + frac1Nx_3+ ldots + frac1Nx_N-1 + frac1Nx_N$$



                            A little bit of rearrangement lets us find that missing value: $$x_N = barX - fracN-1N(x_1+x_2+ldots +x_N-1) $$



                            This isn't the case when the population mean ($mu$) is used, since it is not dependent on the observed data; it's known (or assumed to be known) ahead of time. You therefore need to use all $N$ values to produce calculate your test statistic.







                            share|cite|improve this answer












                            share|cite|improve this answer



                            share|cite|improve this answer










                            answered Apr 30 at 3:26









                            Matt KrauseMatt Krause

                            15.2k24581




                            15.2k24581



























                                draft saved

                                draft discarded
















































                                Thanks for contributing an answer to Cross Validated!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid


                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.

                                Use MathJax to format equations. MathJax reference.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f405756%2fwhy-is-there-a-change-in-the-number-of-degrees-of-freedom-when-the-following-mod%23new-answer', 'question_page');

                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

                                Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

                                Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020