RobertNiles.com
About Robert
Reporting Help
Finding Data on the Internet
Questions for Candidates
Stats Lessons
Mean
Median
Percent
Per capita
Standard Deviation
Margin of Error
Data Analysis
Sample Sizes
Stats Tests
Help Board
Bookstore


The Cartoon Guide to Statistics

This isn't some dumb-downed whitewash. It's a must-read for a beginner taking his or her first steps toward understanding stats.
More information
-->

Home > Statistics Every Writer Should Know > The Stats Board > Discusssion

n vs. n-1
Message posted by Rich on April 27, 2000 at 12:00 AM (ET)

In calculating standard deviations, and in particular RSD's of data sets, when is it proper to use n rather than n-1. I seem to remember being taught that "n" was used for small populations, I believe the number given was 20 sample points. At the time, this was related as 'degrees of freedom', which I now believe may have been explained to me incorrectly. For references' sake, the RSD being measured is of a known and finite set of measurement (area counts from chromatographic injections). Any help here or via private email will be appreciated. Thanks, Rich


READERS RESPOND:
(In chronological order. Most recent at the bottom.)

Re: n vs. n-1
Message posted by Phil Rosenkrantz on April 30, 2000 at 12:00 AM (ET)

In addition to the previous responder, the reason behind the n vs. n-1 choice related to the idea of an "unbiased estimator". This is of importance to mathematical statisticians and probably only a few others. [An "unbiased estimator" is one where the "Expected Value" of the estimator yields the parameter of interest.]

You will notice that as "n" increases, the difference between using n or n-1 becomes almost negilible. Many people teach that after about n=25 or n=30, you can use n instead of n-1 without a problem.

For the t-statistic, n-1 is called the "degrees-of-freedom."

Unless you a statistics major, don't worry about why.


Re: n vs. n-1
Message posted by Jack Tomsky on May 1, 2000 at 12:00 AM (ET)

Theoretically, dividing the sum of squares about the sample mean by n-1 gives you an unbiased estimate of the population variance (sigma^2). Dividing by n gives you the maximum-likelihood estimate. Dividing by n+1 gives you the minimum mean-squared-error (minimax, for a quadratic loss function).

If the mean happens to be known, then dividing the sum of squares about this known mean by n is an unbiased estimate of sigma^2. Thus, you have several choices, each one having some desirable property.



Your $5 contribution helps cover part the $500 annual cost of keeping this site online.

Niles Online Sites:RobertNiles.comTheme Park InsiderViolinist.com

RobertNiles.com™, the site, content and services 咀opyright 1996-2002, Robert Niles.
All rights reserved. Questions? Comments? Read my Privacy Policy, or E-mail me!