Estimating Parameters Web Page


©Copyright 1997, 2000 Tom Malloy

This is the text of the in-class lecture which accompanied the Authorware visual graphics on this topic. You may print this text out and use it as a textbook. Or you may read it online. In either case it is coordinated with the online Authorware graphics.

Evaluate this StatCenter Function


Topic Locator Map

This map allows you to--

  1. Jump directly to a topic which interests you.
  2. Co-ordinate the dynamic visual Authorware presentations with the corresponding text available on this web page.

1. To find a topic which interests you: Look at the map of menus above. Choose a menu that interests you. Notice that the menu buttons have topics printed on them. Click on any button (topic) on the menu; you will jump directly to the text that corresponds to the topic printed on the button.

2. To coordinate this web page with Authorware presentations: The corresponding Authorware program should already be open. Go to the menu of your choice in the Authorware program and click any button which interests you. Then on the topic locator map above click on the same button on the same menu; you will jump to the text that corresponds to the Authorware presentation.

End of Topic Locator Map


PRINT: You may print this web page on your local printer if you wish. Then you can read the hard copy of the lecture text as you look at the Authorware graphics.


Beginning of Text explaining how we Estimate Population Parameters

How can we use sample data to make guesses about population parameters?

Go Back to Menu Locator Map

The next topic is estimating parameters. Let's continue with the example we used when we studied the Sampling Distribution of the Mean (SDM). As can be seen on the graphic, we have a population of SAQ scores. It is normally distributed with with mu = 150 and sigma = 30, that is, it is N(150,30). You take a random sample from this population.

Why we have to estimate population parameters. Now, what if you actually don't know the population mu? In fact, in the usual research setting you never know population parameters like mu and sigma. You have to make guesses about them from your sample data.

In the reality of science, you don't know things like the true mean (mu) of the population of SAQ scores. You only have your data (which is a group SAQ scores). Or, in another example, you don't know the true water absorption rate in an Amazonia rain forest. So you have to go measure water absorption rates on different plots of land in a rainforest. From this sample data you can make a guess about the true water absorption rate.

When we study probability distributions such as the binomial or normal, we simply assume we know the true parameter values like mu or sigma. But in actual research you don't know. And so we're getting to a more scientifically realistic situation now, where you only have data, but do not know the truth. From your data you're going to want to make guesses about the population which generated the data.

In terms of homeworks and tests, instead of the word problem giving you mu and sigma, it will give you a data set. And you will have to calculate statistics which are good guesses about population parameters. How do we estimate or guess what the value of mu is?

Estimating the mu of the population. This graphic shows an overview of all the relationships. In step 1 (in the upper left-hand corner of the graphic) you can see that the dependent variable, SAQ, has been modeled as a normal distribution. In step 2 we do a research project on spatial ability; this is equivalent to taking a sample of certain size, n, from this population of SAQ scores. So we've got a sample. In step 3 we calculate a statistic on the sample data. In this case we calculate the mean.

Estimated mu = M. The estimate of the population mean, mu, is the sample mean. That's as simple as it can be. Unfortunately, it's going to be messier when we get to estimating population variance.

The sample mean is our best guess as to what the population mean is. On the graphic, we've used a blue line to connect the sample mean with the population mean.


Estimating sigma. The second population parameter we want to estimate is the standard deviation, sigma. The current screen gives the formula for calculating an estimate of population standard deviation (sigma) from sample data.

Go Back to Menu Locator Map

Little s versus Big S: As a start, let's establish some symbols.

Little s. In the text you are now reading, we will use little s as a symbol for the statistical formula that estimates the population sigma.

On the graphic screens we're going to use a little script s as a symbol for the same thing.

So any little s, either typed or script, will stand for the formula for guessing the population sigma.

Big S. In contrast, we will continue to use a S as a symbol for the sample standard deviation. When we studied descriptive statistics, we worked a lot with big S, so you should be familiar with its formula.

Definitional formula for little s. On the graphic screen (repeated immediately above) you can see that big S and little s are very related. At the top of the graphid you can see that one way to find little s is to multiply big S by the square root of n divided by n-1. Below you can see that another way to calculate little s is to find the square root of [the sum of the squared deviations divided by n-1]. Examine these two formulas carefully and write them in your workbook.

Little s is our estimated sigma. We won't give a worked example of little s here because the computations are so similar to the computations for the sample standard deviation (big S).

Recall, as a review, big S is the square root of [the sum of the squared deviations divided by n].

The only computational difference between big S and little s whether you divide the sum of the squares by n or by n-1.

Inferential versus Descrptive Statistics. While the formulas for big S and little s are very similar, the two formulas represent two very different concepts. Little s is an inferential statistic. We use little s to infer from the sample data what the population standard deviation might be. In contrast, big S is a descriptive statistic. We use the formua for big S to describe the standard deviation of a sample.

Many books don't make the distinction between big S and little s. This is because, when n is large, there is essentially no practical difference between big S and little s. Dividing some someing by n = 50 or dividing it by n-1 = 49, may not even show up significantly in the answer.

But not making the distinction can lead to a great deal of confusion for students who see formulas that require dividing by n in one place and require dividing by n-1 in another place. Moreover, there is a real and important distinction between inferential and descriptive statistics.

A description of sample variability or dispersion is accurately given by big S.

An inference about population variability is accurately given by little s.

On the graphic screens we will use this little script s for our estimate of population sigma. In the online text you are reading we will use a little s to indicate the same thing.

Computational formula for little s. The next screen graphic shows the computational formula for little s. The computational formula is pretty easy to use on a hand calculator. Some people call it the sum of squares formula.

Overview. The next graphic shows the overall relationships among population, sample and little s. Theoretically, there is a population from which you take some sample. The sample data can be used to make a guess (estimate) about the population sigma. Little s is our best guess, or estimate, as to what the population sigma is.

In terms of the specific example shown on the graphic, we pretend that we know the population is normal and that it has mu of 150 and sigma of 30. We take a sample of a certain size n from the population. From that sample we calculate a statistic, little s, which is the best sample estimate of sigma. Now the populatin sigma is 30 but little s is very unlikely to be 30. It is just a guess. It should, however, be near 30. Depending on all the random factors in sampling, the data differ from sample to sample. So little s will differ from sample to sample.

Summary. One important point to get is that there are two formulas, little s and big S. Big S is the standard deviation of the sample. It describes the amount of spread of the sample data around their mean. Little s is a guess or estimate about the value of the population sigma.


Estimated SEM. Next we will consider how to make a guess about the value of the standard error of the mean (SEM). Recall that the the SEM is the standard deviation (sigma)of the sampling distribution of the mean (SDM). We will begin with a review of the formulas and conceptual difference between big S and little s.

Go Back to Menu Locator Map

Big S. Big S, is a descriptive statistic; it's the sample standard deviation. As a formula it is the square root of the sum of the squared deviations around the Mean over n.

Little s. In contrast, little s, is an inferential statistic because we're not describing the data now, we're making an inference about a population parameter, sigma.

Take a sample. To begin the estimation process, we take a sample of human beings from a population. Using our running example we take a sample of humans from the SAQ population. Now that we have the sample data we want to define a statistic that will be a good estimate of the SEM (standard error of the mean).

How to estimate SEM. We have two formulas; they both work equally well. As you can see on the graphic, you can estimate the standard error of the mean by dividing litttle s by the square root of n. Or you can divide big S by the square root of n-1. Both formulas will give you the same result (within rounding error). The blue line on the graphic shows that these two formulas estimate the standard deviation of the sampling distribution of the mean.

It's useful to have both these formulas because at certain times you may have big S available and at other times you may have little s available.

Example. In the SAQ example we are using, the population is normal with a sigma of 30. Supppose we took a sample of n equal 25 people from the SAQ population. As a review, remember that the Sampling Distribution of the Mean (SDM) is also normal with a standard deviation equal to the population sigma divided by the square root of n. The standard deviation of of the SDM is called the standard error of the mean (SEM). In our example, the SEM is equal to 30 over the square root of 25. So the true SEM is equal to 6.

Using big S. Suppose it turns out that our sample standard deviation (big S) is equal to 27.8. Then our estimated SEM = 27.8 divided by the square root of 24. This gives us an estimated SEM of 5.674.

Using little s. Now let's use our sample estimate of the SAQ population sigma. If big S = 27.8, then little s would be equal to 28.37 (See FYI, below). So our estimated SEM = 28.37 divided by the square root of 25. Estimated SEM = 5.674.

FYI: If you want to work on the numbers in the example above, recall that little s = big S times the square root of [n divided by n-1].


We are finished with our discussion of how to estimate population parameters. We have developed estimates of the population mu and sigma as well as an estimate of the the standard error of the mean.

Go To Estimated SEM Go To Sigma Go To Mu

Go Back to Menu Locator Map