To make estimations about a population, statisticians use a random sample representative of the population. For example, if you weigh 50 random American women, you could estimate the weight of all American women based on their average weight. Sampling error occurs when your sample results deviate from the true population value. That is, if your 50 women yielded an average weight of 135 pounds when the true average was 150 pounds, then your sampling error is -15 (the observed minus actual), meaning you underestimated the true value by 15 points. Because the true value is seldom known, statisticians use other estimates such as standard error and confidence intervals to estimate the sampling error.
Calculate the percentage you are measuring. For instance, if you would like to know what percentage of students at a given school smoke cigarettes, then take a random sample (let's say n, our sample size, equals 30), have them fill out an anonymous survey and calculate the percentage of students who say they smoke. For illustration sake, let's say six students said they smoke. Then the percentage who smoke = (# who smoke) / (total # of students measured) x 100% = 6 / 30 x 100% = 20%.
Calculate the standard error. Because we do not know the actual percentage of students who smoke, we can only approximate the sampling error by calculating the standard error. In statistics, we use proportion, p, instead of percentages for calculations, so let's convert 20% to a proportion. Dividing 20% by 100%, you get p = 0.20. Standard Error (SE) for large sample sizes = sqrt[ p x (1 – p) / n ], where sqrt[x] means to take the square root of x. In this example, we get SE = sqrt[ 0.2 x (0.8) / 30 ] = sqrt[ 0.00533…] ? 0.073.
Create a confidence interval. Lower bound: estimated proportion – 1.96 x SE = 0.2 – 1.96 (0.073) = 0.0569 Upper bound: estimated proportion + 1.96 x SE = 0.2 + 1.96 (0.073) = 0.343 So we would say we are 95% confident the true proportion of smokers is between 0.0569 and 0.343, or as a percentage, 5.69% or 34.3% of students smoke. This wide spread indicates the possibility of a rather large sampling error.
Measure everyone to compute the exact sampling error. Make all students in the school complete the anonymous survey and compute the percentage of students who said they smoke. Let's say it was 120 out of 800 students that said they smoked, then our percentage is 120 / 800 x 100% = 15%. Therefore, our "sampling error" = (estimated) – (actual) = 20 - 15 = 5. The closer to zero, the better our estimation and the smaller our sampling error is said to be. In a real world situation, however, you are not likely to know the actual value and will have to rely on the SE and confidence interval for interpretation.