Wednesday, April 24, 2013


Primary data collection is data that is collected directly by a specific researcher for a specific problem that they would like to solve.  One popular type of primary data collection is sampling.  Sampling occurs when the researcher gathers information from the subset of a population to make inferences.  It is important to remember when sampling to randomize who you gather data from or  you have the possibility of gathering false results.

The most famous case of false information gathered from sampling was the Literary Digest Poll in the election of 1936.  The Literary Digest send out a poll to 10 million subscribers.  They received approximately 2.3 responses back.  These responses predicted that the Republican candidate, Alf Landon, would win by a landslide, when in reality Franklin Roosevelt ended up taking home the win.  So why was the Literary Digest incorrect?  They gathered information from a similar, select group of people.  This group of people was mostly upper class, and so it can be assumed that they would probably vote for the Republican candidate.  Had the Literary Digest send out their poll to a randomized group of people they probably would have gotten a more accurate result and avoided the embarrassment that this poll caused them.  You can read more about this famous incident here -

When sampling there are two types of error that can occur, a sampling error and a non-sampling error.  A sampling error occurs in the difference between the truth of the target population and its sample estimate where as the non-sampling error is a measurement error - which are mistakes made by the researcher such as recording the information incorrectly, gathering it incorrectly etc.  For more information about sampling error visit this article -

What are some other famous incidents with sampling and non-sampling errors in history?

No comments:

Post a Comment