Inferential Statistics
Basics of Inferential Statistics
The fundamental idea of inferential statistics is that the researcher is interested in some "population." In marketing research, the population is often the firm's actual and potential customers. The researcher wants information about the population -- say, for example, the age of the customers in the population.
The most accurate way of getting information about the age of everyone in the population would be to ask all of them how old they are. Obviously, if the population is large, that's a difficult undertaking. Fortunately, there is no need to spend the time and expense of contacting everyone. Inferential statistics to the rescue!
The fundamental idea of inferential statistics is that the researcher can take a "sample" that represents the population. The sample can be relatively few people, say 250 or so. By getting the information from the sample (in our case, asking people in the sample how old they are), the researcher can make an inference about the population.
Obviously, if we don't ask everyone in the population, we might be wrong. Said differently, not measuring the entire population leads to uncertainty. After all, the researcher isn't really certain about the truth unless he or she has asked ALL the customers. Sampling cannot provide 100% certainty.
How much uncertainty are we forced to live with when we use a sample instead of the whole population? Happily, the field of statistics has provided a way to quantify the uncertainty. This quantification is called "sampling error."
When done right, sampling yields some pretty darn good inferences. It's surprising how few people it takes in a representative sample to yield a reasonably certain inference about a population. For example, let's say the population is 100,000 people and a researcher obtains a representative sample of 250 people. Further, let's say the researcher examines the ages that were reported by the people in the sample and calculates the following: mean = 43.3 years and variance = 134.4 years². With that information, it can be calculated that we are 95% confident the true value of the mean age in the population is between 41.9 years and 44.7 years. We're not 100% certain because we didn't have the resources to talk to everyone in the population, but we got a very good estimate by only talking to 250 people. That is inferential statistics.
Another happy reality is that the size of the population is no hindrance to this process. As long as the sample is representative, a population of 1 million people or 100 million transactions is no problem for inferential statistics.
In this course, we will use a variety of techniques that rely on the fundamental idea of inferential statistics: We can make inferences about a population based on a representative sample. However, we won't go into the math behind those estimates. Instead, we'll use the estimates and ask managerial questions like, "How big of a sample size do we need in this managerial situation?" and "Is age really the thing we should be measuring?"