Is a probability sampling method in which you divide a population into clusters?

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

In probability sampling, it is possible to both determine which sampling units belong to which sample and the probability that each sample will be selected. The following sampling methods are examples of probability sampling:

  1. Simple Random Sampling (SRS)
  2. Stratified Sampling
  3. Cluster Sampling
  4. Systematic Sampling
  5. Multistage Sampling (in which some of the methods above are combined in stages)

Of the five methods listed above, students have the most trouble distinguishing between stratified sampling and cluster sampling.

Stratified Sampling is possible when it makes sense to partition the population into groups based on a factor that may influence the variable that is being measured. These groups are then called strata. An individual group is called a stratum. With stratified sampling one should:

  • partition the population into groups (strata)
  • obtain a simple random sample from each group (stratum)
  • collect data on each sampling unit that was randomly sampled from each group (stratum)

Stratified sampling works best when a heterogeneous population is split into fairly homogeneous groups. Under these conditions, stratification generally produces more precise estimates of the population percents than estimates that would be found from a simple random sample. Table 2.2 shows some examples of ways to obtain a stratified sample.

Table 2.2. Examples of Stratified Samples Example 1Example 2Example 3PopulationAll people in the U.S.All PSU intercollegiate athletesAll elementary students in the local school districtGroups (Strata)

4 Time Zones in the U.S. (Eastern, Central, Mountain, Pacific)

26 PSU intercollegiate teams11 different elementary schools in the local school districtObtain a Simple Random Sample500 people from each of the 4 time zones5 athletes from each of the 26 PSU teams20 students from each of the 11 elementary schoolsSample4 × 500 = 2000 selected people26 × 5 = 130 selected athletes11 × 20 = 220 selected students

Cluster Sampling is very different from Stratified Sampling. With cluster sampling, one should

  • divide the population into groups (clusters).
  • obtain a simple random sample of so many clusters from all possible clusters.
  • obtain data on every sampling unit in each of the randomly selected clusters.

It is important to note that, unlike with the strata in stratified sampling, the clusters should be microcosms, rather than subsections, of the population. Each cluster should be heterogeneous. Additionally, the statistical analysis used with cluster sampling is not only different but also more complicated than that used with stratified sampling.

Table 2.3. Examples of Cluster Samples Example 1Example 2Example 3PopulationAll people in the U.S.All PSU intercollegiate athletesAll elementary students in a local school districtGroups (Clusters)4 Time Zones in the U.S. (Eastern, Central, Mountain, Pacific.)26 PSU intercollegiate teams11 different elementary schools in the local school districtObtain a Simple Random Sample2 time zones from the 4 possible time zones8 teams from the 26 possible teams4 elementary schools from the l1 possible elementary schoolsSampleevery person in the 2 selected time zonesevery athlete on the 8 selected teamsevery student in the 4 selected elementary schools

Each of the three examples that are found in Tables 2.2 and 2.3 was used to illustrate how both stratified and cluster sampling could be accomplished. However, there are obviously times when one sampling method is preferred over the other. The following explanations add some clarification about when to use which method.

  • With Example 1: Stratified sampling would be preferred over cluster sampling, particularly if the questions of interest are affected by time zone. For example, the percentage of people watching a live sporting event on television might be highly affected by the time zone they are in. Cluster sampling really works best when there are a reasonable number of clusters relative to the entire population. In this case, selecting 2 clusters from 4 possible clusters really does not provide many advantages over simple random sampling.
  • With Example 2: Either stratified sampling or cluster sampling could be used. It would depend on what questions are being asked. For instance, consider the question "Do you agree or disagree that you receive adequate attention from the team of doctors at the Sports Medicine Clinic when injured?" The answer to this question would probably not be team dependent, so cluster sampling would be fine. In contrast, if the question of interest is "Do you agree or disagree that weather affects your performance during an athletic event?" The answer to this question would probably be influenced by whether or not the sport is played outside or inside. Consequently, stratified sampling would be preferred.
  • With Example 3: Cluster sampling would probably be better than stratified sampling if each individual elementary school appropriately represents the entire population as in a school district where students from throughout the district can attend any school. Stratified sampling could be used if the elementary schools had very different locations and served only their local neighborhood (i.e., one elementary school is located in a rural setting while another elementary school is located in an urban setting.) Again, the questions of interest would affect which sampling method should be used.

The most common method of carrying out a poll today is using Random Digit Dialing in which a machine random dials phone numbers. Some polls go even farther and have a machine conduct the interview itself rather than just dialing the number! Such "robocall polls" can be very biased because they have extremely low response rates (most people don't like speaking to a machine) and because federal law prevents such calls to cell phones. Since the people who have landline phone service tend to be older than people who have cell phone service only, another potential source of bias is introduced. National polling organizations that use random digit dialing in conducting interviewer based polls are very careful to match the number of landline versus cell phones to the population they are trying to survey.

Non-probability Sampling Section 

The following sampling methods that are listed in your text are types of non-probability sampling that should be avoided:

  1. volunteer samples
  2. haphazard (convenience) samples

Since such non-probability sampling methods are based on human choice rather than random selection, a statistical theory cannot explain how they might behave and potential sources of bias are rampant. In your textbook, the two types of non-probability samples listed above are called "sampling disasters."

Read the article: "How Polls are Conducted" by the Gallup organization available in Canvas.

The article provides great insight into how major polls are conducted. When you are finished reading this article you may want to go to the Gallup Poll Website and see the results from recent Gallup polls. Another excellent source of public opinion polls on a wide variety of topics using solid sampling methodology is the Pew Research Center Website. When you read one of the summary reports on the Pew site, there is a link (in the upper right corner) to the complete report giving more detailed results and a full description of their methodology as well as a link to the actual questionnaire used in the survey so you can judge whether there might be bias in the wording of their survey.

It is important to be mindful of margin or error as discussed in this article. We all need to remember that public opinion on a given topic cannot be appropriately measured with one question that is only asked on one poll. Such results only provide a snapshot at that moment under certain conditions. The concept of repeating procedures over different conditions and times leads to more valuable and durable results. Within this section of the Gallup article, there is also an error: "in 95 out of those 100 polls, his rating would be between 46% and 54%." This should instead say that in an expected 95 out of those 100 polls, the true population percent would be within the confidence interval calculated. In 5 of those surveys, the confidence interval would not contain the population percent.

Which sampling method is an example of cluster sampling?

An example of single-stage cluster sampling – An NGO wants to create a sample of girls across five neighboring towns to provide education. Using single-stage sampling, the NGO randomly selects towns (clusters) to form a sample and extend help to the girls deprived of education in those towns.

What is cluster sampling also known as?

Cluster sampling is also known as multi-stage sampling as sample clusters are selected at the first stage and then further elements are sampled from selected clusters.

Is a probability sampling method using which the main segment is divided into clusters usually using geographic and demographic segmentation parameters?

Cluster sampling: Cluster sampling is a probability sampling method using which the main segment is divided into clusters, usually using geographic and demographic segmentation parameters.

What are the 4 types of probability sampling?

There are four commonly used types of probability sampling designs:.
Simple random sampling..
Stratified sampling..
Systematic sampling..
Cluster sampling..