Sampling error is the difference between the z-value and the population parameter.


One area of concern in inferential statistics is the estimation of the population parameter from the sample statistic. It is important to realize the order here. The sample statistic is calculated from the sample data and the population parameter is inferred (or estimated) from this sample statistic. Let me say that again: Statistics are calculated, parameters are estimated.

We talked about problems of obtaining the value of the parameter earlier in the course when we talked about sampling techniques.

Another area of inferential statistics is sample size determination. That is, how large of a sample should be taken to make an accurate estimation. In these cases, the statistics can't be used since the sample hasn't been taken yet.

Point Estimates

There are two types of estimates we will find: Point Estimates and Interval Estimates. The point estimate is the single best value.

A good estimator must satisfy three conditions:

  • Unbiased: The expected value of the estimator must be equal to the mean of the parameter
  • Consistent: The value of the estimator approaches the value of the parameter as the sample size increases
  • Relatively Efficient: The estimator has the smallest variance of all estimators which could be used

Confidence Intervals

The point estimate is going to be different from the population parameter because due to the sampling error, and there is no way to know who close it is to the actual parameter. For this reason, statisticians like to give an interval estimate which is a range of values used to estimate the parameter.

A confidence interval is an interval estimate with a specific level of confidence. A level of confidence is the probability that the interval estimate will contain the parameter. The level of confidence is 1 - alpha. 1-alpha area lies within the confidence interval.

Maximum Error of the Estimate

The maximum error of the estimate is denoted by E and is one-half the width of the confidence interval. The basic confidence interval for a symmetric distribution is set up to be the point estimate minus the maximum error of the estimate is less than the true population parameter which is less than the point estimate plus the maximum error of the estimate. This formula will work for means and proportions because they will use the Z or T distributions which are symmetric. Later, we will talk about variances, which don't use a symmetric distribution, and the formula will be different.

Area in Tails

Since the level of confidence is 1-alpha, the amount in the tails is alpha. There is a notation in statistics which means the score which has the specified area in the right tail.

Examples:

  • Z(0.05) = 1.645 (the Z-score which has 0.05 to the right, and 0.4500 between 0 and it)
  • Z(0.10) = 1.282 (the Z-score which has 0.10 to the right, and 0.4000 between 0 and it).

As a shorthand notation, the () are usually dropped, and the probability written as a subscript. The greek letter alpha is used represent the area in both tails for a confidence interval, and so alpha/2 will be the area in one tail.

Here are some common values

Confidence
Level
Area between
0 and z-score
Area in one
tail (alpha/2)
z-score
50% 0.2500 0.2500 0.674
80% 0.4000 0.1000 1.282
90% 0.4500 0.0500 1.645
95% 0.4750 0.0250 1.960
98% 0.4900 0.0100 2.326
99% 0.4950 0.0050 2.576

Notice in the above table, that the area between 0 and the z-score is simply one-half of the confidence level. So, if there is a confidence level which isn't given above, all you need to do to find it is divide the confidence level by two, and then look up the area in the inside part of the Z-table and look up the z-score on the outside.

Also notice - if you look at the student's t distribution, the top row is a level of confidence, and the bottom row is the z-score. In fact, this is where I got the extra digit of accuracy from.


Table of Contents

Random Sampling

Andrew F. Siegel, Michael R. Wagner, in Practical Business Statistics (Eighth Edition), 2022

A Sample Statistic and a Population Parameter

A sample statistic (or just statistic) is defined as any number computed from your sample data. Examples include the sample average, median, sample standard deviation, and percentiles. A statistic is a random variable because it is based on data obtained by random sampling, which is a random experiment. Therefore, a statistic is known and random.

A population parameter (or just parameter) is defined as any number computed for the entire population. Examples include the population mean and population standard deviation. A parameter is a fixed number because no randomness is involved. However, you will not usually have data available for the entire population. Therefore, a parameter is unknown and fixed.

There is often a natural correspondence between statistics and parameters. For each population parameter (a number you would like to know but cannot know exactly), there is a sample statistic computed from data that represents your best information about the unknown parameter. The description of such a sample statistic is called an estimator of the population parameter, and the actual number computed from the data is called an estimate of the population parameter. For example, the “sample average” is an estimator of the population mean, and in a particular case, the estimate might be “18.3.” The error of estimation is defined as the estimator (or estimate) minus the population parameter and is usually unknown.

An unbiased estimator is neither systematically too high nor too low compared with the corresponding population parameter. This is a desirable property for an estimator. Technically, an estimator is unbiased if its mean value (the mean of its sampling distribution) is equal to the population parameter.

Many commonly used statistical estimators are unbiased or approximately unbiased. For example, the sample average is an unbiased estimator of the population mean μ. Of course, for any given set of data, will (usually) be high or low relative to the population mean, μ. If you were to repeat the sampling process many times, computing a new for each sample, the results would average out close to μ and thus would not be systematically too high or too low.

The sample standard deviation S is (perhaps surprisingly) a biased estimator of the population standard deviation σ, although it is approximately unbiased. Its square, the sample variance S2, is an unbiased estimator of the population variance σ2. For a binomial situation, the sample proportion p is an unbiased estimator of the population proportion π.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128200254000087

The idea of interval estimates

Stephen C. Loftus, in Basic Statistics with R, 2022

12.2 Point and interval estimates

In statistics, we calculate sample statistics in order to estimate our population parameters. What we have seen so far are point estimates, or a single numeric value used to estimate the corresponding population parameter. The sample average is the point estimate for the population average μ. The sample variance s2 is the point estimate for the population variance σ2. The sample proportion is the point estimate for the population proportion p. All of these estimates represent our “best guess” at the value of the population parameter.

Our point estimates or sample statistics give us a fair bit of information, but as we know, sample statistics will vary from sample to sample. If we collected several samples, we would have a bunch of plausible values for our population parameter. It seems reasonable that under certain circumstances, we would want a range of plausible values as an estimate, rather than a singular estimate. This type of estimate is exactly what interval estimates are able to give us. An interval estimate is a interval of possible values that estimates our unknown population parameters. Interval estimates are probably most familiar to us through their use in political polling, as approval ratings and election leads are often stated as an interval. In these settings, an interval estimate is commonly given in the form of a point estimate plus-or-minus a margin of error. The margin of error gives us an idea of how precise the point estimate is, expressing the amount of variability that may exist in our estimate due to uncontrollable error.

For example, prior to the 2017 Virginia gubernatorial election, Quinnipiac University conducted a poll and found that the Democratic candidate Ralph Northam led the Republican, Ed Gillespie, by 9 points plus-or-minus 3.7 points [35]. In this case, the point estimate for Northam's lead is 9 points and the margin of error is 3.7 points. This implied that in reality it would be plausible that Northam could lead the race by any number between 9−3.7=5.3 and 9+4.7=12.7 points.

Under ideal circumstances, our margin of error is able to give us an idea of how confident we can be in the estimate of our population parameter. Say we have two interval estimates where our point estimates are the same but our margins of error differ: 6±2 and 6±4. Our first interval estimate implies that the plausible values of our parameter are from 4 to 8, while the second implies a range of 2 to 10. Because the range of plausible values is smaller for the first interval estimate, we would be inclined to trust the results more. However, there are many things that go into our margin of error: the amount of variability inherent in our data, the sample size, and most importantly the probability that our interval is “right.”

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128207888000249

Linear Regression Models

Kandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Second Edition), 2015

8.2.4 Properties of the Least-Squares Estimators for the Model Y = β0 + β1x + ɛ

We discussed in Chapter 4 the concept of sampling distribution of sample statistics such as that of . Similarly, knowledge of the distributional properties of the least-squares estimators βˆ0 and βˆ1 is necessary to allow any statistical inferences to be made about them. The following result gives the sampling distribution of the least-squares estimators.

Theorem 8.2.1

Let Y = β0 + β1x + ε be a simple linear regression model with ε ~ N(0, σ2), and let the errors εi associated with different observations yi (i = 1, …, N) be independent. Then

(a)

βˆ0 and βˆ1 have normal distributions.

(b)

The mean and variance are given by

Eβˆ0=β0,Varβˆ0=1n+x¯2Sxxσ2,

and

Eβˆ1=β1,Varβˆ1=σ2Sxx,

where Sxx=∑i=1nxi2−1n∑i=1nxi2. In particular, the least-squares estimators βˆ0 and βˆ1 are unbiased estimators of β0 and β1, respectively.

Proof. We know that

βˆ1=S xySxx=1Sxx∑i=1nxi−x¯Yi−Y¯=1S xx∑i=1nxi−x¯Yi−Y¯ ∑i=1nxi−x¯ =1Sxx∑i=1nxi−x¯Yi

where the last equality follows from the fact that∑i=1nxi−x¯=∑i=1nxi−nx¯=0. Because Yi is normally distributed, the sum 1S xx∑i=1nxi−x¯Yi is also normal. Furthermore,

E βˆ1=1Sxx∑i=1nxi−x¯E[ Yi]=1Sxx∑i=1nxi−x¯β0+β1xi=β0 Sxx∑i=1nxi−x¯+β1Sxx∑i=1nxi−x¯ xi=β11Sxx∑i=1nxi−x¯xi=β11Sxx∑i=1nxi2−x¯∑i=1nxi =β11Sxx∑i =1nxi2−∑i=1 nxi∑i=1nxin=β11Sxx∑i=1nxi2 −∑i=1nxi2 n=β11SxxSxx=β1.

For the variance we have,

Varβˆ1=Var1Sxx∑i=1nxi−x¯Yi=1Sxx2∑i=1nxi−x¯2 VarYi,sincetheYi'sareindependent=σ21Sxx2∑i=1nxi−x¯2VarYi=Varβ0+β1+εi =Varεi=σ2=σ2Sxx.

Note that both and βˆ1 are normal random variables. It can be shown that they are also independent (see Exercise 8.3.3). Because βˆ0=y¯−βˆ1x¯ is a linear combination of and βˆ1, it is also normal. Now,

Eβˆ0=EY¯−βˆ1x¯=EY¯−x¯Eβˆ1=E1n∑i=1nYi−x¯β1=1n∑i=1 nβ0+β1x−x¯β1=β0+x¯β 1−x¯β1=β0.

The variance of βˆ0 is given by

Varβˆ 0=VarY¯−βˆ1x¯=VarY¯+x¯2Varβˆ1,sinceY¯andβˆ1areindependent=σ2n+x¯2 σ2Sxx=1n+x¯2Sxxσ2.▪

If an estimator θˆ is a linear combination of the sample observations and has a variance that is less than or equal to that of any other estimator that is also a linear combination of the sample observations, then θˆ is said to be a best linear unbiased estimator (BLUE) for θ. The following result states that among all unbiased estimators for β0 and β1 which are linear in Yi, the least-square estimators have the smallest variance.

Gauss-Markov Theorem

Theorem 8.2.2

Let Y = β0 + β1x + ε be the simple regression model such that for each xi fixed, each Yi is an observable random variable and each ε = εi, i = 1, 2, …, n is an unobservable random variable. Also, let the random variable εi be such that E[εi] = 0, Var(εi) = σ2 and Cov(εi, εj) = 0, if i ≠ j. Then the least-squares estimators for β0 and β1 are best linear unbiased estimators.

It is important to note that even when the error variances are not constant, there still can exist unbiased least-square estimators, but the least-squares estimators do not have minimum variance.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124171138000084

Linear regression models

Kandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Third Edition), 2021

7.2.4 Properties of the least-squares estimators for the model Y = β0 + β1x + ε

We discussed in Chapter 4 the concept of sampling distribution of sample statistics such as that of . Similarly, knowledge of the distributional properties of the least-squares estimators βˆ0 and βˆ1 is necessary to allow any statistical inferences to be made about them. The following result gives the sampling distribution of the least-squares estimators.

Theorem 7.2.1

Let Y = β0 + β1x + ε be a simple linear regression model with ε ∼ N(0, σ2), and let the errors εi associated with different observations yi (i = 1, …, N) be independent. Then

(a)

βˆ0 and βˆ 1 have normal distributions.

(b)

The mean and variance are given by

E(βˆ0)=β0,Var(βˆ0)=(1n+x¯2Sxx)σ2,

and

E(βˆ1)=β1,Var (βˆ1)=σ2Sxx,

where Sxx=∑i=1nxi2−1n(∑i=1n xi)2. In particular, the least-squares estimators βˆ0 and βˆ1 are unbiased estimators of β0 and β1, respectively.

Proof.

We know that

βˆ 1=SxySxx=1 Sxx∑i=1n(xi−x¯)(Yi−Y¯)=1Sxx[∑i=1n(x i−x¯)Yi−Y¯ ∑i=1n(xi−x¯ )]=1Sxx∑i=1n (xi−x¯)Yi,

where the last equality follows from the fact that ∑i=1n(xi−x¯) =∑i=1nxi−nx¯=0. Because Yi is normally distributed, the sum 1Sxx∑i=1n (xi−x¯)Yi is also normal. Furthermore,

E [βˆ1]=1Sxx∑i=1n(xi−x¯)E[ Yi]=1Sxx∑i=1n (xi−x¯)(β0+ β1xi)=β0Sxx ∑i=1n(xi−x¯) +β1Sxx∑i=1n(xi−x¯)xi=β11Sxx∑i=1n(xi −x¯)xi=β11Sxx[∑i=1nxi2−x¯∑i=1nxi]=β1 1Sxx[∑i=1nxi2−(∑i=1nxi)(∑i=1nxin)]=β11Sxx[∑i=1nxi 2−(∑i=1nxi) 2n]=β11SxxSxx=β1.

For the variance we have,

Var[βˆ1]=Var[1Sxx ∑i=1n(xi−x¯)Yi]=1Sxx2∑i=1n(xi−x¯)2Var[Yi](sincetheYi′sareindependent)=σ21Sxx 2∑i=1n(xi−x ¯)2(Var(Yi)=Var( β0+β1+εi)=Var(εi)=σ2) =σ2Sxx.

Note that both and βˆ1 are normal random variables. It can be shown that they are also independent (see Exercise 7.3.3). Because βˆ0=y¯−βˆ1 x¯ is a linear combination of and βˆ1, it is also normal. Now,

E[βˆ0]=E[Y¯−βˆ1x¯]=E[Y¯]−x¯E[βˆ1 ]=E[1n∑i=1nYi ]−x¯β1=1n∑i=1n(β0+β1x)−x ¯β1=β0+x¯β1−x¯β1=β0.

The variance of βˆ0 is given by

Var[βˆ0]=Var[Y¯−β ˆ1x¯]=Var[Y¯]+x¯2Var[βˆ1 ](sinceY¯ andβˆ1areindependent)=σ2n+x¯2σ2S xx=(1n+x¯2Sxx)σ2.

If an estimator θˆ is a linear combination of the sample observations and has a variance that is less than or equal to that of any other estimator that is also a linear combination of the sample observations, then θˆ is said to be a best linear unbiased estimator (BLUE) for θ. The following result states that among all unbiased estimators for β0 and β1 that are linear in Yi, the least-square estimators have the smallest variance.

Gauss–Markov theorem

Theorem 7.2.2

Let Y = β0 +β1x +ε be the simple regression model such that for each xi fixed, each Yi is an observable random variable and each ε = εi, i = 1, 2, …, n is an unobservable random variable. Also, let the random variable εi be such that E[εi] = 0, Var(εi) = σ2 and Cov(εi, εj) = 0, if i ≠ j. Then the least-squares estimators for β0 and β1 are best linear unbiased estimators.

It is important to note that even when the error variances are not constant, there still can exist unbiased least-square estimators, but the least-squares estimators do not have minimum variance.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128178157000075

Hypothesis Testing and Confidence Intervals

T.R. Konold, X. Fan, in International Encyclopedia of Education (Third Edition), 2010

Test Statistics and Probability Estimates

In hypothesis testing, to assess the probability of observing values more extreme than a given sample statistic under H0, we translate the observed sample statistic value into a test statistic. For example, the test statistic for evaluating a single sample mean () when the population standard deviation (σ) is known is:

Z obs=X¯−μσX¯

This zobs test statistic represents the distance between the sample mean and hypothesized population mean in standard deviation units, where the standard deviation is the standard deviation of the sampling distribution of the mean (i.e., standard error of the mean: (σX¯=(σ/n).

A larger absolute value of zobs is further away from the hypothesized population mean. The exact probability of observing a value of zobs or larger can be obtained from a normal distribution table.

Application of the z test statistic requires that the population standard deviation σ be known. When σ is unknown and must be estimated with the sample standard deviation (s), the estimated standard error of the mean is sX¯=sn, and the test statistic becomes:

tobs=X¯−μSX¯

In contrast to the single z distribution, there exists a family of t distributions, as defined by degrees of freedom (df) that are based on sample size (df = n–1). Figure 2 presents three t distributions with different df (df = 5, 15, or ∞). Because the t distributions have different shapes, probabilities beyond a given t value depend not only on the value of t, but also on the df on which that statistic is based. As sample size n increases, t distributions converge on the z distribution. Beyond df = 120, the difference between z and t distributions is negligible, and a t-test practically becomes a z-test.

Sampling error is the difference between the z-value and the population parameter.

Figure 2. Three t-distributions with df = 5, df = 15, and df = ∞.

Test statistics (e.g., F statistic, χ2 statistic) other than z and t may be needed as required by a statistical analysis. Regardless of the specific test statistic used in a research situation, the logic and procedure described here for hypothesis testing remain the same.

Decision Rules Based on Test Statistic

The discussion above focused on testing H0 by comparing the probability of obtaining sample statistic values more extreme than a given sample statistic, assuming the true H0, against a predetermined threshold probability level (i.e., α = 0.05). In practice, this is done through a direct comparison of the observed sample test statistic (e.g., zobs, tobs, Fobs) against the critical value of the statistic (e.g., zcv, tcv, or Fcv) as determined by the predetermined threshold probability level α. For example, the critical value of t (i.e., tcv) associated with a given value of α and a given df is readily available from a table of t distributions. By comparing tobs against tcv, we can establish the following decision rule for testing H0:

If|tobs|≥|tcv|→ rejectH0If|tobs|< |tcv|→failtorejectH0

This decision rule is a direct translation of our previous probability-based decision rule for testing H0:

Ifp( θˆ|H0)≤α→rejectH0Ifp(θˆ|H0)>α→failtorejectH0

For example, our educator researcher previously hypothesized mean student achievement test score of 72 for a population (H0: μ = 72), and she would like to test this H0 against the nondirectional alternative hypothesis Ha: μ ≠ 72 for α = 0.05. From a random sample of 100 participants, the researcher obtained X¯=68, and s = 8 (i.e., sX¯= sn=0.8). The test statistic is:

tobs=X¯ −μsX¯=68−720.8=−5

As | tobs | = 5 exceeds the tcv [t(.975, df = 99) = 2.28, α = .05], the H0: μ = 72 is rejected, as it is very unlikely (p < 0.05) to observe sample mean values equal to or greater than X¯=68 if H0 were true.

Either a nondirectional or directional alternative hypothesis may be evaluated. When a nondirection alternative hypothesis is evaluated (i.e., Ha: μ ≠ K), the overall α (e.g., α = 0.05) is evenly divided into the two tails of the sampling distribution of H0, to accommodate the possibility that the direction of difference between and μ can be either positive or negative, as shown in Figure 3 (two solid lines at the tails).

Sampling error is the difference between the z-value and the population parameter.

Figure 3. The α at distribution tails for nondirectional (two-tailed) and directional (upper tail) tests.

When a directional alternative hypothesis is used (e.g., Ha: μ > K), we only consider the critical value at one tail that corresponds to the hypothesized direction in Ha, and the overall α is located at this tail, as shown in Figure 3 (dashed line at the right tail for Ha: μ > K). Nondirectional and directional alternative hypotheses are often referred to as two-tailed and one-tailed tests, respectively. For a directional (one-tailed) test, because all α is located at one tail, the critical value of the test statistic will be smaller in absolute value than that in a nondirectional test. For example, for a z-test with α = 0.05, the critical values are ±1.96 for a two-tailed test (nondirectional test), and the critical value for a one-tailed (directional) test (Ha: μ > K) is only 1.65. This makes it easier for a one-tailed test to reject H0, as long as the test statistic is in the hypothesized direction. Because of this, if the test statistic is in the hypothesized direction, the one-tailed test has more statistical power (see description of statistical power below) over the two-tailed test. When the test statistic is not in the hypothesized direction, however, the test automatically fails to reject the H0.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780080448947013373

Inferences on a Single Population

Donna L. Mohr, ... Rudolf J. Freund, in Statistical Methods (Fourth Edition), 2022

Solution to Example 4.1 Perceptions of Area

We can now solve the problem in Example 4.1 by providing a confidence interval for the mean exponent. We first calculate the sample statistics: y¯=0.9225 and s=0.1652. The t statistic is based on 24−1=23 degrees of freedom, and since we want a 95% confidence interval we use t0.05∕2=2.069 (rounded). The 0.95 confidence interval on μ is given by

0.9225±(2.069)(0.165)/24or0.9225±0.070, or from 0.8527 to 0.9923.

Thus we are 95% confident that the true mean exponent is between 0.85 and 0.99, rounded to two decimal places. This seems to imply that, on the average, people tend to underestimate the relative areas.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128230435000047

Inference for two quantitative variables

Stephen C. Loftus, in Basic Statistics with R, 2022

18.2.4 Calculate the test statistic

We begin the process of calculating our test statistic for the test for correlations with the general formula for the test statistic.

t= – Null Hypothesis Value Standard Error.

Now in our previous tests we were able to fill in the various portions of this formula through our knowledge of our sample or the central limit theorem. However, for the sample correlation the central limit theorem does not hold. Does this mean that we have no test statistic that we can use? Not at all. It turns out that there exists a test statistic that directly relates our sample correlation and sample size to a specific—and familiar—probability distribution. Specifically, the test statistic for our test for correlations is

t=r1−r2n−2.

In our speed and stopping distance example, our test statistic would be

t=0.80691−0.8069250−2=9.464.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128207888000316

Hypothesis tests for two parameters

Stephen C. Loftus, in Basic Statistics with R, 2022

15.3.4 Calculate the test statistic

As with each of our hypothesis tests, we begin the process of calculating our test statistic with our general formula for the test statistic.

t= – Null Hypothesis ValueStandard Error.

We will need to fill in each of these components from our data, our hypothesis, and knowledge about our sample statistics. First, we need to find the sample statistic that estimates the parameters—or combination of parameters—in our null hypothesis. Remember, our null hypothesis for the two-sample t-test for means is

H0:μ1−μ2=0.

So we need a combination of sample statistics that estimates μ1−μ2. We know that x¯1 estimates μ1 and x¯2 estimates μ2, so it should not be surprising that x¯1−x¯2 estimates μ1−μ2. Additionally, we know from our null hypothesis that our null value is 0—as it will be for all our two-sample tests—so our test statistic to the point is

t=x¯1−x¯2−0Standard Error.

All that is missing is the standard error of our sample statistics. Similar to our two-sample test for proportions, we will turn to the central limit theorem to find our standard error. By the central limit theorem, we know that for either large enough sample sizes—n⩾30—or for Normally distributed data our sample mean will follow a Normal distribution, namely

x¯∼N(μ,σ2n).

With this result in mind, we can apply this result individually to both of our samples, finding that each sample mean will follow its own Normal distribution.

x¯1∼N(μ1,σ12n1)x¯2∼ N(μ2,σ22n 2).

However, we need the standard error of x¯1−x¯2. Based on what we know about combining Normal distributions, we find that x¯1−x¯2 also follows a Normal distribution, specifically

x¯1−x¯2 ∼N(μ1−μ2, σ12n1+σ2 2n2)

which would make the standard error, of x¯1−x¯2 equal to

s.e. (x¯1−x¯ 2)=σ12n1+σ22n2.

How do we estimate this quantity, specifically σ12n1+σ22n2? There are a couple of ways to do this, but it comes down to one simple question: Does σ12=σ22? The answer to this question will affect not only our standard error, and thus our test statistic, but also the distribution that the test statistic follows.

If the variances are equal, then we can combine their information into a single estimate for a common pooled population variance, or σp2. This is similar to how we pooled our information to get our proportion estimate in our two-sample test for proportions. If this is the case, we can estimate our pooled population variance σp2 using the pooled sample variance sp2, calculated using

sp2=(n1−1)s12+(n2−1 )s22n1+n2 −2

where s12 and s22 are our sample variances for sample one and sample two, respectively, and n1 and n2 are our sample sizes. This pooled sample variance can be substituted for our pooled population variance in the standard error of x¯1−x¯2, with a little factoring making the standard error now

s.e.(x¯1−x¯2)=sp 2(1n1+1 n2).

We then substitute this standard error into the formula for our test statistic, making our test statistic for equal variances

t=x¯1−x¯2 sp2(1n1 +1n2).

Our other option is if our variances are not equal, or σ12≠σ22. If this is the case, we need two statistics that estimate σ1 2 and σ22 in order to use these statistics in our standard error. It seems reasonable that our two sample variances should estimate these population variances so we can substitute these sample variances for our population variances. This makes our standard error

s.e.(x¯1−x¯2)=s12 n1+s22n 2.

So, in this case our test statistic for unequal variances

t=x¯ 1−x¯2s 12n1+s22 n2.

This of course brings up a question: How do we know if the variances are equal? They are parameters, so of course unknown. There are formal hypothesis tests that look at if variances are equal, but in practice we will use a standard rule of thumb. If one of the sample variances is more than four times larger than the other—that is, s12s22 >4 or s12s22 <14—we will assume that the variances are not equal. Otherwise, we will assume equivalent variances.

For our conformity example, we have all the various parts of our test statistic. We first look at the ratio of the variances to see if we will conclude if variances are equal or not. Since 19.08727.855=0.6852 is between 14 and 4, we will say that the variances are equal and use that formula. Thus, we need to calculate our pooled variance sp2

sp2=( 23−1)×19.087+(22−1)×27.855 23+22−2=23.37.

Now, we can plug this into our test statistic to get

t= 14.217−9.95523.37(123+122)=2.956.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128207888000286

Simple linear regression

Stephen C. Loftus, in Basic Statistics with R, 2022

19.10.4 Calculate our test statistic

As with each of our previous hypothesis tests, we begin the process of calculating our test statistic with our general formula for the test statistic.

t= – Null Hypothesis ValueStandard Error.

We will need to fill in each of these components from our data, our hypothesis, and knowledge about our sample statistics. First, we need to find the sample statistic that estimates the parameter in our null hypothesis. In the regression case, our null hypothesis was

H0:β1=0.

Thus we need a sample statistic calculated from our data that estimates β1. It seems fairly clear that the slope from our simple linear regression βˆ1 is a reasonable choice for our sample statistic. Additionally, looking at our null hypothesis we can see that our null hypothesis value is equal to 0. Thus, our test statistic at this stage is

t=βˆ1− 0Standard Error.

All that remains is the standard error of our sample statistic, or s.e.(βˆ 1). It turns out that under certain conditions—we will discuss them shortly—the regression slope estimate βˆ1 follows a Normal distribution, specifically

βˆ1∼N(β1,σˆ2(n−1)sx2).

This implies that our standard error for βˆ1 will be s.e.(βˆ1 )=σˆ2(n−1)sx2, where σˆ 2 is our mean squared error from our regression, n is our sample size, and sx2 is the sample variance of our predictor. Thus our test statistic for this hypothesis test will be

t=βˆ1σˆ2(n−1)sx2 .

For our Old Faithful regression, plugging in the various components of our test statistic gives us

t=10.729634.9755271×1.3027=34.09.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128207888000328

Is sampling error is the difference between a sample statistic and a population parameter?

Sampling error is the difference between a population parameter and a sample statistic used to estimate it. For example, the difference between a population mean and a sample mean is sampling error.

What is the difference between sample parameter and population parameter?

A parameter is a measure that describes the whole population. A statistic is a measure that describes the sample. You can use estimation or hypothesis testing to estimate how likely it is that a sample statistic differs from the population parameter.

When sampling from a population the sample mean will be the same as the population mean?

Comparative Table.

Is the difference between the sampling value of the sample and the parameter value of the macro is called?

Therefore, Sampling error is the difference between a sample statistic and the corresponding parameter.