# Need discussion for BUS308 Statistics for Managers details & lecture below.

Read Lecture 2. React to the material in the lecture. Is there anything you found to be unclear? How could you use these ideas within your degree (business administration) area?

BUS308 Week 2 Lecture 2

Statistical Testing for Differences – Part 1

After reading this lecture, the student should know:

1. How statistical distributions are used in hypothesis testing.

2. How to interpret the F test (both options) produced by Excel

3. How to interpret the T-test produced by Excel

Overview

Lecture 1 introduced the logic of statistical testing using the hypothesis testing procedure.

It also mentioned that we will be looking at two different tests this week. The t-test is used to

determine if means differ, from either a standard or claim or from another group. The F-test is

used to examine variance differences between groups.

This lecture starts by looking at statistical distributions – they underline the entire

statistical testing approach. They are kind of like the detective’s base belief that crimes are

committed for only a couple of reasons – money, vengeance, or love. The statistical distribution

that underlies each test assumes that statistical measures (such as the F value when comparing

variances and the t value when looking at means) follow a particular pattern, and this can be used

to make decisions.

While the underlying distributions differ for the different tests we will be looking at

throughout the course, they all have some basic similarities that allow us to examine the t

distribution and extrapolate from it to interpreting results based on other distributions.

Distributions. The basic logic for all statistical tests: If the null hypothesis claim is

correct, then the distribution of the statistical outcome will be distributed around a central value,

and larger and smaller values will be increasingly rare. At some point (and we define this as our

alpha value), we can say that the likelihood of getting a difference this large is extremely

unlikely and we will say that our results do not seem to come from a population that matches the

claims of the null hypothesis.

Note that this logic has several key elements:

1. The test is based on an assumption that the null hypothesis is correct. This gives us a

starting point, even if later proven wrong.

2. All sample results are turned into a statistic that matches the test selected (for

example, the F statistic when using the F-test, or the t-statistic when using the T-test.)

3. The calculated statistic is compared to a related statistical distribution to see how

likely an outcome we have.

4. The larger the test statistic, the more unlikely it is that the result matches or comes

from the population described by the null hypothesis claim.

We will demonstrate these ideas by looking at the questions being asked in this week’s

homework. We will show results of the related Excel tests, and discuss how to interpret the

output.

We need to remember that seeing different value (mean, variance, etc.) from different

samples does not tell us that the population parameters we are estimating are, in fact, different.

The one thing we know about sampling is that each sample will be a bit different. They

generally provide a “close enough” estimate to the population values of concern for decision and

action purposes. But, they are not an exact match. This difference is examined by the use of the

statistical tests, which tell us how much importance we should attach to observed differences.

Lecture Examples

The lectures for each week will also look at our class question of whether or not males

and females are paid equally for equal work. These additional analyses provide some different

clues on what the data is trying to tell us about company pay practices.

While your analysis will focus directly on the salary that males and females are being

paid, the lecture examples will use an alternate method of examining pay practices. Many

compensation professionals often use a relative pay measure called the “comparison-ratio,” or

compa-ratio, to examine pay patterns within the company.

Some background on this measure. Many companies use grades to group jobs of equal

value to the company into groups that have a similar pay range – the values that a company is

willing to pay employees for the job. (As strong as a performer a mail room clerk is, they will

rarely be paid the same as the CEO.) Many companies will set the middle of this range, the

midpoint, as the average salary that that market pays to hire someone into the job. This is how

companies remain competitive in their hiring.

Now, compensation professionals will generally want to analyze how the company is

paying employees relative to these market rates (as summarized by the midpoint). One approach

is to divide each employee’s salary by its related midpoint. The outcome is the compa-ratio

which is considered an alternate measure of pay that eliminates the impact of different grades.

The compa-ratio reports pay as a ratio of the actual salary divided by the salary grade’s midpoint.

The compa-ratio shows if an employee is being paid more than the midpoint (measure’s

value > 1.0) or less than the midpoint (< 1.0). This measure allows us to look at salary
dispersion within a company without focusing on the exact dollar values. It allows a comparison
between what the company is paying and what the outside market is paying (which most
company’s target as the midpoint of a salary range) for the jobs.

The compa-ratio shows if employees are paid above or below the grade midpoint and it

can be used to see what the dispersion pattern of pay. Equal pay would expect to see similar

distributions, variances and means, between males and females in this measure.

The lecture examples will cover the same statistical tests as the homework assignments

but will focus on the compa-ratio pay measure rather than salary. As such, the results presented

each week should be included and/or factored into your weekly conclusions on what the data has

told us about the answer to our question.

The first step in looking at whether males and females are paid equally would be to look

at the average pay of each. Given our sample is a random sample of the population of employees

(and, therefore considered to be representative of the population), the average salaries or average

compa-ratios (they measure related but not identical measures of pay) will give us an indication

of whether things are the same for each gender or not.

One issue in looking at averages is the variation within the groups. If both groups have

the same or very similar variation across the salaries then we test the averages for a difference

using one approach. If the group variances are significantly different, we use a slightly

approach. So, the first step is to examine group variances. This is done with the F-test.

F-test

As noted, the F-test is used to compare variances to determine if the differences noted

could be from simple sampling error (also known as pure chance alone) or if the differences are

large enough to be considered statistically different. The F statistic is simply one group’s

variance divided by the other group’s variance. (When done by hand, it is traditional to have the

largest variance in the numerator, but this is not critical when Excel performs the test for us.) So,

if the variances are equal, then the result of one variance divided by the other would equal 1.0 –

this is the center of the F distribution. How about a situation where one variance equaled 4 and

the other equaled 5 (randomly picked numbers for this example)? If we divided the larger by the

smaller, we would get 5/4 = 1.25 while if we divided the smaller by the larger, we would get 0.8.

Note that these values are on each side of the center value of 1.00. This is what is meant by “two

tails” with the F-test – one tail of the distribution has values less than 1.0 while the other has

values greater than 1.0.. Our value of F depends first on the variances (of course) and then on

how we do the division. The likelihood of these two variances coming from populations that

have the same variance does not depend upon which tail the result is in, but rather how likely it is

to see a difference from 1.00. This is given to us by the F-test p-value (probability value of

seeing a difference as large or larger than what we have if the null hypothesis is true).

One new concept introduced with the F-test is the idea of degrees of freedom (df). While

the technical explanation is somewhat tedious, we can understand the concept with a simple

example. If we have 5 numbers, for example: 1, 2, 3, 4, and 5; we also have a sum of them; in

this case 15. Now, assume we can change any of the numbers in the data set with the only

requirement being that the total must remain the same. How many of the numbers are we free to

change; or what is our degree of freedom in making changes? In this case, we can change any 4

of the values, as soon as we do so we automatically get the fifth value (whatever is needed to

make the sum equal to 15). Thus, to generalize, our df is the count we have minus 1 (equaling n

-1). N-1 is the formula for the degrees of freedom for each variable in the F-test. We will se this

idea in other statistical tests, each of which has its own formula to calculate it. The nice thing is

that Excel will give us this outcome without our needing to worry about it, and we rarely have to

actually use it in any of our work – but, it is technically part of most statistical tests.

There are two versions of the F-test available for use. One is located in the Data ribbon

under the Analysis block in the Data Analysis link and is called F-Test Two-Sample for

Variances. The other is located in the Fx Statistical list (which is duplicated in the Formulas

ribbon under the More Functions option and the Statistical list) and is called simply F.test.

While both test variances, there is an important difference. The F-Test Two Sample for

Variances option provides some additional summary statistics (mean, variance, count) for each

sample, but only provides a one-tail test outcome. One-tail results, whether with the F-Test or

the T-test are used to test a directional difference in variances, when we want to know if one

variance is larger (or smaller) than the other. Since, in general we are interested in the simpler

question of whether the variances are equal or not (without regard to which is larger), when

using this form to test for equality or not, we need to double the p-value to find the two t-tail p-

value we need for our decision on rejecting the null hypothesis.

On the other hand, the F.test found in Fx or Formulas returns only the two-tail p-value;

enough for a decision on rejecting or failing to reject the null hypothesis of no difference but

nothing else. Technically, this is the version we should use when conducting our two-tail

questions in the homework, but (as noted) either can be used if we remember to double the p-

value for the one -tail outcome.

Example: Testing for Variance Equality

As mentioned above, it is often beneficial to start with looking at variance equality when

comparing groups. We need to start our analysis of equal pay for equal work by seeing if there is

even an issue to be concerned with. So, we have selected our random sample of 25 males and 25

females from our corporate population. (A couple of assumptions; the company exists in only

one location, and all our employees in the sample are exempt professions or managers with at

least a bachelor’s college degree.)

Our initial question is: Are the male and female compa-ratio variances equal? (Note, if

they are, this would mean that the standard deviations of both groups are the same.) As with all

statistical tests, we will be using our samples to make judgements or inferences about the

population values. While the sample result values will differ, this difference may not be large

enough to show that the population values are not the same.

Question 1. One of the first things of interest to detectives is if the behavior of the

suspects differs from what they normally do. That is, who’s behavior varies from the norm?

Relating this to our compa-ratio measures has us asking if the compa-ratio variance for males

and females are equal within the population. (In the homework, the question asks about salary

variances. The logic and approach for answering the salary-based question is the same as shown

below.)

Variance equality is tested using the F-Test. There are two versions of this test available

to us that we could use, and both will be shown below. Note that equal standard deviations do

not automatically mean that the means are close, it just tells us if the dispersion patterns are

similar. If similar, the means of each group can be considered equally reflective of the data. The

following focuses on just setting up the data for and performing the statistical test.

The following show only the output for the six hypothesis testing steps. How the Excel

F-tests are set up is covered in Lecture 3 for this week.

Step 1: The question asked is whether the variances for males and females are equal. The

hypothesis statements for an equality test are shown below.

Ho: Male compa-ratio variance = Female compa-ratio variance

Ha: Male compa-ratio variance =/= Female compa-ratio variance

(Since the question asks about equality and not a directional difference, this is a two-tail

test. The Null must contain the names of the two variables involved (Male and Female),

the statistic being tested (variance), and the relationship sign (=). The alternate provides

the opposite view so that between them all possible outcomes are covered. We are only

concerned if the variances are equal, not whether one or the other is larger (or smaller).)

Step 2: We state our decision-making criteria here. It is: Alpha = 0.05 (This will be the

same for all statistical tests we perform in the class, and therefore the same in all

hypothesis set-ups.)

Step 3: The test, test statistic, and the reason for selecting the test are stated here. For this

example, we are using: F statistic and F-test for Variance. We use these as they are

designed to test variance equality.

Step 4: Our decision-making rule is presented here: Reject the null hypothesis if the p-

value is < alpha = 0.05.

(This step is also the same for every statistical test we will perform; it says we will reject

the null hypothesis if the probability of getting a result as large as what we see is less than

5% or a probability of 0.05.)

Note that these steps are set-up before we even look at the data. While, we may have set

up the data columns, we should not have done any analysis yet. These steps tell us how

we will make a decision from the results we get.

Step 5: Perform the analysis. This is the step where Excel performs the analysis and

produces output tables. The setting up of each Excel test is covered in Lecture 3, we are

primarily interested in how to interpret the results in this lecture.

Here is a screenshot of the results for both versions of the F-test. (Only one is needed for

the question.)

Step 6: Conclusions and Interpretation. This is where we interpret what the data is trying

to tell us.

Before moving on to interpreting these results, let’s look at what we have. The F-Test

Two-Sample for Variances output clearly has more information than the F.TEST. We

have the labels identifying each group as well as the mean, variance, and count

(Observations) for each group. The df, equaling the (sample size -1), is shown as well as

the calculated F statistic (which equals the left group’s (or Male in this case) variance

divided by the right group’s (Female) variance. Note, Excel divides the variances in the

order that they were entered into the data entry box, for this example the Males were

entered first.

The next two rows are critical for our decision making; but they are incomplete. They

show the one-tail critical values used in decision making. The P(F<=f) one-tail is the P-
value, or the probability of getting an F-value as large or larger than we have if the null
hypothesis is true. However, it is only a one-tail outcome, while we want a two-tail
outcome, since we only care about the variances being equal or not, not which one is

larger. So, the result as presented cannot be used directly. What we need to do will be

covered after we look at the F.TEST result.

The F.TEST gives less information, but it provides us with exactly what we need; the

two-tail p-value. For our data set, we have a 40% (rounded) chance of getting an F value

this large or larger purely by chance alone when we are looking at a two-tail outcome.

Note that this value, 0.39766 is twice the one-tail value of 0.19883 from the F-Test table.

This will always be true, the two-tail probabilities will be twice the one-tail values.

So, if we want to use the F-TEST Two-Sample for Variance tool, we need to double the

p-value before making our step 6 decisions.

We are now ready to move on to what step 6 asks for. This step has several parts.

• What is the p-value: 0.3977 (our compa-ratio example result). This value equals

EITHER tht F.TEST outcome or 2 times the F-test result. (If, the F-test p-value is

in cell K-15, you could enter =2*K15 to get the value desired).

• What is your decision: REJ or Not reject the null? (If our p-value is < (less than)
0.05, we say REJ, if the p-value is > (larger than) 0.05, we say NOT reject. This is

what our decision rule says to do.) Our answer is for compa-ratio variances:

NOT Rej. This means we do not reject the claim made by the null hypothesis

and accept it as the most likely description of the variances within the population.

• Why? This line asks us to explain why we made the decision we did. Our compa-

ratio response is: The p-value is > (greater than) 0.05, and the decision rule is

reject if the p-value was < 0.05. (The answer here is simply why, based on the
reasoning shown above, you chose your REJ or NOT Rej choice.)

• What is your conclusion about the variances in the population for the male and

female salaries? This part asks us to translate the statistical decision into a clear

answer to the initial question (Are male and female compa-values equal?) Our

response: We do not have enough evidence to say that the variances differ in

the population. The variances are equal in the population. Had we rejected

the null hypothesis, we would have said the population variances differed.

Note that this question did not tell us anything about actual pay differences between the

genders. It did tell us that both groups are dispersed in a similar manner, and thus supported

some of the conclusions we drew from looking at the data last week.

Examples: Testing for Mean Equality

While we test for variance equality with an F test, we use the T-Test to test for mean

equality testing. The t-test also uses the degree of freedom (df) value in providing us with our

probability result; but again, Excel does the work for us. The t distribution is a bell-shaped curve

that is flatter and a bit more spread out than the normal curve we discussed last week. The center

is located at 0 (zero) and the tails (the negative and positive values) are symmetrical.

The t statistic for the testing of two means is basically: (Mean1 – Mean2)/standard error

estimate. (The standard error formula varies according to which type of t-test we are

performing.) Note that the t will be either positive or negative depending upon which mean is

larger. So, if we are interested in simply equal or not equal, it does not matter if we have a

positive or negative t value, only the size of the difference matters. As with the F-test, half of

our alpha goes in the positive tail and half goes in the negative tail when making our equality

decision.

We have two questions about means this week.

Question 2. The second question for this week asks about salary mean equality between males

and females. Again, the set up for this question is covered in Lecture 3, we are concerned here

with the interpretation of the results. (Note, the comments about each step made above apply to

this example as well, but they will not be repeated except for specific information related to the t-

test outcome.) Specific differences from the variance example of question 1 will be highlighted

with italics. Again, the results discussed with each step are shown in Lecture 2.

Since the question asks if male and female compa-ratios (salaries in the homework), are equal we

have a equal versus non-equal hypothesis pair.

Step 1: Ho: Male compa-ratio mean = Female compa-ratio mean

Ha: Male compa-ratio mean =/= Female compa-ratio mean

Step 2: The decision criteria is constant: Alpha = 0.05

Step 3: t statistic and t-test, assuming equal variances. We use these as they are

designed to test mean equality, and we are assuming (and according to the F-test

have) equal variances.

Step 4: Again, our decision rule is the same: Reject the null hypothesis if the p-value is

< alpha = 0.05.

Step 5: Perform the analysis. Here is the screen shot for the results, using the same data

as with Question 1.

As with the F-test output, the t-test starts with the test name, the group names, and some

descriptive statistics. Line 4 start with a new result, the Pooled Variance; this is a

weighted average of the sample variances since we are assuming that the related

population variances are equal. The next line, hypothesized Mean Difference, shows up

only if a value was entered in the data input box setting up the test (discussed in Lecture

3).

Next comes our friend degrees of freedom, which equal the sum of both sample sizes

minus 2 (or N1-1 + N2 -1). The calculated T value (similar to the calculated F value in

the F test output) comes next. Note that since we have a negative t Stat, it falls in the left

tail of the t distribution. This is important in one tail tests but not in two tail tests.

Following the calculated T value come the one and two tail decision points. The one tail

p-value is found in the P(T<==t) one-tail row followed by the T critical one-tail value.
The two tail results follow.

Step 6: Conclusions and Interpretation. This step has several parts.

What is the p-value? 0.571 (our rounded compa-ratio example result).

(Since we again have a two-tail test, we use the P(T<=t) two-tail result.)

What is your decision: REJ or Not reject the null? NOT Rej (our compa-ratio result)

Why? The p-value (0.571) is > (greater than) 0.05. (The compa-ratio result)

What is your conclusion about the means in the population for the male and female

salaries? We do not have enough evidence to say that the means differ in the

population. So, our conclusion is that the means are equal in the population. (Our

compa-ratio result.)

Question 3

The third question for this week asks about salary differences based on educational level

rather than gender. Since education is a legitimate reason to pay someone more, it will be

helpful to see if a graduate degree results in a higher average pay. Note that this question has a

directional focus (do employees with an advanced degree (degree code = 1) have higher average

salaries?). This means we must develop a direction set of hypothesis statements. We will use

the terms UnderG (for undergraduate degree code 0) and Grad (for graduate degree code 1) in

these statements. Again, the results discussed with each step are shown in Lecture 2.

Step 1: Ho: UnderG mean compa-ratio => Grad mean compa-ratio

Ha: UnderG mean ratio < Grad mean compa-ratio

(Note the way the inequalities are set up; since the question is if degree 1 salary

means >, the question becomes the alternate hypothesis as it does not contain an =

claim. These can be written with the Grad mean listed first but the arrow heads

must point to Grad showing an expectation that grad means are larger.)

Step 2: Alpha = 0.05 (Our constant decision criterion)

Step 3: t statistic and t-test assuming equal variances. We use these as they are

designed to test mean equality. (The variance equality assumption is part of the

question set-up.)

Step 4: Reject the null hypothesis if the p-value is < alpha = 0.05. (Our constant decision rule.)

Step 5: Perform the analysis. Here is a screen print for a T-test on the question of

whether the graduate and undergraduate degree compa-ratios means in the population are

equal or not. We are assuming equal variances, we are using the T-Test Two-Sample

Assuming Equal Variances form.

t-Test: Two-Sample Assuming Equal Variances

UnderG Grad

Mean 1.05172 1.07324

Variance 0.00581 0.005999

Observations 25 25

Pooled Variance 0.005904

Hypothesized Mean

Difference 0

df 48

t Stat -0.99016

P(T<=t) one-tail 0.163529
t Critical one-tail 1.677224
P(T<=t) two-tail 0.327059
t Critical two-tail 2.010635

The table output is read exactly the same way as with the question 2 table with the

exception that we are interested in the one-tail outcome, so we use the (highlighted) one-

tail p-value row in our decision making.

Step 6: Conclusions and Interpretation. This step has several parts.

• What is the p-value? 0.164(rounded) (our rounded compa-ratio example result).

(Since we again have a one-tail test, we use the P(T<=t) one-tail result.)

• Is the t value in the t-distribution tail indicated by the arrow in the Ha claim? Yes.

The t-value is negative, and the Ha arrow points to the left (or negative) tail

of the t distribution. (Since we only care about a difference in one direction, the

result must be consistent with the desired direction. Only large negative values

are of interest in this case/set-up, since our difference is calculated by (UnderG –

Grad); large negative values show a larger Grad salary. If we had said Grad <=
Underg in the Null, the alternate arrow would have pointed to the right or positive
tail, and a positive t would have been needed.)

• What is your decision: REJ or Not reject the null? NOT Rej (our compa-ratio

result)

• Why? The p-value is > (greater than) 0.05. So, the sign does not matter in this

case, but it is in the correct or negative tail. (The compa-ratio result)

• What is your conclusion about the impact of education on average salaries? We

do not have enough information to suggest that graduate degree holders have a

higher average salary than undergraduate degree holders. (Our compa-ratio

result.)

Question 4

While the week 1 salary results suggest that males and females are not paid the same, this

week’s compa-ratio tests still do not suggest any inequality. Gender Compa-ratio variances and

means are not significantly different. A somewhat surprising result was that graduate degree

holders did not have higher compa-ratios.

We still cannot answer our equal pay for equal work question; however, as we have yet

developed a measure of pay for equal work. Compa-ratios do remove the impact of grades, but

too many other work-related variables still need to be examined.

Summary

The F and t tests are used to determine if, based upon random sample results, the

population parameters can reasonably be said to differ. The F-test looks for differences in

population variances, while the t-test examines population mean differences. Both tests are

performed as part of the hypothesis testing procedure and always is done in step 5.

Differences in sample results can be transformed into statistical distributions that allow us

to determine the probability or likelihood of getting a difference as large or larger than we found.

It is this transformation that allows us to make our decisions about the differences we see in the

results.

When either test is set-up using the Data | Analysis toolpak function, these tests will

provide summary sample descriptive statistics for the mean, variance, and count as well as the

calculated statistic, the critical value of the statistic, and the p-value. When set-up using Excel’s

Fx or Formula functions, only the p-value is returned.

For both tests, if the appropriate p-value is less than the specified alpha (always 0.05 in

this class), we reject the null hypothesis and say the alternate is the more likely description of the

population.

We can test for a simple difference (called a two-tail test) where it does not matter which

group has the larger value or we can use a directional test (called a one-tail test) where we are

concerned about which variable is larger (or smaller). The null and alternate hypothesis define

which difference we are looking for.

The t-test has three versions: equal variances, unequal variances, and paired. The paired

test is used when we have two measures on each subject (such as the salary and midpoint for

each employee). The F-test is used to help us decide if we need to use the equal or unequal

variance form of the t-test.

The Analysis toolpak F test defaults to a one-tail test so we need to double its p-value

when testing for simple variance differences. The Fx (or Formula) F-test lets us select a one- or

two-tail outcome.

Please ask your instructor if you have any questions about this material.

When you have finished with this lecture, please respond to Discussion Thread 2 for this

week with your initial response and responses to others over a couple of days.

## We've got everything to become your favourite writing service

### Money back guarantee

Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.

### Confidentiality

We don’t share your private information with anyone. What happens on our website stays on our website.

### Our service is legit

We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.

### Get a plagiarism-free paper

We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.

### We can help with urgent tasks

Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.

### Pay a fair price

Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.