Need discussion for BUS308 Statistics for Managers details & lecture below.

Read Lecture 2. React to the material in the lecture. Is there anything you found to be unclear? How could you use these ideas within your degree (business administration) area?

BUS308 Week 2 Lecture 2

Don't use plagiarized sources. Get Your Custom Essay on
Need discussion for BUS308 Statistics for Managers details & lecture below.
Just from \$13/Page

Statistical Testing for Differences – Part 1

After reading this lecture, the student should know:

1. How statistical distributions are used in hypothesis testing.
2. How to interpret the F test (both options) produced by Excel
3. How to interpret the T-test produced by Excel

Overview

Lecture 1 introduced the logic of statistical testing using the hypothesis testing procedure.

It also mentioned that we will be looking at two different tests this week. The t-test is used to
determine if means differ, from either a standard or claim or from another group. The F-test is
used to examine variance differences between groups.

This lecture starts by looking at statistical distributions – they underline the entire
statistical testing approach. They are kind of like the detective’s base belief that crimes are
committed for only a couple of reasons – money, vengeance, or love. The statistical distribution
that underlies each test assumes that statistical measures (such as the F value when comparing
variances and the t value when looking at means) follow a particular pattern, and this can be used
to make decisions.

While the underlying distributions differ for the different tests we will be looking at
throughout the course, they all have some basic similarities that allow us to examine the t
distribution and extrapolate from it to interpreting results based on other distributions.

Distributions. The basic logic for all statistical tests: If the null hypothesis claim is
correct, then the distribution of the statistical outcome will be distributed around a central value,
and larger and smaller values will be increasingly rare. At some point (and we define this as our
alpha value), we can say that the likelihood of getting a difference this large is extremely
unlikely and we will say that our results do not seem to come from a population that matches the
claims of the null hypothesis.

Note that this logic has several key elements:

1. The test is based on an assumption that the null hypothesis is correct. This gives us a
starting point, even if later proven wrong.

2. All sample results are turned into a statistic that matches the test selected (for
example, the F statistic when using the F-test, or the t-statistic when using the T-test.)

3. The calculated statistic is compared to a related statistical distribution to see how
likely an outcome we have.

4. The larger the test statistic, the more unlikely it is that the result matches or comes
from the population described by the null hypothesis claim.

We will demonstrate these ideas by looking at the questions being asked in this week’s
homework. We will show results of the related Excel tests, and discuss how to interpret the
output.

We need to remember that seeing different value (mean, variance, etc.) from different
samples does not tell us that the population parameters we are estimating are, in fact, different.
The one thing we know about sampling is that each sample will be a bit different. They
generally provide a “close enough” estimate to the population values of concern for decision and
action purposes. But, they are not an exact match. This difference is examined by the use of the
statistical tests, which tell us how much importance we should attach to observed differences.

Lecture Examples

The lectures for each week will also look at our class question of whether or not males
and females are paid equally for equal work. These additional analyses provide some different
clues on what the data is trying to tell us about company pay practices.

While your analysis will focus directly on the salary that males and females are being
paid, the lecture examples will use an alternate method of examining pay practices. Many
compensation professionals often use a relative pay measure called the “comparison-ratio,” or
compa-ratio, to examine pay patterns within the company.

Some background on this measure. Many companies use grades to group jobs of equal
value to the company into groups that have a similar pay range – the values that a company is
willing to pay employees for the job. (As strong as a performer a mail room clerk is, they will
rarely be paid the same as the CEO.) Many companies will set the middle of this range, the
midpoint, as the average salary that that market pays to hire someone into the job. This is how
companies remain competitive in their hiring.

Now, compensation professionals will generally want to analyze how the company is
paying employees relative to these market rates (as summarized by the midpoint). One approach
is to divide each employee’s salary by its related midpoint. The outcome is the compa-ratio
which is considered an alternate measure of pay that eliminates the impact of different grades.
The compa-ratio reports pay as a ratio of the actual salary divided by the salary grade’s midpoint.

The compa-ratio shows if an employee is being paid more than the midpoint (measure’s
value > 1.0) or less than the midpoint (< 1.0). This measure allows us to look at salary dispersion within a company without focusing on the exact dollar values. It allows a comparison between what the company is paying and what the outside market is paying (which most company’s target as the midpoint of a salary range) for the jobs.

The compa-ratio shows if employees are paid above or below the grade midpoint and it
can be used to see what the dispersion pattern of pay. Equal pay would expect to see similar
distributions, variances and means, between males and females in this measure.

The lecture examples will cover the same statistical tests as the homework assignments
but will focus on the compa-ratio pay measure rather than salary. As such, the results presented

each week should be included and/or factored into your weekly conclusions on what the data has

The first step in looking at whether males and females are paid equally would be to look
at the average pay of each. Given our sample is a random sample of the population of employees
(and, therefore considered to be representative of the population), the average salaries or average
compa-ratios (they measure related but not identical measures of pay) will give us an indication
of whether things are the same for each gender or not.

One issue in looking at averages is the variation within the groups. If both groups have
the same or very similar variation across the salaries then we test the averages for a difference
using one approach. If the group variances are significantly different, we use a slightly
approach. So, the first step is to examine group variances. This is done with the F-test.

F-test

As noted, the F-test is used to compare variances to determine if the differences noted
could be from simple sampling error (also known as pure chance alone) or if the differences are
large enough to be considered statistically different. The F statistic is simply one group’s
variance divided by the other group’s variance. (When done by hand, it is traditional to have the
largest variance in the numerator, but this is not critical when Excel performs the test for us.) So,
if the variances are equal, then the result of one variance divided by the other would equal 1.0 –
this is the center of the F distribution. How about a situation where one variance equaled 4 and
the other equaled 5 (randomly picked numbers for this example)? If we divided the larger by the
smaller, we would get 5/4 = 1.25 while if we divided the smaller by the larger, we would get 0.8.
Note that these values are on each side of the center value of 1.00. This is what is meant by “two
tails” with the F-test – one tail of the distribution has values less than 1.0 while the other has
values greater than 1.0.. Our value of F depends first on the variances (of course) and then on
how we do the division. The likelihood of these two variances coming from populations that
have the same variance does not depend upon which tail the result is in, but rather how likely it is
to see a difference from 1.00. This is given to us by the F-test p-value (probability value of
seeing a difference as large or larger than what we have if the null hypothesis is true).

One new concept introduced with the F-test is the idea of degrees of freedom (df). While
the technical explanation is somewhat tedious, we can understand the concept with a simple
example. If we have 5 numbers, for example: 1, 2, 3, 4, and 5; we also have a sum of them; in
this case 15. Now, assume we can change any of the numbers in the data set with the only
requirement being that the total must remain the same. How many of the numbers are we free to
change; or what is our degree of freedom in making changes? In this case, we can change any 4
of the values, as soon as we do so we automatically get the fifth value (whatever is needed to
make the sum equal to 15). Thus, to generalize, our df is the count we have minus 1 (equaling n
-1). N-1 is the formula for the degrees of freedom for each variable in the F-test. We will se this
idea in other statistical tests, each of which has its own formula to calculate it. The nice thing is
that Excel will give us this outcome without our needing to worry about it, and we rarely have to
actually use it in any of our work – but, it is technically part of most statistical tests.

There are two versions of the F-test available for use. One is located in the Data ribbon
under the Analysis block in the Data Analysis link and is called F-Test Two-Sample for
Variances. The other is located in the Fx Statistical list (which is duplicated in the Formulas
ribbon under the More Functions option and the Statistical list) and is called simply F.test.

While both test variances, there is an important difference. The F-Test Two Sample for
Variances option provides some additional summary statistics (mean, variance, count) for each
sample, but only provides a one-tail test outcome. One-tail results, whether with the F-Test or
the T-test are used to test a directional difference in variances, when we want to know if one
variance is larger (or smaller) than the other. Since, in general we are interested in the simpler
question of whether the variances are equal or not (without regard to which is larger), when
using this form to test for equality or not, we need to double the p-value to find the two t-tail p-
value we need for our decision on rejecting the null hypothesis.

On the other hand, the F.test found in Fx or Formulas returns only the two-tail p-value;
enough for a decision on rejecting or failing to reject the null hypothesis of no difference but
nothing else. Technically, this is the version we should use when conducting our two-tail
questions in the homework, but (as noted) either can be used if we remember to double the p-
value for the one -tail outcome.

Example: Testing for Variance Equality

As mentioned above, it is often beneficial to start with looking at variance equality when
comparing groups. We need to start our analysis of equal pay for equal work by seeing if there is
even an issue to be concerned with. So, we have selected our random sample of 25 males and 25
females from our corporate population. (A couple of assumptions; the company exists in only
one location, and all our employees in the sample are exempt professions or managers with at
least a bachelor’s college degree.)

Our initial question is: Are the male and female compa-ratio variances equal? (Note, if
they are, this would mean that the standard deviations of both groups are the same.) As with all
statistical tests, we will be using our samples to make judgements or inferences about the
population values. While the sample result values will differ, this difference may not be large
enough to show that the population values are not the same.

Question 1. One of the first things of interest to detectives is if the behavior of the
suspects differs from what they normally do. That is, who’s behavior varies from the norm?
Relating this to our compa-ratio measures has us asking if the compa-ratio variance for males
and females are equal within the population. (In the homework, the question asks about salary
variances. The logic and approach for answering the salary-based question is the same as shown
below.)

Variance equality is tested using the F-Test. There are two versions of this test available
to us that we could use, and both will be shown below. Note that equal standard deviations do
not automatically mean that the means are close, it just tells us if the dispersion patterns are

similar. If similar, the means of each group can be considered equally reflective of the data. The
following focuses on just setting up the data for and performing the statistical test.

The following show only the output for the six hypothesis testing steps. How the Excel
F-tests are set up is covered in Lecture 3 for this week.

Step 1: The question asked is whether the variances for males and females are equal. The
hypothesis statements for an equality test are shown below.

Ho: Male compa-ratio variance = Female compa-ratio variance

Ha: Male compa-ratio variance =/= Female compa-ratio variance

(Since the question asks about equality and not a directional difference, this is a two-tail
test. The Null must contain the names of the two variables involved (Male and Female),
the statistic being tested (variance), and the relationship sign (=). The alternate provides
the opposite view so that between them all possible outcomes are covered. We are only
concerned if the variances are equal, not whether one or the other is larger (or smaller).)

Step 2: We state our decision-making criteria here. It is: Alpha = 0.05 (This will be the
same for all statistical tests we perform in the class, and therefore the same in all
hypothesis set-ups.)

Step 3: The test, test statistic, and the reason for selecting the test are stated here. For this
example, we are using: F statistic and F-test for Variance. We use these as they are
designed to test variance equality.

Step 4: Our decision-making rule is presented here: Reject the null hypothesis if the p-
value is < alpha = 0.05.

(This step is also the same for every statistical test we will perform; it says we will reject
the null hypothesis if the probability of getting a result as large as what we see is less than
5% or a probability of 0.05.)

Note that these steps are set-up before we even look at the data. While, we may have set
up the data columns, we should not have done any analysis yet. These steps tell us how
we will make a decision from the results we get.

Step 5: Perform the analysis. This is the step where Excel performs the analysis and
produces output tables. The setting up of each Excel test is covered in Lecture 3, we are
primarily interested in how to interpret the results in this lecture.

Here is a screenshot of the results for both versions of the F-test. (Only one is needed for
the question.)

Step 6: Conclusions and Interpretation. This is where we interpret what the data is trying
to tell us.

Before moving on to interpreting these results, let’s look at what we have. The F-Test
Two-Sample for Variances output clearly has more information than the F.TEST. We
have the labels identifying each group as well as the mean, variance, and count
(Observations) for each group. The df, equaling the (sample size -1), is shown as well as
the calculated F statistic (which equals the left group’s (or Male in this case) variance
divided by the right group’s (Female) variance. Note, Excel divides the variances in the
order that they were entered into the data entry box, for this example the Males were
entered first.

The next two rows are critical for our decision making; but they are incomplete. They
show the one-tail critical values used in decision making. The P(F<=f) one-tail is the P- value, or the probability of getting an F-value as large or larger than we have if the null hypothesis is true. However, it is only a one-tail outcome, while we want a two-tail outcome, since we only care about the variances being equal or not, not which one is

larger. So, the result as presented cannot be used directly. What we need to do will be
covered after we look at the F.TEST result.

The F.TEST gives less information, but it provides us with exactly what we need; the
two-tail p-value. For our data set, we have a 40% (rounded) chance of getting an F value
this large or larger purely by chance alone when we are looking at a two-tail outcome.
Note that this value, 0.39766 is twice the one-tail value of 0.19883 from the F-Test table.
This will always be true, the two-tail probabilities will be twice the one-tail values.

So, if we want to use the F-TEST Two-Sample for Variance tool, we need to double the
p-value before making our step 6 decisions.

We are now ready to move on to what step 6 asks for. This step has several parts.

• What is the p-value: 0.3977 (our compa-ratio example result). This value equals
EITHER tht F.TEST outcome or 2 times the F-test result. (If, the F-test p-value is
in cell K-15, you could enter =2*K15 to get the value desired).

• What is your decision: REJ or Not reject the null? (If our p-value is < (less than) 0.05, we say REJ, if the p-value is > (larger than) 0.05, we say NOT reject. This is
what our decision rule says to do.) Our answer is for compa-ratio variances:
NOT Rej. This means we do not reject the claim made by the null hypothesis
and accept it as the most likely description of the variances within the population.

• Why? This line asks us to explain why we made the decision we did. Our compa-
ratio response is: The p-value is > (greater than) 0.05, and the decision rule is
reject if the p-value was < 0.05. (The answer here is simply why, based on the reasoning shown above, you chose your REJ or NOT Rej choice.)

• What is your conclusion about the variances in the population for the male and
female salaries? This part asks us to translate the statistical decision into a clear
answer to the initial question (Are male and female compa-values equal?) Our
response: We do not have enough evidence to say that the variances differ in
the population. The variances are equal in the population. Had we rejected
the null hypothesis, we would have said the population variances differed.

Note that this question did not tell us anything about actual pay differences between the
genders. It did tell us that both groups are dispersed in a similar manner, and thus supported
some of the conclusions we drew from looking at the data last week.

Examples: Testing for Mean Equality

While we test for variance equality with an F test, we use the T-Test to test for mean
equality testing. The t-test also uses the degree of freedom (df) value in providing us with our
probability result; but again, Excel does the work for us. The t distribution is a bell-shaped curve
that is flatter and a bit more spread out than the normal curve we discussed last week. The center
is located at 0 (zero) and the tails (the negative and positive values) are symmetrical.

The t statistic for the testing of two means is basically: (Mean1 – Mean2)/standard error
estimate. (The standard error formula varies according to which type of t-test we are
performing.) Note that the t will be either positive or negative depending upon which mean is
larger. So, if we are interested in simply equal or not equal, it does not matter if we have a
positive or negative t value, only the size of the difference matters. As with the F-test, half of
our alpha goes in the positive tail and half goes in the negative tail when making our equality
decision.

We have two questions about means this week.

Question 2. The second question for this week asks about salary mean equality between males
and females. Again, the set up for this question is covered in Lecture 3, we are concerned here
this example as well, but they will not be repeated except for specific information related to the t-
test outcome.) Specific differences from the variance example of question 1 will be highlighted
with italics. Again, the results discussed with each step are shown in Lecture 2.

Since the question asks if male and female compa-ratios (salaries in the homework), are equal we
have a equal versus non-equal hypothesis pair.

Step 1: Ho: Male compa-ratio mean = Female compa-ratio mean

Ha: Male compa-ratio mean =/= Female compa-ratio mean

Step 2: The decision criteria is constant: Alpha = 0.05

Step 3: t statistic and t-test, assuming equal variances. We use these as they are
designed to test mean equality, and we are assuming (and according to the F-test
have) equal variances.

Step 4: Again, our decision rule is the same: Reject the null hypothesis if the p-value is
< alpha = 0.05.

Step 5: Perform the analysis. Here is the screen shot for the results, using the same data
as with Question 1.

As with the F-test output, the t-test starts with the test name, the group names, and some
descriptive statistics. Line 4 start with a new result, the Pooled Variance; this is a
weighted average of the sample variances since we are assuming that the related
population variances are equal. The next line, hypothesized Mean Difference, shows up
only if a value was entered in the data input box setting up the test (discussed in Lecture
3).

Next comes our friend degrees of freedom, which equal the sum of both sample sizes
minus 2 (or N1-1 + N2 -1). The calculated T value (similar to the calculated F value in
the F test output) comes next. Note that since we have a negative t Stat, it falls in the left
tail of the t distribution. This is important in one tail tests but not in two tail tests.
Following the calculated T value come the one and two tail decision points. The one tail
p-value is found in the P(T<==t) one-tail row followed by the T critical one-tail value. The two tail results follow.

Step 6: Conclusions and Interpretation. This step has several parts.

What is the p-value? 0.571 (our rounded compa-ratio example result).

(Since we again have a two-tail test, we use the P(T<=t) two-tail result.)

What is your decision: REJ or Not reject the null? NOT Rej (our compa-ratio result)

Why? The p-value (0.571) is > (greater than) 0.05. (The compa-ratio result)

What is your conclusion about the means in the population for the male and female
salaries? We do not have enough evidence to say that the means differ in the
population. So, our conclusion is that the means are equal in the population. (Our
compa-ratio result.)

Question 3

The third question for this week asks about salary differences based on educational level
rather than gender. Since education is a legitimate reason to pay someone more, it will be
helpful to see if a graduate degree results in a higher average pay. Note that this question has a
directional focus (do employees with an advanced degree (degree code = 1) have higher average
salaries?). This means we must develop a direction set of hypothesis statements. We will use
these statements. Again, the results discussed with each step are shown in Lecture 2.

Step 1: Ho: UnderG mean compa-ratio => Grad mean compa-ratio

Ha: UnderG mean ratio < Grad mean compa-ratio

(Note the way the inequalities are set up; since the question is if degree 1 salary
means >, the question becomes the alternate hypothesis as it does not contain an =
claim. These can be written with the Grad mean listed first but the arrow heads
must point to Grad showing an expectation that grad means are larger.)

Step 2: Alpha = 0.05 (Our constant decision criterion)

Step 3: t statistic and t-test assuming equal variances. We use these as they are
designed to test mean equality. (The variance equality assumption is part of the
question set-up.)

Step 4: Reject the null hypothesis if the p-value is < alpha = 0.05. (Our constant decision rule.)

Step 5: Perform the analysis. Here is a screen print for a T-test on the question of
whether the graduate and undergraduate degree compa-ratios means in the population are
equal or not. We are assuming equal variances, we are using the T-Test Two-Sample
Assuming Equal Variances form.

t-Test: Two-Sample Assuming Equal Variances

Mean 1.05172 1.07324
Variance 0.00581 0.005999
Observations 25 25
Pooled Variance 0.005904

Hypothesized Mean

Difference 0
df 48
t Stat -0.99016
P(T<=t) one-tail 0.163529 t Critical one-tail 1.677224 P(T<=t) two-tail 0.327059 t Critical two-tail 2.010635

The table output is read exactly the same way as with the question 2 table with the
exception that we are interested in the one-tail outcome, so we use the (highlighted) one-
tail p-value row in our decision making.

Step 6: Conclusions and Interpretation. This step has several parts.

• What is the p-value? 0.164(rounded) (our rounded compa-ratio example result).

(Since we again have a one-tail test, we use the P(T<=t) one-tail result.)

• Is the t value in the t-distribution tail indicated by the arrow in the Ha claim? Yes.
The t-value is negative, and the Ha arrow points to the left (or negative) tail
of the t distribution. (Since we only care about a difference in one direction, the
result must be consistent with the desired direction. Only large negative values
are of interest in this case/set-up, since our difference is calculated by (UnderG –
Grad); large negative values show a larger Grad salary. If we had said Grad <= Underg in the Null, the alternate arrow would have pointed to the right or positive tail, and a positive t would have been needed.)

• What is your decision: REJ or Not reject the null? NOT Rej (our compa-ratio
result)

• Why? The p-value is > (greater than) 0.05. So, the sign does not matter in this
case, but it is in the correct or negative tail. (The compa-ratio result)

• What is your conclusion about the impact of education on average salaries? We
do not have enough information to suggest that graduate degree holders have a
higher average salary than undergraduate degree holders. (Our compa-ratio
result.)

Question 4

While the week 1 salary results suggest that males and females are not paid the same, this
week’s compa-ratio tests still do not suggest any inequality. Gender Compa-ratio variances and
means are not significantly different. A somewhat surprising result was that graduate degree
holders did not have higher compa-ratios.

We still cannot answer our equal pay for equal work question; however, as we have yet
developed a measure of pay for equal work. Compa-ratios do remove the impact of grades, but
too many other work-related variables still need to be examined.

Summary

The F and t tests are used to determine if, based upon random sample results, the
population parameters can reasonably be said to differ. The F-test looks for differences in
population variances, while the t-test examines population mean differences. Both tests are
performed as part of the hypothesis testing procedure and always is done in step 5.

Differences in sample results can be transformed into statistical distributions that allow us
to determine the probability or likelihood of getting a difference as large or larger than we found.
It is this transformation that allows us to make our decisions about the differences we see in the
results.

When either test is set-up using the Data | Analysis toolpak function, these tests will
provide summary sample descriptive statistics for the mean, variance, and count as well as the
calculated statistic, the critical value of the statistic, and the p-value. When set-up using Excel’s
Fx or Formula functions, only the p-value is returned.

For both tests, if the appropriate p-value is less than the specified alpha (always 0.05 in
this class), we reject the null hypothesis and say the alternate is the more likely description of the
population.

We can test for a simple difference (called a two-tail test) where it does not matter which
group has the larger value or we can use a directional test (called a one-tail test) where we are
concerned about which variable is larger (or smaller). The null and alternate hypothesis define
which difference we are looking for.

The t-test has three versions: equal variances, unequal variances, and paired. The paired
test is used when we have two measures on each subject (such as the salary and midpoint for
each employee). The F-test is used to help us decide if we need to use the equal or unequal
variance form of the t-test.

The Analysis toolpak F test defaults to a one-tail test so we need to double its p-value
when testing for simple variance differences. The Fx (or Formula) F-test lets us select a one- or
two-tail outcome.

When you have finished with this lecture, please respond to Discussion Thread 2 for this
week with your initial response and responses to others over a couple of days.

Calculator

Total price:\$26
Our features