Section 1
SECTION 1
To conduct any kind of study, data is necessary. These data can be qualitative and quantitative in nature. Qualitative data is the of data that contains certain characteristics about the factor of interest. These factors are known as variables whose values changes from time to time depending on the situation. Similarly, quantitative data is the type of data contains numerical values of a variable. A variable that contains qualitative data is known as qualitative data and that contains quantitative data is known as quantitative variable. Example of a qualitative variable can be educational level of people and an example of quantitative variable can be age of the people.
Quantitative variables can be of two types, discrete variables and continuous variables. Variables whose values are countable are discrete variables and whose values are uncountable are continuous variables. Example of discrete variables can be age of the people and continuous variable can be weight of people.
Dataset is usually known as a collection of information or data on some variables of interest. Thus, datasets can be summarized in various different types of ways depending on the type of the variables that are involved in the dataset. For example, Comparison of two qualitative variables can be made by evaluating the proportions of the qualities in the different groups and comparing them. A qualitative and a quantitative variable can be summarized and compared by evaluating the means of the different groups of qualities and comparing them. The third type is the comparison of two quantitative variables which can be compared by introducing a scatterplot which shows the nature of the relationship between two quantitative variables.
These comparisons are usually done with the help of some computing softwares as that makes the process faster and requires less effort.
SECTION 2
a)
When distance increase selling price decreases. The relationship between distance and selling price is
y=-0.2004x+20018, where x is the distance travelled and y is the selling price.
b) The estimated selling price of a car that has travelled 30,000 km = -0.2004 * 30000 + 20018 = $14,006.
c) For all the 10,000 estimates,
Average = 14000
Standard Deviation = 392
Section 2
Required z-score = (14006 – 14000) / 392 = 0.02
d) P (Z < 0.02) = 0.50798
e)For sample 231,
Expected Rank = P (Z < z-score) * 10000 = 0.50798 * 10000 = 5080
SECTION 3
a)
which sample ? |
231 |
||
Count of Which version ? (A or B) |
Column Labels |
|
|
Row Labels |
n |
y |
Grand Total |
A |
3 |
90 |
93 |
B |
24 |
94 |
118 |
Grand Total |
27 |
184 |
211 |
which sample ? |
231 |
||
Count of Which version ? (A or B) |
Column Labels |
|
|
Row Labels |
n |
y |
Grand Total |
A |
3.23% |
96.77% |
100.00% |
B |
20.34% |
79.66% |
100.00% |
Grand Total |
12.80% |
87.20% |
100.00% |
b)
c) Version A is much more preferred than version B by most of the people.
d)For the selected sample 231, the estimated difference in proportion of preference = (0.9677 – 0.7966) = 0.171
- Total number of samples = 1000
Selected sample number = 231
Average = 0.1
Standard deviation = 0.0505
Z-score = (0.171 – 0.1) / 0.0505 = 1.41
- P (Z < 1.41) = 0.9207
- Expected rank for sample 231 = P (Z < z-score) * 1000 = 0.9207 * 1000 = 921
e)
- Let p1be the proportion of people who prefer version A and p2 be the proportion of people preferring version B. Therefore,
H0: p1 – p2 = 0
H1: p1 – p2 ≠ 0
- The required p-value is 0.0002
- From the p-value it can be said that H0is rejected.
- The two proportions are statistically significant and thus, the proportions are not equal to each other.
SECTION 4
a)
which sample? |
231 |
||
Row Labels |
Count of which machine? (A or B) |
Average of $ Casino profit from bet |
StdDev of $ Casino profit from bet |
A |
103 |
0.184466019 |
4.519560045 |
B |
97 |
0.164948454 |
1.351545507 |
Grand Total |
200 |
0.175 |
3.369143905 |
b) The average profit from Casino A is $0.18 and the average profit from Casino B is $0.16 but the variation of profit from Casino A is much high ($4.51) and the variation of profit from Casino B $1.35, which is less than Casino A. Thus, Casino A is much reliable than Casino A in terms of profit as the profit is more probable than Casino A though the average profit of Casino A is higher.
c)
- For the selected sample 231, the estimated difference in sample means = (0.18 – 0.16) = 0.02
- Total number of samples = 1000
Selected sample number = 231
Average = 0.4
Standard deviation = 0.46
Z-score = (0.02 – 0.4) / 0.46 = -0.83
- P (Z < -0.83) = 0.2032
- Expected rank for sample 231 = P (Z < z-score) * 1000 = 0.2032 * 1000 = 203
d)
- Let µ1be the mean profit from Casino A and µ2 be the mean profit from Casino B. Therefore,
H0: µ1 – µ 2 = 0
H1: µ 1 – µ 2 ≠ 0
- The required p-value is 0.97
- From the p-value it can be said that H0is accepted.
- The average profits from Casino A and Casino B are equal.
SECTION 5
The back to back histogram shows the number of students and the number of administrators that are involved in the distinctive categories in the university. Thus, it can be said that a back to back histogram is used to compare two categorical variables.
In any business the performance measures of the employees can be measured with the help of a back to back histogram.
SECTION 6
a)
sample |
231 |
||
Column Labels |
|
||
no |
yes |
Grand Total |
|
Count of do you support proposed change? |
86 |
121 |
207 |
sample |
231 |
||
Column Labels |
|
||
no |
yes |
Grand Total |
|
Count of do you support proposed change? |
0.415458937 |
0.584541063 |
1 |
b)
Sample Number = 231
Sample Size = 207
Number of people supporting the change = 121
Required proportion = (121/207) = 0.58
c)
- Total number of samples = 1000
Average = 0.6
Standard Deviation = 0.0357
z-score = (0.58 – 0.6) / 0.0357 = -0.56
2.P (Z < -0.56) = 0.2877
3.Expected rank for sample 231 = P (Z < z-score) * 1000 = 0.2877 * 1000 = 288
d)
The required 95% confidence interval for the proportion = (0.5128, 0.6472).