FinalExaminationmakeupsummer2019 x
Final Examination
Database Modeling, Summer 2019 Name_____________________________________
Read each question carefully. Good luck!
<
Question #1 (10 points)
ACME Direct is a direct marketer of books, music, videos and magazines. The Marketing Director of ACME Direct tested a new book title slightly over one year ago and has decided, based on the results of the test, to promote this title to selected names from the database. Last month ACME Direct purchased, for the first time, new list enhancement data (age, income, marital status, home value, etc.) not previously on the customer database.
Using the saved sample from the original test promotion one year ago, the analyst is preparing to develop a regression model which will aid in predicting the type of customer most likely to order this particular book title. The Marketing Director has asked the analyst to append the new enhancement data to the sample in order to see if any of this “new enhancement data” will come into the regression equation.
Do you have any concerns regarding the marketing director’s request? If so, explain your concerns to the Marketing Director in 75 words or less
.
Question #2 (5 points each part, 10 points total)
ACME Direct is a direct marketer of books, music, videos and magazines. Below are two customers selected at random from the ACME database.
a) Based on this information alone, which customer do you believe is most likely to order an upcoming book promotion and why?
Customer |
Total Promotions |
Total Book Promotions |
Total Book Orders |
Smith |
79 |
4 7 |
4 |
Johnson |
61 |
3 3 |
3 |
b) If Smith’s last known order date is 4/15/00 and Johnson’s last known order date is 2/6/99, would you change your mind regarding who you selected to promote in part (a)? Fully explain your answer.
Question #3 (5 points each part, 20 points total)
ACME Direct is a direct marketer of books, music, videos and magazines. They have a database of size 10,000,000. For each of the pairs of data elements residing on the ACME Direct database, describe how you think they would be correlated (positive, negative or zero), the degree of the correlation (strong, moderate or slight) , and defend your answer.
1. “Total number of promotions since last order” associated with each customer and “total number of orders ever.”
1. “Customer age” and “total number of cookbook purchased ever.”
1. “Total number of cookbooks purchased ever” and “total number of videos of any kind purchased.”
1. “Total products returned for full refund” and “total products ordered.”
Last year a sample of names from the books product line primary customer segment was test promoted for a new cookbook offer. The ACME analyst has built a regression model for this sample of names. The resulting cumulative and incremental gains charts developed on the validation sample are shown below. Fill in the 3 missing numbers on this chart.
Question #4 (5 points each calculation, 15 points total)
Question #5 (5 points each part, 10 points total)
a) fill in the 3 missing numbers – showing exactly how they are calculated.
b) how do you calculate the gain of 30 for bucket 7
c) how do you calculate the cumulative RR of 6.5 for bucket 3
Using the gains charts shown in Question #4 the Senior Product Manager at ACME Direct will determine who to promote for his upcoming cookbook promotion. To ensure 5% profit-after-overhead for this campaign, the Senior Product Manager has determined he should not promote any group of customers with a response rate below 4.00%.
If the primary customer segment for the books product line (the universe the regression model was built on) represents 3,450,000 names,
1. How many names should the senior product manager promote?
1. What will be his expected number of orders in roll-out?
Question #6 (5 points each part, 15 points total)
Answer the following questions regarding regression analysis:
1. In 75 words or less please explain what causes multicollinearity?
1. In 75 words or less please describe two ways one can identify that multicollinearity is present in a model.
One:
Two:
1. In 75 words or less please describe two ways that one can rid a model of multicollinearity.
One:
Two:
Question #7 (10 points)
ACME Direct, a direct marketer of book, music, videos and magazines, has built a multiple regression model predicting who is likely to order a World War II video set based on a saved sample of 9,994 names promoted for this product last fall. Below is a copy of the EXCEL output from the regression run. The dependent variable was the typical binary indicator (1=order, 0=silent).
Score customer Jones and Matthews on the above equation and indicate which one is most likely to order this video product. Assume today’s date is 5/1/00.
Customer
Last Order Date
Total Promotions Sent (All Product Lines)
Total Paid Orders
(All Product Lines)
Total Video Orders
Gender
Jones
1/1/00
35
3
1
male
Matthews
10/1/99
66
7
3
unknown
Question #8 (5 points each part, 10 points total)
Below is a cross-tabulation of two variables produced on a sample of 10,000 customers test promoted for the book “The Secret Lives of Our U.S. Presidents” denoted as SECRETS. The two variables used in this cross tabulation are “Time in Months Since Last Book Order” and “Number of Books Ordered Ever.” Answer the following questions about this cross tabulation.
Product Mailed: SECRETS
Segment: Primary Book Customer Segment
Sample Size: 10,000
Number of Orders: 1,000
Response Rate: 10.00%
Number of Books
Time in Months Since Last Book Order
Ordered Ever:
1-10
11-20
21-40
41-60
61+
Total
1
46/525 = 8.76%
30/445 = 6.74%
16/321 = 4.98%
3/135 = 2.22%
0/29 = 0.00%
95/1,455 = 6.53%
2
42/387 =
10.85
77/978 = 7.87%
44/645 = 6.82%
10/289 = 3.46%
1/55 = 1.82%
174/2,354 =
7.39%
3-5
44/349 = 12.61%
95/789 = 12.04%
149/1,256 = 11.86%
39/443 = 8.80%
12/250 = 4.80%
339/3,087 = 10.98%
6-10
41/298 = 13.76%
10/306 =
13.07%
69/534 = 12.92%
67/567 = 11.82%
24/282 = 8.51%
241/1,987 = 12.13%
11+
0/0 = N/A
5/27 = 18.52%
34/232 = 14.66%
60/444 = 13.51%
52/414 =
12.56
151/1,117 = 13.52%
Total
173/1,559 = 11.10%
247/2,545 = 9.71%
312/2,988 = 10.44%
179/1,878 = 9.53%
89/1,030 = 8.64%
1,000/10,000 =
10.00%
173 represents the number of customers that fell into this cell who ordered SECRETS.
1,559 represents all customers (orders & non-orders of SECRETS) falling into this cell.
11.10% represents the percent of those that fell into this cell that ordered SECRETS.
1. What is the SECRETS order rate of the names in this sample that have placed only 2 book orders and whose last book order was 11-20 months ago?
1. What is the SECRETS order rate of the names in this sample that have placed less than 6 book orders ever?
Regression Statistics
R Square
0.0948
Observations
9,994
Coefficients
P-value
Intercept
0.1385
0.0395
TSLO
-0.0159
0.0459
TSLO = Elapsed time, in months, since last order of any kind
NM_VIDORDS
0.0785
0.0001
NM_VIDORDS = Total number of video orders ever made
RATIO_PDPR
0.6385
0.0043
RATIO_PDPR = Ratio of total products paid across all product lines
GNDR_MALE
0.1748
0.0675
to total promotions sent across all product lines
GNDR_MALE = 1 if male, 0 otherwise
Gains Chart
Cumulative Cumulative
CUMULATIVE GAINS CHART Quantity Orders on INCREMENTAL GAINS GHART
Bucket Percent Resp. Rate Gain on 10,000 10,000 Percent Qty on 10,000 Qty on 10,000 Resp. Rate
1 10.00% 9.75% 205 1,000 98 10.00% 1,000 98 9.75%
2 20.50% 8.00% 150 2,050 164 10.50% 1,050 67 6.33%
3 30.00% 6.50% 103 3,000 195 9.50% 950 31 3.26%
4 40.00% 5.60% 75 4,000 224 10.00% 1,000 29 2.90%
5 50.00% 5.00% 56 5,000 250 10.00% 1,000 26 2.60%
6 60.10% 4.55% 42 6,010 273 10.10% 1,010 23 2.32%
7 70.00% 4.16% 30 7,000 291 9.90% 990 18 1.79%
8 80.00% 3.80% 19 8,000 304 10.00% 1,000 13 1.28%
9 90.00% 3.50% 9 9,000 315 10.00% 1,000 11 1.10%
10 100.00% 3.20% 0 10,000 320 10.00% 1,000 5 0.50%
TOTAL — — — 100.00% 10,000 320 3.20%
&CModule 7 – Assignment #2
Page &P
Multiple Regression Output
Regression Statistics
R Square 0.0948
Observations 9,994
Coefficients P-value
Intercept 0.1385 0.0395
TSLO -0.0159 0.0459 TSLO = Elapsed time, in months, since last order of any kind
NM_VIDORDS 0.0785 0.0001 NM_VIDORDS = Total number of video orders ever made
RATIO_PDPR 0.6385 0.0043 RATIO_PDPR = Ratio of total products paid across all product lines
GNDR_MALE 0.1748 0.0675 to total promotions sent across all product lines
GNDR_MALE = 1 if male, 0 otherwise
&A
Page &P
CUMUL GAINS CHART
INCREM GAINS CHART
Bucket
Percent
Resp. Rate
Gain
Percent
Resp. Rate
1
10.00%
9.75%
205
10.00%
9.75%
2
20.50%
8.00%
150
10.50%
6.33%
3
30.00%
6.50%
103
9.50%
3.26%
4
40.00%
5.60%
75
10.00%
2.90%
5
50.00%
5.00%
56
10.00%
2.60%
6
60.10%
4.55%
42
10.10%
2.32%
7
70.00%
4.16%
30
9.90%
1.79%
8
80.00%
3.80%
19
10.00%
1.28%
9
90.00%
3.50%
9
10.00%
1.10%
10
100.00%
3.20%
0
10.00%
0.50%
TOTAL
—
—
—
100.00%
3.20%
Gains Chart
Cumulative Cumulative
CUMUL GAINS CHART Quantity Orders on INCREM GAINS CHART
Bucket Percent Resp. Rate Gain on 10,000 10,000 Percent Qty on 10,000 Qty on 10,000 Resp. Rate
1 10.00% 9.75% 205 1,000 98 10.00% 1,000 98 9.75%
2 20.50% 8.00% 150 2,050 164 10.50% 1,050 67 6.33%
3 30.00% 6.50% 103 3,000 195 9.50% 950 31 3.26%
4 40.00% 5.60% 75 4,000 224 10.00% 1,000 29 2.90%
5 50.00% 5.00% 56 5,000 250 10.00% 1,000 26 2.60%
6 60.10% 4.55% 42 6,010 273 10.10% 1,010 23 2.32%
7 70.00% 4.16% 30 7,000 291 9.90% 990 18 1.79%
8 80.00% 3.80% 19 8,000 304 10.00% 1,000 13 1.28%
9 90.00% 3.50% 9 9,000 315 10.00% 1,000 11 1.10%
10 100.00% 3.20% 0 10,000 320 10.00% 1,000 5 0.50%
TOTAL — — — 100.00% 10,000 320 3.20%
&CModule 7 – Assignment #2
Page &P
Gains Chart
Cumulative Cumulative
CUMUL GAINS CHART Quantity Orders on INCREM GAINS CHART
Bucket Percent Resp. Rate Gain on 10,000 10,000 Percent Qty on 10,000 Qty on 10,000 Resp. Rate
1 10.00% 9.75% 205 1,000 98 10.00% 1,000 98 9.75%
2 20.50% 8.00% 150 2,050 164 10.50% 1,050 67 6.33%
3 30.00% 6.50% 103 3,000 195 9.50% 950 31 3.26%
4 40.00% 5.60% 75 4,000 224 10.00% 1,000 29 2.90%
5 50.00% 5.00% 56 5,000 250 10.00% 1,000 26 2.60%
6 60.10% 4.55% 42 6,010 273 10.10% 1,010 23 2.32%
7 70.00% 4.16% 30 7,000 291 9.90% 990 18 1.79%
8 80.00% 3.80% 19 8,000 304 10.00% 1,000 13 1.28%
9 90.00% 3.50% 9 9,000 315 10.00% 1,000 11 1.10%
10 100.00% 3.20% 0 10,000 320 10.00% 1,000 5 0.50%
TOTAL — — — 100.00% 10,000 320 3.20%
&CModule 7 – Assignment #2
Page &P
Regression Statistics
R Square
0.0948
Observations
9,994
Coefficients
P-value
Intercept
0.1385
0.0395
TSLO
-0.0159
0.0459
TSLO = Elapsed time, in months, since last order of any kind
NM_VIDORDS
0.0785
0.0001
NM_VIDORDS = Total number of video orders ever made
RATIO_PDPR
0.6385
0.0043
RATIO_PDPR = Ratio of total products paid across all product lines
GNDR_MALE
0.1748
0.0675
to total promotions sent across all product lines
GNDR_MALE = 1 if male, 0 otherwise
Gains Chart
Cumulative Cumulative
CUMULATIVE GAINS CHART Quantity Orders on INCREMENTAL GAINS GHART
Bucket Percent Resp. Rate Gain on 10,000 10,000 Percent Qty on 10,000 Qty on 10,000 Resp. Rate
1 10.00% 9.75% 205 1,000 98 10.00% 1,000 98 9.75%
2 20.50% 8.00% 150 2,050 164 10.50% 1,050 67 6.33%
3 30.00% 6.50% 103 3,000 195 9.50% 950 31 3.26%
4 40.00% 5.60% 75 4,000 224 10.00% 1,000 29 2.90%
5 50.00% 5.00% 56 5,000 250 10.00% 1,000 26 2.60%
6 60.10% 4.55% 42 6,010 273 10.10% 1,010 23 2.32%
7 70.00% 4.16% 30 7,000 291 9.90% 990 18 1.79%
8 80.00% 3.80% 19 8,000 304 10.00% 1,000 13 1.28%
9 90.00% 3.50% 9 9,000 315 10.00% 1,000 11 1.10%
10 100.00% 3.20% 0 10,000 320 10.00% 1,000 5 0.50%
TOTAL — — — 100.00% 10,000 320 3.20%
&CModule 7 – Assignment #2
Page &P
Multiple Regression Output
Regression Statistics
R Square 0.0948
Observations 9,994
Coefficients P-value
Intercept 0.1385 0.0395
TSLO -0.0159 0.0459 TSLO = Elapsed time, in months, since last order of any kind
NM_VIDORDS 0.0785 0.0001 NM_VIDORDS = Total number of video orders ever made
RATIO_PDPR 0.6385 0.0043 RATIO_PDPR = Ratio of total products paid across all product lines
GNDR_MALE 0.1748 0.0675 to total promotions sent across all product lines
GNDR_MALE = 1 if male, 0 otherwise
&A
Page &P