# Case Study on Data Mining

Abstract <>

Questions

1. The following attributes are measured for members of a herd of Asian elephants: *weight, height, tusk length, trunk length,* and *ear area*. Based on these measurements, what sort of similarity measure from Section 2.4 (measure of similarity and dissimilarity) would you use to compare or group these elephants? Justify your answer and explain any special circumstances. (Chapter 2)

2. Consider the training examples shown in Table 3.5 (185 page) for a binary classification problem. (Chapter 3)

(a) Compute the Gini index for the overall collection of training examples.

(b) Compute the Gini index for the Customer ID attribute.

(c) Compute the Gini index for the Gender attribute.

(d) Compute the Gini index for the Car Type attribute using multiway split.

3. Consider the data set shown in Table 4.9 (348 page). (Chapter 4)

(a) Estimate the conditional probabilities for P(A|+), P(B|+), P(C|+), P(*A|-*), P(*B|-*), and P(*C|-*).

(b) Use the estimate of conditional probabilities given in the previous question to predict the class label for a test sample (A = 0*, B* = 1*, C* = 0) using the naıve Bayes approach.

(c) Estimate the conditional probabilities using the m-estimate approach, with p = 1/2 and m = 4.

Conclusion<>

Grading Rubric for the Assignment #2:

- Delivery: Delivered the assignments on time, and in correct format: 25 percent
- Completion: Providing a thoroughly develop the document including descriptions of all questions: 25 percent
- Understanding: Demonstrating a clear understanding of purpose and writing a central idea with mostly relevant facts, details, and/or explanation: 25 percent
- Organization: Paper is well organized based on the APA format, makes good use of transition statements, and in most instances follows a logical progression including good use of symbols, spacing in output: 25 percent

Information Technologydata mining

Assignment #2 (100 point)

Students

are required to submit the assignment 2 to your instructor for grading. The assignments are on the assigned materials/textbook topics associated with the course modules. Please read the following instruction and complete it to post on schedule.

1. The following attributes are measured for members of a herd of Asian elephants: weight, height, tusk length, trunk length, and ear area. Based on these measurements, what sort of similarity measure from Section 2.4 (measure of similarity and dissimilarity) would you use to compare or group these elephants? Justify your answer and explain any special circumstances. (Chapter 2)

2. Consider the training examples shown in Table 3.5 (185 page) for a binary classification problem. (Chapter 3)

(a) Compute the Gini index for the overall collection of training examples.

(b) Compute the Gini index for the Customer ID attribute.

(c) Compute the Gini index for the Gender attribute.

(d) Compute the Gini index for the Car Type attribute using multiway split.

3. Consider the data set shown in Table 4.9 (348 page). (Chapter 4)

(a) Estimate the conditional probabilities for P(A|+), P(B|+), P(C|+), P(A|-), P(B|-), and P(C|-).

(b) Use the estimate of conditional probabilities given in the previous question to predict the class label for a test sample (A = 0, B = 1, C = 0) using the naıve Bayes approach.

(c) Estimate the conditional probabilities using the m-estimate approach, with p = 1/2 and m = 4.

Grading Rubric for the Assignment #2:

· Delivery: Delivered the assignments on time, and in correct format: 25 percent

· Completion: Providing a thoroughly develop the document including descriptions of all questions: 25 percent

· Understanding: Demonstrating a clear understanding of purpose and writing a central idea with mostly relevant facts, details, and/or explanation: 25 percent

· Organization: Paper is well organized based on the APA format, makes good use of transition statements, and in most instances follows a logical progression including good use of symbols, spacing in output: 25 percent

## We've got everything to become your favourite writing service

### Money back guarantee

Your money is safe. Even if we fail to satisfy your expectations, you can always request a refund and get your money back.

### Confidentiality

We don’t share your private information with anyone. What happens on our website stays on our website.

### Our service is legit

We provide you with a sample paper on the topic you need, and this kind of academic assistance is perfectly legitimate.

### Get a plagiarism-free paper

We check every paper with our plagiarism-detection software, so you get a unique paper written for your particular purposes.

### We can help with urgent tasks

Need a paper tomorrow? We can write it even while you’re sleeping. Place an order now and get your paper in 8 hours.

### Pay a fair price

Our prices depend on urgency. If you want a cheap essay, place your order in advance. Our prices start from $11 per page.