Data analysis and Visualization

  Suppose your goal is to build a pattern to prophesy which of your customers don’t entertain sanity prophylactic; may-be you shortness to negotiate niggardly sanity prophylactic packages to them. You’ve firm a basisset of customers whose sanity prophylactic foothold you apprehend. You’ve as-well attested some customer properties that you prize acceleration prophesy the chance of prophylactic coverage: age, usurpation foothold, allowance, notice about location and vehicles, and so on. In this assignment we’ll address issues that you can indicate during the basis exploration/visualization side. First you’ll bargain privation prizes. Then you procure devote some niggardly basis transmutations and when they’re appropriate: changeing regular waverings to discrete; normalization and rescaling; and logarithmic transmutations. Customer basis can be downloaded from : custdata.RDS 1. Arraign basis into a basis execute determined custData using readRDS() allot. If you saved refine custdata.RDS in the folder C:/tmp, impartial arraign basis as custData<-readRDS("C:/tmp/custdata.RDS") 2. Sculpture compute of rows and shafts in the refine. Use dim() allot. 3. Sculpture shaft calls. 4. Sculpture compute of NAs in each shaft. Hint: One way to discover NAs is to use sum() and allots, by departure the shaft to 5.  Adding New Columns to a Basis Frame The wavering gas_usage mixes numeric and symbolic basis: prizes elder than 3 are monthly gas reckonings, but prizes from 1 to 3 are alloticular decrees. In analysis, gas_usage has some privation prizes. The prize 1 resources "Gas reckoning moderate in laceration or condo fee". The prize 2 resources "Gas reckoning moderate in electricity payment". The prize 3 resources "No enjoin or gas not used". One way to bargain gas_usage is to change all the alloticular decrees (1,2,3) to NA, and to add three new indicator waverings, one for each decree. For illustration, the indicator wavering gas_with_electricity procure entertain the prize 1 whenever the first gas_usage wavering had the prize 2, and the prize 0 incorrectly. A) Produce the three new indicator waverings, gas_with_rent, gas_with_electricity, and no_gas_bill. Add these indicators to the basis execute custData. Hint: Use ifelse() allot. Restrain texbook pages 66-67 for samples. B) Sculpture the shaft calls of custData to restrain if these new shafts are ascititious. 6. Change Feeble Values to NA The wavering age has the problematic prize 0, which probably resources that the age is hidden. In analysis, there are a few customers succeeding a while age elder than 100, which may as-well be an falsity. However, for this design you career to barely bargain the prize 0 as feeble, and to pretend ages elder than one hundred years are efficient. The wavering allowance has indirect prizes. We'll pretend for this design those prizes are feeble. A) Change feeble age and allowance waverings to NA, as if they were "privation waverings." B) Change all prizes of gas_usage that are hither than 4 to NA. (The infer we shortness to do this is accordingly we already produced three new indicators for the decrees 1,2 and, 3 in gas_usage shaft. And for-this-reason we shortness to dedicate these entries as privation waverings accordingly they don't delineate the gas reckoning total.) Hint: Use ifelse() allot. Restrain texbook pages 66-67 for samples. 7. Barcharts, Histograms, Disseminate Plots A) Concoct barcharts of the prophesyors num_vehicles, recent_move, sanity_ins, marital_status, is_employed, and housing_type. The forthcoming is the bar chart of the housing_type: B) Sculpture histogram of age and allowance. Comment on the arrangement and skewness of the basis for these prophesyors. C) Sculpture the disseminate concoct of age versus allowance: 8. Blindness Concoct and Transmutation to Eliminate Skew A) Sculpture the blindness concocts of allowance and age. B) Is basis correct or left skewed? C) If basis is skewed, devote a transmutation to oust the skewness as abundant as practicable. Hint: Restrain textbook page 74-75. The forthcoming is the blindness concoct of the allowance : And the forthcoming is the blindness concoct succeeding log10() is used to change allowance: 9. Change Regular Wavering to Discrete We would approve to produce the forthcoming ranges for the age prophesyor. [0,25], (25,65], (65,130] A) Use cut() allot to cut the age prophesyor basis into ranges ardent aloft.  Add the upshot as a shaft to the basis execute custData as a new prophesyor determined ageRange. Hint: Listing 4.6 in the textbook, page 71. B) Concoct the bar chart of the ageRange, as shown below: 10. Imputed Prize for the age Predictor You command prize that the basis is privation accordingly the basis gathering failed at accidental, defiant of the site and of the other prizes. In this occurrence, you can re-establish the privation prizes succeeding a while "a inferable admire," or imputed prize. Statistically, one niggardlyly used admire is the expected, or balance. For age prophesyor re-establish all NAs by the balance of the age prizes that are not NAs. Caution: The R balance() allot receipts a compute not an integer. Make certain that you change it to integer using  as.integer() allot. A) Sculpture the balance prize you set-up. B) Succeeding replacing the NAs succeeding a while balance prizes, reproduce the selfselfsame course in allot 10 aloft to sculpture the bar chart:   In allot 5) of week 4 assignment one of the doubt is about adding indicator waverings (new shafts) to the basis execute. The forthcoming assertion describes the indicator wavering gas_with_electricity: For illustration, the indicator wavering gas_with_electricity procure entertain the prize 1 whenever the first gas_usage wavering had the prize 2, and the prize 0 incorrectly. Assume that the call of basis execute is custData. To add the indicator wavering gas_with_electricity to the basis execute custData, merely use the ifelse() as shown below: custData$gas_with_electricity<-ifelse(custData$gas_usage==2,1,0) The assertion aloft adds a new shaft determined gas_with_electricity succeeding a while prizes 1 or 0 fixed on the prizes of gas_usage shaft from custData basis execute. So, if the prize of gas_usage is 2 it assigns 1 as the prize of  gas_with_electricity incorrectly it assigns 0 as the prize of gas_with_electricity. The other two shafts procure be ascititious similarly.