Business Statistics: EDA & Insurance claims

Objective – Explore the dataset and extract insights from the data. Using statistical evidence to


Table of Contents

Context

Leveraging customer information is of paramount importance for most businesses. In the case of an insurance company, attributes of customers like the age, sex,bmi,smoker,children can be crucial in making business decisions.

Data Dictionary

Question to be answered

Libraries

Read and Understand Data

Types of variables

Observations

Observations

Exploratory Data Analysis

Univariate Analysis

Observations

Observations

Bivariate & Multivariate Analysis

Observation

Observation

Observation

Observations

Observation

Observations

Observations

Observations

Observation

Conclusion

Statistical Analysis

1.Prove (or disprove) that the medical claims made by the people who smoke is greater than those who don't?

Step 1: Define null and alternative hypothesis
$\ H_0 : \mu_1 <= \mu_2 $ The average charges of smokers is less than or equal to nonsmokers
$\ H_a :\mu_1 > \mu_2 $ The average charges of smokers is greater than nonsmokers
Step 2: Decide the significance level. If P values is less than alpha reject the null hypothesis. α = 0.05
Step 3: Identify the test Standard deviation of the population is not known ,will perform a T stat test . The > sign in alternate hypothesis indicate test is right tailed, that is all z values that would cause us to reject null hypothesis are in just one tail to the right of sampling distribution curve.
Step 4: Calculate the test-statistics and p-value
Step 5: Decide whethere to reject or failed to reject null hypothesis
We reject the null hypothesis and can conclude that people who smoke have on an average larger medical claim compared to people who don't smoke. Similar result can also been seen in Fig no.1 Smokers Vs Charges

2.Prove (or disprove) with statistical evidence that the BMI of females is different from that of males.

Let $\mu_1 \mu_2 $ and be the respective population means for BMI of males and BMI of females
Step 1: Define null and alternative hypothesis
$\ H_0 : \mu_1 - \mu_2 = 0$ There is no difference between the BMI of Male and BMI of female.
$\ H_a : \mu_1 - \mu_2 !=0 $ There is difference between the BMI of Male and BMI of female.
Step 2: Decide the significance level α = 0.05
Step 3:Identify the test
Standard deviation of the population is not known ,will perform a T stat test.Not equal to sign in alternate hypothesis indicate its a two tailed test.
Step 4: Calculate the test-statistics and p-value
Step 5: Decide to reject or accept null hypothesis
We fail to reject the null hypothesis and can conclude that There is no difference between BMI of Female and BMI of Male.

3.Is the proportion of smokers significantly different across different regions?

Step 1: Define null and alternative hypotheses * H0 Smokers proportions is not significantly different across different regions * Ha Smokers proportions is different across different regions
Step 2: Decide the significance level α = 0.05
Step 3: Identify Test
Here we are comparing two different categorical variables, smoker and different region. So perform a Chi-sq Test.
Step 4: Calculate the test-statistics and p-value
Step 5: Decide to reject or accept null hypothesis
We failed to reject the null hypothesis and conclude that Smoker proportions is not significantly different across different regions.

4.Is the mean BMI of women with no children, one child, and two children the same? Explain your answer with statistical evidence.

Step 1: Define null and alternative hypotheses * H0: μ1 = μ2 = μ3 The mean BMI of women with no children , one child,two children is same
* Ha: Atleast one of mean BMI of women is not same
Step 2: Decide the significance level α = 0.05
Step 3: Identify Test
One-way ANOVA - Equality of population through variances of samples.
Step 4: Calculate the test-statistics and p-value
Step 5: Decide to reject or accept null hypothesis
P value is 0.715858 and it is greater than aplha(0.05) ,We failed to reject the null hypothesis and conclude that mean Bmi of women with no children,one children, two children is same.

Recommendation