Name:

Statistical Concepts:

- Data Simulation
- Confidence Intervals
- Normal Probabilities

Short Answer Writing Assignment

All answers should be complete sentences.

We need to find the confidence interval for the SLEEP variable. To do this, we need to find the mean and then find the maximum error. Then we can use a calculator to find the interval, (x – E, x + E).

First, find the mean. Under that column, in cell E37, type **=AVERAGE(E2:E36)**. Under that in cell E38, type **=STDEV(E2:E36)**. Now we can find the maximum error of the confidence interval. To find the maximum error, we use the “confidence” formula. In cell E39, type **=CONFIDENCE.NORM(0.05,E38,35)**. The 0.05 is based on the confidence level of 95%, the E38 is the standard deviation, and 35 is the number in our sample. You then need to calculate the confidence interval by using a calculator to subtract the maximum error from the mean (x-E) and add it to the mean (x+E).

1. Give and interpret the 95% confidence interval for the hours of sleep a student gets. (5 points)

2. Give and interpret the 99% confidence interval for the hours of sleep a student gets. (5 points)

3. Compare the 95% and 99% confidence intervals for the hours of sleep a student gets. Explain the difference between these intervals and why this difference occurs. (5 points)

In the week 2 lab, you found the mean and the standard deviation for the HEIGHT variable for both males and females. Use those values for follow these directions to calculate the numbers again.

(From week 2 lab: Calculate descriptive statistics for the variable Height by Gender. Click on **Insert** and then **Pivot Table**. Click in the top box and select all the data (including labels) from **Height** through **Gender**. Also click on “new worksheet” and then **OK**. On the right of the new sheet, click on **Height** and **Gender**, making sure that **Gender** is in the **Rows** box and **Height** is in the **Values** box. Click on the down arrow next to **Height** in the **Values** box and select **Value Field Settings**. In the pop up box, click **Average **then** OK**. Write these down. Then click on the down arrow next to **Height** in the **Values** box again and select **Value Field Settings**. In the pop up box, click on **StdDev **then** OK**. Write these values down.)

You will also need the number of males and the number of females in the dataset. You can either use the same pivot table created above by selecting **Count** in the **Value Field Settings**, or you can actually count in the dataset.

Then in Excel (somewhere on the data file or in a blank worksheet), calculate the maximum error for the females and the maximum error for the males. To find the maximum error for the females, type **=CONFIDENCE.T(0.05,stdev,#)**, using the females’ height standard deviation for “stdev” in the formula and the number of females in your sample for the “#”. Then you can use a calculator to add and subtract this maximum error from the average female height for the 95% confidence interval. Do this again with 0.01 as the alpha in the beginning of the formula to find the 99% confidence interval.

Find these same two intervals for the male data by using the same formula, but using the males’ standard deviation for “stdev” and the number of males in your sample for the “#”.

4. Give and interpret the 95% confidence intervals for males and females on the HEIGHT variable. Which is wider and why? (7 points)

5. Give and interpret the 99% confidence intervals for males and females on the HEIGHT variable. Which is wider and why? (7 points)

6. Find the mean and standard deviation of the DRIVE variable by using =AVERAGE(A2:A36) and =STDEV(A2:A36). Assuming that this variable is normally distributed, what percentage of data would you predict would be less than 40 miles? This would be based on the calculated probability. Use the formula =NORM.DIST(40, mean, stdev,TRUE). Now determine the percentage of data points in the dataset that fall within this range. To find the actual percentage in the dataset, sort the DRIVE variable and count how many of the data points are less than 40 out of the total 35 data points. That is the actual percentage. How does this compare with your prediction? (10 points)

7. What percentage of data would you predict would be between 40 and 70 and what percentage would you predict would be more than 70 miles? Subtract the probabilities found through =NORM.DIST(70, mean, stdev, TRUE) and =NORM.DIST(40, mean, stdev, TRUE) for the “between” probability. To get the probability of over 70, use the same =NORM.DIST(70, mean, stdev, TRUE) and then subtract the result from 1 to get “more than”. Now determine the percentage of data points in the dataset that fall within this range, using same strategy as above for counting data points in the data set. How do each of these compare with your prediction and why is there a difference? (11 points)