statistical tools communication skills within a team environment in your future career.
Purpose:
The first project is meant to test not only your ability to implement the statistical tools that you have learned so far, but also to demonstrate your communication skills within a team environment in your future career.
What to Turn In:
• A single pdf document containing your answers to the study questions.?o Your document should be typed.?o The font, color, size, etc. are your choice but your document should not be more ?than 3 pages. ?
• An Excel file containing the data, and your results, clearly but briefly commented. ?o Additional figures\tables may be part of the analysis that appears in the Excel file. ?Please make sure that the pdf and the Excel files you submit include your group name. For instance, if your group name is Lazy Grizzlies, your submissions are in the form of LazyGrizzlies.pdf and LazyGrizzlies.xlsx. Please email the two files to m.kazemi@wsu.edu. ?To Do: ?
1. Identify the variables and specify their type and level of measurement. ?
2. Summarize the data. Depending on the type of the variable you have various tools to choose ?from (Please see lecture note #2, page 2). ?
3. Visualize the data. Choose an appropriate graph to visualize numerical and categorical ?variables (Please see lecture note #2, page 2). ?
4. Describe the reason behind selecting any tools that you have used for your analysis. ?
Grading
The first project is worth 50 points. 15 points will be based on your excel file, and whether you have correctly done the analysis. Any tables and figures must be relevant to your answers. Additional figures may be part of the analysis, but should only appear in the Excel file. Points will be deducted for failing to follow instructions.
The Data:?Please use the data on the Tesla Model SK. This is the “Tesla” file in the “Datasets” folder on Blackboard. The data you are provided with are a subset of the full data set of 20,000 consumers who decided to go green and purchase an electric car. There are five columns in the data set, which are:
• Vehicle Identification Number (VIN) – A unique code used by the automotive company to identify the motor vehicles. ?
• Battery pack (kWh) – The battery energy capacity. ?
• Range per charge (Miles) – The driving range of the vehicle once it has been fully ?charged at a Supercharger station. ?
• Class – The vehicle class based on its body style and performance. ?
• Layout – The layout describes where on the vehicle the engine and drive wheels are found. ?
Questions
1. Identify the variables and classify them as categorical or numerical. If the variable is numerical, determine whether the variable is discrete or continuous. In addition, determine the measurement scale for each of these variables.(5pts) ?
2. Construct a summary table for each categorical variable and comment on the results. (4pts) ?
3. Create contingency tables to examine whether there is any pattern between the categorical variables. Decide which types of contingency table is more informative (in terms of frequencies, relative frequencies, or percentages of row, column, or total) and comment on the results. (8pts) ?
4. Construct a histogram for each types of battery pack and comment on the shape of the data for each battery pack. The histograms need to be constructed using bar charts. Therefore, you need to go through the following steps
• Finding the sample size, min, and max ?
• Computing the number of classes ?
• Finding the class width ?
• Define class boundaries and midpoints ?
• Fining the frequency count of each group. Since you’re working with a large data set, ?it would be more efficient to use an Excel function to find the frequencies. For instance, the following function will return the number of values stored in the range R that fall into the interval [a,b) ?=COUNTIFS(R,”>=” & a, R ,”<“& b) ?
• Draw a bar chart based on midpoints and frequencies\relative frequencies and put ?the finishing touches to create the histogram. (10pts) ?
5. Compute the summary statistics for each types of battery pack and comment on the following numerical measures:
• mean ?
• median ?
• standard deviation ?
• range ?Feel free to use the Data Analysis tool pack to obtain the summary statistics. (4pts) ?
6. For each battery pack, does the results of part 5 confirm or contradict your answer to part 4? (4pts)