Scoring guide (Rubric) – Project 5 Rubric (1)
Answer point by point according to the below CriteriaPoints1.1 EDA – Basic data summary, Univariate, Bivariate analysis, graphs, Check for Outliers and missing values and check the summary of the dataset71.2 EDA – Illustrate the insights based on EDA51.3 EDA – Check for Multicollinearity – Plot the graph based on Multicollinearity & treat it.32. Data Preparation (SMOTE)103.1 Applying Logistic Regression & Interpret results33.2 Applying KNN Model & Interpret results33.3 Applying Naïve Bayes Model & Interpret results (is it applicable here? comment and if it is not applicable, how can you build an NB model in this case?)33.4 Confusion matrix interpretation33.5 Remarks on Model validation exercise 33.6 Bagging7.53.7 Boosting7.54. Actionable Insights and Recommendations5Points60Description

This project requires you to understand what mode of transport employees prefers to commute to their office. The attached data ‘Cars.csv’ includes employee information about their mode of transport as well as their personal and professional details like age, salary, work exp. We need to predict whether or not an employee will use Car as a mode of transport. Also, which variables are a significant predictor behind this decision.
Following is expected out of the candidate in this assessment.
EDA (15 Marks)

Perform an EDA on the data – (7 marks)
Illustrate the insights based on EDA (5 marks)
Check for Multicollinearity – Plot the graph based on Multicollinearity & treat it. (3 marks)

Data Preparation (10 marks)

Prepare the data for analysis (SMOTE)

Modeling (30 Marks)

Create multiple models and explore how each model perform using appropriate model performance metrics (15 marks)

KNN 
Naive Bayes (is it applicable here? comment and if it is not applicable, how can you build an NB model in this case?)
Logistic Regression

Apply both bagging and boosting modeling procedures to create 2 models and compare its accuracy with the best model of the above step. (15 marks)

Actionable Insights & Recommendations (5 Marks)

 Summarize your findings from the exercise in a concise yet actionable note

Please note the following:

Your submission should have two files – 1) Business report in PDF format with a word limit of 3000 words, 2) R Code file. Appendices are not counted in the word limit
You must give the sources of data presented. Do not refer to blogs; Wikipedia etc.
Any assignment found copied/ plagiarized with other(s) will not be graded and marked as zero.
Please ensure timely submission as post deadline assignment will not be accepted.