All the data came from the Kaggle competition [1].
The work includes:
1 - Data Visualization:
- Describe the generalized math formulation of a linear generalized model and the parameter adjustment algorithm. (theoretical)
- Describe the regularization problem and bad-conditioning effects on the result
- Basic statistics of each variable
- Analyze the distribution of each variable
- Analyze outliers.
- Check the correlations.
2 - Prediction and Analysis in Linear Models:
- Describe the mathematical formulation of the utilized model and the parameter adjustment algorithm.
- Explore different model structures and parameters for linear regression models. Evaluate, if possible, the effects of outliers and overfitting.
- Evaluate the results in k-fold (k=10).
3 - Prediction and Comparison with Non-Linear Models:
- Evaluate the results in k-fold (k=10).
- Compare the results obtained with neural networks, SVM or other non-linear algorithm.
[1] - https://www.kaggle.com/c/pubg-finish-placement-prediction