About:
This course is specifically designed for those students/professionals aspiring to do their career in data science. The course intensively covers the Fundamentals of Mathematics and Statistics required to learn Machine learning along with some practical application using R Programming Language. It is scheduled to start from the 18th May 2019.
Course Duration:
It is a 3 month long course which will be mostly conducted on weekends (Saturdays & Sundays) compatible for working professionals. The sessions will be for 3 hours a day which includes theoretical background preparation along with hands on Programming with R. The maximum intake for the batch is 30 students.
Eligibility:
Student/Professional who is Graduate/Final Year Under Graduate with a basic exposure to Mathematics and interested to pursue data science is highly encouraged to apply for the course. Note that no programming background is required to join the course. However an exposure to C++ will be an advantage.
Topics to be covered
Mathematics and Statistics play an important and integral part in data science. A very strong background in these two subjects can prove to be beneficial to students aspiring to do career in data science. With this aim the course is divided into four components:
• Introduction to R, Statistical Programming Language.
• Basic Statistics, Probability and Inference.
• Applied Linear Algebra and Matrix Computations.
• Basic Algorithms in Machine Learning.
• Design of Flex boards using Shiny in R will be discussed for Data Presentation.
Basic Algorithms in Machine Learning
The following is the list of the topics that will be covered in the above section along with the case studies.
• Simple and Multiple Linear Regression.
• Advanced Linear Regression which includes Ridge Regression, LASSO and Elastic Net.
• Logistic Regression.
• Naive Bayes Classifier.
• Clustering using K Means.
• ARIMA Modelling Technique (Time Series).
• Market Basket Analysis.
Statistical Machine Learning using R - Session wise details
Session No |
General Topic | Specific Topic of Discussion |
1 | Programming with R | Introduction to R, R Data Structures, Control Structures, Loops and Functions. |
2 | Introduction to Databases, Data Imports and Exports in R. | |
3 | Data Manipulations using R. | |
4 | Graphics using R and Introduction to ggplot2. | |
5 | Statistics Probability and Inference | Basic Statistics, Correlation and Probability. |
6 | Probability distributions, Concepts of Sampling. | |
7 | Testing of Hypothesis using R. | |
8 | Association between variables and ANNOVA using R. | |
9 | Statistical Inference using R. | |
11 | Applied Linear Algebra | Introduction to the concept of Distance, Euclidean Distance, Manhattan Distance and other terminologies, Matrix Algebra , Some special matrices, System of Linear Equations, LU Decompositions, LDU Decompositions and Cholesky Decompositions. |
12 | Introduction to Euclidean Spaces, Rank of Matrices , Determinants and Complementary Subspaces, Rank Nullity theorem, Rank Normal Form, Rank and Determinants of Partioned Matrices. | |
13 | Inner Products, Norms, Orthogonality, Gram Smidth Process, Orthocomplementaty Subspaces, More Advanced topics on Orthogonility which include QR Decomposition, Orthogonal Projectors, Orthogonal Triangulizations, Orthogonal Similarity Reduction. | |
14 | Eigenvalues and Eigenvectors, Its real life Applications and Spectral decomposition of real Symmetric Matrices, in particular related to Variance, Covariance Matrices. | |
15 | Singular Value Decomposition, SVD & linear Systems, SVD, data compression and Principal Components, Computing SVD. | |
16 | Indtroduction to Quadratic forms, Positive Definite Matrices, Inequalities related to Quadratic Forms. | |
17 | Linear iterative Systems and Convergence of Matrix Powers, Spectral Radius, Introduction to Web Page Ranking and Markov Chains, Jacobi, Gauss Seidel and Conjugate Gradient Method. | |
18 | Machine Learning | Introduction to Linear Regression. (Simple and Multivariate Regression with a Case Study) |
19 | Introduction to Advanced Regression Topics. (Ridge Regression, LASSO and Elastic Net) | |
19 | Introduction to Logistic Regression with a Case Study. | |
20 | ARIMA Modelling Technique (Time Series) with a Case Study. Statistical Machine Learning using R | |
21 | Clustering using K Means with a Case Study. | |
22 | Naive Bayes Classifier with a Case Study. | |
23 | Market Basket Analysis with a Case Study. | |
24 | Evaluation Metrics, Bias Variance Tradeoff. | |
25 | Programming with R | Design of Flexboards in R using Shiny to present the Data. |