Machine Learning with PythonLaajuus (6 ECTS)

Course unit code: TT00EV96

General information

Credits: 6 ECTS

Teaching language: English

Objective

This course dives into practical machine learning using an approachable and well-known programming language, Python. It provides a unique opportunity for the student to get hands-on experience with popular Python libraries for machine learning such as Numpy, Matplotlib, Pandas, Seaborn, and Scikit-learn. By the end of this course, the student will be able to implement his/her own machine learning models (supervised and unsupervised) from scratch, get them to work, and evaluate their performance. Furthermore, common practices and tricks used by data scientists and machine learning experts are also described throughout the course to prepare the student for future job opportunities.

Content

1. An Introduction to Machine Learning:
Machine Learning – Applications of Machine Learning – Why Python?

2. Getting Started with Python:
Jupyter Notebook – Setting up Jupyter Notebook – Getting Started with Jupyter Notebook – Python Basics: Syntax and Variables – Python Basics: Operators – Python Basics: Data Types – Python Basics: Decision Making – Python Basics: Loops – Python Basics: Defining Functions – Python Libraries for Machine Learning

3. Numpy:
Introduction – Arrays – Array Math – Array Indexing

4. Matplotlib:
Introduction – Plot – Subplot and Scatter Plot – OOI and Easy Subplotting

5. Pandas:
Introduction – Loading Data – Accessing DataFrame Elements – Basic Statistics and Missing Values – Querying and GroupBy

6. Regression:
Scikit-learn – Linear Regression: Introduction – Linear Regression: Implementation – K-Nearest Neighbors Regression: Introduction – K-Nearest Neighbors Regression: Implementation

7. Classification:
Logistic Regression: Introduction – Logistic Regression: Implementation – Support Vector Machines: Introduction – Support Vector Machines: Implementation – K-Nearest Neighbors Classifier: Introduction – K-Nearest Neighbors Classifier: Implementation – Decision Tree: Introduction – Decision Tree: Implementation

8. Unsupervised Learning:
Principal Component Analysis: Introduction – Principal Component Analysis: Implementation – k-Means Clustering: Introduction – k-Means Clustering: Implementation

9. Final Tasks:
Project – Self-study Essay

Assessment criteria, satisfactory (1)

- The student knows the basic concepts of machine learning.
- The student knows the general framework of machine learning algorithms and their primary types.
- The student is familiar with real-life applications of machine learning.
- The student is familiar with the history of Python and why it is important for machine learning.
- The student knows how to setup and get started with Jupyter Notebook for Python.
- The student is familiar with writing codes in Python programming language.
- The student is familiar with important Python libraries for machine learning.

Assessment criteria, good (3)

- The student knows how to work with Numpy arrays and how to use different Numpy functions operate on them.
- The student knows how to use Matplotlib to produce basic plots of data and results.
- The student knows how to use Pandas library to work with tabular data to manipulate them.
- The student is familiar with the Scikit-learn library and its importance to building machine learning models.
- The student knows how to train and evaluate linear regression models using Scikit-learn.
- The student knows how to implement simple non-linear regression models using Scikit-learn.
- The student is familiar with data preprocessing and how to perform it using Pandas and Scikit-learn libraries.

Assessment criteria, excellent (5)

- The student knows how to train and evaluate logistic regression, support vector machines, K-nearest neighbors, and decision tree classifiers using Scikit-learn.
- The student knows how to tune some parameters of learning algorithms to achieve better performances.
- The student knows how to implement PCA method using Scikit-learn for dimensionality reduction.
- The student knows how to implement k-means method using Scikit-learn to perform clustering on unlabeled data.
- The student knows how to efficiently visualize data and model performance.
- The student knows how to use different metrics to facilitate model evaluation and selection.

Further information

Course is only for Diploma in Machine & Deep Learning students.