Siirry suoraan sisältöön

Introduction to Python for Data Science (4 op)

Toteutuksen tunnus: TT00EV94-3001

Toteutuksen perustiedot


Ajoitus

01.08.2021 - 31.07.2022

Opintopistemäärä

4 op

Virtuaaliosuus

4 op

Toteutustapa

Etäopetus

Yksikkö

ICT ja tuotantotalous

Toimipiste

Karaportti 2

Opetuskielet

  • Englanti

Paikat

0 - 100

Koulutus

  • Tieto- ja viestintätekniikan tutkinto-ohjelma

Opettaja

  • Virve Prami

Vastuuopettaja

Janne Salonen

Ryhmät

  • DiplomaDA
    Diploma in Data Analytics
  • DiplomaMD
    Diploma in Machine and Deep Learning

Tavoitteet

This course is designed to introduce the students to the basics of the Python programming environment, including fundamental Python programming techniques used in data science. The course aims to teach students various data visualization, manipulation, and cleaning techniques using the popular Python data science libraries by exploring different types of data. This course provides a unique opportunity for the student to get hands-on experience with popular Python libraries such as NumPy, Pandas, and Matplotlib. By the end of this course, the student will understand the data science workflow, the basics of Python programming, and learns how to take tabular data, clean it, manipulate it, visualize it, and run basic analyses.

Sisältö

1. Introduction
What is Data Science? – Who is a Data Scientist? – Demands for Data Scientists – Data Science Workflow – Data Science Challenges – Programming in Data Science – Python for Data Science

2. Getting Started with Python
Jupyter Notebook – Anaconda – Anaconda Installation – Getting Started with Jupyter Notebook – Python Syntax – Introduction about Python syntax – Important characteristics of Python syntax – Examples without detailed description about the examples – Input and Output – Variables – Data Types – Python Operators – Arithmetic Operations – Comparison Operation – Logical Operations – String Operations

3. Python Data Structures
Introduction – Lists – List Indexing – List Slicing – List Manipulation: Add New Elements – List Manipulation: Change and Remove Elements – Tuple – Accessing Tuple Elements – Working with Tuples – Set – Set Manipulation – Dictionary – Accessing Dictionary Elements – Dictionary Manipulation

4. Python Programming Fundamentals
Conditions: Introduction – Conditions: if – Conditions: else – Conditions: elif – Loops: Introduction – Loops: for – Loops: for in data structures – Loops: while – Loops: break, continue – Functions: Introduction – Functions: user-defined functions I – Functions: user-defined functions II – Comprehensions

5. Introduction to Numpy
Introduction to NumPy – Array – Arrays Primary Functions – Intrinsic NumPy Array Creation – Creating Random Arrays – Standard Mathematics Operations – Broadcasting in NumPy – Vector and Matrix Mathematics – Statistics in NumPy – Common Mathematics Functions – Filtering – Copy and View

6. Data Manipulation with Pandas
Introduction to Pandas – Series and DataFrame – Input and output – Summary and Statistics – Column Selection – Creating New Column – Removing Column – Location Selection – Filtering – Group by– Useful Functions – Handling Missing Data– Apply Function – Concatenation – Merging

7. Data Visualization with Matplotlib
Introduction to Matplotlib – Plot – Bar Plot – Histogram – Pie Chart – Scatter Plot – Plot Attributes – Subplot

8. Final Tasks
Project – Self-study Essay

Aika ja paikka

Course is online in TechClass portal and it's 100% self-study course.

Oppimateriaalit

Lecture slides, tutorial videos, quizzes, exercises

Opetusmenetelmät

- Tutorial Videos
- Exercises
- Quiz
- Project
- Self-study

Opiskelijan ajankäyttö ja kuormitus

Lectures = 50h
Exercises = 20h
Self-study = 20h
Quizzes = 10h
Project = 20h
Total = 120 hours

Arviointiasteikko

Hyväksytty/Hylätty

Arviointikriteerit, tyydyttävä (1)

- The student is familiar with the concept of Data Science and its challenges.
- The student is familiar with the basic concepts of Python.
- The student knows how to setup and get started with Jupyter Notebook for Python.
- The student is familiar with writing codes in the Python programming language.
- The student knows about the Python Data Structure
- The student is familiar with the history of Python and why it is important for Data Science.
- The student is familiar with important Python libraries for Data Science.

Arviointikriteerit, hyvä (3)

- The student knows about the Data Science Workflow.
- The student knows about the functional programming in Python
- The student is familiar with NumPy arrays and why it is important for vector and matrix operations.
- The student is familiar with the Pandas and knows about DataFrame and Series.
- The student is familiar with working with tabular data to manipulate them.
- The student knows how to use Matplotlib to produce basic plots of data and results.

Arviointikriteerit, kiitettävä (5)

- The student knows how to use different NumPy functions to operate on arrays.
- The student knows about the advanced topics in NumPy arrays like Copy and Views
- The student knows how to take tabular data, clean it, manipulate it
- The student knows how to plot multiple charts in one figure
- The student knows how to plot charts with custom config and annotations.

Arviointimenetelmät ja arvioinnin perusteet

Exercises 65%
Quizzes 5%
Project 20%
Essay 10%

Lisätiedot

Course is only for Diploma in Machine & Deep Learning students.