Skip to main content

Statistical Data Analysis with Python (10 ECTS)

Code: TT00FA97-3001

General information


Timing
01.08.2022 - 31.12.2023
Implementation has ended.
Number of ECTS credits allocated
10 ECTS
Virtual portion
10 ECTS
Mode of delivery
Online
Campus
Karaportti 2
Teaching languages
English
Seats
0 - 500
Degree programmes
Information and Communication Technology
Teachers
Virve Prami
Groups
OPEN_UAS_TIVI_AI_ML_DS_75_ECTS
Open UAS: Artificial intelligence, Machine Learning and Data Science (NonStop Module) 75 ECTS
Course
TT00FA97
No reservations found for implementation TT00FA97-3001!

Learning outcomes

Data analysis and statistical analysis are necessary for many in-demand data analytics job roles. They are used hand in hand to solve business problems; data analysis is a general approach to making data understandable for decision-makers, and statistical analysis is a professional statistical attitude. A rich data analysis skill-set will help you better understand the data to extract knowledge and insights. This course is designed to give you the necessary resources to gain the career-building Python skills you need to succeed as a Data Analyst. By the end of this course, you will get a full understanding of how to use Python’s scientific computing libraries to import, clean, manipulate, visualize data and use a wide range of statistical techniques to analyze data to extract meaningful insights.

This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.

The student will pass this course after submitting the required quiz, assignments, and the final project.

Content

1. Introduction to Data Analysis:
What is Data Analysis? – Different Types of Data Analysis – What is Statistical Analysis? – Descriptive vs. Inferential Statistics – Methods of Sampling – Steps Involved in Data Analysis – Quiz

2. Data Ingestion:
Introduction – Importing Flat Files – Parsing Date and Time – Importing Excel Spreadsheets – Connecting to a Database – Retrieving Tables from MySQL Databases – Retrieving Tables from PostgreSQL Database – Retrieving Data from Azure Blob Storage – Retrieving Data from AWS S3 Buckets – Importing JSON Files – Combining Multiple Datasets – Quiz

3. Descriptive Statistics:
Introduction – Histogram and Bar Chart – Central Tendency Measures – Data Variability Measures – Extracting Descriptive Statistics – Skewness – Kurtosis

4. Data Cleaning:
Introduction – Handling Incorrect Values – Handling Incorrect Data Types – Removing Missing Values – Handling Missing Values: Simple Imputation – Handling Missing Values: K-NN Imputation – Handling Missing Values: MICE – Binning – Outlier Detection: IQR Method – Outlier Detection: Isolation Forest – Data Sanitization – Quiz

5. Probability:
Introduction – Probabilistic Experiment – Probability of an Event – Random Variable – Discrete and Continuous Random Variables – Probability Mass Function – Probability Density Function – Cumulative Distribution Function – Empirical Cumulative Distribution Function – Expected Values

6. Statistical Data Modeling:
Introduction – Normal Distribution – Other Types of Distribution Functions – Kernel Density Estimation – Fitting Data to the Probability Distribution – Conditional Probabilistic Analysis

7. Relationship Analysis
Introduction – Correlation vs. Causation – Covariance Matrix – Pearson Correlation – Kendall Rank Correlation – Spearman Rank Correlation – Heatmap of Correlation Matrix – Quiz

8. Hypothesis Testing:
Introduction – Essential Concepts – Chi-square Test of Independence – Chi-square Test of Independence: Implementation – Two-Sample t-Test – Paired t-Test – One-Way ANOVA – Post-Hoc Test – Non-Parametric Tests

9. A/B Testing:
Introduction – Designing the Experiment – Collecting and Preparing the data – Visualizing the Results – Testing the Hypothesis – Drawing Conclusions

10. Final Tasks:
Project – Self-study Essay

Prerequisites

Introduction to Python for Data Science

Teaching methods

This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.

The student will pass this course after submitting the required quiz, assignments, and the final project.

Location and time

Course can be done in own pace in TechClass portal.

Learning materials and recommended literature

Lecture slides, tutorial videos, quizzes, exercises and project can be find via TechClass portal.

Alternative completion methods of implementation

N/A

Internship and working life connections

N/A

Exam dates and retake possibilities

Online.

International connections

N/A

Student workload

- Tutorial Videos
- Exercises
- Quiz
- Project
- Self-study

Content scheduling

Up to Student her-/himself.

Assessment methods and criteria

Exercises 50%
Quizzes 25%
Project 25%

Evaluation scale

Hyväksytty/Hylätty

Assessment criteria, approved/failed

Exercises 50%
Quizzes 25%
Project 25%

Go back to top of page