Statistical Data Analysis with Python (10 ECTS)
Code: TT00FA97-3001
General information
- Timing
- 01.08.2022 - 31.12.2023
- Implementation has ended.
- Number of ECTS credits allocated
- 10 ECTS
- Virtual portion
- 10 ECTS
- Mode of delivery
- Online
- Campus
- Karaportti 2
- Teaching languages
- English
- Seats
- 0 - 500
- Degree programmes
- Information and Communication Technology
- Teachers
- Virve Prami
- Groups
-
OPEN_UAS_TIVI_AI_ML_DS_75_ECTSOpen UAS: Artificial intelligence, Machine Learning and Data Science (NonStop Module) 75 ECTS
- Course
- TT00FA97
Learning outcomes
Data analysis and statistical analysis are necessary for many in-demand data analytics job roles. They are used hand in hand to solve business problems; data analysis is a general approach to making data understandable for decision-makers, and statistical analysis is a professional statistical attitude. A rich data analysis skill-set will help you better understand the data to extract knowledge and insights. This course is designed to give you the necessary resources to gain the career-building Python skills you need to succeed as a Data Analyst. By the end of this course, you will get a full understanding of how to use Python’s scientific computing libraries to import, clean, manipulate, visualize data and use a wide range of statistical techniques to analyze data to extract meaningful insights.
This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.
The student will pass this course after submitting the required quiz, assignments, and the final project.
Content
1. Introduction to Data Analysis:
What is Data Analysis? – Different Types of Data Analysis – What is Statistical Analysis? – Descriptive vs. Inferential Statistics – Methods of Sampling – Steps Involved in Data Analysis – Quiz
2. Data Ingestion:
Introduction – Importing Flat Files – Parsing Date and Time – Importing Excel Spreadsheets – Connecting to a Database – Retrieving Tables from MySQL Databases – Retrieving Tables from PostgreSQL Database – Retrieving Data from Azure Blob Storage – Retrieving Data from AWS S3 Buckets – Importing JSON Files – Combining Multiple Datasets – Quiz
3. Descriptive Statistics:
Introduction – Histogram and Bar Chart – Central Tendency Measures – Data Variability Measures – Extracting Descriptive Statistics – Skewness – Kurtosis
4. Data Cleaning:
Introduction – Handling Incorrect Values – Handling Incorrect Data Types – Removing Missing Values – Handling Missing Values: Simple Imputation – Handling Missing Values: K-NN Imputation – Handling Missing Values: MICE – Binning – Outlier Detection: IQR Method – Outlier Detection: Isolation Forest – Data Sanitization – Quiz
5. Probability:
Introduction – Probabilistic Experiment – Probability of an Event – Random Variable – Discrete and Continuous Random Variables – Probability Mass Function – Probability Density Function – Cumulative Distribution Function – Empirical Cumulative Distribution Function – Expected Values
6. Statistical Data Modeling:
Introduction – Normal Distribution – Other Types of Distribution Functions – Kernel Density Estimation – Fitting Data to the Probability Distribution – Conditional Probabilistic Analysis
7. Relationship Analysis
Introduction – Correlation vs. Causation – Covariance Matrix – Pearson Correlation – Kendall Rank Correlation – Spearman Rank Correlation – Heatmap of Correlation Matrix – Quiz
8. Hypothesis Testing:
Introduction – Essential Concepts – Chi-square Test of Independence – Chi-square Test of Independence: Implementation – Two-Sample t-Test – Paired t-Test – One-Way ANOVA – Post-Hoc Test – Non-Parametric Tests
9. A/B Testing:
Introduction – Designing the Experiment – Collecting and Preparing the data – Visualizing the Results – Testing the Hypothesis – Drawing Conclusions
10. Final Tasks:
Project – Self-study Essay
Prerequisites
Introduction to Python for Data Science
Teaching methods
This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.
The student will pass this course after submitting the required quiz, assignments, and the final project.
Location and time
Course can be done in own pace in TechClass portal.
Learning materials and recommended literature
Lecture slides, tutorial videos, quizzes, exercises and project can be find via TechClass portal.
Alternative completion methods of implementation
N/A
Internship and working life connections
N/A
Exam dates and retake possibilities
Online.
International connections
N/A
Student workload
- Tutorial Videos
- Exercises
- Quiz
- Project
- Self-study
Content scheduling
Up to Student her-/himself.
Assessment methods and criteria
Exercises 50%
Quizzes 25%
Project 25%
Evaluation scale
Hyväksytty/Hylätty
Assessment criteria, approved/failed
Exercises 50%
Quizzes 25%
Project 25%