Statistical Data Analysis with Python (10 op)
Toteutuksen tunnus: TT00FA97-3001
Toteutuksen perustiedot
- Ajoitus
- 01.08.2022 - 31.12.2023
- Toteutus on päättynyt.
- Opintopistemäärä
- 10 op
- Virtuaaliosuus
- 10 op
- Toteutustapa
- Etäopetus
- Toimipiste
- Karaportti 2
- Opetuskielet
- englanti
- Paikat
- 0 - 500
- Koulutus
- Tieto- ja viestintätekniikan tutkinto-ohjelma
- Opettajat
- Virve Prami
- Ryhmät
-
OPEN_UAS_TIVI_AI_ML_DS_75_ECTSOpen UAS: Artificial intelligence, Machine Learning and Data Science (NonStop Module) 75 ECTS
- Opintojakso
- TT00FA97
Tavoitteet
Data analysis and statistical analysis are necessary for many in-demand data analytics job roles. They are used hand in hand to solve business problems; data analysis is a general approach to making data understandable for decision-makers, and statistical analysis is a professional statistical attitude. A rich data analysis skill-set will help you better understand the data to extract knowledge and insights. This course is designed to give you the necessary resources to gain the career-building Python skills you need to succeed as a Data Analyst. By the end of this course, you will get a full understanding of how to use Python’s scientific computing libraries to import, clean, manipulate, visualize data and use a wide range of statistical techniques to analyze data to extract meaningful insights.
This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.
The student will pass this course after submitting the required quiz, assignments, and the final project.
Sisältö
1. Introduction to Data Analysis:
What is Data Analysis? – Different Types of Data Analysis – What is Statistical Analysis? – Descriptive vs. Inferential Statistics – Methods of Sampling – Steps Involved in Data Analysis – Quiz
2. Data Ingestion:
Introduction – Importing Flat Files – Parsing Date and Time – Importing Excel Spreadsheets – Connecting to a Database – Retrieving Tables from MySQL Databases – Retrieving Tables from PostgreSQL Database – Retrieving Data from Azure Blob Storage – Retrieving Data from AWS S3 Buckets – Importing JSON Files – Combining Multiple Datasets – Quiz
3. Descriptive Statistics:
Introduction – Histogram and Bar Chart – Central Tendency Measures – Data Variability Measures – Extracting Descriptive Statistics – Skewness – Kurtosis
4. Data Cleaning:
Introduction – Handling Incorrect Values – Handling Incorrect Data Types – Removing Missing Values – Handling Missing Values: Simple Imputation – Handling Missing Values: K-NN Imputation – Handling Missing Values: MICE – Binning – Outlier Detection: IQR Method – Outlier Detection: Isolation Forest – Data Sanitization – Quiz
5. Probability:
Introduction – Probabilistic Experiment – Probability of an Event – Random Variable – Discrete and Continuous Random Variables – Probability Mass Function – Probability Density Function – Cumulative Distribution Function – Empirical Cumulative Distribution Function – Expected Values
6. Statistical Data Modeling:
Introduction – Normal Distribution – Other Types of Distribution Functions – Kernel Density Estimation – Fitting Data to the Probability Distribution – Conditional Probabilistic Analysis
7. Relationship Analysis
Introduction – Correlation vs. Causation – Covariance Matrix – Pearson Correlation – Kendall Rank Correlation – Spearman Rank Correlation – Heatmap of Correlation Matrix – Quiz
8. Hypothesis Testing:
Introduction – Essential Concepts – Chi-square Test of Independence – Chi-square Test of Independence: Implementation – Two-Sample t-Test – Paired t-Test – One-Way ANOVA – Post-Hoc Test – Non-Parametric Tests
9. A/B Testing:
Introduction – Designing the Experiment – Collecting and Preparing the data – Visualizing the Results – Testing the Hypothesis – Drawing Conclusions
10. Final Tasks:
Project – Self-study Essay
Esitietovaatimukset
Introduction to Python for Data Science
Työmuodot
This course is 100% virtual thanks to the comprehensive tutorial videos and content prepared for this course.
The student will pass this course after submitting the required quiz, assignments, and the final project.
Aika ja paikka
Course can be done in own pace in TechClass portal.
Oppimateriaali ja suositeltava kirjallisuus
Lecture slides, tutorial videos, quizzes, exercises and project can be find via TechClass portal.
Opintojaksototeutuksen valinnaiset suoritustavat
N/A
Harjoittelu- ja työelämäyhteistyö
N/A
Tenttien ajankohdat ja uusintamahdollisuudet
Online.
Kansainväliset yhteydet
N/A
Opiskelijan ajankäyttö ja kuormitus
- Tutorial Videos
- Exercises
- Quiz
- Project
- Self-study
Sisällön jaksotus
Up to Student her-/himself.
Arviointimenetelmät ja arvioinnin perusteet
Exercises 50%
Quizzes 25%
Project 25%
Arviointiasteikko
Hyväksytty/Hylätty
Arviointikriteerit arvosanalle hyväksytty
Exercises 50%
Quizzes 25%
Project 25%