Siirry suoraan sisältöön

Introduction to Data Mining ToolsLaajuus (3 op)

Tunnus: TX00FB57

Laajuus

3 op

Osaamistavoitteet

Knowledge and understanding

The students will gain a knowledge of concepts and application of algorithms of Data Mining. The focus on the course is the practicalities of undertaking a data mining projects using the CRISP-DM and SEMMA++ framework. Topics will cover representation of a business problem into a data mining problem, exploratory data analysis, data preparation and enhancing data for modelling, building models for prediction, tuning of models and assessment of model performance. Insight into common problems and pitfalls and how to avoid them. Students are encouraged to bring along their own datasets for exploration.

Students will have choice of data mining tools to explore, commercial package of SAS Enterprise Miner or the Open Source Weka platform.

Skills

The students are able to understand how to build projects using data mining tools without the need to program or script in high level programming languages.

The focus will be on data mining and not programming skills.

Sisältö

- A walkthrough of the SAS and the Weka Interface.
- Navigating your way around the SAS and Weka documentation.
- Defining a data mining project
- Defining scope, collection, assessing limitations and the suitability data
- Application of common data mining frameworks
- Introduction to supervised learning methods
- Introduction to unsupervised learning methods
- Common pitfalls in data mining projects
- Construction of Ethical Impact Statement for your data mining project.

Esitietovaatimukset

Basic mathematics, for example, linear algebra, understanding of summary statistics and distributions such as the normal distribution.

No programming knowledge of SAS or Java is necessary, however students with these skills will be able to take their data mining

projects to the next level going beyond the out of the box functionality provided.

Ilmoittautumisaika

02.05.2023 - 31.07.2023

Ajoitus

01.08.2023 - 04.08.2023

Opintopistemäärä

3 op

Toteutustapa

Lähiopetus

Yksikkö

ICT ja tuotantotalous

Toimipiste

Leiritie 1

Opetuskielet
  • Englanti
Paikat

0 - 40

Koulutus
  • Degree Programme in Information Technology
Opettaja
  • Anthony Williams
Ryhmät
  • ICTSUMMER
    ICT Summer School

Tavoitteet

Knowledge and understanding

The students will gain a knowledge of concepts and application of algorithms of Data Mining. The focus on the course is the practicalities of undertaking a data mining projects using the CRISP-DM and SEMMA++ framework. Topics will cover representation of a business problem into a data mining problem, exploratory data analysis, data preparation and enhancing data for modelling, building models for prediction, tuning of models and assessment of model performance. Insight into common problems and pitfalls and how to avoid them. Students are encouraged to bring along their own datasets for exploration.

Students will have choice of data mining tools to explore, commercial package of SAS Enterprise Miner or the Open Source Weka platform.

Skills

The students are able to understand how to build projects using data mining tools without the need to program or script in high level programming languages.

The focus will be on data mining and not programming skills.

Sisältö

- A walkthrough of the SAS and the Weka Interface.
- Navigating your way around the SAS and Weka documentation.
- Defining a data mining project
- Defining scope, collection, assessing limitations and the suitability data
- Application of common data mining frameworks
- Introduction to supervised learning methods
- Introduction to unsupervised learning methods
- Common pitfalls in data mining projects
- Construction of Ethical Impact Statement for your data mining project.

Oppimateriaalit

Getting Started with SAS® Enterprise Miner™, [version 14.1 or later]
Data Mining: Practical Machine Learning Tools and Techniques, [any edition]
Ian H. Witten

Lisätietoja opiskelijoille

Students need to bring their own laptops.

Arviointiasteikko

0-5

Arviointimenetelmät ja arvioinnin perusteet

Daily exercises assigned on the course are worth 60% and both a report and products on a business problem are worth 40%.

Esitietovaatimukset

Basic mathematics, for example, linear algebra, understanding of summary statistics and distributions such as the normal distribution.

No programming knowledge of SAS or Java is necessary, however students with these skills will be able to take their data mining

projects to the next level going beyond the out of the box functionality provided.

Ilmoittautumisaika

02.05.2022 - 14.08.2022

Ajoitus

15.08.2022 - 19.08.2022

Opintopistemäärä

3 op

Toteutustapa

Lähiopetus

Yksikkö

ICT ja tuotantotalous

Toimipiste

Leiritie 1

Opetuskielet
  • Englanti
Paikat

0 - 20

Koulutus
  • Degree Programme in Information Technology
  • Tieto- ja viestintätekniikan tutkinto-ohjelma
Opettaja
  • Anthony Williams
Ryhmät
  • ICTSUMMER
    ICT Summer School

Tavoitteet

Knowledge and understanding

The students will gain a knowledge of concepts and application of algorithms of Data Mining. The focus on the course is the practicalities of undertaking a data mining projects using the CRISP-DM and SEMMA++ framework. Topics will cover representation of a business problem into a data mining problem, exploratory data analysis, data preparation and enhancing data for modelling, building models for prediction, tuning of models and assessment of model performance. Insight into common problems and pitfalls and how to avoid them. Students are encouraged to bring along their own datasets for exploration.

Students will have choice of data mining tools to explore, commercial package of SAS Enterprise Miner or the Open Source Weka platform.

Skills

The students are able to understand how to build projects using data mining tools without the need to program or script in high level programming languages.

The focus will be on data mining and not programming skills.

Sisältö

- A walkthrough of the SAS and the Weka Interface.
- Navigating your way around the SAS and Weka documentation.
- Defining a data mining project
- Defining scope, collection, assessing limitations and the suitability data
- Application of common data mining frameworks
- Introduction to supervised learning methods
- Introduction to unsupervised learning methods
- Common pitfalls in data mining projects
- Construction of Ethical Impact Statement for your data mining project.

Oppimateriaalit

Getting Started with SAS® Enterprise Miner™, [version 14.1 or later]
Data Mining: Practical Machine Learning Tools and Techniques, [any edition]
Ian H. Witten

Arviointimenetelmät ja arvioinnin perusteet

Daily exercises assigned on the course are worth 60% and both a report and products on a business problem are worth 40%.

Esitietovaatimukset

Basic mathematics, for example, linear algebra, understanding of summary statistics and distributions such as the normal distribution.

No programming knowledge of SAS or Java is necessary, however students with these skills will be able to take their data mining

projects to the next level going beyond the out of the box functionality provided.