University of Tsukuba | Graduate School of SIE | Department of CS | Lectures

Data Engineering I

Office hour
Hiroyuki Kitagawa
Laboratory of Advanced Research B, Room# 903, Monday 12:00-13:30

Office hour
Yoshiharu Ishikawa
Laboratory of Advanced Research B, Room# 904, Monday 12:00-13:30

Course number
Software Systems
Lecture (including exercises)

Monday 4, 5

Outline   This lecture gives an overview of advanced data engineering techiniques, especially those used in data mining. First, a survery of basic database and information retrieval techniques is given. Then, various state of the art data mining techniques are discussed.

Prerequisite Basic knowledge about database and information retrieval is desirable.

Class plan
1. Review of database technology
Review of basic database technology

2. Review of information retrieval technology
Review of basic database technology

3. Introduction to data mining
Background isses, objectives, and basic concepts in data minig

4. Association rule
Association rule, apriori algorithm, FP-growth method, evaluation of association rules, etc.

5. Classification
Classifier, decision tree, Baysian classifier, nearest neighbor method, SVM, etc.

6. Clustering
Partition-based methods, hierarchical methods, density-based methods, clustering methods for categorical attributes, etc.

7. Web mining
Web graph structure, classification of Web mining approaches, link analysis, Web community mining, etc.

Text Handout is given in the class.

References - J. Han and M. Kamber, Data Mining - Concepts and Techniques -, Morgan Kaufmann
- S. Chakrabarti, Mining the Web, Morgan Kaufmann
- T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Springer
- C. Faloutsos, Searching Multimedia Databases by Content, Kluwer Academic Publishers

Evaluation Assessment is done based on reports, answers to quizes, and exercises.

Top of this page.