This second course in data mining focuses on understanding the research methods used in the field of data mining. The course targets students who want to gain in-dept knowledge about a particular data mining topic (e.g., PhD students who plan to use a data mining component in their research).
Prerequisites: Successful completion an introductory data mining course like EMIS/CSE 7331 (till Fall 2016 EMIS 7332). It is assumed that every student is familiar with all the basic data mining topics (clustering, classification, and association rules) and has some experience with programming and one or more data mining tools (R, RapidMiner, SAS, SPSS, Weka, XLMiner, etc.). Enrolling in this course requires department/instructor consent.
Date | Presenter | Tutorial | Abstract | Additional material (code, software, etc.) |
---|---|---|---|---|
2/9 | Michael Hahsler | Data Stream Mining | R package stream, MOA, Apache Spark Streaming, Apache Storm, Apache Samza, Apache SAMOA, IBM Streams, MS Azure Stream Analytics | |
2/16 | Xinxiang Zhang | Deep Learning | Abstract | |
2/23 | Anyu Zhang | Lung Cancer Detection | Abstract | |
3/2 | Farzad Kamalzadeh | Markov Models in Health Care | Abstract | R Code, R Package seqHMM |
3/9 | Michael Prappas | Natural Lanuage Processing | Abstract | Code |
3/23 | Scott Eisenhart | Recommender Systems | Abstract | recomenderlab |
3/30 | Andrew Cranmer | Image Mining - Local Binary Patterns (video) | Abstract | |
4/6 | Tutorial moved to 4/25 | |||
4/13 | Revant Reddy Katanguri | Big Data Technologies | Abstract | |
4/20 | Ben Brock | Mining Applications for the Internet of Things (video) | Abstract | |
4/25 | Harold Mitchell | Introduction to Apache Spark | Abstract | Anaconda, Pyspark_First_Program.ipynb, Spark-HelloWorld.ipynb |