Man-Tak Shing Mathias Kolsch - Data Sciences
Shing & Kolsch Title
Dr. Man-Tak Shing
Associate Professor
Department of Computer Science
shing@nps.edu
Dr. Mathias Kolsch
Associate Professor
Department of Computer Science
kolsch@nps.edu
Overview
Overview
The description of the class CS4921: Mining of Large Databases, along with some of the previous knowledge you should obtain before taking the class
CS4921 Mining of Large Datasets (3-1)
Modern data-mining applications, often called "big-data" analysis, require us to manage immense amounts of data quickly. Big-data mining focuses on the extraction of information from very large amounts of data, that is, data so large it does not fit on a single computer's memory or disk. Because of the emphasis on size, many of the examples covered in the course are about the Web or data derived from the Web. Rather than using data to "train" machine-learning engines, this course takes an algorithmic point of view, focusing on applying algorithms to data and hands-on with Hadoop. Topics covered in the course include:
- Distributed file system and map-reduce algorithm
- Data mining techniques - finding similar items, clustering
- Technologies for search engines - link analysis, page ranking, link-spam detection
- Ability to program in Python, C++, or Java
- Basic knowledge of data structures and algorithms
- Basic UNIX command line usage
text
Text:
Asset Publisher
title-list-document-download is not a display type.