Course description

Mining Massive Datasets

The course is based on the text Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, and Jeff Ullman, who by coincidence are also the instructors for the course.

The book is published by Cambridge Univ. Press, but by arrangement with the publisher. The material in this on-line course closely matches the content of the Stanford course CS246.

The major topics covered include: MapReduce systems and algorithms, Locality-sensitive hashing, Algorithms for data streams, PageRank and Web-link analysis, Frequent itemset analysis, Clustering, Computational advertising, Recommendation systems, Social-network graphs, Dimensionality reduction, and Machine-learning algorithms.

Upcoming start dates

1 start date available

Start anytime

Self-Paced Online
Online
English

Suitability - Who should attend?

Prerequisites

The course is intended for graduate students and advanced undergraduates in Computer Science. At a minimum, you should have had courses in Data structures, Algorithms, Database systems, Linear algebra, Multivariable calculus, and Statistics.

Outcome / Qualification etc.

What you'll learn

MapReduce systems and algorithms
Locality-sensitive hashing
Algorithms for data streams
PageRank and Web-link analysis
Frequent itemset analysis
Clustering
Computational advertising
Recommendation systems
Social-network graphs
Dimensionality reduction
Machine-learning algorithms

Course delivery details

This course is offered through Stanford University, a partner institute of EdX.

5-10 hours per week

Expenses

Verified Track -$149
Audit Track - Free

Ads

Mining Massive Datasets

Course description