课程目录: 大数据基础培训

4401 人关注
(78637/99817)
课程大纲:

大数据基础培训

 

 

 

 

Section 1: The basics of working with big data

Understand the four V’s of Big Data (Volume, Velocity,

and Variety); Build models for data; Understand the occurrence of rare events in random data.

Section 2: Web and social networks

Understand characteristics of the web and social networks;

Model social networks; Apply algorithms for community detection in networks.

Section 3: Clustering big data

Clustering social networks; Apply hierarchical clustering; Apply k-means clustering.

Section 4: Google web search

Understand the concept of PageRank; Implement the basic; PageRank algorithm for strongly connected graphs;

Implement PageRank with taxation for graphs that are not strongly connected.

Section 5: Parallel and distributed computing using MapReduce

Understand the architecture for massive distributed and parallel computing;

Apply MapReduce using Hadoop; Compute PageRank using MapReduce.

Section 6: Computing similar documents in big data

Measure importance of words in a collection of documents;

Measure similarity of sets and documents; Apply local sensitivity hashing to compute similar documents.

Section 7: Products frequently bought together in stores

Understand the importance of frequent item sets; Design association rules; Implement the A-priori algorithm.

Section 8: Movie and music recommendations

Understand the differences of recommendation systems; Design content-based recommendation systems;

Design collaborative filtering recommendation systems.

Section 9: Google's AdWordsTM System

Understand the AdWords System; Analyse online algorithms in terms of competitive ratio; Use online matching to solve the AdWords problem.

Section 10: Mining rapidly arriving data streams

Understand types of queries for data streams; Analyse sampling methods for data streams;

Count distinct elements in data streams; Filter data streams.