BIG DATA AND HADOOP L P C 3 0 3

Programming through Java. ... CO4: Use MapReduce class and run a job in a hadoop framework. ... Grid Computing. A brief history of Hadoop...

0 downloads 51 Views 86KB Size
GVP COLLEGE OF ENGINEERING (A)

2016

BIG DATA AND HADOOP (Elective-1) Course Code:15CS2108

L 3

P 0

C 3

Prerequisites: Data base Management Systems, Object Oriented Programming through Java. Course Outcomes: After the completion of the course, student will be able to CO1: Understand the fundamentals of big cloud and data architectures. CO2: Understand HDFS file structure and Map reduce frameworks, and use them to solve complex problems, which require massive computation power. CO3: Use relational data in a Hadoop environment, using Hive and Hbase tools of the Hadoop Ecosystem. CO4: Use MapReduce class and run a job in a hadoop framework. CO5: Understand Meta heuristic Concepts. UNIT-I (10-Lectures) Introduction to Big Data. What is Big Data. Why Big Data is Important. Meet Hadoop. Data. Data Storage and Analysis. Comparison with other systems. Grid Computing. A brief history of Hadoop. Apache hadoop and the Hadoop EcoSystem. Linux refresher; VMWare Installation of Hadoop. UNIT-II (10-Lectures) The design of HDFS. HDFS concepts. Command line interface to HDFS.Hadoop File systems. Interfaces. Java Interface to Hadoop. Anatomy of a file read. Anatomy of a file write. Replica placement and Coherency Model. Parallel copying with distcp, keeping an HDFS cluster balanced.

M.TECH-COMPUTER SCIENCE ENGINEERING

20

GVP COLLEGE OF ENGINEERING (A)

2016

UNIT-III (10-Lectures) Introduction. Analyzing data with Unixtools. Analyzing data with hadoop. Java MapReduce classes (new API). Data flow, combiner functions,Running a distributed MapReduce Job. Configuration API. Setting up the development environment. Managing configuration. Writing a unit test with MRUnit. Running a job in local job runner. Running on a cluster. Launching a job. The MapReduce WebUI. UNIT-IV (10-Lectures) Analyzing data with tools. Analyzing data with hadoop. Java MapReduce classes (new API). Data flow, combiner functions, Running a distributed MapReduce Job. Configuration API. Setting up the development environment. Managing configuration. Writing a unit test with MRUnit.Running a job in local job runner. Running on a cluster. Launching a job. The Map Reduce WebUI. UNIT-V (10-Lectures) The Hive Shell. Hive services. Hive clients. The meta store. Comparison with traditional databases. Hive Ql. Hbasics. Concepts. Implementation. Java and Mapreduce clients. Loading data, web queries. TEXT BOOKS: 1. Tom White, Hadoop,”The Definitive Guide“, 3rd Edition, O‟Reilly Publications, 2012. 2. Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch, “Understanding Big Data Analytics for Enterprise Class Hadoop and Streaming Data”, 1st Edition, TMH,2012. REFERENCES: 1. Frank J.Ohlhorst, “Big Data Analytics: Turning Big Data Into Big Money”,2nd Edition, TMH,2012.

M.TECH-COMPUTER SCIENCE ENGINEERING

21