Big Data Training in Chennai by Sasken 9840014739: Big data training in Chennai by Sasken 9840014739

Friday, 14 March 2014

Big data training in Chennai by Sasken 9840014739

Big Data/Hadoop Course Content

This course has been targeted for Architects, Administrators and developers

Attend once and fit yourself to any role as you wish !

Module 1 Big data Getting Started	What is Big Data? What is Apache Hadoop ? History of Hadoop Understanding distributed file systems and Hadoop Hadoop eco system components Hadoop use cases Ubuntu Installation JDK Installation
Module 2 Hadoop Distributed File system	Eclipse Installation Overview of HDFS Communication Protocols Rack Awareness Hadoop cluster Topology Setting up SSH for Hadoop Cluster Running Hadoop – 1. Pseudo-distributed mode Linux basic commands HDFS file commands Reading and writing to HDFS programmatically Hands-on Lab Exercises
Module 3 MapReduce Framework	Java Basics Anatomy of a MapReduce Program Writables InputFormat OutputFormat Streaming API Inherent failure handling Reading and writing Hands-on Lab Exercises
Module 4 Advanced MapReduce Programming	Input splits, Record Reader, Mapper, Partition & Shuffle, Reduce, OutputFormat Writing MapReduce program Streaming in Hadoop Counters Performance Tuning Joins Sorting Determining Optimal number of reducers, partitions Hadoop cluster – Performance tuning Hands-on Lab Exercises
Module 5 Apache Hadoop Administration	Best Practices for Hadoop setup and infrastructure Hadoop cluster Installation preparation Ø Cluster network design Ø Installation of Linux operating system Ø Configuring SSH Ø Walkthrough on Rack topology and set up Managing Hadoop cluster Ø HDFS cluster management Ø Secondary Name node configuration Ø Task Tracker management Ø Configuring the HDFS quota Ø Configuring Fair Scheduler Ø Upgrading Hadoop Ø Deploying and managing Hadoop clusters with Ambari Monitoring Hadoop cluster Ø Monitoring Hadoop cluster with Ganglia Ø Monitoring Hadoop cluster with Ambari Ø Monitoring Hadoop cluster with Nagia Hadoop Cluster Performance Tuning Ø Benchmarking and profiling Ø Using compression for input and output Ø Configuring optimal map and reduce slots for the TT Ø Fine tuning Job Tracker config Ø Fine tuning Task Tracker config Ø Tuning Shuffle, merge and sort parameters Security Implementation Kerberos security Implementation Workflow Scheduler Capacity Scheduler Fair Scheduler dfsadmin & mradmin commands Administration of Hcatalog and Hive Backup and Recovery Scenario based exercises - Data node failure & Recovery - Name Node Failure & Recovery - JT & TT failure & Recovery - Removing data nodes - Adding Data nodes - Commissioning and decommissioning of nodes
Module 6 Pig and Pig Latin	Installation and configuration Running Pig Lating through grunt Writing programs - Filter , Load & Store functions Writing user defined functions Working with Scripts Lab Exercises
Module 7 HBase and ZooKeeper	NoSQL Vs SQL Cap Theorem Architecture Installation Configuration Java API MR integration Performance Tuning Lab Exercises
Module 8 Hive	Features of Hive Architecture Installation and configuration HiveQL Lab Exercises
Module 9 Other Hadoop eco system components	Overview of Ambari, Oozie ,Mahout Installing & configuring Sqoop, mysql-server Installing & configuring flume Lab Exercises
Module 10 Hadoop on Cloud	Hosting Hadoop on Amazon EC2 EMR Hands-on

http://big-data-training-in-chennai.blogspot.in/

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)