Knowledge for great careers


Hadoop and HDFS architecture

  • Hadoop Architecture and Eco system
  • Understanding of Distribution System
  • Understanding of Parallel computing
  • HDFS Architecture
  • HDFS user commands
  • HDFS admin commands
  • Data replication
  • Rebalancer
  • Real Time user case for Hadoop Integration with existing DWH platform


  • Understanding map reduce framework and architecture
  • Map slot and reducer slot in cluster
  • Hadoop and mapreduce deamons
  • Writing mapreduce programme
  • Understanding of MR APIs
  • Mapreduce configuration parameter


  • Data loading in PIG
  • Data Extraction in PIG
  • Data Transformation in PIG
  • Hands on exercise on PIG


  • Hive query language
  • Alter and delete Hive
  • Partition in Hive
  • Indexing
  • Joins in Hive
  • Unions in hive
  • Industry specific configuration of hive parameters
  • Authentication and Authorization
  • Statistics with Hive
  • Archiving in Hive
  • UDFS
  • Working with Avro files

Working with Sqoop

  • Introduction
  • Import data
  • Export Data
  • Sqoop Syntaxs
  • Database connection

Working with Flume

  • Introduction
  • Configuration and Setup
  • Flume Sink with example
  • Channel
  • Flume source with example
  • Complex flume architecture

Introduction to Apache Cassandra

  • Install the Software
  • Start the Cassandra Server
  • Create a Keyspace
  • Create a column Family
  • Insert,update,delete,read data

Getting started with Cassandra and DataStax

  • Installing in Datastax community Binaries on Windows
  • Starting DataStax OpsCenetr
  • Basic CLI commands

Cassandra Architecture

  • Internode communication and seed nodes
  • Failure Detection and Recovery
  • Data Partitioning and types
  • Random Partitioner and Ordered Partitioner
  • Replica Placement Strategy
  • Network Topology Strategy
  • Snitches
  • Client Requests in Cassandra
  • Write Request(also Multi-Data Center Write Requests)
  • Read Requests

Managing and Accessing Data in Cassandra

  • Writes in Cassandra
  • Compaction
  • Transactions and concurrency control
  • Reads in Cassandra
  • Data consistency in Cassandra
  • Consistency levels for Multi-Data Center Cluster
  • Cassandra’s Built-in Consistency Repair Features

Cassandra CLI

  • Creating a keyspace
  • Creating a column family
  • Creating a Counter Column Family
  • Inserting Rows and Columns
  • Reading Rows and Columns
  • Setting an Expiring Column
  • Indexing a column
  • Deleting Rows and Columns
  • Dropping Column Families and Keyspaces

Getting started with CQL and Cassandra Query Language

  • CQL command-Line Program
  • Running CQL Commands with cqlsh
  • Creating a Keyspace
  • Creating a Column Family
  • Inserting and retrieving Columns
  • Adding columns with ALTER COLUMNFAMILY
  • Altering column Metadata
  • Specifying Column Expiration with TTL
  • Dropping Column Metadata
  • Indexing a Column
  • Deleting Columns and Rows