Apache Kafka, the next generation Big Data tool, is a fast, scalable and fault-tolerant distributed messaging system. Kafka works in combination with different real-time processing tools like Apache Storm, Apache HBase and Apache Spark, for analysis and rendering of streaming data. Big Data projects cannot be implemented without a queuing solution. There are few traditional queuing solutions available like RabbitMQ, JMS, AMQP, etc. But Kafka is far ahead of them in terms of volume, reliability, high throughput, scalability, fault tolerance, durability, low latency, etc., which makes it a single choice in modern data architecture.
Preview
By the end of this training you will learn to:
Learn the concepts of latest version of Kafka
Develop Real Time Applications
Understand architecture of Kafka
Setup and configure Kafka on a cluster
Master various components consumer, producer and brokers
Play with topics and perform different operations
Integrate Kafka with various consumers
Play with partitions, distribute data between partitions
Master the concepts of high & low level APIs
Monitoring of Kafka
Course Contents
Day 1
What is Kafka – An Introduction
Why Kafka
What is Kafka – An Introduction
Kafka Components and use cases
Implementing Kafka on a single Node
Day 2
Multi Broker Kafka Implementation
Single Node kafka with Independent Zookeeper
Kafka Terminology
Replication
Partitions & Brokers
Consumers
Writes Terminology
Different Scenario of Failure Handling
Day 3
Multi Node Cluster Setup
Multi Node Cluster Setup
Administration Commands
Graceful Shutdown
Balancing Leadership
Rebalancing Tools
ExpendingYour cluster and Using partition Reassignment Tool
Custom Partition Assignment
Decommissioning Broker
Increasing Replication Factor
Day 4
Integrate Flume with Kafka
What is Kafka Integration and Its Need
What is Apache Flume
How to integrate Flume with Kafka (as a Source)