Ebook574 pages6 hours

Kafka in Action

Name: Kafka in Action
Author: Dylan Scott
ISBN: 9781638356196

By Dylan Scott, Viktor Gamov and Dave Klein

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Master the wicked-fast Apache Kafka streaming platform through hands-on examples and real-world projects.

In Kafka in Action you will learn:

    Understanding Apache Kafka concepts
    Setting up and executing basic ETL tasks using Kafka Connect
    Using Kafka as part of a large data project team
    Performing administrative tasks
    Producing and consuming event streams
    Working with Kafka from Java applications
    Implementing Kafka as a message queue

Kafka in Action is a fast-paced introduction to every aspect of working with Apache Kafka. Starting with an overview of Kafka's core concepts, you'll immediately learn how to set up and execute basic data movement tasks and how to produce and consume streams of events. Advancing quickly, you’ll soon be ready to use Kafka in your day-to-day workflow, and start digging into even more advanced Kafka topics.

About the technology
Think of Apache Kafka as a high performance software bus that facilitates event streaming, logging, analytics, and other data pipeline tasks. With Kafka, you can easily build features like operational data monitoring and large-scale event processing into both large and small-scale applications.

About the book
Kafka in Action introduces the core features of Kafka, along with relevant examples of how to use it in real applications. In it, you’ll explore the most common use cases such as logging and managing streaming data. When you’re done, you’ll be ready to handle both basic developer- and admin-based tasks in a Kafka-focused team.

What's inside

    Kafka as an event streaming platform
    Kafka producers and consumers from Java applications
    Kafka as part of a large data project

About the reader
For intermediate Java developers or data engineers. No prior knowledge of Kafka required.

About the author
Dylan Scott is a software developer in the insurance industry. Viktor Gamov is a Kafka-focused developer advocate. At Confluent, Dave Klein helps developers, teams, and enterprises harness the power of event streaming with Apache Kafka.

Table of Contents
PART 1 GETTING STARTED
1 Introduction to Kafka
2 Getting to know Kafka
PART 2 APPLYING KAFK
3 Designing a Kafka project
4 Producers: Sourcing data
5 Consumers: Unlocking data
6 Brokers
7 Topics and partitions
8 Kafka storage
9 Management: Tools and logging
PART 3 GOING FURTHER
10 Protecting Kafka
11 Schema registry
12 Stream processing with Kafka Streams and ksqlDB

Skip carousel

LanguageEnglish

PublisherManning

Release dateMar 22, 2022

ISBN9781638356196

Author

Dylan Scott

Dylan Scott is a software developer with over ten years of experience in Java and Perl. His experience includes implementing Kafka as a messaging system for a large data migration, and he uses Kafka in his work in the insurance industry.

Related authors

Skip carousel

Related to Kafka in Action

Related ebooks

Skip carousel

Kafka Streams in Action: Real-time apps and microservices with the Kafka Streams API
Ebook
Kafka Streams in Action: Real-time apps and microservices with the Kafka Streams API
byBill Bejeck
Rating: 0 out of 5 stars
0 ratings
Data Pipelines with Apache Airflow
Ebook
Data Pipelines with Apache Airflow
byJulian de Ruiter
Rating: 0 out of 5 stars
0 ratings
Spark in Action: Covers Apache Spark 3 with Examples in Java, Python, and Scala
Ebook
Spark in Action: Covers Apache Spark 3 with Examples in Java, Python, and Scala
byJean-Georges Perrin
Rating: 0 out of 5 stars
0 ratings
Bootstrapping Microservices with Docker, Kubernetes, and Terraform: A project-based guide
Ebook
Bootstrapping Microservices with Docker, Kubernetes, and Terraform: A project-based guide
byAshley Davis
Rating: 3 out of 5 stars
3/5
Event Streams in Action: Real-time event systems with Kafka and Kinesis
Ebook
Event Streams in Action: Real-time event systems with Kafka and Kinesis
byValentin Crettaz
Rating: 0 out of 5 stars
0 ratings
Cloud Native Patterns: Designing change-tolerant software
Ebook
Cloud Native Patterns: Designing change-tolerant software
byCornelia Davis
Rating: 4 out of 5 stars
4/5
Amazon Web Services in Action
Ebook
Amazon Web Services in Action
byMichael Wittig
Rating: 0 out of 5 stars
0 ratings
Serverless Architectures on AWS: With examples using AWS Lambda
Ebook
Serverless Architectures on AWS: With examples using AWS Lambda
byPeter Sbarski
Rating: 0 out of 5 stars
0 ratings
Logging in Action: With Fluentd, Kubernetes and more
Ebook
Logging in Action: With Fluentd, Kubernetes and more
byPhil Wilkins
Rating: 0 out of 5 stars
0 ratings
Pipeline as Code: Continuous Delivery with Jenkins, Kubernetes, and Terraform
Ebook
Pipeline as Code: Continuous Delivery with Jenkins, Kubernetes, and Terraform
byMohamed Labouardy
Rating: 3 out of 5 stars
3/5
Kubernetes Native Microservices with Quarkus and MicroProfile
Ebook
Kubernetes Native Microservices with Quarkus and MicroProfile
byJohn Clingan
Rating: 0 out of 5 stars
0 ratings
Docker in Action, Second Edition
Ebook
Docker in Action, Second Edition
byJeffrey Nickoloff
Rating: 3 out of 5 stars
3/5
AWS Lambda in Action: Event-driven serverless applications
Ebook
AWS Lambda in Action: Event-driven serverless applications
byDanilo Poccia
Rating: 0 out of 5 stars
0 ratings
Redis in Action
Ebook
Redis in Action
byJosiah Carlson
Rating: 0 out of 5 stars
0 ratings
Kafka Streams - Real-time Streams Processing
Ebook
Kafka Streams - Real-time Streams Processing
byPrashant Kumar Pandey
Rating: 5 out of 5 stars
5/5
Streaming Data: Understanding the real-time pipeline
Ebook
Streaming Data: Understanding the real-time pipeline
byAndrew Psaltis
Rating: 0 out of 5 stars
0 ratings
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
Ebook
Infrastructure as Code, Patterns and Practices: With examples in Python and Terraform
byRosemary Wang
Rating: 0 out of 5 stars
0 ratings
GraphQL in Action
Ebook
GraphQL in Action
bySamer Buna
Rating: 2 out of 5 stars
2/5
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Ebook
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
byEric Chou
Rating: 0 out of 5 stars
0 ratings
Serverless Architectures on AWS, Second Edition
Ebook
Serverless Architectures on AWS, Second Edition
byPeter Sbarski
Rating: 5 out of 5 stars
5/5
Akka in Action
Ebook
Akka in Action
byRaymond Roestenburg
Rating: 0 out of 5 stars
0 ratings
MongoDB in Action: Covers MongoDB version 3.0
Ebook
MongoDB in Action: Covers MongoDB version 3.0
byKyle Banker
Rating: 0 out of 5 stars
0 ratings
Re-Engineering Legacy Software
Ebook
Re-Engineering Legacy Software
byChris Birchall
Rating: 0 out of 5 stars
0 ratings
Irresistible APIs: Designing web APIs that developers will love
Ebook
Irresistible APIs: Designing web APIs that developers will love
byKirsten Hunter
Rating: 0 out of 5 stars
0 ratings
Terraform in Action
Ebook
Terraform in Action
byScott Winkler
Rating: 5 out of 5 stars
5/5
MLOps Engineering at Scale
Ebook
MLOps Engineering at Scale
byCarl Osipov
Rating: 0 out of 5 stars
0 ratings
Designing Cloud Data Platforms
Ebook
Designing Cloud Data Platforms
byDanil Zburivsky
Rating: 0 out of 5 stars
0 ratings
Modern Java in Action: Lambdas, streams, functional and reactive programming
Ebook
Modern Java in Action: Lambdas, streams, functional and reactive programming
byRaoul-Gabriel Urma
Rating: 0 out of 5 stars
0 ratings
Kubernetes in Action
Ebook
Kubernetes in Action
byMarko Luksa
Rating: 0 out of 5 stars
0 ratings
Learn Kubernetes in a Month of Lunches
Ebook
Learn Kubernetes in a Month of Lunches
byElton Stoneman
Rating: 0 out of 5 stars
0 ratings

Internet & Web For You

Skip carousel

Coding For Dummies
Ebook
Coding For Dummies
byNikhil Abraham
Rating: 5 out of 5 stars
5/5
No Place to Hide: Edward Snowden, the NSA, and the U.S. Surveillance State
Ebook
No Place to Hide: Edward Snowden, the NSA, and the U.S. Surveillance State
byGlenn Greenwald
Rating: 4 out of 5 stars
4/5
Get Rich or Lie Trying: Ambition and Deceit in the New Influencer Economy
Ebook
Get Rich or Lie Trying: Ambition and Deceit in the New Influencer Economy
bySymeon Brown
Rating: 0 out of 5 stars
0 ratings
How to Disappear and Live Off the Grid: A CIA Insider's Guide
Ebook
How to Disappear and Live Off the Grid: A CIA Insider's Guide
byJohn Kiriakou
Rating: 0 out of 5 stars
0 ratings
Hacking : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Ethical Hacking
Ebook
Hacking : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Ethical Hacking
byKevin Clark
Rating: 5 out of 5 stars
5/5
How To Make Money Blogging: How I Replaced My Day-Job With My Blog and How You Can Start A Blog Today
Ebook
How To Make Money Blogging: How I Replaced My Day-Job With My Blog and How You Can Start A Blog Today
byBob Lotich
Rating: 4 out of 5 stars
4/5
The Logo Brainstorm Book: A Comprehensive Guide for Exploring Design Directions
Ebook
The Logo Brainstorm Book: A Comprehensive Guide for Exploring Design Directions
byJim Krause
Rating: 4 out of 5 stars
4/5
Social Engineering: The Science of Human Hacking
Ebook
Social Engineering: The Science of Human Hacking
byChristopher Hadnagy
Rating: 3 out of 5 stars
3/5
Podcasting For Dummies
Ebook
Podcasting For Dummies
byTee Morris
Rating: 4 out of 5 stars
4/5
How to Be Invisible: Protect Your Home, Your Children, Your Assets, and Your Life
Ebook
How to Be Invisible: Protect Your Home, Your Children, Your Assets, and Your Life
byJ. J. Luna
Rating: 4 out of 5 stars
4/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
Six Figure Blogging Blueprint
Ebook
Six Figure Blogging Blueprint
byRaza Imam
Rating: 5 out of 5 stars
5/5
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
Ebook
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
byKristen Meinzer
Rating: 3 out of 5 stars
3/5
The Designer's Web Handbook: What You Need to Know to Create for the Web
Ebook
The Designer's Web Handbook: What You Need to Know to Create for the Web
byPatrick McNeil
Rating: 0 out of 5 stars
0 ratings
Stop Asking Questions: How to Lead High-Impact Interviews and Learn Anything from Anyone
Ebook
Stop Asking Questions: How to Lead High-Impact Interviews and Learn Anything from Anyone
byAndrew Warner
Rating: 5 out of 5 stars
5/5
200+ Ways to Protect Your Privacy: Simple Ways to Prevent Hacks and Protect Your Privacy--On and Offline
Ebook
200+ Ways to Protect Your Privacy: Simple Ways to Prevent Hacks and Protect Your Privacy--On and Offline
byJeni Rogers
Rating: 0 out of 5 stars
0 ratings
The Cyber Attack Survival Manual: Tools for Surviving Everything from Identity Theft to the Digital Apocalypse
Ebook
The Cyber Attack Survival Manual: Tools for Surviving Everything from Identity Theft to the Digital Apocalypse
byNick Selby
Rating: 0 out of 5 stars
0 ratings
The Beginner's Affiliate Marketing Blueprint
Ebook
The Beginner's Affiliate Marketing Blueprint
byAlex M
Rating: 4 out of 5 stars
4/5
The $1,000,000 Web Designer Guide: A Practical Guide for Wealth and Freedom as an Online Freelancer
Ebook
The $1,000,000 Web Designer Guide: A Practical Guide for Wealth and Freedom as an Online Freelancer
byRob Anthony O'Rourke
Rating: 5 out of 5 stars
5/5
The Gothic Novel Collection
Ebook
The Gothic Novel Collection
byGaston Leroux
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
Ebook
Python QuickStart Guide: The Simplified Beginner's Guide to Python Programming Using Hands-On Projects and Real-World Applications
byRobert Oliver
Rating: 0 out of 5 stars
0 ratings
The Digital Marketing Handbook: A Step-By-Step Guide to Creating Websites That Sell
Ebook
The Digital Marketing Handbook: A Step-By-Step Guide to Creating Websites That Sell
byRobert W Bly
Rating: 5 out of 5 stars
5/5
Mike Meyers' CompTIA Security+ Certification Guide, Third Edition (Exam SY0-601)
Ebook
Mike Meyers' CompTIA Security+ Certification Guide, Third Edition (Exam SY0-601)
byMike Meyers
Rating: 5 out of 5 stars
5/5
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
Ebook
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
byChris Mason
Rating: 4 out of 5 stars
4/5
How To Start A Profitable Authority Blog In Under One Hour
Ebook
How To Start A Profitable Authority Blog In Under One Hour
byPassive Marketing
Rating: 5 out of 5 stars
5/5
Web Copy That Sells: The Revolutionary Formula for Creating Killer Copy That Grabs Their Attention and Compels Them to Buy
Ebook
Web Copy That Sells: The Revolutionary Formula for Creating Killer Copy That Grabs Their Attention and Compels Them to Buy
byMaria Veloso
Rating: 4 out of 5 stars
4/5
The Internet Is Not What You Think It Is: A History, a Philosophy, a Warning
Ebook
The Internet Is Not What You Think It Is: A History, a Philosophy, a Warning
byJustin Smith-Ruiu
Rating: 4 out of 5 stars
4/5
Cybersecurity For Dummies
Ebook
Cybersecurity For Dummies
byJoseph Steinberg
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

HashiCorp Vault for Kubernetes: Bret is joined by Rosemary Wang from HashiCorp to show off Vault for Kubernetes, an open source secrets provider.
Podcast episode
HashiCorp Vault for Kubernetes: Bret is joined by Rosemary Wang from HashiCorp to show off Vault for Kubernetes, an open source secrets provider.
byDevOps and Docker Talk: Cloud Native Interviews and Tooling
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
An Introduction to the Go Programming language with Andrew Gerrand: Andrew Gerrand is a developer at Google who works on the Go Programming Language (golang). Why Go and why now? What kinds of problems does Go solve that aren't a good match for existing languages? How does Go compare to C++ and improve upon it?
Podcast episode
An Introduction to the Go Programming language with Andrew Gerrand: Andrew Gerrand is a developer at Google who works on the Go Programming Language (golang). Why Go and why now? What kinds of problems does Go solve that aren't a good match for existing languages? How does Go compare to C++ and improve upon it?
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
EP 22: What is OAuth 2?
Podcast episode
EP 22: What is OAuth 2?
byPro Coder Show
0 ratings
0% found this document useful
Serverless Event-Driven Architecture with Danilo Poccia: In an event driven application, each component of application logic emits events, which other parts of the application respond to. We have examined this pattern in previous shows that focus on pub/sub messaging, event sourcing, and CQRS.
Podcast episode
Serverless Event-Driven Architecture with Danilo Poccia: In an event driven application, each component of application logic emits events, which other parts of the application respond to. We have examined this pattern in previous shows that focus on pub/sub messaging, event sourcing, and CQRS.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#76 - Learning Domain-Driven Design - Vladik Khononov
Podcast episode
#76 - Learning Domain-Driven Design - Vladik Khononov
byTech Lead Journal
0 ratings
0% found this document useful
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
Podcast episode
Taking A Tour Of PostgreSQL with Jonathan Katz - Episode 42: A Whirlwind Tour Of The PostgreSQL Database (Interview)
byData Engineering Podcast
100%
100% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
EP 09: Application Contexts, Dependency Injection, and Inversion of Control - OH MY!
Podcast episode
EP 09: Application Contexts, Dependency Injection, and Inversion of Control - OH MY!
byPro Coder Show
0 ratings
0% found this document useful
#437: A Day in the Life of an SA: What does a day in the life of a Solution Architect look like at AWS? Simon speaks with four SA’s fr
Podcast episode
#437: A Day in the Life of an SA: What does a day in the life of a Solution Architect look like at AWS? Simon speaks with four SA’s fr
byAWS Podcast
100%
100% found this document useful
EP 01: The Best of SpringOne 2021 (ft. Dan Vega)
Podcast episode
EP 01: The Best of SpringOne 2021 (ft. Dan Vega)
byPro Coder Show
0 ratings
0% found this document useful
Modern Software Engineering: delivered continuously with Dave Farley
Podcast episode
Modern Software Engineering: delivered continuously with Dave Farley
byShip It! SRE, Platform Engineering, DevOps
0 ratings
0% found this document useful
Run your microservices in no-fail mode: The home team sits down with Maxim Fateev, CEO and cofounder of Temporal Technologies, and Dominik Tornow, Principal Engineer at Temporal, to talk all things microservices. What are the tradeoffs in moving from monolith to microservices, and is the pendulum swinging back toward bigger, less complex architectures? Plus, why is state so hard to nail down?
Podcast episode
Run your microservices in no-fail mode: The home team sits down with Maxim Fateev, CEO and cofounder of Temporal Technologies, and Dominik Tornow, Principal Engineer at Temporal, to talk all things microservices. What are the tradeoffs in moving from monolith to microservices, and is the pendulum swinging back toward bigger, less complex architectures? Plus, why is state so hard to nail down?
byThe Stack Overflow Podcast
0 ratings
0% found this document useful
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
Podcast episode
Software Architecture with Simon Brown: Software architecture address the challenge of communicating and navigating large, complex systems to stakeholders, both technical and non-technical. Over the years software architecture has gone in and out of fashion.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
046 jsAir - React Native with Bonnie Eisenman, Ken Wheeler, and Tyler McGinnis: React Native with Bonnie Eisenman, Ken Wheeler, and Tyler McGinnis Description: JavaScript is taking the software world by storm, and we're going to talk about yet another enabling technology: React Native. Show sponsors:Egghead.io - Bite-size...
Podcast episode
046 jsAir - React Native with Bonnie Eisenman, Ken Wheeler, and Tyler McGinnis: React Native with Bonnie Eisenman, Ken Wheeler, and Tyler McGinnis Description: JavaScript is taking the software world by storm, and we're going to talk about yet another enabling technology: React Native. Show sponsors:Egghead.io - Bite-size...
byJavaScript Air
0 ratings
0% found this document useful
SOLID Principles with Uncle Bob - Robert C. Martin: Scott sits down with Robert C. Martin as Uncle Bob helps Scott understand the SOLID Principles of Object Oriented Design.
Podcast episode
SOLID Principles with Uncle Bob - Robert C. Martin: Scott sits down with Robert C. Martin as Uncle Bob helps Scott understand the SOLID Principles of Object Oriented Design.
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
Podcast episode
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Kafka Streams with Jay Kreps: Kafka Streams is a library for building streaming applications that transform input Kafka topics into output Kafka topics. In a time when there are numerous streaming frameworks already out there, why do we need yet another?
Podcast episode
Kafka Streams with Jay Kreps: Kafka Streams is a library for building streaming applications that transform input Kafka topics into output Kafka topics. In a time when there are numerous streaming frameworks already out there, why do we need yet another?
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
gRPC & protocol buffers: with Askhay Shah
Podcast episode
gRPC & protocol buffers: with Askhay Shah
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
The Pragmatic Programmer celebrates 20 years with Dave Thomas and Andy Hunt: Straight from the programming trenches, The Pragmatic Programmer cuts through the increasing specialization and technicalities of modern software development to examine the core process—what do you do, as an individual and as a team, if you want to create software that’s easy to work with and good for your users. Now updated after 20 years, Scott talks to Andy and Dave about this classic book! This classic title is regularly featured on software development “Top Ten” lists, and is issued by many corporations to new hires.
Podcast episode
The Pragmatic Programmer celebrates 20 years with Dave Thomas and Andy Hunt: Straight from the programming trenches, The Pragmatic Programmer cuts through the increasing specialization and technicalities of modern software development to examine the core process—what do you do, as an individual and as a team, if you want to create software that’s easy to work with and good for your users. Now updated after 20 years, Scott talks to Andy and Dave about this classic book! This classic title is regularly featured on software development “Top Ten” lists, and is issued by many corporations to new hires.
byHanselminutes with Scott Hanselman
100%
100% found this document useful
Taming Distributed Architecture with Caitie McCaffrey: Distributed systems programming will always be a world of tradeoffs -- there is no silver bullet in the future. But life can be made easier with tactics such as the actor pattern and the use of conflict-free replicated data types (CRDTs). -
Podcast episode
Taming Distributed Architecture with Caitie McCaffrey: Distributed systems programming will always be a world of tradeoffs -- there is no silver bullet in the future. But life can be made easier with tactics such as the actor pattern and the use of conflict-free replicated data types (CRDTs). -
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#71 - Strategic Monoliths and Microservices - Vaughn Vernon
Podcast episode
#71 - Strategic Monoliths and Microservices - Vaughn Vernon
byTech Lead Journal
0 ratings
0% found this document useful
2: Pytest vs Unittest vs Nose: Choosing a test framework
Podcast episode
2: Pytest vs Unittest vs Nose: Choosing a test framework
byTest and Code
0 ratings
0% found this document useful
DynamoDB The Database of Choice for Serverless Applications with Alex DeBrie: Alex DeBrie is the founder of DeBrie, LLC, a cloud-native training and AWS consulting company with a focus on DynamoDB and serverless technologies. He’s also the author of The DynamoDB Book, a 450-page tome that offers tips, strategies, and more about dat
Podcast episode
DynamoDB The Database of Choice for Serverless Applications with Alex DeBrie: Alex DeBrie is the founder of DeBrie, LLC, a cloud-native training and AWS consulting company with a focus on DynamoDB and serverless technologies. He’s also the author of The DynamoDB Book, a 450-page tome that offers tips, strategies, and more about dat
byScreaming in the Cloud
0 ratings
0% found this document useful
The Rust Programming Language: with Steve Klabnik and Yehuda Katz
Podcast episode
The Rust Programming Language: with Steve Klabnik and Yehuda Katz
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Kafka in the Cloud with Neha Narkhede: Apache Kafka is an open-source distributed streaming platform. Kafka was originally developed at LinkedIn, and the creators of the project eventually left LinkedIn and started Confluent, a company that is building a streaming platform based on Kafka.
Podcast episode
Kafka in the Cloud with Neha Narkhede: Apache Kafka is an open-source distributed streaming platform. Kafka was originally developed at LinkedIn, and the creators of the project eventually left LinkedIn and started Confluent, a company that is building a streaming platform based on Kafka.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#28 - Becoming an Effective Software Engineering Manager - James Stanier
Podcast episode
#28 - Becoming an Effective Software Engineering Manager - James Stanier
byTech Lead Journal
0 ratings
0% found this document useful
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
Podcast episode
All Roads Lead to Kubernetes with Kendall Miller: Kendall Miller is the president at Fairwinds, a shop that helps teams optimize containerized apps and get the most out of Kubernetes that was formerly called ReactiveOps. He's also the host of Authority Issues, a podcast about leadership. Prior to these p
byScreaming in the Cloud
0 ratings
0% found this document useful
Patterns of distributed systems: In today’s cloud-first world, distributed systems are everywhere. Unmesh Joshi gives an insight into his work looking at distributed systems — from distributed databases such as Cassandra to messaging brokers such as Kafka or infrastructure...
Podcast episode
Patterns of distributed systems: In today’s cloud-first world, distributed systems are everywhere. Unmesh Joshi gives an insight into his work looking at distributed systems — from distributed databases such as Cassandra to messaging brokers such as Kafka or infrastructure...
byThoughtworks Technology Podcast
0 ratings
0% found this document useful

Skip carousel

Basic Concepts
Linux Format
Article
Basic Concepts
Jul 2, 2019
A messaging system such as Kafka enables you to send messages between processes, applications and servers. Applications connect to Kafka to send or get data. Strictly speaking, a Kafka ‘topic’ is a unit of storage in Kafka: data in Kafka is stored in
1 min read
Are Docker Containers a Good Idea for Laptops?
Maximum PC
Article
Are Docker Containers a Good Idea for Laptops?
Mar 31, 2020
Docker containers are cool. If you haven’t yet played with Docker, you’re missing a large world of easily deployed applications. For example, I can deploy NodeRed, Plex, Jupyter Lab, and Nextcloud servers, and run them behind a Traefik reverse proxy
2 min read
An Introduction To Rabbitmq
Linux Format
Article
An Introduction To Rabbitmq
Jun 29, 2021
RabbitMQ is a Message Broker, which means that it can safely hold messages generated by applications and make them available to other applications. The main advantages are reliability, support for clustering and high-availability queues, tracing capa
1 min read
Create A RESTful Server In Go
Linux Format
Article
Create A RESTful Server In Go
Oct 19, 2021
8 min read
Metrics & Visuals In Go
Linux Format
Article
Metrics & Visuals In Go
Nov 17, 2020
Mihalis Tsoukalos is a DataOps engineer and a technical writer. He’s the author of Go Systems Programming and Mastering Go, 2nd edition. The subject of this tutorial is two-fold. First, it’s about creating a Go application that exports metrics to P
7 min read
KAFKA Build Utilities With The Kafka Server
Linux Format
Article
KAFKA Build Utilities With The Kafka Server
Jul 2, 2019
Nowadays, quite a few data architectures involve both a database and Apache Kafka, which is a distributed streaming platform and the subject of this tutorial. You can also find Kafka described as a publish-subscribe message system, which is a fancy w
7 min read
Build A Search And Analytic Engine
Linux Format
Article
Build A Search And Analytic Engine
Mar 10, 2020
7 min read
Types Of Databases
Linux Format
Article
Types Of Databases
Aug 27, 2019
NoSQL databases provide the performance, scalability and stability that’s required by the modern data-driven apps we interact with these days. But that is where the similarity between NoSQL systems end. In fact, it wouldn’t be wrong to say that the o
1 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
All Your Database Are Belong To Us
Linux Format
Article
All Your Database Are Belong To Us
Apr 6, 2021
7 min read
It’s A Virtual Serverworld
Linux Format
Article
It’s A Virtual Serverworld
Sep 21, 2021
9 min read
Mailserver
Linux Format
Article
Mailserver
May 4, 2021
4 min read
Virtual Beginners
Linux Format
Article
Virtual Beginners
Mar 10, 2020
This issue Jonni has been getting very angry at me for conflating the terms ‘emulate’ and ‘virtualise’ – and don’t even get him started on ‘containerise’. Part of the issue was that I hadn’t clocked that virtualisation only includes systems running o
1 min read
Join the Pod, Man!
Linux Format
Article
Join the Pod, Man!
May 30, 2023
8 min read
Liz Rice Chief Open Source Officer at Isovalent
Techfastly
Article
Liz Rice Chief Open Source Officer at Isovalent
Apr 1, 2022
5 min read
“The Wide Open Vistas Of The Internet Be Came Available. It Was Hair-shirt And Roll-your-own”
PC Pro Magazine
Article
“The Wide Open Vistas Of The Internet Be Came Available. It Was Hair-shirt And Roll-your-own”
Apr 10, 2022
10 min read
5 Best Add–on Services
MacLife
Article
5 Best Add–on Services
Oct 10, 2023
It’s time to add some meat to your new server. These options can be installed directly onto your Mac to provide a hub for other networked devices — computers, mobiles and more — to access shared content and services. > https://calibre-ebook.com If yo
1 min read
Jonathan Ellis INTERVIEW
Linux Format
Article
Jonathan Ellis INTERVIEW
Oct 22, 2019
6 min read
Set Up A Production- Ready Web Server
APC
Article
Set Up A Production- Ready Web Server
Nov 4, 2019
8 min read
How To Build The Linux Format Server
Linux Format
Article
How To Build The Linux Format Server
Oct 19, 2021
10 min read
Create Your Own VPS Internet ArchiveBox
Linux Format
Article
Create Your Own VPS Internet ArchiveBox
Apr 5, 2022
10 min read
Create Your Own VPS Internet ArchiveBox
Linux Format
Article
Create Your Own VPS Internet ArchiveBox
Apr 5, 2022
10 min read
“When Something Goes Wrong, You Realise You’re Like That Cartoon Character That Has Run Off The Edge Of The Cliff”
PC Pro Magazine
Article
“When Something Goes Wrong, You Realise You’re Like That Cartoon Character That Has Run Off The Edge Of The Cliff”
Feb 9, 2023
We need to talk about data. Specifically, your data and my data. The stuff we use on a day-to-day basis, from where we store it to what our expectations are for its safe handling. Now let me get one thing clear from the beginning: I am going to sugge
9 min read
HotPicks
Linux Format
Article
HotPicks
Jan 12, 2021
12 min read
Mailserver
Linux Format
Article
Mailserver
Jul 26, 2022
3 min read
Set Up A Production-ready Web Server
Linux Format
Article
Set Up A Production-ready Web Server
Sep 24, 2019
8 min read
Mailserver
Linux Format
Article
Mailserver
Apr 2, 2024
3 min read
It’s Great When You’re K8s
Linux Format
Article
It’s Great When You’re K8s
Oct 18, 2022
8 min read
Do Docker Like An Adult!
Linux Format
Article
Do Docker Like An Adult!
Feb 6, 2024
In the world of ephemeral services, Docker images provide a great way to have disposable I services on a ‘quick in, quick out’ scenario. With that ease of use, bad practice can creep in. Here we’re discussing some of the ways to optimise production u
4 min read
HotPicks
Linux Format
Article
HotPicks
Nov 19, 2019
12 min read

Related categories

Skip carousel

Reviews for Kafka in Action

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Kafka in Action - Dylan Scott

Kafka in Action

Dylan Scott, Viktor Gamov, and Dave Klein

Foreword by Jun Rao

To comment go to liveBook

Manning

Shelter Island

For more information on this and other Manning titles go to

www.manning.com

Copyright

For online information and ordering of these and other Manning books, please visit www.manning.com. The publisher offers discounts on these books when ordered in quantity.

For more information, please contact

Special Sales Department

Manning Publications Co.

20 Baldwin Road

PO Box 761

Shelter Island, NY 11964

Email: orders@manning.com

No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

♾ Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine.

ISBN: 9781617295232

Dedication

Dylan: I dedicate this work to Harper, who makes me so proud every day, and to Noelle, who brings even more joy to our family every day. I would also like to dedicate this book to my parents, sister, and wife, who are always my biggest supporters.

Viktor: I dedicate this work to my wife, Maria, for her support during the process of writing this book. It’s a time-consuming task, time that I needed to carve out here and there. Without your encouragement, nothing would have ever happened. I love you. Also, I would like to dedicate this book to (and thank) my children, Andrew and Michael, for being so naïve and straightforward. When people asked where daddy is working, they would say, Daddy is working in Kafka.

Dave: I dedicate this work to my wife, Debbie, and our children, Zachary, Abigail, Benjamin, Sarah, Solomon, Hannah, Joanna, Rebekah, Susanna, Noah, Samuel, Gideon, Joshua, and Daniel. Ultimately, everything I do, I do for the honor of my Creator and Savior, Jesus Christ.

Brief contents

Part 1. Getting started

1 Introduction to Kafka

2 Getting to know Kafka

Part 2. Applying Kafka

3 Designing a Kafka project

4 Producers: Sourcing data

5 Consumers: Unlocking data

6 Brokers

7 Topics and partitions

8 Kafka storage

9 Management: Tools and logging

Part 3. Going further

10 Protecting Kafka

11 Schema registry

12 Stream processing with Kafka Streams and ksqlDB

Appendix A. Installation

Appendix B. Client example

Front matter

foreword

preface

acknowledgments

about this book

about the authors

about the cover illustration

Part 1. Getting started

1 Introduction to Kafka

1.1 What is Kafka?

1.2 Kafka usage

Kafka for the developer

Explaining Kafka to your manager

1.3 Kafka myths

Kafka only works with Hadoop®

Kafka is the same as other message brokers

1.4 Kafka in the real world

Early examples

Later examples

When Kafka might not be the right fit

1.5 Online resources to get started

References

2 Getting to know Kafka

2.1 Producing and consuming a message

2.2 What are brokers?

2.3 Tour of Kafka

Producers and consumers

Topics overview

ZooKeeper usage

Kafka’s high-level architecture

The commit log

2.4 Various source code packages and what they do

Kafka Streams

Kafka Connect

AdminClient package

ksqlDB

2.5 Confluent clients

2.6 Stream processing and terminology

Stream processing

What exactly-once means

References

Part 2. Applying Kafka

3 Designing a Kafka project

3.1 Designing a Kafka project

Taking over an existing data architecture

A first change

Built-in features

Data for our invoices

3.2 Sensor event design

Existing issues

Why Kafka is the right fit

Thought starters on our design

User data requirements

High-level plan for applying our questions

Reviewing our blueprint

3.3 Format of your data

Plan for data

Dependency setup

References

4 Producers: Sourcing data

4.1 An example

Producer notes

4.2 Producer options

Configuring the broker list

How to go fast (or go safer)

Timestamps

4.3 Generating code for our requirements

Client and broker versions

References

5 Consumers: Unlocking data

5.1 An example

Consumer options

Understanding our coordinates

5.2 How consumers interact

5.3 Tracking

Group coordinator

Partition assignment strategy

5.4 Marking our place

5.5 Reading from a compacted topic

5.6 Retrieving code for our factory requirements

Reading options

Requirements

References

6 Brokers

6.1 Introducing the broker

6.2 Role of ZooKeeper

6.3 Options at the broker level

Kafka’s other logs: Application logs

Server log

Managing state

6.4 Partition replica leaders and their role

Losing data

6.5 Peeking into Kafka

Cluster maintenance

Adding a broker

Upgrading your cluster

Upgrading your clients

Backups

6.6 A note on stateful systems

6.7 Exercise

References

7 Topics and partitions

7.1 Topics

Topic-creation options

Replication factors

7.2 Partitions

Partition location

Viewing our logs

7.3 Testing with EmbeddedKafkaCluster

Using Kafka Testcontainers

7.4 Topic compaction

References

8 Kafka storage

8.1 How long to store data

8.2 Data movement

Keeping the original event

Moving away from a batch mindset

8.3 Tools

Apache Flume

Red Hat® Debezium™

Secor

Example use case for data storage

8.4 Bringing data back into Kafka

Tiered storage

8.5 Architectures with Kafka

Lambda architecture

Kappa architecture

8.6 Multiple cluster setups

Scaling by adding clusters

8.7 Cloud- and container-based storage options

Kubernetes clusters

References

9 Management: Tools and logging

9.1 Administration clients

Administration in code with AdminClient

kcat

Confluent REST Proxy API

9.2 Running Kafka as a systemd service

9.3 Logging

Kafka application logs

ZooKeeper logs

9.4 Firewalls

Advertised listeners

9.5 Metrics

JMX console

9.6 Tracing option

Producer logic

Consumer logic

Overriding clients

9.7 General monitoring tools

References

Part 3. Going further

10 Protecting Kafka

10.1 Security basics

Encryption with SSL

SSL between brokers and clients

SSL between brokers

10.2 Kerberos and the Simple Authentication and Security Layer (SASL)

10.3 Authorization in Kafka

Access control lists (ACLs)

Role-based access control (RBAC)

10.4 ZooKeeper

Kerberos setup

10.5 Quotas

Network bandwidth quota

Request rate quotas

10.6 Data at rest

Managed options

References

11 Schema registry

11.1 A proposed Kafka maturity model

Level 0

Level 1

Level 2

Level 3

11.2 The Schema Registry

Installing the Confluent Schema Registry

Registry configuration

11.3 Schema features

REST API

Client library

11.4 Compatibility rules

Validating schema modifications

11.5 Alternative to a schema registry

References

12 Stream processing with Kafka Streams and ksqlDB

12.1 Kafka Streams

KStreams API DSL

KTable API

GlobalKTable API

Processor API

Kafka Streams setup

12.2 ksqlDB: An event-streaming database

Queries

Local development

ksqlDB architecture

12.3 Going further

Kafka Improvement Proposals (KIPs)

Kafka projects you can explore

Community Slack channel

References

Appendix A. Installation

Appendix B. Client example

index

Front matter

foreword

Beginning with its first release in 2011, Apache Kafka® has helped create a new category of data-in-motion systems, and it’s now the foundation of countless modern event-driven applications. This book, Kafka in Action, written by Dylan Scott, Viktor Gamov, and Dave Klein, equips you with the skills to design and implement event-based applications built on Apache Kafka. The authors have had many years of real-world experience using Kafka, and this book’s on-the-ground feel really sets it apart.

Let’s take a moment to ask the question, Why do we need Kafka in the first place? Historically, most applications were built on data-at-rest systems. When some interesting events happened in the world, they were stored in these systems immediately, but the utilization of those events happened later, either when the user explicitly asked for the information, or from some batch-processing jobs that would eventually kick in.

With data-in-motion systems, applications are built by predefining what they want to do when new events occur. When new events happen, they are reflected in the application automatically in near-real time. Such event-driven applications are appealing because they allow enterprises to derive new insights from their data much quicker. Switching to event-driven applications requires a change of mindset, however, which may not always be easy. This book offers a comprehensive resource for understanding event-driven thinking, along with realistic hands-on examples for you to try out.

Kafka in Action explains how Kafka works, with a focus on how a developer can build end-to-end event-driven applications with Kafka. You’ll learn the components needed to build a basic Kafka application and also how to create more advanced applications using libraries such as Kafka Streams and ksqlDB. And once your application is built, this book also covers how to run it in production, including key topics such as monitoring and security.

I hope that you enjoy this book as much as I have. Happy event streaming!

—

Jun Rao, Confluent Cofounder

preface

One of the questions we often get when talking about working on a technical book is, why the written format? For Dylan, at least, reading has always been part of his preferred learning style. Another factor is the nostalgia in remembering the first practical programming book he ever really read, Elements of Programming with Perl by Andrew L. Johnson (Manning, 2000). The content was something that registered with him, and it was a joy to work through each page with the other authors. We hope to capture some of that practical content regarding working with and reading about Apache Kafka.

The excitement of learning something new touched each of us when we started to work with Kafka for the first time. In our opinion, Kafka was unlike any other message broker or enterprise service bus (ESB) that we had used before. The speed to get started developing producers and consumers, the ability to reprocess data, and the pace of independent consumers moving quickly without removing the data from other consumer applications were options that solved pain points we had seen in past development and impressed us most as we started looking at Kafka.

We see Kafka as changing the standard for data platforms; it can help move batch and ETL workflows near real-time data feeds. Because this foundation is likely a shift from past data architectures that many enterprise users are familiar with, we wanted to take a user with no prior knowledge of Kafka and develop their ability to work with Kafka producers and consumers, and perform basic Kafka developer and administrative tasks. By the end of this book, we hope you will feel comfortable digging into more advanced Kafka topics such as cluster monitoring, metrics, and multi-site data replication with your new core Kafka knowledge.

Always remember, this book captures a moment in time of how Kafka looks today. It will likely change and, hopefully, get even better by the time you read this work. We hope this book sets you up for an enjoyable path of learning about the foundations of Apache Kafka.

acknowledgments

Dylan

: I would like to acknowledge first, my family: thank you. The support and love shown every day is something that I can never be thankful enough for—I love you all. Dan and Debbie, I appreciate that you have always been my biggest supporters and number one fans. Sarah, Harper, and Noelle, I can’t do justice in these few words to the amount of love and pride I have for you all and the support you have given me. To the DG family, thanks for always being there for me. Thank you, as well, JC.

Also, a special thanks to Viktor Gamov and Dave Klein for being coauthors of this work! I also had a team of work colleagues and technical friends that I need to mention that helped motivate me to move this project forward: Team Serenity (Becky Campbell, Adam Doman, Jason Fehr, and Dan Russell), Robert Abeyta, and Jeremy Castle. And thank you, Jabulani Simplisio Chibaya, for not only reviewing, but for your kind words.

Viktor

: I would like to acknowledge my wife and thank her for all her support. Thanks also go to the Developer Relations and Community Team at Confluent: Ale Murray, Yeva Byzek, Robin Moffatt, and Tim Berglund. You are all doing incredible work for the greater Apache Kafka community!

Dave

: I would like to acknowledge and thank Dylan and Viktor for allowing me to tag along on this exciting journey.

The group would like to acknowledge our editor at Manning, Toni Arritola, whose experience and coaching helped make this book a reality. Thanks also go to Kristen Watterson, who was the first editor before Toni took over, and to our technical editors, Raphael Villela, Nickie Buckner, Felipe Esteban Vildoso Castillo, Mayur Patil, Valentin Crettaz, and William Rudenmalm. We also express our gratitude to Chuck Larson for the immense help with the graphics, and to Sumant Tambe for the technical proofread of the code.

The Manning team helped in so many ways, from production to promotion—a helpful team. With all the edits, revisions, and deadlines involved, typos and issues can still make their way into the content and source code (at least we haven’t ever seen a book without errata!), but this team certainly helped to minimize those errors.

Thanks go also to Nathan Marz, Michael Noll, Janakiram MSV, Bill Bejeck, Gunnar Morling, Robin Moffatt, Henry Cai, Martin Fowler, Alexander Dean, Valentin Crettaz and Anyi Li. This group was so helpful in allowing us to talk about their work, and providing such great suggestions and feedback.

Jun Rao, we are honored that you were willing to take the time to write the foreword to this book. Thank you so much!

We owe a big thank you to the entire Apache Kafka community (including, of course, Jay Kreps, Neha Narkhede, and Jun Rao) and the team at Confluent that pushes Kafka forward and allowed permission for the material that helped inform this book. At the very least, we can only hope that this work encourages developers to take a look at Kafka.

Finally, to all the reviewers: Bryce Darling, Christopher Bailey, Cicero Zandona, Conor Redmond, Dan Russell, David Krief, Felipe Esteban Vildoso Castillo, Finn Newick, Florin-Gabriel Barbuceanu, Gregor Rayman, Jason Fehr, Javier Collado Cabeza, Jon Moore, Jorge Esteban Quilcate Otoya, Joshua Horwitz, Madhanmohan Savadamuthu, Michele Mauro, Peter Perlepes, Roman Levchenko, Sanket Naik, Shobha Iyer, Sumant Tambe, Viton Vitanis, and William Rudenmalm—your suggestions helped make this a better book.

It is likely we are leaving some names out and, if so, we can only ask you to forgive us for our error. We do appreciate you.

about this book

We wrote Kafka in Action to be a guide to getting started practically with Apache Kafka. This material walks readers through small examples that explain some knobs and configurations that you can use to alter Kafka’s behavior to fulfill your specific use cases. The core of Kafka is focused on that foundation and is how it is built upon to create other products like Kafka Streams and ksqlDB. Our hope is to show you how to use Kafka to fulfill various business requirements, to be comfortable with it by the end of this book, and to know where to begin tackling your own requirements.

Who should read this book?

Kafka in Action is for any developer wanting to learn about stream processing. While no prior knowledge of Kafka is required, basic command line/terminal knowledge is helpful. Kafka has some powerful command line tools that we will use, and the user should be able to at least navigate at the command line prompt.

It might be helpful to also have some Java language skills or the ability to recognize programming concepts in any language for the reader to get the most out of this book. This will help in understanding the code examples presented, which are mainly in a Java 11 (as well as Java 8) style of coding. Also, although not required, a general knowledge of a distributed application architecture would be helpful. The more a user knows about replications and failure, the easier the on-ramp for learning about how Kafka uses replicas, for example.

How this book is organized: A roadmap

This book has three parts spread over twelve chapters. Part 1 introduces a mental model of Kafka and a discussion of why you would use Kafka in the real world:

Chapter 1 provides an introduction to Kafka, rejects some myths, and provides some real-world use cases.

Chapter 2 examines the high-level architecture of Kafka, as well as important terminology.

Part 2 moves to the core pieces of Kafka. This includes the clients as well as the cluster itself:

Chapter 3 looks at when Kafka might be a good fit for your project and how to approach designing a new project. We also discuss the need for schemas as something that should be looked at when starting a Kafka project instead of later.

Chapter 4 looks at the details of creating a producer client and the options you can use to impact the way your data enters the Kafka cluster.

Chapter 5 flips the focus from chapter 4 and looks at how to get data from Kafka with a consumer client. We introduce the idea of offsets and reprocessing data because we can utilize the storage aspect of retained messages.

Chapter 6 looks at the brokers’ role for your cluster and how they interact with your clients. Various components are explored, such as a controller and a replica.

Chapter 7 explores the concepts of topics and the partitions. This includes how topics can be compacted and how partitions are stored.

Chapter 8 discusses tools and architectures that are options for handling data that you need to retain or reprocess. The need to retain data for months or years might cause you to evaluate storage options outside your cluster.

Chapter 9 finishes part 2 by reviewing the necessary logs, metrics, and administrative duties to help keep your cluster healthy.

Part 3 moves us past looking at the core pieces of Kafka and on to options for improving a running cluster:

Chapter 10 introduces options for strengthening a Kafka cluster by using SSL, ACLs, and features like quotas.

Chapter 11 digs into the Schema Registry and how it is used to help data evolve, preserving compatibility with previous and future versions of datasets. Although this is seen as a feature most used with enterprise-level applications, it can be helpful with any data that evolves over time.

Chapter 12, the final chapter, looks at introducing Kafka Streams and ksqlDB. These products are at higher levels of abstraction, built on the core you studied in part 2. Kafka Streams and ksqlDB are large enough topics that our introduction only provides enough detail to help you get started on learning more about these Kafka options on your own.

About the code

This book contains many examples of source code both in numbered listings and in line with normal text. In both cases, the source code is formatted in a fixed-width font like this to separate it from ordinary text. In many cases, the original source code has been reformatted; we’ve added line breaks and reworked indentation to accommodate the available page width in the book. In some cases, even this was not enough, and listings include line-continuation markers (➥). Code annotations accompany many of the listings, highlighting important concepts.

Finally, it’s important to note that many of the code examples aren’t meant to stand on their own; they’re excerpts containing only the most relevant parts of what is currently under discussion. You’ll find all the examples from the book and the accompanying source code in their complete form in GitHub at https://github.com/Kafka -In-Action-Book/Kafka-In-Action-Source-Code and the publisher’s website at https://www.manning.com/books/kafka-in-action. You can also get executable snippets of code from the liveBook (online) version of this book at https://livebook.manning.com/ book/kafka-in-action.

liveBook discussion forum

Purchase of Kafka in Action includes free access to liveBook, Manning’s online reading platform. Using liveBook’s exclusive discussion features, you can attach comments to the book globally or to specific sections or paragraphs. To access the forum, go to https://livebook.manning.com/#!/book/kafka-in-action/discussion. You can also learn more about Manning’s forums and the rules of conduct at https://livebook .manning.com/#!/discussion.

Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It is not a commitment to any specific amount of participation on the part of the authors, whose contribution to the forum remains voluntary (and unpaid). We suggest you try asking them some challenging questions lest their interest stray! The forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

Other online resources

The following online resources will evolve as Kafka changes over time. These sites can also be used for past version documentation in most cases:

Apache Kafka documentation—http://kafka.apache.org/documentation.html

Confluent documentation—https://docs.confluent.io/current

Confluent Developer portal—https://developer.confluent.io

about the authors

Dylan Scott

is a software developer with over ten years of experience in Java and Perl. After starting to use Kafka like a messaging system for a large data migration, Dylan started to dig further into the world of Kafka and stream processing. He has used various techniques and queues including Mule, RabbitMQ, MQSeries, and Kafka.

Dylan has various certificates that show experience in the industry: PMP, ITIL, CSM, Sun Java SE 1.6, Oracle Web EE 6, Neo4j, and Jenkins Engineer.

Viktor Gamov

is a Developer Advocate at Confluent, the company that makes an event-streaming platform based on Apache Kafka. Throughout his career, Viktor developed comprehensive expertise in building enterprise application architectures using open source technologies. He enjoys helping architects and developers design and develop low-latency, scalable, and highly available distributed systems.

Viktor is a professional conference speaker on distributed systems, streaming data, JVM, and DevOps topics, and is a regular at events including JavaOne, Devoxx, OSCON, QCon, and others. He is the coauthor of Enterprise Web Development (O’Reilly Media, Inc.).

Follow Viktor on Twitter @gamussa, where he posts there about gym life, food, open source, and, of course, Kafka!

Dave Klein

spent 28 years as a developer, architect, project manager (recovered), author, trainer, conference organizer, and homeschooling dad, until he recently landed his dream job as a Developer Advocate at Confluent. Dave is marveling in, and eager to help others explore, the amazing world of event streaming with Apache Kafka.

about the cover illustration

The figure on the cover of Kafka in Action is captioned Femme du Madagascar or Madagascar Woman. The illustration is taken from a nineteenth-century edition of Sylvain Maréchal’s four-volume compendium of regional dress customs, published in France. Each illustration is finely drawn and colored by hand. The rich variety of Maréchal’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. Whether on city streets, in small towns, or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.

Dress codes have changed since then, and the diversity by region and class, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns or regions. Perhaps we have traded cultural diversity for a more varied personal life—certainly for a more varied and fast-paced technological life.

At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Maréchal’s pictures.

Part 1. Getting started

In part 1 of this book, we’ll look at introducing you to Apache Kafka and start to look at real use cases where Kafka might be a good fit to try out:

In chapter 1, we give a detailed description of why you would want to use Kafka, and we dispel some myths you might have heard about Kafka in relation to Hadoop.

In chapter 2, we focus on learning about the high-level architecture of Kafka as well as the various other parts that make up the Kafka ecosystem: Kafka Streams, Connect, and ksqlDB.

When you’re finished with this part, you’ll be ready to get started reading and writing messages to and from Kafka. Hopefully, you’ll have picked up some key terminology as well.

1 Introduction to Kafka

This chapter covers

Why you might want to use Kafka

Common myths of big data and message systems

Real-world use cases to help power messaging, streaming, and IoT data processing

As many developers are facing a world full of data produced from every angle, they are often presented with the fact that legacy systems might not be the best option moving forward. One of the foundational pieces of new data infrastructures that has taken over the IT landscape is Apache Kafka®.¹ Kafka is changing the standards for data platforms. It is leading the way to move from extract, transform, load (ETL) and batch workflows (in which work was often held and processed in bulk at one predefined time) to near-real-time data feeds [1]. Batch processing, which was once the standard workhorse of enterprise data processing, might not be something to turn back to after seeing the powerful feature set that Kafka provides. In fact, you might not be able to

Enjoying the preview?

Page 1 of 1

Kafka in Action

About this ebook

Dylan Scott

Related authors

Related to Kafka in Action

Related ebooks

Internet & Web For You

Related podcast episodes

Related articles

Related categories

Reviews for Kafka in Action

What did you think?

Book preview

Kafka in Action - Dylan Scott

Kafka in Action

Dedication

Brief contents

contents

Part 1. Getting started

Part 2. Applying Kafka

Part 3. Going further

foreword

preface

acknowledgments

about this book

Who should read this book?

How this book is organized: A roadmap

About the code

liveBook discussion forum

Other online resources

about the authors

about the cover illustration

Part 1. Getting started

1 Introduction to Kafka

This chapter covers