High Performance Data Processing with Python, Kafka and Elasticsearch

Harshit Prasad

E-Commerce Elastic Search Engineering EuroPython Messaging and Job Queues (RabbitMQ/Redis/...)

See in schedule: Fri, Jul 30, 13:45-14:15 CEST (30 min) Download/View Slides

In the current technology era, all kind of applications work on data. Data is used to represent a set of information. The healthcare apps, e-commerce apps etc works on data. Sometimes, this data needs to be get updated to reflect new changes across the platform. This action can be performed manually but what if platform data is getting updated in realtime or let’s say in every 1 hour? Such kind of problem can be solved by implementing a service based on Producer Consumer model.

In this talk, I will be covering how Producer Consumer models work and how such design pattern can be implemented with Python. I will be explaining the whole implementation process using other tools such as Kafka as data streamer and Elasticsearch as data store.

Talk Outline:

1. Problem Statement (2 mins)
1. Introduction to problem statement.

2. Introduction to Producer Consumer Model (3 mins)
1. Basics of Producer Consumer Model
2. Applications

3. Deep-dive explanation of Producer Consumer model using example (5 mins)
1. Elasticsearch
2. Kafka

4. Explaining parts of our Producer Consumer model (5 mins)
1. What kind of data are we updating in our data store?
2. Why it’s a high performance solution?
3. Implementation in Python as end-to-end framework.

5. Code walkthrough (5 mins)
1. Produce data
2. Stream data
3. Consume data

6. Conclusion and Learnings (5 mins)
1. Learnings
2. Performance Pros and Cons

7. Q/A Session (5 mins)

Target Audience - Beginner / Intermediate

Proposal Section - Web based Systems

Prerequisites - Python & System Design

Type: Talk (30 mins); Python level: Beginner; Domain level: Intermediate


Harshit Prasad

Grofers

Harshit Prasad is a Software Engineer at Grofers - India’s largest online grocery shopping platform. He is an avid programmer who is passionate about code, design, and technology. Harshit is an open-source contributor and worked with many organizations such as HackerRank, CERN in the past. He has been a Google Summer of Code student two times in 2017 and 2018. When Harshit is away from work - he likes to play badminton, write blogs, help people on StackOverflow. He loves traveling and photography.