Matching The Scale at With Kafka

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Oct 16, 2018

Matching the Scale at

with Kafka
Krunal Vora
Software Engineer, Observability
Monitoring

Logging
Configuration Management
Infrastructure

2
Preface

3
Preface

Use-cases asserting
Journey on
the contribution of
Tinder
Kafka at Tinder

4
Neil, 25
Barcelona, Spain

Photographer, Travel Enthusiast

5
Amanda, 26
Los Angeles, CA, United States

Founder at Creative Productions

6
Amanda signs up for Tinder!

7
A Quick Introduction
Double Opt-In

9
Necessity to schedule notifications
onboarding the new user

10
Kafka @ Tinder

Kafka Sprinkler

11
Delay Scheduling
Topics

Client
ETL Process user-profile

photo-upload-
reminders

Push {
notification - payload byte[],
Upload photos
scheduling_policy,
Scheduling Service
output_topic
etc.
}
Notification Service

12
Amanda uploads some pictures!

13
Necessity for content moderation!

14
Content Moderation

Trust /
Anti-Spam
worker

Content
Moderation
Publish-Subscribe
ML worker

15
Amanda is all set to start exploring Tinder!

16
Next step: Recommendations!

17
Recommendations

Recommendations
Engine

User
Documents ElasticSearch

18
Meanwhile, Neil has been inactive on
Tinder for a while

19
This calls for User Reactivation

20
Determine the Inactive Users
TTL property
used to identify
inactivity

21
User Reactivation TTL property
used to
identify
Topics inactivity

Client ETL Process app-open

superlikeable

SuperLikeable Worker

feed-updates

Notification Service

Activity Feed Worker

22
User Reactivation works best when
the user is awake. Mostly.

23
Batch User TimeZone
Batch approach

Enrichment
Feature Store
processes

Daily Batch Job

User Events Latitude - Longitude Machine Learning


Enrichment processes

Works but doesn’t provide the


edge of fresh updated data
critical for user experience 24
Need for Updated User TimeZone

- Users’ Preferred times for Tinder


- People who fly for work
- Bicoastal users
- Frequent travelers

25
Updated User TimeZone

Kafka Streams Enrichment Feature Store


processes

Client
Events
Multiple topics for Latitude - Longitude Machine Learning
different workflows Enrichment processes

26
Neil uses the opportunity to get back
on the scene!

27
Neil notices a new feature released by
Tinder - Places!

28
Tinder launches a new feature: Places

Finding common ground

29
Places

Push
Places notifications
backend Places Worker
service Recs

...

Publish-Subscribe

30
Places

Leveraging the “exactly once” semantic provided by Kafka 1.1.0

31
Newly launched
features need that How do we keep an eye?
extra care!

32
Geo Performance Monitoring
Client - Aggregates by country
Performance - Aggregates by a set of
Client ETL Process
Event
rules / slices over the data
Consumer
- Exports metrics using
Prometheus java api

33
How can we analyze the root
Failures are inevitable!
cause with minimum delay?

34
Logging Pipeline
Logstash
Forwarder

Redis
Filebeat ElasticSearch Kibana

Logstash
Indexer
35
Logging Pipeline
Kafka

Filebeat ElasticSearch Kibana

Logstash

36
Neil decides to travel to LA for
potential job opportunities

37
The Passport feature

38
Time to dive deep into GeoSharded
Recommendations

39
Recommendations

Recommendations
Engine

User
Documents ElasticSearch

40
Passport to GeoShards

Shard A Shard B

41
GeoSharded Recommendations V1
Shard A

User ES Feeder ES Feeder


Documents Service Worker
Shard B

SQS Queue

Location Shard C
Service

Shard D

Tinder
Recommendation
Engine
42
GeoSharded Recommendations V1
Shard A

User ES Feeder ES Feeder


Documents Service Worker
Shard B

SQS Queue

Location Shard C
Service

Shard D

Tinder
Recommendation
Engine
43
GeoSharded Recommendations

Source: Tinder Tech Blog - https://tech.gotinder.com/geosharded-recommendations-part-3-consistency/ 44


GeoSharded Recommendations V2
Shard A

User ES Feeder ES Feeder


Documents Service Worker
Shard B

Guaranteed Ordering
Location Shard C
Service

Shard D

Tinder
Recommendation
Engine

45
Neil swipes right!

46
47
Impact of Kafka @ Tinder
Client
Events

Data Processing

Server Push Notifications


Events
Delayed Events

Feature Store

Third Party
Events

48
Impact of Kafka @ Tinder

~86B >40TB ~90%


Events/Day Data/Day Cost Effectiveness

Kafka delivers the


Using Kafka over SQS / Kinesis
~1M Events/Second performance and throughput
saves us approximately 90%
needed to sustain this scale of
on costs
data processing

49
Roadmap: Unified Event Bus

Producer Consumer

Stream
Producer Stream
Worker Custom
Event Events Events Event
Consumer
Publisher Subscriber
Destination
Interface

Events
Resource

50
And lastly,
A shout-out to all the Tinder team members that helped putting together this information

51
Thank you!
PRESENTATION ASSETS

52

You might also like