October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging system
-
Upload
yahoo-developer-network -
Category
Technology
-
view
176 -
download
0
Transcript of October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging system
Dist r ibuted pub/sub p lat formgithub.com/yahoo/pulsar
Matteo Merli — [email protected] Bay Area Hadoop Meetup — 10/19/2016
What is Pulsar?
2
▪ Hosted multi-tenant pub/sub messaging platform ▪ Simple messaging model ▪ Horizontally scalable - Topics, Message throughput ▪ Ordering, durability & delivery guarantees ▪ Geo-replication ▪ Easy to operate (Add capacity, replace machines) ▪ Few numbers for production usage: › 1.5 years — 1.4 M topics — 100 B msg/day — Zero data loss › Average publish latency < 5ms, 99pct 15ms › 80+ application onboarded — Self-serve provisioning › Presence in 8 data centers
Pulsar
Common use cases
3
▪ Application integration › Server-to-server control, status propagation, notifications
▪ Persistent queue › Stream processing, buffering, feed ingestion, tasks dispatcher
▪ Message bus for large scale data stores › Durable log › Replication within and across geo-locations
Pulsar
Main features
4
▪ REST / Java / Command line administrative APIs › Provision users / grant permissions › Users self-administration › Metrics for topics / brokers usage
▪ Multi tenancy › Authentication / Authorization › Storage quota management › Tenant isolation policies › Message TTL › Backlog and subscriptions management tools
▪ Message retention and replay › Rollback to redeliver already acknowledged messages
Pulsar
Why build a new system?
5
▪ No existing solution to satisfy requirements › Multi tenant — 1M topics — Low latency — Durability — Geo replication
▪ Kafka doesn’t scale well with many topics: › Storage model based on individual directory per topic partition › Enabling durability kills the performance
▪ Ability to manage large backlogs ▪ Operations are not very convenient › eg: replacing a server, manual commands to copy the data and involves clients › clients access to ZK clusters not desirable
▪ No scalable support to keep consumer position
Pulsar
Messaging Model
6 Pulsar
Consumer-A1 receives all messages published on T; B1, B2, B3 receive one third each
Shared
Exclusive
Consumer-B1
Consumer-B2
Consumer-B3
Topic-T
Subscription-B
Subscription-A Consumer-A1Producer-X
Producer-Y
7
Client API
Producer
PulsarClient client = PulsarClient.create( "http://broker.usw.example.com:8080");
Producer producer = client.createProducer( "persistent://my-prop/us-west/my-ns/my-topic");
// Handles retries in case of failure producer.send("my-message".getBytes());
// Async version: producer.sendAsync(“my-message”.getBytes()) .thenAccept(msgId -> { // Message was persisted });
Consumer
PulsarClient client = PulsarClient.create( "http://broker.usw.example.com:8080");
Consumer consumer = client.subscribe( "persistent://my-prop/us-west/my-ns/my-topic", "my-subscription-name");
while (true) { // Wait for a message Message msg = consumer.receive();
// Process message …
// Acknowledge the message so that // it can be deleted by broker consumer.acknowledge(msg); }
Pulsar
Main client library features
8
▪ Sync / Async operations ▪ Partitioned topics ▪ Transparent batching of messages ▪ Compression ▪ End-to-end checksum ▪ TLS encryption ▪ Individual and cumulative acknowledgment ▪ Client side stats
Pulsar
Architecture
9 Pulsar
Separate layers between brokers and storage (bookies) ‣ Broker and bookies can
be added independently
‣ Traffic can be shifted very quickly across brokers
‣ New bookies will ramp up on traffic quickly
Pulsar Cluster
ZK
Producer Consumer
Broker 1 Broker 3
Bookie 1
Bookie 2
Bookie 3
Bookie 4
Bookie 5
Broker 2
Architecture
10 Pulsar
Pulsar Cluster
Broker
Bookie
ZK
GlobalZK
Servicediscovery
Producer AppPulsar
lib
Replication
ManagedLedger
BK Client
Globalreplicators
Cache
Dispatcher
Consumer AppPulsar
lib
LoadBalancer
Broker ‣ End-to-end async
message processing ‣ Messages are relayed
across producers, bookies and consumers with no copies
‣ Pooled ref-counted buffers
‣ Cache recent messages
BookKeeper
11
▪ Replicated log service ▪ Offer consistency and durability
▪ Why is it a good choice for Pulsar? › Very efficient storage for sequential data › For each topic we are creating multiple ledgers over time › Very good distribution of IO across all bookies › Isolation of write and reads › Flexible model for quorum writes with different tradeoffs
Pulsar
BookKeeper - Storage
12
▪ A single bookie can serve and store thousands of ledgers
▪ Writes to journal, reads come from ledger device: › Avoid read activity to impact
write latency › Writes are added to in-
memory write-cache and committed to journal
› Write cache is flushed in background to separated ledger device
▪ Entries are sorted to allow for mostly sequential reads
Pulsar
Performance — Single topic throughput and latency
13 Pulsar
Throughput and 99pct publish latency — 1 Topic — 1 Producer
Late
ncy
(ms)
0
1
2
3
4
5
6
Throughput (msg/s)1,000 10,000 100,000 1,000,000 10,000,000
1,800,000
10 Bytes100 Bytes1KB
Final Remarks
• Check out the code and docs at github.com/yahoo/pulsar
• Give feedback or ask for more details on mailing lists: • Pulsar-Users • Pulsar-Dev