Apache Kafka is a distributed messaging system developed by LinkedIn. It functions similarly to a message queue, but is more specialized for the simultaneous use-cases of real-time and batch log processing. In particular, Kafka emphasizes throughput, scalability, and consumer flexibility over supporting some of the more traditional message queue features, e.g. the Java Message Service (JMS) standard. This paper describes the architecture of Kafka and how it is used at LinkedIn. We'll take a look at a handful of design and implementation decisions in Kafka that enable it to be very efficient and flexible.
At a high level, Kafka follows a messaging model similar to that of other message queues. Producers send messages to "topics," which are essentially namespaces for the messages to live in, and consumers subscribe to the topics that they wish to receive messages from. The first set of implementation details that enable Kafka to be efficient come from within a single server (referred to as a "broker"). Each broker stores some set of partitions for a topic, so that the messages can be distributed among multiple brokers (we'll see how the partitions play into the consumer side later). Each partition can be thought of as a set of messages appended to each other and broken up into files of reasonable size (the paper uses 1GB as the example). Like in many other systems, inserts in Kafka are implemented as appending the message to the end of a file, which is efficient using commodity hardware. The authors, however, choose not to do any in-memory caching of the messages from disk. The reasons are twofold: firstly, the standard OS page cache is quite effective for the access patterns being used (i.e. write-through and read-ahead policies), and secondly, they leverage the Linux sendfile API to send data directly from the file to the socket, avoiding the overhead of having the data live in user space. An interesting effect of this is that the Kafka process itself does not manage a lot of memory, so it is less prone to garbage collection problems and restarts since most of the memory is handled by the OS. These optimizations enable Kafka to have excellent performance on a single broker.
Kafka does interesting things on the consumer side as well. In traditional message queues, the server is typically in charge of keeping track of which messages have been sent to which consumers based on the acknowledgments it has received. Kafka inverts this model and lets consumers request messages from whatever position they wish to. There are a few issues that this can run into. Brokers can no longer clean up messages that they know have been consumed because consumers can always request older ones. This is solved by Kafka having a retention model of a fixed period of time, e.g. 7 days; because of how the partitions are implemented within a single broker, performance does not degrade as partitions grow (unlike with many message queue implementations). This also enables the different patterns of consumption that Kafka was built to handle. Real-time services will consume messages as soon as they appear, while data warehousing and analytics services can, for example, batch consume once per day. The other effect of brokers not controlling message delivery is that consumers who wish to distribute the messages of a topic among themselves (so that each message is consumed once) must coordinate in some way. Kafka achieves this by using the existing highly-available service ZooKeeper. Both consumers and brokers must talk to ZooKeeper in order to coordinate which consumers are responsible for which partitions of a topic and what offset each partition is currently at. So the logic for consumption progress is external to the implementation of a single partition, which allows that to be very simple and efficient as described above.
Messaging systems often form the core of a distributed service, so reliability, scalability, and throughput are of the utmost importance. Having had some less-than-ideal experiences with existing message queues, it is refreshing to see a new approach in Kafka. While it does not satisfy the most stringent durability requirements, Kafka makes up for it with excellent performance and the ability to distribute effectively. It is also an example of how we often do not need to build our own locking/consensus services anymore because ZooKeeper provides a flexible API and strong guarantees on both consistency and availability. I anticipate that future messaging systems will make many of the same decisions that Kafka made in order to satisfy growing data volumes and the fundamentally distributed nature of systems.
No comments:
Post a Comment