10 Matching Annotations
  1. Jun 2017
    1. A better alternative is at least once message delivery. For at least once delivery, the consumer reads data from a partition, processes the message, and then commits the offset of the message it has processed. In this case, the consumer could crash between processing the message and committing the offset and when the consumer restarts it will process the message again. This leads to duplicate messages in downstream systems but no data loss.

      This is what SECOR does.

    1. On every received heartbeat, the coordinator starts (or resets) a timer. If no heartbeat is received when the timer expires, the coordinator marks the member dead and signals the rest of the group that they should rejoin so that partitions can be reassigned. The duration of the timer is known as the session timeout and is configured on the client with the setting session.timeout.ms. 

      Time to live for the consumers. If the heartbeat doesn't reach the co-ordindator in this duration then the co-ordinator redistributes the partitions to the remaining consumers in the consumer group.

    2. The high watermark is the offset of the last message that was successfully copied to all of the log’s replicas.

      High Watermark: messages copied over to log replicas

    3. Kafka new Client which uses a different protocol for consumption in a distributed environment.

    4. Kafka scales topic consumption by distributing partitions among a consumer group, which is a set of consumers sharing a common group identifier.

      Topic consumption is distributed among a list of consumer group.

    1. Consumers in this group are designed to be dead-simple, performant, and highly resilient. Since the data copied verbatim, no code upgrades are required to support new message types.

      exactly what we want

  2. May 2017
    1. The way consumption is implemented in Kafka is by dividing up the partitions in the log over the consumer instances so that each instance is the exclusive consumer of a "fair share" of partitions at any point in time. This process of maintaining membership in the group is handled by the Kafka protocol dynamically. If new instances join the group they will take over some partitions from other members of the group; if an instance dies, its partitions will be distributed to the remaining instances.

      The coolest feature: this way all you need to do is add new consumers in a consumer group to auto scale per topic

    2. Consumers label themselves with a consumer group name

      maintain separate consumer group per tenant basis. Helps to scale out when we have more load per tenant.

    1. Consumers label themselves with a consumer group name, and each record published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.

      kafka takes care of the consumer groups. Just create one Consumer Group for each topic.