Hypothesis

20 Matching Annotations

Mar 2024
engineering.linkedin.com engineering.linkedin.com

Running Kafka At Scale

1
1. hexus80 18 Mar 2024
  
  in Public
  
  Kafka mirror maker
  
  https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330
Visit annotations in context

Annotators

hexus80

URL

engineering.linkedin.com/kafka/running-kafka-scale
Mar 2022
docs.microsoft.com docs.microsoft.com

Choreography pattern - Cloud Design Patterns

2
1. hexus80 28 Mar 2022
  
  in Public
  
  reduce coupling between services
  
  But in return it demands each service to be aware of the initial request's business logic.
2. hexus80 28 Mar 2022
  
  in Public
  
  It acknowledges all incoming requests and delegates operations to the respective services.
  
  E.g. an aggregator
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/choreography
docs.microsoft.com docs.microsoft.com

Claim-Check pattern - Cloud Design Patterns

2
1. hexus80 28 Mar 2022
  
  in Public
  
  Claim-Check pattern
  
  In a way, the valet key pattern is a claim-check (you do not send the data, you send a token that can be used to get/process the data) The same consideration about event streaming - e.g. a stream with all actions that happen to a blob in a storage can be also kind-of-sort-of considered a claim-check
2. hexus80 28 Mar 2022
  
  in Public
  
  message size exceeds the data limit of the message bus
  
  Though it will make the code a bit more complex
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/claim-check
docs.microsoft.com docs.microsoft.com

Valet Key pattern - Azure Architecture Center

1
1. hexus80 10 Mar 2022
  
  in Public
  
  client with a key or token that the data store can validate
  
  E.g. pre-signed URLs for S3
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/valet-key
docs.microsoft.com docs.microsoft.com

Sharding pattern - Cloud Design Patterns

4
1. hexus80 10 Mar 2022
  
  in Public
  
  autoincremented fields can't be coordinated across shards, possibly resulting in items in different shards having the same shard key
  
  It's a completely separate problem, not related to hashing. Any key that is used for hashing can be potentially duplicated. Even UUIDs are not 100% unique, it's just the likelihood of a collision for it that is insanely low. But not impossible. Furthermore, it seems like uniqueness opposes the purpose of logical partitioning - the better uniqueness of the key is, the lower logical meaning of it. E.g. in the triplet merchandise id as code - merchandise id as int - merchandise id as uuid, code is the most meaningful, but the least unique. UUID is opposite - it's unique, but doesn't carry any significant meaning.
2. hexus80 10 Mar 2022
  
  in Public
  
  similar volume of I/O
  
  This. Not data volume, but I/O
3. hexus80 10 Mar 2022
  
  in Public
  
  because the partition keys are hashes of the shard keys or data identifiers.
  
  Consistent hashing?
4. hexus80 10 Mar 2022
  
  in Public
  
  The next figure illustrates storing sequential sets (ranges) of data in shard.
  
  So lookup is still required
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/sharding
docs.microsoft.com docs.microsoft.com

Materialized View pattern - Cloud Design Patterns

3
1. hexus80 10 Mar 2022
  
  in Public
  
  When defining a materialized view, maximize its value by adding data items or columns to it based on computation or transformation of existing data items, on values passed in the query, or on combinations of these values when appropriate.
  
  Denormalization?
2. hexus80 10 Mar 2022
  
  in Public
  
  A materialized view is never updated directly by an application, and so it's a specialized cache.
  
  Would be nice to have some examples, systems that perform the data storage -> materialized view sync
3. hexus80 10 Mar 2022
  
  in Public
  
  To support efficient querying, a common solution is to generate, in advance, a view that materializes the data in a format suited to the required results set.
  
  'Write-heavy vs read-heavy' balance
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/materialized-view
docs.microsoft.com docs.microsoft.com

Index Table pattern - Azure Architecture Center

3
1. hexus80 09 Mar 2022
  
  in Public
  
  Use this pattern to improve query performance when an application frequently needs to retrieve data by using a key other than the primary (or shard) key.
  
  ... or consider an indexing engine like Elasticsearch or Solr
2. hexus80 09 Mar 2022
  
  in Public
  
  The first operation searches the index table to retrieve the primary key, and the second uses the primary key to fetch the data.
  
  ... especially if data storage doesn't support any form of join
3. hexus80 09 Mar 2022
  
  in Public
  
  The overhead of maintaining secondary indexes can be significant
  
  This
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/index-table
docs.microsoft.com docs.microsoft.com

Event Sourcing

4
1. hexus80 09 Mar 2022
  
  in Public
  
  Adding a timestamp to every event
  
  Only if clocks are monotonic and guaranteed to be in sync between event source service instances. And even then a clash is possible - depends on timestamp precision.
2. hexus80 09 Mar 2022
  
  in Public
  
  by avoiding the need to synchronize the data model
  
  How does it avoid the data model (schema) change sync problems?
3. hexus80 09 Mar 2022
  
  in Public
  
  data update conflicts are more likely because the update operations take place on a single item of data
  
  But in a distributed system the conflicts problem still exists? Unless there's partitioning that guarantees that each item is always processed by the same event source service instance. And even then multiple clients can concurrently apply different changes to the same data?
4. hexus80 09 Mar 2022
  
  in Public
  
  directly against a data store, which can slow down performance
  
  How it is related? Besides, the performance of the part that is responsible for keeping materialized view of the data is going to have similar performance issues?
Visit annotations in context

Annotators

hexus80

URL

docs.microsoft.com/en-us/azure/architecture/patterns/event-sourcing

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL

Annotators

URL