Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Exactly Once"

Over a window of time that changes depending on the amount of ingested events.

Basically, they read from a kafka stream and have a deduplication layer in rocks db that produces to another kafka stream. They process about 2.2 billion events through it per day.

While this will reduce duplicates and get closer to Exactly Once (helping reduce the two generals problem on incoming requests and potentially work inside their data center), they still have to face the same problem again when they push data out to their partners. Some packet loss, and they will be sending out duplicate to the partner.

Not to downplay what they have done as we are doing a similar thing near our exit nodes to do our best to prevent duplicate events making it out of our system.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: