 | perfectspiral on Dec 6, 2020 | next [–] It looks like Azure has now integrated Citus into it's Postgres offering, but Im curious why AWS hasn't integrated either Vitess or Citus at all? It seems like a well executed combination of Vitess/Citus + read replicas pointing to decoupled page storage would result in a fully featured non-proprietary database that scales all the way down to your laptop and all the way up to the heaviest OLTP use cases. If that existed, only truly masochistic people would continue using things like DynamoDB. |
|
 | carlineng on Dec 6, 2020 | parent | next [–] Citus was acquired by MSFT in 2019, hence the integration. Vitess is primarily a MySQL-based solution, and I imagine that AWS would rather have its customers on Postgres-flavored DBs, given that MySQL is currently owned by Oracle. |
|
 | perfectspiral on Dec 7, 2020 | root | parent | next [–] The Citus extension is still open-source though, right? Although I think it was a solid strategic move by MSFT to acquihire the Citus devs (probably all these people were given substantial golden handcuffs) to block AWS from hiring experts to integrate this extension into their PG offering. |
|
 | marcinzm on Dec 6, 2020 | parent | prev | next [–] >If that existed, only truly masochistic people would continue using things like DynamoDB. Which is probably why AWS doesn't support it, they make less money and have less lock in on their clients. |
|
 | Scarbutt on Dec 6, 2020 | parent | prev | next [–] They want you to use Aurora instead. |
|
 | wiredfool on Dec 6, 2020 | prev | next [–] Sharding based on customer is an attractive way to get to shards, potentially with minimal interference in ops or data model, but truly sucks when your biggest customers don’t fit on one shard anymore. |
|
 | sougou on Dec 6, 2020 | parent | next [–] The way to reason about this is: If a customer had a huge business, what key would they shard by? Then choose that as the sharding key for all customers. |
|
 | philsnow on Dec 7, 2020 | root | parent | next [–] sougou is one of the original vitess devs, he would know this very well. Hi Sugu! |
|
 | atombender on Dec 6, 2020 | parent | prev | next [–] The solution is to have different shard sizes. Your largest customer needs to fit on the largest shard size (extra large VMs or whatever). |
|
 | wiredfool on Dec 6, 2020 | root | parent | next [–] Fins, but there comes a time when your largest customer doesn’t fit on your largest machine. I wouldn’t say that killed my first startup, but it certainly didn’t help. |
|
 | continuations on Dec 6, 2020 | root | parent | next [–] How big is your largest customer though? You can easily buy a machine with several TB of RAM, several hundreds of TB of SSDs in RAID giving you millions of IOPS, quad-socket 256 cores. How likely is it that a single machine cannot handle a single customer? |
|
 | jjeaff on Dec 7, 2020 | root | parent | next [–] Ya, this is always what I think about when I see people spending lots of effort on "scalability". You either need an enormous client base or are doing something very specialized that requires tons of memory or cpu per customer. I would think 99% of all internet companies could likely handle everything on a single massive machine, or two, for redundancy. Assuming they were somewhat optimized. |
|
 | derekperkins on Dec 7, 2020 | root | parent | next [–] That introduces different problems. Applying MySQL replication logs is single threaded by default and generally doesn't scale as well as the master, so it becomes increasingly difficult to keep replicas from lagging. As your database gets into tens or hundreds of TBs, now backups and restores take hours or days to complete. Any DR scenario becomes an existential threat. For large tables, schema changes can take weeks. This often results in developers doing their own custom table sharding strategies, even though they might still be on a single machine. I'm not saying that you can't architect around these problems, but operating a db at massive scale on a single machine isn't the obvious win that it seems to be. |
|
 | atombender on Dec 6, 2020 | root | parent | prev | next [–] I guess I was assuming a shard was more than one machine. If a customer can't be horizontally scaled across nodes, then of course there's an upper limit on how large a shard can be. Database like CockroachDB and TiDB help here. |
|
 | chevman on Dec 6, 2020 | parent | prev | next [–] Yeah, seems like the key point here is be careful architecting your infrastructure around a somewhat arbitrary construct like 'customer'. The relevant entity to scale around for Slack is likely 'active users' or even 'messages'. Which may map loosely to 'customers', but that's almost incidental. |
|
 | juancampa on Dec 6, 2020 | parent | prev | next [–] The article addresses this particular point > Each keyspace is a logical collection of data that roughly scales by the same factor — number of users, teams, and channels. Say goodbye to only sharding by team, and to team hot-spots! |
|
 | kamikaz1k on Dec 6, 2020 | parent | prev | next [–] Has that ever happened to you? Would love to read about that story and hopefully the solution... |
|
 | yashap on Dec 6, 2020 | prev | next [–] > Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes. Damn, even for a service as popular as Slack, that’s significantly more than I expected. Slack has ~12-13 mil DAUs, right? I assume at peak time of day, maybe 2-3 million actively using the product at the same time? If that’s a fair assumption, that’s roughly 1 MySQL query per second, per active customer - seems like a fair bit? I wonder if they do polling (instead of websockets)? At my work we have roughly 1 order of magnitude fewer DAUs, but roughly 2 orders of magnitude fewer QPS. And we also have chat areas of the product. |
|
 | jhgg on Dec 6, 2020 | parent | next [–] It is a pretty high QPS. I work on a chat service that is quite popular (even more popular than Slack) and looked at our QPS to our persistent data stores. It is less than an order of magnitude compared to slack's when you normalize for DAU. Really wonder what they're doing over there. |
|
 | derekperkins on Dec 7, 2020 | root | parent | next [–] Whatever the service, it isn't the platform that Slack is. I would guess that a huge percentage of their db traffic is from apps and custom integrations. |
|
 | karlding on Dec 6, 2020 | parent | prev | next [–] > Slack has ~12-13 mil DAUs, right? As of September 2019, Slack reported 12 million DAUs [0]. Of course, that is pre-pandemic, and the only figures that Slack has provided regarding demand in 2020 are these tweets [1] from Slack CEO Stewart Butterfield back in March. In those tweets, it mentions that Slack was serving 12.5M simultaneously connected users. [0] https://slack.com/blog/news/work-is-fueled-by-true-engagemen... [1] https://twitter.com/stewart/status/1243000487365861376 |
|
 | ahachete on Dec 6, 2020 | parent | prev | next [–] GitLab.com is serving around 300QPS at peak (see https://about.gitlab.com/blog/2020/09/11/gitlab-pg-upgrade/). While that's like 1/7th of Slack's traffic, it is served by a single PostgreSQL cluster. That's the comment. |
|
 | yashap on Dec 7, 2020 | root | parent | next [–] May want to edit your post - I believe you meant 300K QPS, not 300 QPS :) But yeah, that’s very impressive to handle that much load without sharding! Though it is a cluster of 12 instances, each with 96 CPUs and 614 GB RAM - those are some seriously beefy machines. Slack are getting 300K WRITES per second, though. Which does mean you have really no choice but to shard - read replicas are nice for scaling reads, but do nothing for writes, and that’s A LOT of writes. |
|
 | ahachete on Dec 7, 2020 | root | parent | next [–] Cannot edit any longer, but yes, that's a typo, it's 300K QPS. Right now the cluster is only 8 instances, and the write-only traffic on the master spikes to 70K QPS. So I agree with you --pretty beefy hardware, and it'd probably require sharding before reaching Slack's scale, but still quite impressive. |
|
 | conradfr on Dec 6, 2020 | parent | prev | next [–] Does that account for all those app/bots monitoring/CI/VCS/etc that "spam" channels? |
|
 | mosselman on Dec 6, 2020 | prev [–] Is there something comparable for PostgreSQL that makes it easy to setup clusters with? Maybe with automatic fail overs and HA? |
|
 | shlomi-noach on Dec 6, 2020 | parent | next [–] Deepthi, tech lead for Vitess, just published this RFC a few days ago: https://github.com/vitessio/vitess/issues/7084 Vitess plans to support PostgreSQL, but there is no work on that at this time. |
|
 | merb on Dec 7, 2020 | parent | prev | next [–] on k8s there is: - crunchydata - the thing crunchy is based on (well actually they are more based on the underlying things like patroni): zalando postgres operator https://github.com/zalando/postgres-operator, really really solid stuff |
|
 | nkcmr on Dec 6, 2020 | parent | prev | next [–] Not a DBA, but this seems comparable: > stolon - PostgreSQL cloud native High Availability https://github.com/sorintlab/stolon |
|
 | manquer on Dec 6, 2020 | parent | prev | next [–] Citus does sharding as an extension for postgres. If all you want is HA and FO there are simpler solutions like pgpool and pgbouncer etc. |
|
 | dilyevsky on Dec 6, 2020 | parent | prev | next [–] CockroachDB uses pg protocol but doesn’t support all the features (close enough though). |
|
 | Scarbutt on Dec 6, 2020 | parent | prev [–] Not a library but you can achieve the same with less work using amazon aurora postgres. |
|
 | perfectspiral on Dec 6, 2020 | root | parent | next [–] It's not the same, you can't scale out from 1 writer on Aurora. |
|
 | isatty on Dec 6, 2020 | root | parent | prev [–] And a boat load of money |
|