What is the industry/production grade solutions or if you have already any experience please share it. Thanks

  • maynarkh@feddit.nl
    link
    fedilink
    arrow-up
    16
    ·
    edit-2
    11 months ago

    Yeah, Postgres might be better, like the people in the other answers say.

    For the people though, maybe OP as well, who haven’t got free rein on what DB system they use because they might be doing this for work, or they might already have a lot built on MySQL, or they might be hosting software that only works with MySQL:

    https://dev.mysql.com/doc/refman/8.0/en/replication.html

    Create read replicas, route read traffic to them. You can scale reads out on MySQL or any other relational DB really by using read replicas, you usually can’t scale writes horizontally though. The one thing you should check in addition to your traffic profile vis-à-vis reads vs. writes is whether read replicas still provide consistency, ergo if a write on the master immediately appears on the replicas.

    Databases usually have two choices there; they either have replicas lag behind the master, meaning that something you wrote to the master will not always appear immediately on the read replicas, or they lock the whole system up on each write to guarantee you are reading the latest info. I guess MySQL would tend to the saner former option.

    If you have any reads that are really important to be consistent with writes done just before them, just route them to the master if you still have capacity there.

    • incompetentboob@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      11 months ago

      Thank you for answering the question like a helpful person instead of just instantly posting “you’re wrong and you should use this instead”

      It’s so frustrating when someone doesn’t understand that there are constraints that OP hasn’t included because it’s not relevant to the question.

    • A Phlaming Phoenix@lemm.ee
      link
      fedilink
      arrow-up
      3
      ·
      11 months ago

      To build on this (and I also use Postgres, so I’m assuming MySQL/MariaDB are similar), there is almost certainly a metric emitted by the DBs that can tell you how long that lag is between initial write and replica updates. That would be the thing to monitor to detect the specific problem where replication lag creates application lag.

      Also worth mentioning that horizontal scaling can solve some problems, but there are a few major configuration items to check that will improve performance across all the replicas. Off the cuff:

      • Properly index your tables
      • Build on hardware big enough to keep indexed data in memory
      • Don’t use the MyISAM engine with MySQL since it has a bunch of performance and locking problems. Upgrade to InnoDB.
      • Optimize your queries. Horizontal scaling won’t give you much improvement if you’re doing full table scans or something like that. ORMs can produce some pretty ugly SQL sometimes. Consider writing your own queries that are better optimized to make use of your indices.
    • xmunk
      link
      fedilink
      arrow-up
      2
      ·
      11 months ago

      It’s also important to consider that horizontal scaling has limits, when you enable replication it does put additional load on the write instance and a naively configured replication setup will add additional load for every replica.

    • xmunk
      link
      fedilink
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      11 months ago

      I’m glad someone else said it… It may sound glib but the performance limits between Postgres and MySQL are night and day.

      More seriously though… Read replication is how you horizontally scale any RDBMS. Read replication unlocks a serious amount of read bandwidth but it comes at a (usually steep) technical cost to migrate logic to be handle aware - especially when it comes to complex transactions.

        • xmunk
          link
          fedilink
          arrow-up
          1
          arrow-down
          1
          ·
          11 months ago

          I provided a “real” answer, but I think it’s important to realize that MySQL is far behind the game in terms of performance… The Oracle acquisition really fucked that software and it hasn’t recovered.

          Anyways, horizontal scaling means replicas… But the specifics of that (and resources to aide with it) are highly dependent on their infrastructure… Self hosted? AWS RDS? Aurora? All those have really different starting points when considering a migration.

  • fruitycoder
    link
    fedilink
    arrow-up
    2
    ·
    11 months ago

    I’ve been eyeing vitess for this personally, haven’t needed it yet, so I can’t tell you how good it is in my opinion.

    • hypnotic_nerd@programming.devOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      Can you help me out here, I tried to setup vitess with kubernetes and vitess-operator but pods were always crashing and there is no proper community support and docs for it. How did you implement it? And how is the support?

      • fruitycoder
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        11 months ago

        Are you following this guide perhaps?

        Also make sure you have a storage class setup (for example Longhorn or Rook/Ceph) the operator docs doesn’t explicitly mention that requirement.

  • Gallardo994
    link
    fedilink
    arrow-up
    2
    ·
    11 months ago

    I may suggest Galera Cluster. Easy to set up and pretty reliable