Federation likes (votes) are wildly different from server to server as it stands now. And unless something is way off on my particular server, 0.4 seconds is what PostgreSQL is reporting as the mean (average) time per single comment vote INSERT, and post vote INSERT is similar. (NOTE: my server is classic hard drives, 100MB/sec bencharked, not a SSD)

Discussion of the SQL statement for a single comment vote insert: https://lemmy.ml/post/1446775

Every single VOTE is both a HTTP transaction from the remote server and a SQL transaction. I am looking into Postgress supporting batches of inserts to not check all the index constraints at each single insert: https://www.postgresql.org/docs/current/sql-set-constraints.html

Can the Rust code for inserts from federation be reasonably modified to BEGIN TRANSACTION only every 10th comment_like INSERT and then do a COMMIT of all of them at one time? and possibly a timer that if say 15 seconds passes with no new like entries from remote servers, do a COMMIT to flush based a timeout.

Storage I/O writing for votes alone is pretty large…

  • King@vlemmy.net
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Batching the inserts up only kicks the can down the road a few weeks. We need a 500x improvement in insertion time.

    • RoundSparrow@lemmy.mlOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      The proposal has been raised (by me) to move all federation out of lemmy_server into a different service and have a queue in there. I think that opens up to people working and updating the code better. The email systems I have worked with that have a database storage backend have used their own MTA service, not run in the main app’s core. I also think Reddit does data acceptance before it gets to PostgreSQL too - as I’ve seen comments get backed up when one of their servers or services goes offline.