php mysql aggregating/ combining queries to reduce db connections



  • I have a use case where a large number of clients are reporting their operational states every minute using an HTTP connection to a php script.

    For each client, I have a row in a mysql table with unique clientID.

    PHP script first checks if the client is valid, and update the timestamp so I know when was the last time they connected.(A heartbeat of sense)

    INSERT into clientStatus (cID,..,timeStamp) VALUES (123,....,time())
    ON DUPLICATE KEY UPDATE ......,timeStamp=time();
    

    I use INSERT/ UPDATE to handle any new clients joining, but I can handle it separately if it helps to have a better flow on the larger scale. Each HTTP request creates a write operation (as above) to the DB.

    Is there a way to aggregate all such queries (15-20 thousand requests per minute), and save it somewhere, combine them and run it as a single INSERT/ UPDATE query? Or is this line of thinking not the right approach?

    At this point, the AWS RDS IOPS is being a bottleneck, and that's what caused me to explore this angle.



  • There are multiple places where there might be a bottleneck.

    • HTTP
    • The client process that then talks to MySQL
    • Connecting to MySQL (for a single action)
    • Competing with other actions to do the Update (or IODKU)

    Plan A: Scrape the web server's log to see that someone made a request. (The web service may not let you do this.) This would let you batch the data, and even de-duplicate much of the data. Periodically in the scraping, do a batch IODKU into the table. The timestamp updating would be delayed, but that may be OK.

    Plan B: If the table where the update is occurring is somewhat busy with other stuff, it may be better to split this column out into a separate table. (I often recommend this for a "Likes" and "Visits" counter.)

    Plan 😄 Like Plan B, but stage the updates in a very simple table that is periodically flushed with batch inserts into the appropriate locations.

    Until you know which part of the processing is the bottleneck, you can't be sure that any of my suggestions (or others) would be beneficial.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2