Analyzing queries with high disk IO



  • RDS Aurora PostgreSQL 10.14 instance db.r5.4xlarge.

    I'm trying to figure out some high RDS IO costs in my machine. I'm looking at the pg_stat_statements and asking whether the following query make sense:

    SELECT rolname::regrole,
           calls,
           round((total_time / 1000 / 60)::numeric, 3)                             as total_minutes,
           round(((total_time / 1000) / calls)::numeric, 3)                        as average_time_seconds,
           rows,
           userid,
           regexp_replace(query, '[ \t\n]+', ' ', 'g')                             AS query_text,
           100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent,
           pg_size_pretty((shared_blks_hit + shared_blks_read) * 8192)             AS total_memory_read
    FROM pg_stat_statements
             JOIN pg_roles r
                  ON r.oid = userid
    WHERE calls > 1
      AND rolname NOT LIKE '%backup'
      AND rolname <> 'rdsadmin'
      AND rolname <> 'rdsproxyadmin'
    ORDER BY 8 asc nulls last
    LIMIT 5;
    

    According to the document hit_percent indicate how much data was fetch from the cache (shared_buffer or os kernel) vs. the total data - the higher the number, the better...

    In my addition, I have total_memory_read which is the total of memory read from both disk and cache. Here is an ouput I recieve

    |rolname   |calls|total_minutes|average_time_seconds|rows|userid|query_text            |hit_percent       |total_memory_read|
    +----------+-----+-------------+--------------------+----+------+----------------------+------------------+-----------------+
    |XXX       |8    |4.278        |32.085              |256 |20550 |SELECT some_query ... |44.915182913169814|420 GB           |
    +----------+-----+-------------+--------------------+----+------+----------------------+------------------+-----------------+
    

    My questions:

    1. Does total_memory_read is really the amount of memory this 8 calls consume? It seems quite huge to be 420G
    2. If I multiple (1-hit_percent) by total_memory_read do I get the number of GB it fetch from the disk (and eventually get disk IO of ~231)?
    3. Are there any other suggestions on how to track high IO hogs?


  • cache (shared_buffer or os kernel) vs. the total data

    In community PostgreSQL, this is wrong. Data coming from the kernel cache counts as read, not hit. PostgreSQL only knows is requested the data, not what the kernel had to do to deliver it. So some (possibly large) percentage of the reads might really be hits for the kernel. For this reason it might be a good idea to look at actual time spent waiting for data, by turning on track_io_timing then looking at blk_read_time, rather than or in addition to looking at shared_blks_read.

    But rumor has it that one of things Aurora does is implement direct IO, or something like it, into PostgreSQL. If that is really the case, then it wouldn't matter where kernel cache hits show up in the equation, because there would be zero of them.

    Does total_memory_read is really the amount of memory this 8 calls consume? It seems quite huge to be 420G

    That seems correct. And yes, that is large, but it is easy to write a query that examines a full table of data, either because it needs it all, or because there is no useful index. Do you have a 50 GB table?

    If I multiple (1-hit_percent) by total_memory_read do I get the number of GB it fetch from the disk (and eventually get disk IO of ~231)?

    I guess. But it seems pretty silly to derive one number, then derive another number, then derive a third number from one of the other derived ones, then multiply some of them them to cancel out you derivations and get back to one of the original numbers. Why not just look at shared_blks_read directly?

    Are there any other suggestions on how to track high IO hogs?

    If all you know is "I use too much IO", then look at the pg_stat_statements table ordered by shared_blks_read desc or blk_read_time desc. There is no point in even calculating the hit ratio, much less sorting by it. I wouldn't impose a limit of 5 either, that is pretty low. You might want to sum the column for the entire table so you know what fraction of total reads is covered by the top however many rows. You could do this with a window function, or you could run a sum once and mentally apply it after that.

    I'd also remove calls > 1. If a giant query which has been run only once uses up most of your IO, you would want to know that, not hide from it. Also remove the role filters. Maybe there is nothing you can do about those roles' queries, but if they consume a huge amount of IO you should at least be informed.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2