Slow postgres query



  • I have a query on a table with almost 40mio rows and its getting slower and slower. Even if I do a simple:

    SELECT COUNT(id) as count_air FROM "airplay_airplay";
    it's taking almost 12s to output 37428412
    

    Any ideas where to look for improvements? Thanks in advance

    Finalize Aggregate  (cost=621262.92..621262.93 rows=1 width=8) (actual time=12207.175..12207.359 rows=1 loops=1)
       Buffers: shared hit=480 read=424843
       ->  Gather  (cost=621262.71..621262.92 rows=2 width=8) (actual time=12203.426..12207.339 rows=3 loops=1)
             Workers Planned: 2
             Workers Launched: 2
             Buffers: shared hit=480 read=424843
             ->  Partial Aggregate  (cost=620262.71..620262.72 rows=1 width=8) (actual time=12172.089..12172.090 rows=1 loops=3)
                   Buffers: shared hit=480 read=424843
                   ->  Parallel Seq Scan on airplay_airplay  (cost=0.00..581274.77 rows=15595177 width=8) (actual time=0.487..11179.857 rows=12476137 loops=3)
                         Buffers: shared hit=480 read=424843
     Planning Time: 0.082 ms
     JIT:
       Functions: 11
       Options: Inlining true, Optimization true, Expressions true, Deforming true
       Timing: Generation 1.109 ms, Inlining 164.849 ms, Optimization 73.269 ms, Emission 39.649 ms, Total 278.877 ms
     Execution Time: 12207.961 ms
    (16 rows)
    

    The table is 8 columns with mainly integer fields and a few date fields.

       Column   |           Type           | Collation | Nullable |                   Default
    ------------+--------------------------+-----------+----------+---------------------------------------------
     id         | bigint                   |           | not null | nextval('airplay_airplay_id_seq'::regclass)
     created_at | timestamp with time zone |           | not null |
     updated_at | timestamp with time zone |           | not null |
     airedAt    | timestamp with time zone |           | not null |
     duration   | integer                  |           | not null |
     radio_id   | bigint                   |           | not null |
     song_id    | bigint                   |           | not null |
     playcount  | integer                  |           | not null |
    Indexes:
        "airplay_airplay_pkey" PRIMARY KEY, btree (id)
        "airplay_airplay_radio_id_f0670674" btree (radio_id)
        "airplay_airplay_song_id_d1d128db" btree (song_id)
    Foreign-key constraints:
        "airplay_airplay_radio_id_f0670674_fk_airplay_radio_id" FOREIGN KEY (radio_id) REFERENCES airplay_radio(id) DEFERRABLE INITIALLY DEFERRED
        "airplay_airplay_song_id_d1d128db_fk_core_song_id" FOREIGN KEY (song_id) REFERENCES core_song(id) DEFERRABLE INITIALLY DEFERRED
    


  • The query execution time is just the time it takes to sequentially read the table. There is nothing substantial that you can speed up here; https://www.cybertec-postgresql.com/en/postgresql-count-made-fast/ .

    Some ideas:

    • Use count(*) rather than count(id). That has the same result, but is slightly faster.

    • perhaps refreshing the visibility map can result in a faster index-only scan:

      VACUUM airplay_airplay;
      
    • The table seems somewhat bloated. If you can take the down time to run

      VACUUM (FULL) airplay_airplay;
      VACUUM airplay_airplay;
      

      that might speed it up a bit.

    • if an approximate result is good enough, run

      SELECT reltuples FROM pg_class WHERE relname = 'airplay_airplay';
      

      That will be lightning fastt.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2