Why does an index on the join predicate does not improve the performance of a hash join?



  • Various references, such as https://www.cybertec-postgresql.com/en/join-strategies-and-performance-in-postgresql/ , say that an index on the join predicate does not improve the performance of a hash join.

    However, couldn't a B+ tree index reduce the I/O cost of the table which is probed against the pre-calculated hash table by just loading the leaves of the index into main memory and do a sequential scan on them (instead of the entire table) given that the index contains all attributes of the indexed table which are specified in the SELECT clause?



  • I think what it means is that the index itself is never directly used as the mechanism for identifying the rows to which the other data set joins.

    They might be used as a covering index, or for limiting the rows to be joined in other ways, though. So indexing might still be beneficial in some ways.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2