Allocation scans in heaps and indexes


  • QA Engineer

    I'm trying to comprehend the difference between an allocation scan in a heap and in a B-tree, performance-wise.

    Let's say, that we have an Orders table, with a clustered index defined on the OrderDate column.

    If we query it like so: SELECT orderid, custid, empid, shipperid, orderdate, filler FROM dbo.Orders;

    The execution plan will consist of a Clustered index scan with "Ordered" property valued "False", which (if I understand correctly) leaves the storage engine with two options on how to perform it:

    1. With an index order scan, starting with the first leaf and proceeding in the order of the index
    2. With an allocation scan, starting with the first IAM and proceeding in the file order

    The question is: Will there be a difference between the second option and the situation when the Orders table is a heap (hence the storage engine can only use an allocation scan)? And if not, is there any advantage in creating indexes on tables, that would mostly be queried without ordering?

    Or I just got the whole thing wrong and it doesn't work that way.



  • ...is there any advantage in creating indexes on tables, that would mostly be queried without ordering?

    There is if you ever plan to filter or join by predicates that utilize the fields the index covers. For example, if you wanted to only look at Orders in the last year for a table that has 10 years of data, and you indexed on OrderDate, then the optimizer would potentially be able to seek on the index to read and return the data that's only in the relevant branches of the B-Tree to serve your query, as opposed to reading the entire table, which a Heap would've resulted in.

    But if your intentions are always to read the entirety of the data in the table always, and never UPDATE or DELETE from it based on any predicates either, nor JOIN to it, then there'll be negligible difference in whether you create a clustered index on it or not. Though it's not usual someone has this use case for a table other than a staging table.

    While it's true that the index orders the data in the B-Tree data structure by the fields specified in the index definition, that's not the only purpose and benefit of it. Because of the structure of a B-Tree (a nonlinear tree with branches as opposed to a linear structure like a Heap) you get O(log(n)) search time as opposed to n search time. So regardless of if your query uses an ORDER BY clause, if it needs to search on a predicate for specific values (JOIN, WHERE, HAVING clauses), then most likely a B-Tree will be the faster data structure for the optimizer to search.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2