Looking for visualization to understand how an allocation order scan results in missing/duplicate rows?



  • In read uncommitted isolation mode, SQL server engine may decide to use the index-order scan or allocation order scan.

    An index order scan can result in missing/duplicate rows due to the reasons mentioned in the answer from this answer: https://dba.stackexchange.com/a/307757/162483 . The linked answer contains a diagram that shows the visualization.

    An allocation order scan can also result in missing/duplicate rows, and as mentioned in above answer, it is because of page splits. I am looking for a visualization of this.



  • In read uncommitted isolation mode, SQL server engine may decide to use the index-order scan or index order allocation scan.

    There is no such thing as an "index order allocation scan". There are index-order scans and allocation-order scans.

    I am looking for visualization of this.

    I'm not going to draw a picture, but imagine an index with pages like this:

    page keys next page
    500 000-099 300
    300 100-199 700
    700 200-299

    The page numbers are their physical position in the file. So page 500 is the 500th 8KB page in the physical file.

    Notice physical page IDs do not match index key order.

    For example, physical page 500 contains keys (000-099), which are lower than those on page 300 (100-199). This is perfectly normal and allowed.

    In this (very) simplified example, a single IAM page covers this allocation range. The IAM page tells us that pages 300, 500, and 700 belong to this index (and that's all it tells us).

    Now an allocation-order scan of the index starts at page 300 (the lowest physical page number that belongs to this index according to the IAM).

    Meanwhile, another process causes page 500 to split:

    page keys next page
    500 000-049 100
    300 100-199 700
    700 200-299
    100 050-099 300

    Half the rows have moved from page 500 to (new) page 100. Since we are using an allocation-order scan, we do not follow the next-page pointers. Instead, we continue with the next higher-numbered page after page 300, which is 500, then 700.

    We never encounter the rows on page 100. These are 'missing' due to the page split and allocation-order scan.


    If the split of page 500 had resulted in a new page numbered higher than 300, we would have encountered it during the allocation-order scan. No missing rows, and no duplicates.


    As a final example, imagine we have read page 300, and are now processing page 500. At this point, another process causes page 300 to split, creating new page 600. We will encounter the rows on page 600 twice: Once when we read page 300 before the split, and once when we encounter page 600 after the split.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2