Index Optimization for tables that are truncated and charged everyday



  • I'm a recently recruited Data Engineer, I have made a bunch of ETL pipelines planned to run daily, the destination tables are truncated and charged again.

    When I arrived I found other ETLs that also run in a daily manner, but when I checked the fragmentation percentage on the indexes the rates were too high, so I'm thinking of creating an optimization task for all the indices.

    Most of the clustered indices (primary keys) are doing just fine, but the non-clustered indices are suffering from high fragmentation, How and when should I rebuild/reorganize indices? After or before loading the new data?



  • I would do this,

    1. Truncate table before re-load.
    2. Script out existing index definitions and drop the existing.
    3. Run the re-load which should be done much faster than before as there were no pre-existing index's on the table.
    4. Then, re-create the indexes using definitions extracted from step (2).

    Pros:

    • Fast load times
    • Controls the fragmentation that you are concerned about.

    Cons:

    • Every time you have to re-create your indexes.

    Note: If the time of completion is a factor then you probably would need to try both approaches(Re-create vs Re-build) and see what works best for you.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2