How to use Agile methodology to develop a data mart



  • I have a question about if (and how) Agile should be used to build a data mart.

    I am building a data warehouse using Kimball theory (i.e., the data warehouse is several data marts connected via shared/conformed dimensions). Of course, I am incrementally building out the warehouse (i.e., one or a few data marts at a time, according to business priority). (To be clear, that in itself doesn't mean that the warehouse is built following agile. Agile is iterative. Incremental <> iterative.) To be clear, in Kimball theory:

    • the grain represents the "level of detail" of the business process being modeled (i.e., it is a representation of the business process)
    • the likelihood of the grain needing to change is probably very low (at least during the initial release of the data mart)

    So, as I see it, it is super important to get the grain "correct" (complete and accurate) in the initial release of a data mart. Otherwise, there will be a lot of re-work.

    Also, as I understand it, the primary reason for following agile is change. So if change isn't expected, there isn't much reason to follow agile.

    But, I know that many data warehouse practitioners recommend agile. I don't understand this. Should agile (i.e., iterative development) be used to create each data mart (at least in the data mart's initial release)? And, if so, what does following agile look like for a data mart?



  • Also, as I understand it, the primary reason for following agile is change. If change isn't expected, then there isn't much reason to follow agile.

    This is a perfectly reasonable statement. However, consider that there are many forms of change:

    • Change as a result of feedback - "Now that we see it, we realise we need something a bit different"
    • Change as a result of technical challenges - "The performance is nowhere near as good as we expected, maybe we need to reconsider the architecture"
    • Change as a result of shifting strategy - "We could really do with this new report so that we can stay competitive"
    • People change - "The lead developer has gone off sick for 2 months"

    So, as I see it, it is super important to get the grain "correct" (complete and accurate) in the initial release of a data mart. Otherwise, there will be a lot of re-work.

    When we talk about agility we mean the trade-off between getting things "correct" the first time (minimising rework) and the ability to adapt to change as it happens. There is no right or wrong answer here, it is about finding a compromise between up-front planning and agility that works well in your particular situation.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2