Optimization : JOIN or CTE



  • Suppose that I have these tables,and example data (the real table I am working on has more columns)

    table `users`

    id name zip
    1 John 201

    table `region`

    zip city state country
    22 a b c

    table `purchase`

    id user_id price
    2 3 2

    Should I use subquery,cte or join when those tables have billions of records (so efficiency matters) and

    1. query which city,state,country does the user id=1 lives (Yield 1 row)
    2. query the name, city, price of a purchase order by price desc limit 100 (Get details on top 100 purchases).
    3. query the number of the purchases, sum of price of users whose name begins with "Dr"
    4. query the average spending(sum of price) of users in each city in country "c"


  • I would say always aim for join.

    If this is not possible consider subquery or CTE.

    Here is why

    Query order processing is: FROM, WHERE, GROUP BY, SELECT... Sooner you narrow result is better since next processed clause will have less to do.

    Using join like this: SELECT ... FROM users u JOIN purchase p ON u.id=p.id ... will narrow result in the first processed clause - FROM clause. It will return only records satisfying join condition. In this example only users with at least one purchase.

    In WHERE clause you narrow further. E.g. WHERE users.name like 'Dr%' Or WHERE region.country='c'

    Next there are GROUP BY, ORDER BY and FETCH. E.g. ... ORDER BY purchase.price FETCH NEXT 100 ROWS ONLY

    Optimizer is sophisticated piece of software. Sometimes it can rewrite a query from CTE or subquery into join. I do not know if MySQL's is smart enough though.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2