Storing column for data that doesn't exist for majority of rows



  • I'm using mariadb.

    I have a single table that keeps track of video_count and image_count for a few different sites. I.e.

     type        | enum('SITE1','SITE2','SITE3')                                
     video_count | int(11)                                                
     image_count | int(11) 
    

    video_count will exist for ALL sites but image_count only exists for SITE3. Also, SITE3 will be a minority of rows.

    So does it make sense to arrange the data this way on one table? I'd like to query ALL rows...but is it a problem if it's redundant and expensive knowing that 90% of my rows will have 0 for image_count?

    It'd be perfect if I could do something like SELECT * from TABLE BUT DON't GET video_count WHERE type != 'SITE3'"

    Is that even a possible query in SQL? Or does it now make sense to create separate tables for 'SITE1...SITEN'? Ty!



  • I recommend storing NULL in the image_count column on rows where it is inapplicable because the type doesn't need it. NULL is different from an integer 0.

    InnoDB doesn't store a value in the row for a NULL. So it's pretty low overhead to leave the column NULL on the rows that don't need it. There's a bit stored in the row header for each nullable column, but that will be needed whether you store a value or not.

    10k rows is small for an InnoDB table. I wouldn't worry about it unless you need to run this database on a Raspberry Pi or similarly compact platform.

    SQL does not have any syntax for SELECT * except-the-columns-I-don't-want. If you use SELECT *, that means all the columns, including those that are NULL. If you want a subset of columns, you have to name all of the ones you want.




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2