What is the best way to insert data that is not yet in the table?
I have a data set that has grown bigger. Before I could afford to re-create the table anytime, but the writing operation is now taking too long.
I cannot do something like timestamp-based delta inserts because it could be that historical data is added to my dataset:
SELECT max(date) FROM my_table
The data has a very long format, few columns and a lot of rows.
I am not looking for a db specific solution but more general solutions and the concepts so I can learn about it and implement myself.
I could also find a common identifier and check the missing... Or try and insert every record and if I see it's duplicated drop the particular insert on that record (that would at least save the writing time but it would take time looking into if the record exists?)
So, how can I make sure I only insert missing data from my dataset into my table?
Look into the MERGE statement. Unfortunately each database has it's own syntax, especially Oracle and MSSQL differ very much. but it will do what you need.