Is there some good way to "diff" a table's data with a fresh copy of it, automatically?
emmalee last edited by
I have a table representing all countries on this planet, with their country codes and full titles/common names, etc.
I have not run my
update_country_tablefunction since 2020, so it's due for another run now.
If I run it as-is, it will fetch the new data, truncate the current table (and reset identity), then
INSERTall the new rows. In other words, I will have no idea how many new countries were added, or removed, or possibly had one of their columns edited. Well, I can count the rows before and after, but that will only tell me if it has changed in numbers -- not what changed. (Maybe two countries were removed and two others added, or many of the existing ones had typos fixed for their common names.)
I could of course make this function much more advanced and have it painstakingly check each existing record instead of truncating it and then inserting all the new data, but this is anything but simple. It would have to check if any of the columns have changed, etc. It would just be an annoying mess, especially since this is a common thing that I want to do in many other places as well.
Is there some way to do something like this? (This is made-up fantasy code from my dreams.)
MAKE TEMPORARY COPY OF countries AS countries2; TRUNCATE TABLE countries; INSERT ... ... SELECT (COMPARE TABLES countries, countries2);
I doubt this exists, but it sure would be useful for the many times when I have data of this kind which I regularly pull down. I don't like being "blind" to what it actually changes.
For my source code files, I use the neat WinMerge program to see exactly what has changed since last time. I sure wish this could be easily done for my database tables as well.
MAKE TEMPORARY COPY OF countries AS countries2;
You can use a CTAS (create table as select) operation. If you want PostgreSQL to drop it after disconnect, create a
create table countries2 as select * from countries;
I assume you are not asking about the following part.
TRUNCATE TABLE countries; INSERT ... ...
For the comparison of 2 tables, you can use the
-- Rows in countries but not in countries2 select * from countries except select * from countries2;
-- Rows in countries2 but not in countries
select * from countries2
select * from countries;
If your table has unique key (say
name), you can use a SQL like the following.
select c.*, c2.* from countries c full outer join countries2 c2 on c.name=c2.name where (c.full_name is distinct from c2.full_name or c.population is distinct from c2.population ... );