Table with smaller data types seems to take more space on disk?
-
I have these two identical tables:
Table "public.region" Column | Type | Collation | Nullable | Default -------------+----------+-----------+----------+--------- r_regionkey | integer | | not null | r_name | char(25) | | | r_comment | char(152) | | | Indexes: "region_pkey" PRIMARY KEY, btree (r_regionkey)
and
Table "public.region2" Column | Type | Collation | Nullable | Default -------------+----------+-----------+----------+--------- r_regionkey | smallint | | not null | r_name | text | | | r_comment | text | | | Indexes: "region_pkey" PRIMARY KEY, btree (r_regionkey)
I am using
smallint
andtext
in order to save space, but weirdly this is the result:select pg_size_pretty(pg_table_size('region'))
returns
8192 bytes
whileselect pg_size_pretty(pg_table_size('region2'))
returns
48 kB
.Why is
region2
taking more space, even though I am usingsmallint
instead ofinteger
andtext
instead ofchar(n)
?
-
After running
VACUUM FULL public.region;
andVACUUM FULL public.region2;
, test again with:SELECT pg_relation_size('public.region');
Three possible issues:
The obvious reason: table bloat from updates or deletes. Removed by
VACUUM FULL
.Schema-qualified names ('public.region' instead of just 'region') makes sure you don't measure the wrong table in a different schema by accident. Probably not the case.
pg_table_size()
includes auxiliary relation forks (files), which may be filled for one table, but not for the other. For your purpose, the more accurate test is withpg_relation_size()
.
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE
pg_table_size
(regclass
) →bigint
Computes the disk space used by the specified table, excluding indexes (but including its TOAST table if any, free space map, and visibility map).
And:
pg_relation_size
(relation
regclass
[,fork
text
] ) →bigint
Computes the disk space used by one “fork” of the specified relation. (Note that for most purposes it is more convenient to use the higher-level functions
pg_total_relation_size
orpg_table_size
, which sum the sizes of all forks.) With one argument, this returns the size of the main data fork of the relation. The second argument can be provided to specify which fork to examine:main
returns the size of the main data fork of the relation.fsm
returns the size of the Free Space Map (see https://www.postgresql.org/docs/current/storage-vm.html ) associated with the relation.vm
returns the size of the Visibility Map (see https://www.postgresql.org/docs/current/storage-vm.html ) associated with the relation.init
returns the size of the initialization fork, if any, associated with the relation.
Since your example has no https://www.postgresql.org/docs/current/storage-toast.html , only https://www.postgresql.org/docs/current/storage-fsm.html and https://www.postgresql.org/docs/current/storage-vm.html make a difference. Those are typically negligible in size for bigger tables, but relevant for your minimal test. Both may go down to "0 bytes" after
VACUUM FULL
.Test with more rows (thousands).
Related:
- https://dba.stackexchange.com/questions/23879/measure-the-size-of-a-postgresql-table-row/23933#23933