Why does PostgreSQL allow certain type mismatches in Foreign Keys?



  • While teaching a database basics course, a student asked about Foreign Keys whose data type does not match the data type of the thing (e.g. Primary Key) they are referencing.

    For example, all numbers that can be stored in an INTEGER column can be expressed as TEXT, and so a TEXT column's data may be used to reference data in an INTEGER column, as long as the appropriate typecasts/conversions are applied.

    We use PostgreSQL in teaching (because of its excellent documentation, among other things), so we went and had a look. Lo and behold, the "simplified" chapter about https://www.postgresql.org/docs/13/ddl-constraints.html#DDL-CONSTRAINTS-FK told us:

    Of course, the number and type of the constrained columns need to match the number and type of the referenced columns.

    Further research in the "feature complete" section about https://www.postgresql.org/docs/13/sql-createtable.html did not explicitly mention data types, though. This part only talks about values.

    We tried various combinations of data types, some more convincing (like the INTEGER-TEXT variant from above) than others. The DBMS was not convinced and replied with 42804: incompatible types.

    So far, so good. Imagine our utter astonishment when we found out that PostgreSQLs various Integer-types in fact do work.

    They even take the sign correctly into account, which means they are not just matching up bits.

    Of course there is a direction this should work in: Having an INTEGER column that is referenced by a BIGINT column always works, since everything that fits into the referenced column also fits into the referencing column.

    Surprisingly, PostgreSQL allows the other direction (with INTEGER and SMALLINT in this example):

    CREATE TABLE this_should_not_work
    (
        this_should_not_work_id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
        data                    TEXT
    );
    

    CREATE TABLE this_should_not_work_detail
    (
    detail_id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    detail_data TEXT,

    -- This Foreign Key references a column of a different type, which should not be possible
    fk_this_should_not_work_id SMALLINT REFERENCES this_should_not_work (this_should_not_work_id)
    );

    Here is an executable version of the above problem: https://www.db-fiddle.com/f/xsh37EkJZpGc6oSGaf9BL2/0

    The db-fiddle also has sequence options and INSERT-Statements that trigger a failure on the second insertion.

    Note that the Documentation and the db-fiddle linked above is for PostgreSQL 13, but the problem(?) can be reproduced on PostgreSQL 14 as well.

    I am aware that foreign key type mismatch is a database design problem. The question is, why does PostgreSQL point out the obvious cases (INTEGER-TEXT), but not the more subtle ones (INTEGER-SMALLINT)?

    PS: A lot of havok can be caused when this is abused in combination with ON UPDATE CASCADE, since the update on the referenced table fails because of the data not fitting into the referencing table - this makes for a rather "creative" error message.



  • Use the source, Luke!

    In ATAddForeignKeyConstraint in src/backend/commands/tablecmds.c, we find the truth about the requirements:

            /*
             * There had better be a primary equality operator for the index.
             * We'll use it for PK = PK comparisons.
             */
            ppeqop = get_opfamily_member(opfamily, opcintype, opcintype,
                                         eqstrategy);
    
        if (!OidIsValid(ppeqop))
            elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
                 eqstrategy, opcintype, opcintype, opfamily);
    

    So, the unique index on the target type has to support equality comparisons.

            /*
             * Are there equality operators that take exactly the FK type? Assume
             * we should look through any domain here.
             */
            fktyped = getBaseType(fktype);
    
        pfeqop = get_opfamily_member(opfamily, opcintype, fktyped,
                                     eqstrategy);
        if (OidIsValid(pfeqop))
        {
            pfeqop_right = fktyped;
            ffeqop = get_opfamily_member(opfamily, fktyped, fktyped,
                                         eqstrategy);
        }
        else
        {
            /* keep compiler quiet */
            pfeqop_right = InvalidOid;
            ffeqop = InvalidOid;
        }
    

    If there is an equality operator between the data types of the referencing column and the referenced column that is supported by the target index, we are good.

            if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop)))
            {
                /*
                 * Otherwise, look for an implicit cast from the FK type to the
                 * opcintype, and if found, use the primary equality operator.
                 * This is a bit tricky because opcintype might be a polymorphic
                 * type such as ANYARRAY or ANYENUM; so what we have to test is
                 * whether the two actual column types can be concurrently cast to
                 * that type.  (Otherwise, we'd fail to reject combinations such
                 * as int[] and point[].)
                 */
                Oid         input_typeids[2];
                Oid         target_typeids[2];
    
            input_typeids[0] = pktype;
            input_typeids[1] = fktype;
            target_typeids[0] = opcintype;
            target_typeids[1] = opcintype;
            if (can_coerce_type(2, input_typeids, target_typeids,
                                COERCION_IMPLICIT))
            {
                pfeqop = ffeqop = ppeqop;
                pfeqop_right = opcintype;
            }
        }
    

    Otherwise, there must be an implicit cast from the type of the referencing column to the referenced column.

            if (!(OidIsValid(pfeqop) && OidIsValid(ffeqop)))
                ereport(ERROR,
                        (errcode(ERRCODE_DATATYPE_MISMATCH),
                         errmsg("foreign key constraint \"%s\" cannot be implemented",
                                fkconstraint->conname),
                         errdetail("Key columns \"%s\" and \"%s\" "
                                   "are of incompatible types: %s and %s.",
                                   strVal(list_nth(fkconstraint->fk_attrs, i)),
                                   strVal(list_nth(fkconstraint->pk_attrs, i)),
                                   format_type_be(fktype),
                                   format_type_be(pktype))));
    

    If neither is true error out.

    So you can have foreign keys from integer to smallint because there exists an equality operator between these types that belongs to the index's operator family:

    \do =
                                                  List of operators
       Schema   │ Name │        Left arg type        │       Right arg type        │ Result type │  Description  
    ════════════╪══════╪═════════════════════════════╪═════════════════════════════╪═════════════╪═══════════════
     ...
     pg_catalog │ =    │ integer                     │ smallint                    │ boolean     │ equal
     ...
    (63 rows)
    

    However, there is no implicit cast between text and integer, so you cannot have foreign key references between these types.

    \dC
                                             List of casts
             Source type         │         Target type         │      Function      │   Implicit?   
    ═════════════════════════════╪═════════════════════════════╪════════════════════╪═══════════════
     ...
     integer                     │ bigint                      │ int8               │ yes
     integer                     │ bit                         │ bit                │ no
     integer                     │ boolean                     │ bool               │ no
     integer                     │ "char"                      │ char               │ no
     integer                     │ double precision            │ float8             │ yes
     integer                     │ money                       │ money              │ in assignment
     integer                     │ numeric                     │ numeric            │ yes
     integer                     │ oid                         │ (binary coercible) │ yes
     integer                     │ real                        │ float4             │ yes
     integer                     │ regclass                    │ (binary coercible) │ yes
     integer                     │ regcollation                │ (binary coercible) │ yes
     integer                     │ regconfig                   │ (binary coercible) │ yes
     integer                     │ regdictionary               │ (binary coercible) │ yes
     integer                     │ regnamespace                │ (binary coercible) │ yes
     integer                     │ regoper                     │ (binary coercible) │ yes
     integer                     │ regoperator                 │ (binary coercible) │ yes
     integer                     │ regproc                     │ (binary coercible) │ yes
     integer                     │ regprocedure                │ (binary coercible) │ yes
     integer                     │ regrole                     │ (binary coercible) │ yes
     integer                     │ regtype                     │ (binary coercible) │ yes
     integer                     │ smallint                    │ int2               │ in assignment
     ...
    



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2