Question

I see in this post on listagg — Rows to Delimited Strings talking about SQL's LISTAGG() clause ON OVERFLOW

The return type of listagg is either varchar or clob with an implementation defined length limit.

I've never previously seen the term clob before. PostgreSQL doesn't alias anything to it,

CREATE TABLE foo ( a clob );
ERROR:  type "clob" does not exist

Looking it up, it's apparently a term "character large object" that comes from the spec? What exactly is a "Character Large Object" and how does it relate to the PostgreSQL text type?

Was it helpful?

Solution

Colloquially

The clob in reference here is NOT in reference to the spec, which is confusing. This is term clob here is from Oracle/DB2/Informix parlance

So yes, in this case the PostgreSQL equivalent is text, which is confirmed by this post on the PostgreSQL mailing list.

"Character Large Object" - SQL Spec

But in the SQL Spec, there is some confusion as the terms means something different. The SQL:2011 Part 1: Framework (SQL/Framework) defines two types for character string types.

A "character string type" is either of fixed length, or of variable length up to some implementation-defined maximum.

  • A value of character large object type is a string of characters from some character repertoire and is always associated with exactly one character set.
  • A large object character string is of variable length, up to some implementation-defined maximum that is probably greater than that of other character strings.

So both of the lengths are undefined by the spec, and left to the implementation. But only the "Large Object Character String" is "of variable width".

It goes on to say in Part 2,

A character large object type is a character string type where the name of the specific character string type is CHARACTER LARGE OBJECT. A value of a character large object type is a large object character string.

And has one final clarifying mention

The data types CHARACTER, CHARACTER VARYING, and CHARACTER LARGE OBJECT are collectively referred to as character string types and the values of character string types are known as character strings.

Which seems a bit confusing, but the spec defines what a large object character string can do it 4.2.3.4 Operations involving large object character strings. They can be used in,

  • <null predicate>.
  • <like predicate>.
  • <similar predicate>.
  • <position expression>.
  • <comparison predicate> with an <equals operator> or <not equals operator>.
  • <quantified comparison predicate> with the <equals operator> or <not equals operator>.

But they can not be used in,

  • predicates other than those listed above and the <exists predicate>
  • <general set function>.
  • <group by clause>.
  • <order by clause>.
  • <unique constraint definition>.
  • <referential constraint definition>.
  • <select list> of a <query specification> that has a <set quantifier> of DISTINCT.
  • UNION, INTERSECT, and EXCEPT.
  • columns used for matching when forming a <joined table>.

In summary, perhaps PostgreSQL doesn't use clob as a term anywhere because it's connected with implementation and capabilities and is part of the LO infrastructure which involves SQL Locators. In PostgreSQL the text type is stored in line. That distinction likely doesn't matter much today, as I don't see any of the RDBMs, except Oracle, documenting clob as being out-of-line.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top