What is the difference between data dictionary and data dump of SEDE?

https://dba.stackexchange.com/questions/260979

25-02-2021
|

Question

A meta post (https://meta.stackexchange.com/q/2677/714660) gives the "anatomy" of the data dump from SEDE that includes the structure of the public data dump(tables, columns, data types, etc) and Entity Relationship Diagram.

The stuff above looks similar to data dictionary.

So, what is the difference between data dictionary and data dump of SEDE?

Solution

Well, the data dump is the data itself, i.e. the questions, answers, comments, user information etc. of the Stack Exchange sites. Note that on Stack Exchange, 'data dump' usually refers to something else than SEDE: a quarterly dump of the data published in XML format. It contains more or less the same information as SEDE (which is the reason the schema Q&A is shared between them), but in a different format, and it isn't updated as often as SEDE is (SEDE is updated once a week, on Sunday morning).

There is no formal data dictionary for the entirety of SEDE; most of the schema documentation can only be found in the Database schema documentation for the public data dump and SEDE and isn't available as a database/repository/API itself. The schema is rather small and schemas of this size don't benefit as much from a data dictionary as large-scale applications like ERP software does.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange