How do you document your database structure? [closed]

https://stackoverflow.com/questions/186392

06-07-2019
|

Question

Many database systems don't allow comments or descriptions of tables and fields, so how do you go about documenting the purpose of a table/field apart from the obvious of having good naming conventions?

(Let's assume for now that "excellent" table and field names are not enough to document the full meaning of every table, field and relationship in the database.)

I know many people use UML diagrams to visualize the database, but I have rarely—if ever—seen a UML diagram including field comments. However, I have good experience with using comments inside .sql files. The downside to this approach is that it requires the .sql files to be manually kept up-to-date as the database structure changes over time—but if you do, you can also have it under version control.

Some other techniques I have seen are separate document describing database structure and relationships and manually maintained comments inside ORM code or other database-mapping code.

How have you solved this problem in the past? What methods exists and what are the various pros and cons associated with them? How you would you like this solved in "a perfect world"?

Update

As others have pointed out, most of the popular SQL engines do in fact allow comments, which is great. Oddly enough, people don't seem to be using these features much. At least not on the projects I have been involved with in the past.

Solution

MySQL allows comments on tables and rows. PostgreSQL does as well. From other answers, Oracle and MSSQL have comments too.

For me, a combination of UML diagram for a quick refresher on field names, types, and constraints, and an external document (TeX, but could be any format) with extended description of everything database-related - special values, field comments, access notes, whatever - works best.

OTHER TIPS

Late one but hopefully useful… Here is a process we used when developing relatively large database (around 100 tables and around 350 objects in total)

Developers were required to use extended properties to add details to all objects.
Admins rejected any DDL that didn’t have extended properties
Third party tool was used to automatically generate visual documentation via command line interface every day. We used ApexSQL Doc and it worked just fine but I also successfully used SQL Doc from Red Gate in other company.

This process ensured that we have all objects documented and document up to date.

The difficult thing though was getting developers to write good comments consistently ;)

SQL Server has extended properties that can take care of this.

This article describes how do set them up in SQL Sever http://www.developer.com/db/article.php/3677766

MSDN Reference

It can be used in conjunction with RedGate SQL Doc to create a nice Data dictionary.

I use comments attached to tables and column. SchemaSpy is a great tool for generating html documentation files out of your schema, including comments.

At one point I wrote a basic SQL parser that would parse CREATE TABLE statements and strip out specially formatted comments. These were then post-processed into LaTeX source and rendered to PDF. This was inspired by Javadoc and was used to create the documentation for This product. Subsequently a data dictionary feature was built into the warehouse manager and a modified version of the LaTeX generator was used to render the data dictionary from the warehouse manager.

On another project I used Visio - the version that comes with Visual Studio Enterprise Architect will forward engineer a database. The SQL so generated had the table and column comments rendered in comment strings that were fairly straightforward to parse. The tool I wrote generated MIF files that were be included into a spec document built with FrameMaker.

If you have a repository tool such as Powerdesigner you can maintain data models in it and get repository reports out that include the documentation you have entered. If you need deeper integration of your data dictionary with functional specifications (Quite useful for data warehouse systems where the ETL is complex and involves significant computation of derived values) you can still extract the metadata and write a utility to generate something that will integrate the data dictionary into a specification document. This also allows cross-referencing between data dictionary items and other specification documents and generation of indexes that cover the data dictionary definitions and related documentation such as a specification of how something is calculated with examples.

We've written a word document that lists the tables, the fields and what everything does. This is backed up by a diagram that shows how everything links/relates to each other. It's a pretty simple document really, just a load of tables with Field Name > Data Type > Purpose

I'm using Firebird which has description field for all system objects (tables, columns, views, procedures and parameters, triggers, etc.) It's nice because you can easily share it with others (docs go with database, not separately) and you never lose it.

Most admin. tools for Firebird allow you to edit these descriptions and there are some specialized tools (like IBDesc, for example) that create nice HTML or PDF reports you can print (for some or all tables) easily.

It's a really simplistic approach, but I use a pair of wiki pages: one with the mysqldump of the database, and one written in a slightly more English-like format.

For the projects I've worked on, that's been sufficient (through the dozens of tables level). I don't know how well it might scale to larger projects (say in the hundreds of tables), but it's been good so far.

I comment my databases as I comment my programs. By writing good (I hope) comments in the source code (the SQL file containing the DDL instructions).

Using SQL COMMENT is another possibility. The good thing with them is that they are always with your objects, are backed up with them, etc. The bad thing is that they are more limited (for instance in length).

Since we use Rational Software Architect, we use its data discovery features to document our databases and then annotate them from there.

In Oracle you can comment on tables and it stores it in the data dictionary.

However, I store all my table, column, index comments in a very old version of ERWin. It's the master source of truth and generates the DDL to create tables, etc. From there, I can extract it out into a word document or pdf.

I've recently turned to writing markdown documentation, which includes linking to individual table .sql files (where tables and fields are hopefully intuitively named with plenty of comments).

I keep the individual table schema's in version control, using the following command:

mysqldump --no-data --tab=./tables dbname

The schema for a single table allows you to see comments, indexes, unique keys etc. so is fairly self explanatory (well that is the idea at least).

The master markdown documentation has hyperlinks such as user table peppered throughout, so the reader can easily go to the different tables.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow