Question

I have a relational database (PostgreSQL 8.4) of around 3000 products proposed by a company. The database is used to display the products in the company’s website (running on a python 2.6 application). My final goal is to build a PDF file in order to print a paper version of the products catalog and I would like to know what technology to use for that purpose. The operation will have to be repeated once a year for every new catalog so I would like to automate the catalog generation but still give some flexibility because I won’t stay in the company forever and there is no technical person that will replace me after (small company, small budget).

Ideally, I would like to generate dynamically the structured content of the 3000 products in a text editor (like OpenOffice for example) for the following reasons:

  • content is generated dynamically so no need to retype everything
  • only the content structure is generated dynamically and not the styling so a non-technical user is able to customize styles.
  • the document being editable, it is easy for a non-technical user to add pages in the catalog like a welcome page, a note page, the terms and conditions. In other words, text editors are great, I don’t want to reinvent the wheel but I don’t want a person to retype all data for the 3000 products.

Soutions first looked:

  • I had a look at LaTeX but it seem that the data and the styling are mixed together in opposition to HTML and CSS which clearly separate content and styling which I find much more easy to use.
  • I thought about using directly HTML and CSS but it might too technical.
  • I also looked at a library allowing to generate PDF directly from python like ReportLab (http://www.reportlab.com/software/documentation/tutorial/product-catalogue/). This, however, does not allow to modify anything once the PDF is build and a little modification might required a technical person.

So if you have an idea for this kind of job then I would be very happy to get some tips about the right technologies. Thank you very much.

Was it helpful?

Solution 2

OpenDocument Format (ODF) approach

Why:

  • It is an open format, which is guarantee for long term solution, and it comes with no cost (small company, small budget).
  • It separates content and style.
  • There are mature free and open source software compatible with it: OpenOffice.org, LibreOffice. And these are easy to use for non-programmer users.

How:

The ODF format is quite complex but some libraries already exist to help generating files, and there are some available in Python: odfpy, lpod,JODreports, Apache odf toolkit, ... And they seem to do the job!

Simliar question but for Java

OTHER TIPS

Approach in LaTeX

General

LaTeX does allow to separate content from styling, as it is a markup language (and feels pretty like HTML and CSS if you are coming from it).

http://en.wikipedia.org/wiki/Markup_language#TeX

http://en.wikibooks.org/wiki/LaTeX/Modular_Documents#Getting_LaTeX_to_process_multiple_files

This way you can put all the formatting options in your base file and then input or include the files which contain the actual content of your work. This means that the important part of your working process, i.e. writing, is kept largely separate from formatting choices (which is one of the main reasons why LaTeX is so good for serious writing!) You will thus be dealing solely with text and very basic commands such as \section, \emph etc. Your document will be uncluttered and much easier to work with..

The commands \input{filename} and \include{filename} insert text files (with or without LaTex commands).

For more customization you would need own macro(s) to read the content files and style them accordingly.

Some resources on defining macros (I can´t provide the linked hyperlinks because of my reputation right now):

en.wikibooks.org/wiki/LaTeX/Macros

en.wikibooks.org/wiki/LaTeX/Creating_Packages

One specific example

I´ve written a software documentation, the actual source code was stored in separate files. The lstinputlisting package reads the source code and outputs it in a "styled way".

\lstinputlisting[caption=My caption]{sourcefile.lang}

What you are looking for is called database publishing. This can be done with a batch formatter (e.g. TeX or XSL-FO) or -- if you need don't want 100% automation -- with addons for DTP programs like InDesign and Quark.

Database publishing

Yes, as Martin Schroeder points out this is about database publishing. A recent similar specific question is about using the pod tool to generate LibreOffice ODT files.

The pod approach uses Python. Python statements are in the ODF template file. You might use the approach with any scripting language.

LibreOffice Writer has as well a 'flat XML' file format. A database publishing batch process needs to replace certain place holders with XML code generated from the database. This might be done by an interpreter which goes through your 'flat XML' file and looks for certain keywords or commands and then executes them.

Advantage The advantage of this approach is that a general user can alter the report just by using LibreOffice. Insertion commands which are interpreted by your batch program may be easily placed at the right place. These command may have the form of a DSL.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top