Question

I have decided to develop an EDMS in Java as an end of year project for my last year in IT studies and i'm currently researching into database solutions for uploading and storing files with different formats as well as their metadata. I would like to be able to query file metadata and file content (I.E : return all documents created after june 2012, by the user John and that contain the string "finance").

I understand that Databases are for data and File Systems are for files as explained in this article, but some of my teachers have suggested that I look into XML databases, Apache Cocoon or Apache Jackrabbit and I have to admit that I am at a loss as to which approach I should take. This article seems to suggest that MongoDB would be my best bet?

Thank you for your patience and help.

Sebastien

Was it helpful?

Solution

Without having the features you plan on implementing it is hard to say. Assuming that your system will:

  1. allow uploading of documents
  2. allow searching of documents based on various metadata
  3. allow downloadig

consider either:

  1. Keeping the files in a filesystem but the metadata in a database such as mysql
  2. Keeping the files in a filesystem but use a search engine like Elasticsearch to store the metadata.

Both of the solutions would work depending on how you want to search. The flow of your application would be:

Uploading Documents

  1. User uploads new document
  2. Assign document internal ID
  3. Store document in filesystem based on that ID
  4. Store metadata in database/elasticsearch using ID to reference the file

Retrieving Documents

  1. User enters search criteria
  2. You generate query for either database or elasticsearch
  3. Display results. The result will have a link with the internal ID you created
  4. User selects result. You use the ID to get the document
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top