Question

So when a document is deleted, the metadata is actually preserved forever. For a hosted service like cloudant, where storage costs every month, I instead would like to completely purge the deleted documents.

I read somewhere about a design pattern where you use dbcopy in a view to put the docs into a 'current' db then periodically delete the expired dbs. But I cant find the article, and I don't quite understand how database naming would work. How would the cloudant clients always know the 'current' database name?

Was it helpful?

Solution

Cloudant does not expose the _purge endpoint (the loose consistency guarantees between the clustered nodes make purging tricky).

The most common solution to this problem is to create a second database and use replication with a validate_document_update so that deleted documents with no existing entry in the target database are rejected. When replication is complete (or acceptably up-to-date if using continuous replication), switch your application to use the new database and delete the old one. There is currently no way to rename databases but you could use a virtual host which points to the "current" database.

I'd caution that a workload which generates a high ratio of deleted:active documents is generally an anti-pattern in Cloudant. I would first consider whether you can change your document model to avoid it.

OTHER TIPS

Deleted documents are kept for ever in couchdb. Even after compaction .Though the size of document is pretty small as it contains only three fields

{_id:234wer,_rev:123,deleted:true}

The reason for this is to make sure that all the replicated databases are consistent. If a document that is replicated on several databases is deleted from one location there is no way to tell it to other replicated stores.

There is _purge but as explained in the wiki it is only to be used in special cases.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top