This sounds like the perfect place to apply phylogenetic tree-building methods! These have been famously used to recover a history of the Bible.
I would recommend starting with a method like Neighbour Joining, which is pretty fast (cubic -- and yes, this is considered fast among this class of methods) and pretty accurate. All it needs is a distance matrix: for n files, this is an n*n table of numbers, each giving the distance between a pair of the files. Distances can be computed any way you want, but some make more sense than others. diff file1 file2|wc -c
would be one crude way to calculate a distance between two files.
One caveat: many phylogenetic methods, including Neighbour Joining, build unrooted trees -- that is, they cannot infer an ancestor, but only the tree structure relating the different items (which are called taxa when they are biological species or individuals). Still, that should be enough to help you find the root. More sophisticated Maximum Likelihood models sometimes can infer rooted trees, but these tend to be geared towards the specifics of how DNA evolves over time.