Diffing many texts against each other to derive template and data (finding common subsequences)

Question 1

I agree with the comments about the question being ill-defined. It seems likely that the format is much more specific than your general question indicates.

Having said that, something like RecordBreaker might be a help. You could also Google "wrapper induction" to see if you find some useful leads.

Question 2

Perform a global multiple sequence alignment, and then call every resulting column that has a constant value part of the template:

                   id:   937 name=alice  ;
                   id: 28    name=bob    ;
                   id:925931 name=charlie;
Inferred template: XXX      XXXXXX       X

Most tools that I'm aware of for multiple sequence alignment require smaller alphabets -- DNA or protein -- but hopefully you can find a tool that works on the alphabet you're using (which presumably is at least all printable ASCII characters). In the worst case, you can of course implement the DP yourself: to align 2 sequences (strings) globally you use the Needleman-Wunsch algorithm, while for more than two sequences there are several approaches, the most common being sum-of-pairs scoring. The exact algorithm for k > 2 sequences unfortunately takes time exponential in k, but the heuristics employed in bioinformatics tools such as MUSCLE are much faster, and produce alignments that are very nearly as good. If they can be persuaded to work with the alphabet you're using, they would be the natural choice.