Parsing nonuniform data

Question

Well that is an interesting problem for sure and there are a couple of things you could try.

Making the assumption that you don't have labels on your data, then the first thing I would try to do, is to check the connections between each instance using a clustering algorithm like k-means (http://en.wikipedia.org/wiki/K-means_clustering), keep in mind that this wouldn't solve your problem but would help you to explore your data and hopefully find a set of features to train a supervised learning classifier.

In the case that you do have labels on your data, or you could manually tag your set. Then you are in front a more manageable problem. At first glance, it would look a lot like a text or document classification problem (like classify emails as Spam/NoSpam), in which case a naive bayes classifier could be a good first attempt to attack the problem since is a easy algorithm to implement and can provide reasonable good results.

About Naives Bayes Classifier (https://www.bionicspirit.com/blog/2012/02/09/howto-build-naive-bayes-classifier.html)

I made some assumptions here and I might be wrong based on that. Maybe if you clarify some points (like if you are able to manually tag the data) we would be able to help you further.