Question

I'm going to start my degree thesis and I want to do a fault detector system using machine learning techniques. I need datasets for my thesis but I don't know where I can get that data. I'm looking for historical operation/maintenance/fault datasets of any kind of machine in the oil & gas industry (drills, steam injectors etc) or electrical companies (transformators, generators etc).

Was it helpful?

Solution

Maybe there's another way to go. The idea would be to generate the dataset you will be processing with your algorithms. You define the models of behaviour of events (those you're looking for and those into which they are hiden). Then generate the data, then analyse.

This approach has the benefit to let you control exactly what is inside the processed data. And make sure your algorithm identifies exactly what it is supposed to identify, no more, no less.

With GEDIS Studio we model events behaviours with activity profiles and the generator produces those events. We have implemented generators for telecom CDR, credit card usages, smart metering, etc.

They are freely available online from the evaluation account on http://www.data-generator.com

Check the detailed CDR use case at http://www.gedis-studio.com/online-call-detail-records-cdr-generator.html

Regards

OTHER TIPS

A huge list of open data sets is listed here:

Including Amazon, KDnuggets, Stanford, Twitter, Freebase, Google Public and more.

I've found an interesting project With tons of data available. It's a real data benchmark executed over an industrial valve. This is the website.

Industrial Actuator Real Data Benchmark Study.

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top