Unsupervised Feature extraction of dishes by building tree structure of ingerdients with Natural Language Processing

StackOverflow https://stackoverflow.com/questions/14395006

Question

I am building a recommendation system for dishes. Consider a user eating french fries and rates it a 5. Then I want to give a good rating to all the ingredients that the dish is made of. In the case of french fires the linked words should be "fried" "potato" "junk food" "salty" and so on. From the word Tsatsiki I want to extract "Cucumbers", "Yoghurt" "Garlic". From Yoghurt I want to extract Milk product, From Cucumbers vegetable and so on.

What is this problem called in Natural Language Processing and is there a way to address it?

I have no data at all, and I am thinking of building web crawler that analyzes the web for the dish. I would like it to be as little Ad-Hoc as possible and not necessarily in English. Is there a way, maybe in within deep learning to do the thing? I would not only a dish to be linked to the ingredients but also a category: junk food, vegetarian, Italian food and so on.

Was it helpful?

Solution

This type of problem is called ontology engineering or ontology building. For an example of a large ontology and how it's structured, you might check out something like YAGO. It seems like you are going to be building a boutique ontology for food and then overlaying a rating's system. I don't know of any ontologies out there of the form you're looking for, but there are relevant things out there you should take a look at, for example, this OWL-based food ontology and this recipe ontology.

OTHER TIPS

Do you have a recipe like that:

Ingredients:
*Cucumbers
*Garlic
*Yoghurt

or like that:

Grate a cucumber or chop it. Add garlic and yoghurt.

If the former, your features have already been extracted. The next step would be to convert to a vector recommend other recipes. The simplest way would be to do (unsupervised) clustering of recipes.

If the latter, I suspect you can get away with a simple rule of thumb. Firstly, use a part-of-speech tagger to extract all the nouns in the recipe. This would extract all the ingredients and a bit more (e.g. kitchen appliances, cutlery, etc). Look up the nouns in a database of food ingredients database such as this one.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top