Pregunta

I am working on software in which the user may select multiple substrings from an arbitrary string. Sometimes, this will naturally result in patterns. E.g.:

"The quick brown fox jumps over the lazy dog"
 - substring selected: brown fox
"The quick purple fox jumps over the lazy dog"
 - substring selected: purple fox
"The quick orange fox jumps over the lazy dog"
 - substring selected: orange fox

So it would seem that the user is always selecting the characters "fox" and the word immediately preceding it.

It would be really neat if I could implement some subroutine that could offer "Predictions" for these substrings, which the user could either make use of, or discard as appropriate. E.g.:

"The quick yellow fox jumps over the lazy dog"
 - suggested substring: yellow fox (ACCEPTED)
"The quick red fox jumps over the lazy dog"
 - suggested substring: red fox (ACCEPTED)
"The English Foxhound is a scent hound, bred to hunt foxes by scent."
 - suggested substring: hunt fox (REJECTED)

Generally speaking, how would one identify patterns in user input programatically, and use those patterns to make predictions about future input?

¿Fue útil?

Solución

There has been research on this in the field of text editing. There, the idea is to have the user edit a semi-structured text and to replicate the changes to similar portions of the text (with the appropriate transformations).

The general idea is to generate candidate patterns and rank/dismiss them based on user input and heuristics.

See for example this paper for a nice overview.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top