Full text searches on product codes

Question 1

Using Lucene, there are a couple of options I would recommend.

One would be to index product ids with a KeywordAnalyzer, and then query as you suggested, with a fuzzy query.

Or, you could create a custom Analyzer, in which you add a WordDelimiterFilter, which will create tokens at changes in case, as well as dashes and spaces (if any exist in your tokens after having been passed through the tokenizer). An important note, if you are using a StandardAnalyzer, or SimpleAnalyzer, or something similar, you will want to make sure the WordDelimiterFilter is applied BEFORE the LowercaseFilter. Running it through the LowercaseFilter would, of course, prevent it being able to split terms based on camel casing. Another caution, you'll probably want to customize your StopFilter, since "I" is a common english stopword.

In a custom analyzer, you mainly just need to override createComponents(). For example, if you wanted to add WordDelimiterFilter functionality into the StandardAnalyzer's set of filters:

@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    Tokenizer tokenizer = new StandardTokenizer(Version.LUCENE_40,reader);
    TokenStream filter = new StandardFilter(Version.LUCENE_40,tokenizer);
    //Take a look at the WordDelimiterFactory API for other options on this filter's behavior
    filter = new WordDelimiterFilter(filter,WordDelimiterFilter.GENERATE_WORD_PARTS,null);
    filter = new LowercaseFilter(Version.LUCENE_40,filter);
    //As mentioned, create a CharArraySet of your stopwords, since the default will likely cause problems for you
    filter = new StopFilter(Version.LUCENE_40,filter,myStopWords);
    return new TokenStreamComponents(tokenizer, filter);
}

Question 2

With Solr, please make sure to walk through the example tutorial and corresponding schema.xml. You will see that there are two type definitions there (en_splitting and en_splitting_tight I think) that show very similar use cases.

Specifically, you are looking at WordDelimiterFilter augmented by LowerCaseFilter and possibly SynonymFilter. You do have to be a bit careful with SynonymFilters though, especially if you are mapping to/from multi-word equivalences.