Searching for strings in a 'dictionary' file with multiple wildcard values

https://stackoverflow.com/questions/23576826

19-07-2023
|

题

I am trying to create a function which will take 2 parameters. A word with wildcards in it like "*arn*val" and a file name containing a dictionary. It returns a list of all words that match the word like ["carnival"].

My code works fine for anything with only one "*" in it, however any more and I'm stumped as to how to do it.

Just searching for the wildcard string in the file was returning nothing.

Here is my code:

dictionary_file = open(dictionary_filename, 'r')
dictionary = dictionary_file.read()
dictionary_file.close()
dictionary = dictionary.split()

alphabet = ["a","b","c","d","e","f","g","h","i",
            "j","k","l","m","n","o","p","q","r",
            "s","t","u","v","w","x","y","z"]

new_list = []

for letter in alphabet:
    if wildcard.replace("*", letter) in dictionary:
        new_list += [wildcard.replace("*", letter)]

return new_list

The parameters parameters: First is the wildcard string (wildcard), and second is the dictionary file name (dictionary_filename).

Most answers on this site were about Regex, which I have no knowledge of.

解决方案

Your particular error is that .replace replaces all occurrences e.g., "*arn*val" -> "CarnCval" or "IarnIval". You want different letters here. You could use the second nested loop over the alphabet (or use itertools.product() to generate all possible letter pairs) to fix it but a simpler way is to use regular expressions:

import re

# each `*` corresponds to an ascii lowercase letter
pattern = re.escape(wildcard).replace("\\*", "[a-z]")
matches = list(filter(re.compile(pattern+"$").match, known_words))

Note: it doesn't support escaping * in the wildcard.

If input wildcards are file patterns then you could use fnmatch module to filter words:

import fnmatch

matches = fnmatch.filter(known_words, wildcard)

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow