I am currently investigating the most appropriate dictionary to use in an application I am building.

Inspecting the dictionaries which are bundled with Sublime Text 2, the file format is as you would expect - a list of alphabetically ordered words. However, alot of those words have additional information appended to them. Take this snippet as an example:

abaft
abbreviation/M
abdicate/DNGSn
Abelard/M
abider/M
Abidjan
ablaze
abloom
aboveground
abrader/M
Abram/M
abreaction/MS
abrogator/MS
abscond/DRSG
absinthe/MS
absoluteness/S
absorbency/SM
abstract/ShTVDPiGY
absurdness/S

A fruitless Google search has not shed any light on what the letters after the slash (/) mean.

Maybe they hint at the sex of the word, but that is only a guess and I'd prefer to read a formal explanation of their meaning.

Has anybody come across these?

有帮助吗?

解决方案

The letters following the slash are called affixes. These encodings can be prefixes or suffixes that may be applied to the root word.

See this blog post for a nice explanation and examples of what these affixes can be used for.

Another place to look is the aspell manual.

其他提示

TLDR: each letter in the .dic file following the slash is a name of a rule in the .aff file.

https://superuser.com/a/633869/367530

Each rule is in the .aff file for that language. The rules come in two flavors: SFX for suffixes, and PFX for prefixes. Each line begins with PFX/SFX and then the rule letter identifier (the ones that follow the word in the dictionary file:

PFX [rule_letter_identifier] [combineable_flag] [number_of_rule_lines_that_follow]

You can normally ignore the combinable flag, it is Y or N depending on whether it can be combined with other rules. Then there are some number of lines (indicated by the ) that list different possibilities for how this rule applies in different situations. It looks like this:

PFX [rule_letter_identifier] [number_of_letters_to_delete] [what_to_add] [when_to_add_it]

For example:

  • SFX B Y 3
  • SFX B 0 able [^aeiou]
  • SFX B 0 able ee
  • SFX B e able [^aeiou]e

If B is one of the letters following a word, i.e. someword/B, then this is one of the rules that can apply. There are three possibilities that can happen (because there are three lines). Only one will apply:

  • able is added to the end when the end of the word is not (indicated by ^) one of the letters in the set (indicated by [ ]) of letters a, e, i, o, and u. For example, question → questionable
  • able is added to the end when the end of the word is ee. For example, agree → agreeable.
  • able is added to the end when the end of the word is not a vowel ([^aeiou]) followed by an e. The letter e is stripped (the column before able). For example, excite → excitable.

PFX rules are the same, but apply at the beginning of the word instead for prefixes.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top