There are a few reasons you probably want to look at a database (not necessarily MySQL) rather than the file system for this sort of thing:
More files in one directory slow things down
Although XFS is supposed to be very clever about allocating resources, most filesystems experience degrading performance the more files you have in a single directory. It also becomes a headache to deal with them on the command line. Having a look at this (http://oss.sgi.com/projects/xfs/datasheet.pdf) there's a graph on there about look ups, which only goes up to 50k per directory, and it's on the way down.
Overhead
There is a certain amount of filesystem overhead per file. If you have many small files, you may find that the final store bloats as a result of this.
Key cleaning
Are all your words safe to put in a filename? Are you sure? A slash or two in there is really going to ruin your day.
NoSQL might be a good option
Something like MongoDB/Redis might be a good option for this. MongoDB can store single documents of up to 16mb and isn't much harder to use that putting things on the file system. If you are storing 15mb documents, you might be getting a bit too close for comfort on that limit, but there are other options.
The nice thing about this is, the lookup performance is likely to be pretty good off the bat and if you later on find it isn't you can scale the performance by creating a cluster etc. Any system like this will also do a good job of managing the files on the disk intelligently for good performance.
If you are going to use the disk
Consider taking an MD5 hash of the word you want to store, and base your filename on this. For example the MD5 of azpdk
is:
1c58fb66d5a4d6a1ebe5ec9e217fbbf9
You could use this to create a filename e.g.:
my_directory/1c5/8fb/66d5a4d6a1ebe5ec9e217fbbf9
This has a few nice features:
- The hash takes care of scary characters
- The directories spread out the data, so no directory has more than 4096 entries
- This means the lookup performance should be relatively decent
Hope that helps.