Question

There are good annotated vocabulary in {a,i,my}spell dictionaries, giving the lexical class et all.

I'm writing a program that needs a large vocabulary, but with a reduced scope, so I don't need all the information on the *spell dictionaries. Then, I'm reading this dictionary data, processing it and rewriting in my own format. The dictionary I'm using is GPLv2 and I want to release my program in BSD.

As the data itself [the idiom] is public domain, but the representation of data [*spell .dic files] is not, maybe I must follow GPL.

But as I'm not using the GPLd code itself, but a derived trough machine processing from it, maybe I must not.

Additional info:

I'm already making my derived own word list, mostly because the original word list have too much semantic information, like this: "ripostar/#vi/XYL/" or "ritualizar/#vt/XYPLnc/". But the resulting world list must be manually curated afterwards. This lead to a situation where I will distribute my word list and not require the original to works.

Was it helpful?

Solution

Its fuzzy. You're going to need to talk to someone who is familiar with the intellectual property of copyrights and word lists in your jurisdiction.

Lists of facts doesn't enjoy copyright in the United States. The court case for this is Feist v. Rural where it was ruled that copyrightability is based on originality. In this case, a collection of phone numbers doesn't meet that threshold of originality.

Note that in other countries the governing principle is sweat of the brow.

Word lists themselves are hazy. Scrabble has some history about a particular word list and who owns it - is the selection of words minimally creative? or is it a rote compilation? I don't know. I can't really say. You could argue it either way. If it isn't protectable by copyright, then the word list isn't able to be licensed... though that does get back into the protected by copyright where aspect.

For this, your best bet would to contact an organization that can answer some questions about the GPL. From http://www.fsf.org/about/contact/email

For any questions about the GNU GPL, LGPL, AGPL and Free Documentation Licenses:
licensing@fsf.org

And they would be able to give you about as an authoritative answer you can get before you start paying money to a lawyer.

You may also wish to consider the format of the file being something that is open. Write a program that can handle any {a,i,my}spell dictionary format. Just don't include the GPL files with it. Include a minimally useful dictionary with it. Other people can use other dictionaries if they so desire.

You may find other word lists that have a BSD compatible license.

OTHER TIPS

The following is from the "Copyright" file included in the Aspell distribution:

This English word list is comes directly from SCOWL 7.0 (up to level 60, using the speller/make-aspell-dict script, http://wordlist.sourceforge.net/) and is thus under the same copyright of SCOWL. The affix file (only included in the aspell6 package) is based on the Ispell one which is under the same copyright of Ispell. Part of SCOWL is also based on Ispell thus the Ispell copyright is included with the SCOWL copyright.

The collective work is Copyright 2000-2011 by Kevin Atkinson as well as any of the copyrights mentioned below:

This specifies the origins of the files. It then goes on to state in no uncertain terms that what you are trying to do is explicitly allowed (emphasis added):

Copyright 2000-2011 by Kevin Atkinson

Permission to use, copy, modify, distribute and sell these word
lists
, the associated scripts, the output created from the scripts,
and its documentation for any purpose is hereby granted without fee,
provided that the above copyright notice appears in all copies and
that both that copyright notice and this permission notice appear in
supporting documentation.
Kevin Atkinson makes no representations
about the suitability of this array for any purpose. It is provided
"as is" without express or implied warranty.

I admit that I had to dig for this info. It was not listed on the web site directly, but I figured it would be in the distribution files somewhere. That really is the authoritative place that has to contain copyright and license details.

So what’s your question? Would your work count as a derivative work from (work based on) a dictionary or not?

I am not a lawyer (and you should get one, if you are concerned about possible legal issues), so you probably would not interested in my opinion about that, but I really do not understand why it bothers you.

Copyleft of GNU GPL protects free software against becoming bound with additional restrictions, not the otherwise.

Since both of modern¹ BSD Licenses (2-clause and 3-clause) are compatible with GNU GPL, i. e. their sets of restrictions are subsets of that of GNU GPL, it is nothing wrong to combine your code under the terms of one of them along with GPL’ed one in a single product. As a whole for the sake of simplicity it might and should be considered as covered by GNU GPL, but that by no means would affect copyright conditions your own part of work alone – it remains under terms of lax permissive license, your chose for it.


¹ I hope, you are not going to use 4-clause BSD License with obnoxious advertising clause.

Licensed under: CC-BY-SA with attribution
scroll top