Question

The broad question is: how are the licensing terms for a classifier that is trained with open annotated data (or manually annotated) data?

  1. I am trying to train a dependency parser for German text with annotated data which is licensed under creative commons license (Attribution CC BY). To train the classifier I want to use machine learning tool which is licensed under the Apache license.

    Is it legally permissible to license the resulting classifier (my code and the model file) under a commercial license?

  2. Suppose I scrape a text from the web, or alternatively I download a corpus collection that is licensed under Attribution CC BY, and I use an annotation tool which is open source under Apache license, and I train a classifier with an Apache machine learning software.

Will it legally be permissible to license the resulting classifier under commercial terms?

Was it helpful?

Solution

well CC BY says you can:

Adapt — remix, transform, and build upon the material for any purpose, even commercially.

The Apache Licence is more complicated, but unless you are distributing your annotator its not a problem.

I would say that your trained model would constitute a transformation of the corpus collection. But as its licenced under CC you are fine to sell it

Licensed under: CC-BY-SA with attribution
scroll top