Question

I am looking for a way to automatically collect all the third-party library licences that my project is using. Currently I am collecting by hand the licences on github.

So far , I don't have a clear idea how to get a 3rd party library licence automatically. What is the most reliable way to get the 3rd party licence ?

Small ideas :

  • Most Github projects contain a licence text. ex : https://github.com/square/dagger . But can you map a dependency 'com.squareup.dagger:dagger:1.2.2' with its github url ?

  • most JVM artifacts are found on mvnrepository . I don't know if mvnrepository.com list the licence.

  • the .jar files may contain licence text . How to extract it ?

Related : What is the best practice for arranging third-party library licenses "paperwork"?

Was it helpful?

Solution

One possible way to automate part of this is the following algorithm:

Add the project GAV to queue
For each GAV in queue 
  Add all dependencies from GAV to queue // optional after first run? 
  Download jar
  Extract/unzip jar and search root directory of jar for file containing "license" // see Java zip classes
  Parse root pom.xml for license information
  if neither work
     output that license information could NOT be found
  else
     save license information for GAV
// end for loop       

You could create a maven plugin that does this and outputs the file to the root directory of your project (instead of a build directory) so that you notice when the file changes. Otherwise, a perl/python script might be easier (but also more of a hack.. :) ).

Given that it's easy to use transitive dependencies in your code without knowing it, you should also look at using the the Ban Transitive Dependencies enforcer rule.

If you do not do that, then I would definitely make sure to scan all transitive dependencies for their licenses (always use the 3rd line of the algorithm).

Licensed under: CC-BY-SA with attribution
scroll top