Question

I have code that is currently 'open source' but not easily accessible since it is not available from a repository. The idea is to make it available via either SourceForge or Github, but I am willing to use whatever free site that supports the requirements.

Project description

The project consists of code in many modules (Java packages under the org.pscode hierarchy), and some dependencies (e.g. 50 meg of cross-compilation plug-in, 5-10 meg of MP3..). The project contains many stand-alone applications as disparate as a cross-compilation compiler meant for developers, and a music juke box for end users. But there are also single classes (such as BigClip that can hold a large sound clip), that are useful in other applications that are not from the project itself. While some of the packages are effectively 'self contained' (e.g. JaNeLA), others are components used across a range of existing applications.

The entire project as it exists on my machine is already pushing 200 Meg.

As a developer who wanted to play a long clip, I would be hesitant to download 200+ megabytes of project just to get a class who's code is short enough to post on SO.

Code repository requirements

  • A code repository/sharing system that allows the user to download only the parts they require.
  • If it consists of many separate parts of the main project, they are downloaded automatically.
  • Provide an automatic way to calculate (for display to the user) how much download is required for each 'sub' project. OK, that is not a requirement that will 'make or break' my choice, but it would be very handy.

I was looking at Git since it seems to be have many advances over the older forms of CVS, and was reading Git For Eclipse Users & got to point 3 of Distributed Version Control Systems which starts..

Given that there is no master repository, it becomes clear that the repository must live in its entirety on each of the nodes in the DVCS. ..

That worries me, but I am not sure I fully understand it, or whether there is another mechanism within Git to provide the behavior required.

Question(s)

Can Git fulfill the requirements stated above?

On another tack, is this a 'non-question' for most developers? If people will typically download 200 meg of project just to get 3 Kb of code to parse a file of Comma-separated values, then perhaps I am worrying over nothing.

Was it helpful?

Solution

You can use git submodules. A submodule in a git repository is a reference to another git repository. Normally, when you clone a git repository, you must copy everything. However, submodules are a way to split up a large project into many separate repositories.

Note however, that a submodule will need to be in a separate subdirectory, so if you have source code you want in separate submodules, you might need to move them around, and perhaps change your linking.

When you clone the main project, it will not download all submodules automatically. You need to specify that you want to pull all submodules manually. However, it is possible to quickly download all submodules, for more details:

Easy way pull latest of all submodules

OTHER TIPS

This is really not possible with Git. Although, if you host your code on a Git repository such as github, users can browse through your files and download the ones which they deem necessary. See Git vs Subversion:

Git requires you to clone the entire repository (including history) and create a working copy that mirrors at least a subset of the items under version control

Although Git's submodule feature can help with this problem, you must manage multiple repositories and ensure coordination between them. This can be somewhat messy if you are still mainly in development, and may require some refactoring.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top