Question

How would I go about cleaning the node_modules folder when prepping my code for deployment.

I am making an app using node-webkit and would prefer to include the least amount of files possible when bundling the final version of the app as the unzip process takes some time.

I've looked at npm dedupe and use npm install --production to get rid of duplicates and fetch only production files, however I am still left with Readme files, benchmarks, tests and build files which I don't need.

What I would like to end up with for each module in the node_modules folder is a LICENSE file if it exists, the package.json and anything else that I need for the module to run, but nothing more.

The question: How to automatically clean a node_modules directory for a SCM commit was heading somewhat in the same direction, but it is talking about committing so not really what I am looking for.

The question: NPM clean modules again was somewhat along the same lines as mine, but not quite fully there.

This answer helps as it is more efficient version for dedupe for bundling the final app.

Update
I tried the custom module linked from here but it didn't seem to work correctly, even after some fiddling about.

With all that said, I haven't quite found the right answer yet.


Here's an example of what I am looking for.

In my project I currently have two dependencies: socket.io and socket.io-client.

Together they make up 15 MB with 550 files in 110 folders.

Manually cleaning readme, makefile, VC++ build files such as .pdb and .obj and other unnecessary files I was able to shrink it down to 2.74 MB with 265 files in 73 folders.
This is with just two modules.

I would like to find out if there is a way to do this automatically, preferably with npm.

Was it helpful?

Solution 2

Well cleaning the node_modules for the deployment for a webkit application, its kind of difficult, because of the modules inside node_modules directory are installed with or without test files other misc files and so, if the owner of the module, has declared an .npmignore file with dir/files such as tests or examples those will be excluded from packaging process when the owner publishes his module, but it will exist in the repository (git) as normal.

Exclude test code in npm package?

the above is left in module owners hands, if he/she "forgets" to make one, then package will contain pretty much everything.


Note that since you don't use the development package of socket.io or socket.io-client it doesn't mean you have to npm install socket.io --save-dev, a simple npm install socket.io -V would install the production package as it was uploaded by its owner.
A possible workaround would be to make a grunt task, to clean your entire node_modules as you would like it to be.

a couple of rules would be

  • test or tests directory or *test*.js files
  • build directories (not sure about it might contain some binaries sometimes that are necessary)
  • history.md

GruntJS


Hope i helped somehow, also take a look at Tilemill and how they deploy their application.

OTHER TIPS

This module attempts to intelligently cleanup the node_modules folder:

modclean

Install:

npm install modclean -g

or

npm install modclean --save-dev

Usage:

modclean

It uses a set of default patterns to remove unnecessary bloat from modules throughout the dependency tree.

Maybe you are interested in this little find command, which I've assembled over time. Please be aware, that this is not a "one-size-fits-all" solution! You need to check it very carefully against your requirements. It is intended for node.js environments and will definitely destroy browser environments. I run in a bash script as a postinstall script in npm.

DO NOT BLINDLY COPY 'N PASTE. You have been warned!

find node_modules \( \( -name "dist" -or -name "ts" -or -name "logos" -or -name "min" -or -name "test*" -or -name "doc*" -or -name "tst" -or -name "example*" -or -name "build" -or -name "man" -or -name "benchmark*" \) -and -type d \) -or \
   \( \( -iname "readme*" -or -iname "changelog*" -or -iname "notice*" -or -iname "test*.js" -or -iname "*.min.js" \) -and -type f \) -or \
   \( -path "*moment-timezone/data/unpacked*" -and -type d \)

To be sure, I've not added the final line | xargs rm -rf. You can safely execute the above command without deleting anything and later add the pipe to xargs with rm to make it really happen.

What does the find command do? I'll explain it pattern by pattern.

  1. \( \( -name "dist" -or -name "ts" -or -name "logos" -or -name "min" -or -name "test*" -or -name "doc*" -or -name "tst" -or -name "example*" -or -name "build" -or -name "man" -or -name "benchmark*" \) -and -type d \) => search for directories that match on the text in the quotes. The * is a wildcard.

  2. \( \( -iname "readme*" -or -iname "changelog*" -or -iname "notice*" -or -iname "test*.js" -or -iname "*.min.js" \) -and -type f \) => search for files in any folder where the text in the quotes matches regardless of the case. Especially the pattern "*.min.js" could be dangerous for some people.

  3. \( -path "*moment-timezone/data/unpacked*" -and -type d \) => remove the unpacked data from moment. This also saves plenty of space.

Feel free to improve this!

Modules installed with NPM should never contain development files (i.e. benchmarks, tests etc.). If they are included you should contact the module maintainer and ask to have them added to .npmignore.

Note: Development files in this case means the files that are needed for development of the actual module and not your application.

It's been suggested before, but having a deploy task in grunt is probably a good idea. Just make sure to test the application after the cleanup. grunt-contrib-clean is great for cleaning.

See .npmignore from connect for some ideas on which files/directories that shouldn't be in a production package.

I was looking into this for deploying to AWS's elasticbeanstalk. When I run the eb deploy command it was magically finding what files to upload and it wasn't getting any of the bower_components or node_modules. I wondered how.

It turns out that eb deploy invokes git archive under the hood. git archive checks out the branch you specify and makes a zip of all the files.

Presumably you don't commit things like your node_modules or bower_components directories to git, so git archive might be a solution to your problem. If you want to avoid things like test cases and README files and so on, you might still need to do some tagging in git. But you're starting at a substantially smaller list and you're obviously excluding the bulk of the stuff you want to exclude.

I think you're looking for npm prune

npm prune [<name> [<name ...]]

This command removes "extraneous" packages. If a package name is provided, then only packages matching one of the supplied names are removed.

Extraneous packages are packages that are not listed on the parent package's dependencies list.

Documentation https://docs.npmjs.com/cli/prune

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top