Question

I'm working on a Python program that makes heavy use of eggs (Plone). That means there are 198 directories full of Python code I might want to search through while debugging. Is there a good way to search only the .py files in only those directories, avoiding unrelated code and large binary files?

Was it helpful?

Solution

find DIRECTORY -name "*.py" | xargs grep PATTERN

By the way, since writing this, I have discovered ack, which is a much better solution.

(And since that edit, I have discovered ag).

OTHER TIPS

I would strongly recommend ack, a grep substitute, "aimed at programmers with large trees of heterogeneous source code" (from the website)

grep -r -n "PATTERN" --include="*.py" DIRECTORY

I also use ack a lot these days. I did tweak it a bit to find all the relevant file types:

# Add zcml to the xml type:
--type-add
xml=.zcml

# Add more files the plone type:
--type-add
plone=.dtml,.zpt,.kss,.vpy,.props

# buildout config files
--type-set
buildout=.cfg

# Include our page templates to the html type so we can limit our search:
--type-add
html=.pt,.zpt

# Create txt file type:
--type-set
txt=.txt,.rst

# Define i18n file types:
--type-set
i18n=.pot,.po

# More options
--follow
--ignore-case
--nogroup

Important to remember is that ack won't find files if the extension isn't in its configuration. See "ack --help-types" for all the available types.

I also assume you are using omelette so you can grep/ack/find all the related files?

This problem was the motivation for the creation of collective.recipe.omelette. It is a buildout recipe which can symlink all the eggs from your working set into one directory structure, which you can point your favorite search utility at.

find <directory> -name '*.py' -exec grep <pattern> {} \;

There's also GNU idutils if you want to grep for identifiers in a large source tree very very quickly. It requires building a search database in advance, by running mkid (and tweaking its config file to not ignore .py files). z3c.recipe.tag takes care of that, if you use buildout.

Just in case you want a non-commandline OSS solution...

I use pycharm. It has built in support for buildout. You point it at a buildout generated bin/instance and it sets the projects external dependencies to all the eggs used by the instance. Then all the IDE's introspection and code navigation work nicely. Goto definition, goto instances, refactoring support and of course search.

I recomend grin to search, omelette when working with plone and the pydev-feature 'Globals browser' (with eclipse or aptana studio).

And simply because there are not enough answers...

If you're developing routinely, it's well worth the effort to install Eclipse with Pydev (or even easier, Aptana Studio - which is a modified Eclipse), in which case the find tools are right there.

My grepping life is way more satisfying since discovering Emacs' rgrep command.

Say I want to find 'IPortletDataProvider' in Plone's source. I do:

  1. M-x rgrep
  2. Emacs prompts for the search string (IPortletDataProvider)
  3. ... then which files to search (*.py)
  4. ... then which directory (~/Plone/buildout-cache/eggs). If I'm already editing a file, this defaults to that file's directory, which is usually exactly what I want.

The results appear in a new buffer. At the top is the find | xargs grep command Emacs ran. All matches are highlighted. I can search the buffer using the standard text search commands. Best of all, I can hit Enter (or click) on a match to open that file.

It's a pretty nice way to work. I like that I don't have to remember find | xargs grep argument sequences, but that all that power is there if I need it.

Emacs rgrep example

OpenGrok is an excellent choice for source searching and navigation. Runs on Java, though.

I really wish there was something like http://opengrok.plone.org/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top