Question

Is there a way to perform a full text search of a subversion repository, including all the history?

For example, I've written a feature that I used somewhere, but then it wasn't needed, so I svn rm'd the files, but now I need to find it again to use it for something else. The svn log probably says something like "removed unused stuff", and there's loads of checkins like that.

Edit 2016-04-15: Please note that what is asked here by the term "full text search", is to search the actual diffs of the commit history, and not filenames and/or commit messages. I'm pointing this out because the author's phrasing above does not reflect that very well - since in his example he might as well be only looking for a filename and/or commit message. Hence a lot of the svn log answers and comments.

Was it helpful?

Solution

git svn clone <svn url>
git log -G<some regex>

OTHER TIPS

svn log in Apache Subversion 1.8 supports a new --search option. So you can search Subversion repository history log messages without using 3'rd party tools and scripts.

svn log --search searches in author, date, log message text and list of changed paths.

See SVNBook | svn log command-line reference.

If you are running Windows have a look at SvnQuery. It maintains a full text index of local or remote repositories. Every document ever committed to a repository gets indexed. You can do google-like queries from a simple web interface.

I'm using a small shellscript, but this only works for a single file. You can ofcourse combine this with find to include more files.

#!/bin/bash
for REV in `svn log $1 | grep ^r[0-9] | awk '{print $1}'`; do 
  svn cat $1 -r $REV | grep -q $2
  if [ $? -eq 0 ]; then 
    echo "$REV"
  fi 
done

If you really want to search everything, use the svnadmin dump command and grep through that.

The best way that I've found to do this is with less:

svn log --verbose | less

Once less comes up with output, you can hit / to search, like VIM.

Edit:

According to the author, he wants to search more than just the messages and the file names. In which case you will be required to ghetto-hack it together with something like:

svn diff -r0:HEAD | less

You can also substitute grep or something else to do the searching for you. If you want to use this on a sub-directory of the repository, you will need to use svn log to discern the first revision in which that directory existed, and use that revision instead of 0.

I have been looking for something similar. The best I have come up with is OpenGrok. I have not tried to implement it yet, but sounds promising.

svn log -v [repository] > somefile.log

for diff you can use the --diff option

svn log -v --diff [repository] > somefile.log

then use vim or nano or whatever you like using, and do a search for what you're looking for. You'll find it pretty quickly.

It's not a fancy script or anything automated. But it works.

While not free, you might take a look at Fisheye from Atlassian, the same folks that bring you JIRA. It does full text search against SVN with many other useful features.

http://www.atlassian.com/software/fisheye/

I was looking for the same thing and found this:

http://svn-search.sourceforge.net/

I just ran into this problem and

svnadmin dump <repo location> |grep -i <search term>

did the job for me. Returned the revision of the first occurrence and quoted the line I was looking for.

I don't have any experience with it, but SupoSE (open source, written in Java) is a tool designed to do exactly this.

svn log -l<commit limit> | grep -C<5 or more lines> <search message>

I wrote this as a cygwin bash script to solve this problem.

However it requires that the search term is currently within the filesystem file. For all the files that match the filesystem grep, an grep of all the svn diffs for that file are then performed. Not perfect, but should be good enough for most usage. Hope this helps.

/usr/local/bin/svngrep

#!/bin/bash
# Usage: svngrep $regex @grep_args

regex="$@"
pattern=`echo $regex | perl -p -e 's/--?\S+//g; s/^\\s+//;'` # strip --args
if [[ ! $regex ]]; then
    echo "Usage: svngrep \$regex @grep_args"
else 
    for file in `grep -irl --no-messages --exclude=\*.tmp --exclude=\.svn $regex ./`;     do 
        revs="`svnrevisions $file`";
        for rev in $revs; do
            diff=`svn diff $file -r$[rev-1]:$rev \
                 --diff-cmd /usr/bin/diff -x "-Ew -U5 --strip-trailing-cr" 2> /dev/null`
            context=`echo "$diff" \
                 | grep -i --color=none   -U5 "^\(+\|-\).*$pattern" \
                 | grep -i --color=always -U5             $pattern  \
                 | grep -v '^+++\|^---\|^===\|^Index: ' \
                 `
            if [[ $context ]]; then
                info=`echo "$diff" | grep '^+++\|^---'`
                log=`svn log $file -r$rev`
                #author=`svn info -r$rev | awk '/Last Changed Author:/ { print $4 }'`; 

                echo "========================================================================"
                echo "========================================================================"
                echo "$log"
                echo "$info"
                echo "$context"
                echo
            fi;
        done;
    done;
fi

/usr/local/bin/svnrevisions

#!/bin/sh
# Usage:  svnrevisions $file
# Output: list of fully numeric svn revisions (without the r), one per line

file="$@"
    svn log "$file" 2> /dev/null | awk '/^r[[:digit:]]+ \|/ { sub(/^r/,"",$1); print  $1 }'

I usually do what Jack M says (use svn log --verbose) but I pipe to grep instead of less.

In case you are trying to determine which revision is responsible for a specific line of code, you are probably looking for:

svn blame

Credit: original answer

I came across this bash script, but I have not tried it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top