Question

We want to index our (fairly large collection of) git repositories using OpenGrok, and the one thing I haven't been able to figure out is how to index all the branches. From what I can see, it looks like I need to have checked-out copies of each branch that I want to index, so, if a repository has, say, a dozen branches, I need to have a dozen copies of it, one for each branch, e.g.,

git-repo-with-many-branches-master/
git-repo-with-many-branches-branch1/
git-repo-with-many-branches-branch2/
       :
git-repo-with-many-branches-branch12/

Is that really true? Or is there a way to tell OpenGrok to look at all the branches when creating its index?

Was it helpful?

Solution

The other layers in OpenGrok is designed to work on multiple SCM systems that doesn't work like git, so unfortunately for you you have to check out each branch you want to index as a separate git repository :-(

You could always file an RFE for support for browsing multiple branches in a git repository.

OTHER TIPS

I have written a script exactly for this purpose: daun. You will need to use daun cli instead of git cli in your cron job. Do note that it does not support OpenGrok git history at the time of writing as we are only interested in OpenGrok fast searching capability. We would point our OpenGrok users to tools like GitHub/Bitbucket for web-based git history.

Here is a bash script I wrote for doing exactly this. It will clone any new branches, update any existing branches, and delete branches that no longer exist. Here are instructions that "work"; you may choose to make things more secure but this is good enough if your server is only accessible on your LAN. I have a cron job setup that just runs it every 30 minutes on the server. To set up the cron job to run as root, run:

sudo crontab -e

Then paste in these contents:

*/30 * * * * /usr/local/bin/opengrok_index.sh

Then write and close:

:wq

You will need to install "expect" which the script uses to enter your ssh key's password. One of these two commands will work depending what linux OS you are using:

sudo yum install expect
sudo apt-get install expect

Then create a file at /usr/local/bin/opengrok_index.sh:

sudo vi /usr/local/bin/opengrok_index.sh

Next, paste in the contents of the script (from the bottom of this post), and change the variables at the top according to your system. Next, change the permissions so only root can read it (it has passwords in it):

sudo chmod 700 /usr/local/bin/opengrok_index.sh

You probably want to test running the script manually and get it working, before expecting the cron job to work. This is a particular script that I wrote for my particular setup, so you may need to put in some echo statements and do some debugging to get it working correctly:

sudo /usr/local/bin/opengrok_index.sh

Additional notes:

  • This script logs into GIT over SSH (not HTTPS). As such, your GIT_USER must exist on the system, and have an SSH key under /home/user/.ssh/id_rsa that has access to the GIT repo. This is standard GIT login stuff so I won't go over it here. The script will enter the GIT_USER_SSH_PASSWORD when prompted
  • The script checks out all files as GIT_USER, so you may need to "chown" your CHECKOUT_LOCATION to that user

Script:

#!/bin/bash

SUDO_PASSWORD="password"
CHECKOUT_LOCATION="/var/opengrok/src/"
GIT_PROJECT_NAME="Android"
GIT_USER="username"
GIT_USER_SSH_PASSWORD="password"
GIT_URL="yourgit.domain.com"
OPENGROK_BINARY_FILE="/usr/local/opengrok-0.12.1.6/bin/OpenGrok"

# Run command as GIT_USER which has Git access
function runGitCommand {
  git_command="$@"

  expect_command="
    spawn sudo -u $GIT_USER $git_command
    expect {
        \"*password for*\" {
            send \"$SUDO_PASSWORD\"
            send \"\r\"
            exp_continue
        }
        \"*Enter passphrase for key*\" {
            send \"$GIT_USER_SSH_PASSWORD\"
            send \"\r\"
            exp_continue
        }
    }"

  command_result=$(expect -c "$expect_command" || exit 1)
}

# Checkout the specified branch over the network (slow)
function checkoutBranch {
  branch=$1

  # Check out branch if it does not exist
  if [ ! -d "$branch" ]; then
    runGitCommand git clone ssh://$GIT_URL/$GIT_PROJECT_NAME
    # Rename project to the branch name
    mv $GIT_PROJECT_NAME $branch || exit 1
  # Otherwise update the existing branch
  else
    cd $branch || exit 1
    runGitCommand git fetch
    runGitCommand git pull origin $branch || exit 1
    cd ..
  fi
}

# If the branch directory does not exist, copy the master
# branch directory then switch to the desired branch.
# This is faster than checkout out over the network.
# Otherwise, update the exisiting branch directory
function updateBranch {
  branch=$1

  if [ ! -d "$branch" ]; then
    mkdir $branch || exit 1
    rsync -av master/ $branch || exit 1
    cd $branch || exit 1
    runGitCommand git checkout -b $branch origin/$branch
    cd ..
  else
    cd $branch || exit 1
    runGitCommand git pull origin $branch || exit 1
    cd ..
  fi
}

# Change to the OpenGrok indexing location to checkout code
cd $CHECKOUT_LOCATION || exit 1

# Check out master branch
checkoutBranch master

# Get a list of all remote branches
cd master || exit 1
old_ifs=$IFS
IFS=$'\n'
origin_branches=( $(git branch -r) )
IFS=$old_ifs
origin_branches_length=${#origin_branches[@]}
cd .. # Move out of "master" directory

# Loop through and check out all branches
branches=(master)
for origin_branch in "${origin_branches[@]}"
do
  # Strip the "origin/" prefix from the branch name
  branch=${origin_branch#*/}

  # Ignore the "HEAD" branch
  # Also skip master since it has already been updated
  if [[ $branch == HEAD* ]] || [[ $branch == master* ]]; then
    continue
  fi

  branches+=("$branch")
  updateBranch $branch
done

# Get list of branches currently in OpenGrok
old_ifs=$IFS
IFS=$'\n'
local_branches=( $(ls -A1) )
size=${#local_branches[@]}
IFS=$old_ifs

# Get list of branches that are in OpenGrok, but do not exist
# remotely. These are branches that have been deleted
deleted_branches=()
for local_branch in "${local_branches[@]}"
do
  skip=0

  for branch in "${branches[@]}"
  do
    if [[ $local_branch == $branch ]]; then
      skip=1;
      break;
    fi
  done

  if [[ $skip == "0" ]]; then
    deleted_branches+=("$local_branch")
  fi
done

# Change to checkout directory again, in case some future code
# change brings us somewhere else. We are deleting recursively
# here and cannot make a mistake!
cd $CHECKOUT_LOCATION
# Delete any branches that no longer exist remotely
for deleted_branch in ${deleted_branches[@]}
do
  rm -rf ./$deleted_branch
done

# Reindex OpenGrok
$OPENGROK_BINARY_FILE index

I don't know anything about OpenGrok but of course you can change branches using Git:

git checkout master
# do the indexing here
git checkout branch1
# indexing
git checkout branch2
# and so on...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top