We're running MediaWiki 1.21 on Ubuntu 12.04.3 with the Lucene-search extension 2.1.3 (from its build.properties file).
I followed the instructions for a Single Host Setup (using ant to build the jar), and Setting Up Suggestions for the Search Box. Things seemed to be working just fine. However, new documents aren't being matched by the type-ahead search feature. Looking at the filesystem, I see that there are various items in the application's indexes directory:
$ cd /usr/local/search/lucene-search-2/indexes
$ ls -l
total 24
drwxr-xr-x 10 root root 4096 Aug 20 2013 import
drwxr-xr-x 7 root root 4096 Apr 14 06:42 index
drwxr-xr-x 2 root root 4096 Apr 14 06:41 search
drwxr-xr-x 9 root root 4096 Aug 20 2013 snapshot
drwxr-xr-x 2 root root 4096 Aug 20 2013 status
drwxr-xr-x 8 root root 4096 Aug 20 2013 update
We have a daily cron job that runs the Lucene-search build command, which dumps the wiki database as xml, and then modifies files in the import and snapshot folders. I noticed that the job reads from the search folder, which contains symbolic links to the update folder:
$ ls -l search/
total 24
lrwxrwxrwx 1 root root 70 Feb 12 21:39 wikidb -> /usr/local/search/lucene-search-2/indexes/update/wikidb/20140212064727
lrwxrwxrwx 1 root root 73 Feb 12 21:39 wikidb.hl -> /usr/local/search/lucene-search-2/indexes/update/wikidb.hl/20140212064727
lrwxrwxrwx 1 root root 76 Apr 14 06:41 wikidb.links -> /usr/local/search/lucene-search-2/indexes/update/wikidb.links/20140414064150
lrwxrwxrwx 1 root root 77 Feb 12 21:39 wikidb.prefix -> /usr/local/search/lucene-search-2/indexes/update/wikidb.prefix/20140212064728
lrwxrwxrwx 1 root root 78 Feb 12 21:39 wikidb.related -> /usr/local/search/lucene-search-2/indexes/update/wikidb.related/20140212064713
lrwxrwxrwx 1 root root 76 Feb 12 21:39 wikidb.spell -> /usr/local/search/lucene-search-2/indexes/update/wikidb.spell/20140212064740
Only the wikidb.links entry is current. The others are a couple of months old, which makes me think I missed something in how our daily cron task is setup. Here's the job:
#!/bin/sh
log=/var/log/lucene-search-2-cron.log
(
echo "Building wiki lucene-search indexes ..."
cd /usr/local/search/lucene-search-2
./build
echo "Stopping the lsearchd service..."
service lsearchd stop
# ok, so stopping the service apparently doesn't mean that the processes are gone, whack them manually
# See tip on using the "[x]yz" character class option so you don't need the additional "grep -v xyz":
# http://stackoverflow.com/questions/3510673/find-and-kill-a-process-in-one-line-using-bash-and-regex
echo "Killing any lucene-search processes that didn't terminate..."
kill -9 $(ps -ef | grep '[l]search' | awk '{print $2}')
echo "Starting the lsearchd service..."
service lsearchd start
) > $log 2>&1
And here's the service script /etc/init.d/lsearchd:
#!/bin/sh -e
### BEGIN INIT INFO
# Provides: lsearchd
# Required-Start: $syslog
# Required-Stop: $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 1
# Short-Description: Start the Lucene Search daemon
# Description: Provide a Lucene Search backend for MediaWiki. Copied by John Ericson from: http://ubuntuforums.org/showthread.php?t
=1476445
### END INIT INFO
# Set to install directory of lucense-search. For example: /usr/local/lucene-search-2.1.3
LUCENE_SEARCH_DIR="/usr/local/search/lucene-search-2"
# Set username for daemon to run as. Can also use syntax "username:groupname" to also specify group for daemon to run as. For example: me:me
RUN_AS_USER="lsearchd"
OPTIONS="-configfile $LUCENE_SEARCH_DIR/lsearch.conf"
test -x $LUCENE_SEARCH_DIR/lsearchd || exit 0
test -n "$RUN_AS_USER" && CHUID_ARG="--chuid $RUN_AS_USER" || CHUID_ARG=""
if [ -f "/etc/default/lsearchd" ] ; then
. /etc/default/lsearchd
fi
. /lib/lsb/init-functions
case "$1" in
start)
cd $LUCENE_SEARCH_DIR
log_begin_msg "Starting Lucene Search Daemon..."
start-stop-daemon --start --quiet --oknodo --chdir $LUCENE_SEARCH_DIR --background $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd -- $OPT
IONS
log_end_msg $?
;;
stop)
log_begin_msg "Stopping Lucene Search Daemon..."
start-stop-daemon --stop --quiet --oknodo --retry 2 --chdir $LUCENE_SEARCH_DIR $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd
log_end_msg $?
;;
restart)
$0 stop
sleep 1
$0 start
;;
reload|force-reload)
log_begin_msg "Reloading Lucene Search Daemon..."
start-stop-daemon --stop -signal 1 --chdir $LUCENE_SEARCH_DIR $CHUID_ARG --exec $LUCENE_SEARCH_DIR/lsearchd
log_end_msg $?
;;
status)
status_of_proc $LUCENE_SEARCH_DIR/lsearchd lsearchd && exit 0 || exit $?
;;
*)
log_success_msg "Usage: /etc/init.d/lsearchd {start|stop|restart|reload|force-reload|status}"
exit 1
esac
exit 0
Update #1:
I deleted the update directory and ran the build command manually from the console as root. As expected, it only generated the update/wikidb.links entry, none of the other folders exist. I reviewed my earlier setup notes, and don't see anything different, so how did those folders get created, and how do they get maintained?
Update #2:
I retraced my steps from the initial install, and couldn't see anything I missed. So on a chance, I stopped the service and ran lsearchd from the console, and it created the missing directories! So I terminated the process and tried things again: deleted the indexes folder and ran the cron script from the console as root. I confirmed that when run this way, lsearchd DID NOT create the missing directories. And of course, now I remember that I had run lsearchd from the console when initially setting things up, verifying that it was getting client queries for the wiki's Search input field. And these are the indexes it had been using for the lookups, which explains why new documents are not included.
Here is what the command looks like when run as a service:
$ ps -ef | grep [l]search
lsearchd 10192 1 0 14:02 ? 00:00:00 /bin/bash /usr/local/search/lucene-search-2/lsearchd -configfile /usr/local/search/lucene-search-2/lsearch.conf
lsearchd 10198 10192 0 14:02 ? 00:00:01 java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2/LuceneSearch.jar -Djava.rmi.server.hostname=AMWikiBugz -jar /usr/local/search/lucene-search-2/LuceneSearch.jar -configfile /usr/local/search/lucene-search-2/lsearch.conf
So the remaining question is:
Why does lsearchd NOT create the directories when run as a service?