Lucene Reindex

Overview

This page documents the use of admin tools to rebuild the Lucene index.  Indexes are created based on the hibernate-search annotations on model objects such as @Indexed, @IndexedEmbedded, @Field, @ContainedIn, etc.

Prerequisites

  • MySQL database must already exist (e.g. "umlsdb").  
    • If no data exists in the database, this tool will clear the indexes.
    • Otherwise, it will index the available data.
  • MySQL database connection parameters must be defined in the properties file specified by "run.config.umls".
  • The file system directory specified by the "hibernate.search.default.indexBase" in the properties file specified by run.config.umls must exist.
  • Tomcat server must be stopped (i.e., "tstop"), and then restarted after reindex is complete ("tstart"), UNLESS admin tool is being used to rebuild all indexes via a REST call to the server

Details

This tool is used to rebuild Lucene indexes for objects indexed by the system.

Indexes are written into the directory specified by the hibernate.search.default.indexBase property in the properties file specified by "run.config.umls".

Following are some details about the implementation of this tool:

Mojo: LuceneReindexMojo.java (in admin/mojo/src/main/resources/java/com/wci/umls/server/mojo)
Project: admin/lucene
Configuration Parameters

  • run.config.tumls - the standard configuration file specified as a -D parameter
  • indexed.objects - optional parameter that allows a comma-separated list of objects to reindex (e.g. ConceptJpa).  If omitted, all indexes are rebuilt.

  • server - true/false indicating whether to run the mojo through the server 

Sample Instructions

Sample command line call of the admin tool to rebuild all indexes. This is generally what we want to do after a DB refresh. Check the relevant README.txt for details.

 

# stop the server
service tomcat7 stop

# remove the indexes
/bin/rm -rf /local/content/MEME/MEME5/ncim/data/indexes*

# change the value of config.properties to match what is on your server
cd ~/code/admin/lucene
mvn install -PReindex -Drun.config.umls=/meme_work_local/ncim/config/config.properties

# give it some reasonable permissions
chown -R 777 /local/content/MEME/MEME5/ncim/data/indexes
  
# start the server
% service tomcat7 start

 

Sample command line call of the admin tool to rebuild all indexes via a REST call to the server (here the run.config.ts is used to specify the base.url parameter, which is where the server is running):

 

% cd ~/code/admin/lucene
% mvn install -PReindex -Drun.config.umls=/home/ec2-tomcat/config/config.properties -Dserver=true

 

 

Sample command line call of the admin tool to reindex only certain data types:

# stop the server
% service tomcat7 stop

# Re-indexes only map records and tracking records
% cd ~/code/admin/lucene
% mvn install -PReindex -Drun.config.umls=/home/ec2-tomcat/config/config.properties -Dindexed.objects=ConceptJpa,TrackingRecordJpa

# start the server
% service tomcat7 start

 

 

Sample Eclipse run configuration to rebuild all indexes:

Troubleshooting

Occasionally it is necessary to completely reindex all objects from scratch. When this is needed, the recommended approach is to remove the index files from the index and then run this mojo. For example:

# stop the server
% service tomcat7 stop
 
# remove the index
% /bin/rm -rf /var/lib/tomcat7/indexes/*
 
# rebuild the index
% cd ~/code/admin/lucene
% mvn install -PReindex  -Drun.config.umls=/home/ec2-tomcat/config/config.properties
 
# start the server
% service tomcat7 start