Lucene Reindex
Overview
This page documents the use of admin tools to rebuild the Lucene index. Indexes are created based on the hibernate-search annotations on model objects such as @Indexed, @IndexedEmbedded, @Field, @ContainedIn, etc.
Prerequisites
- MySQL database must already exist (e.g. "umlsdb").
- If no data exists in the database, this tool will clear the indexes.
- Otherwise, it will index the available data.
- MySQL database connection parameters must be defined in the properties file specified by "run.config.umls".
- The file system directory specified by the "hibernate.search.default.indexBase" in the properties file specified by run.config.umls must exist.
- Tomcat server must be stopped (i.e., "tstop"), and then restarted after reindex is complete ("tstart"), UNLESS admin tool is being used to rebuild all indexes via a REST call to the server
Details
This tool is used to rebuild Lucene indexes for objects indexed by the system.
Indexes are written into the directory specified by the hibernate.search.default.indexBase property in the properties file specified by "run.config.umls".
Following are some details about the implementation of this tool:
Mojo: LuceneReindexMojo.java (in admin/mojo/src/main/resources/java/com/wci/umls/server/mojo)
Project: admin/lucene
Configuration Parameters
- run.config.tumls - the standard configuration file specified as a -D parameter
indexed.objects - optional parameter that allows a comma-separated list of objects to reindex (e.g. ConceptJpa). If omitted, all indexes are rebuilt.
server - true/false indicating whether to run the mojo through the server
Sample Instructions
Sample command line call of the admin tool to rebuild all indexes. This is generally what we want to do after a DB refresh. Check the relevant README.txt for details.
# stop the server service tomcat7 stop # remove the indexes /bin/rm -rf /local/content/MEME/MEME5/ncim/data/indexes* # change the value of config.properties to match what is on your server cd ~/code/admin/lucene mvn install -PReindex -Drun.config.umls=/meme_work_local/ncim/config/config.properties # give it some reasonable permissions chown -R 777 /local/content/MEME/MEME5/ncim/data/indexes # start the server % service tomcat7 start
Sample command line call of the admin tool to rebuild all indexes via a REST call to the server (here the run.config.ts is used to specify the base.url parameter, which is where the server is running):
% cd ~/code/admin/lucene % mvn install -PReindex -Drun.config.umls=/home/ec2-tomcat/config/config.properties -Dserver=true
Sample command line call of the admin tool to reindex only certain data types:
# stop the server % service tomcat7 stop # Re-indexes only map records and tracking records % cd ~/code/admin/lucene % mvn install -PReindex -Drun.config.umls=/home/ec2-tomcat/config/config.properties -Dindexed.objects=ConceptJpa,TrackingRecordJpa # start the server % service tomcat7 start
Sample Eclipse run configuration to rebuild all indexes:
Troubleshooting
Occasionally it is necessary to completely reindex all objects from scratch. When this is needed, the recommended approach is to remove the index files from the index and then run this mojo. For example:
# stop the server % service tomcat7 stop # remove the index % /bin/rm -rf /var/lib/tomcat7/indexes/* # rebuild the index % cd ~/code/admin/lucene % mvn install -PReindex -Drun.config.umls=/home/ec2-tomcat/config/config.properties # start the server % service tomcat7 start
References/Links