SPRINT 5 - Finish 9/30/2015

Overview

Sprint to extend some of the "read only" APIs, to finalize unit/integration testing for existing functionality, to extend the interface, to develop additional loaders, and to begin developing editing capabilities.

Priority things

For "semantic type" mechanism - if there are tree numbers, order by that and indent"
DONE: owl deployment
JFW :Refresh server deployments
- UMLS, SNOMED, SNOMED VET, ICD (reload on machine)
DONE: JFW: QA queries
- referential integrity
- Verify for each type of loader and the sample data.
JFW: Jekins - mojo tests
- REST integration tests
BAC: OWL - NCIt
BAC: consider adding top-level hierarchies as "semantic types" to concepts for loaders in general
- This could be a function of the treeposition or transitive closure computers.
- the corresponding metadata would have to be created as well
BAC: remove "void addXXX" and "void removeXXX" methods from model objects. Use getObjects().add/remove(...) only
DONE: Terminology sampler mojo - bring over from old term server project.
Model
- DONE: General class axioms
- Mapping, MapRecord, MapEntry (each with attributes) - similar to Subset, SubsetMember except with one more layer.
  - Then implement this for RRF loader
  - Then implement this for RF2 loader (snapshot)
  - Then implement this for RF2 loader (delta)
User Interface
- Enable glass pane while switching tabs - need to say increment in tab controller and decrement at the end of the controller.
Integration Tests - Jpa/REST - get them tested again
DONE: support "mode" parameter on loaders to automatically recreate db and clear indexes.
Have search handler work for all searching, including relationships and trees. - separate methods to build query?? - probably
DONE Improve search
- Create "SearchHandler" as an extension point (like graph resolution handler). Have a "default implementation"
- Search algorithm
  - First search on exact string (e.g. "literal" search)
    - handles short strings containing special characters, consider doing a literal search (e.g. "!" or "+" or "Ca+")
      - alternative is to save untokenized forms of all strings for exact searching
    - e.g. for the query "+" use a lucene query like "terminology:SNOMEDCT AND version:latest AND atoms.nameSort:"\+" (not sure if you have to escape the + if it is in quotes).
  - Then search on matches - this is the normal search
  - The trick here is to combine the results from the first search and the second search into a single list, where the ordered results from the first search are at the head of the list followed by the ordered results of the second search - but with any duplicates removed.
    - could consider "(literal search) OR (normal search)" as a single query.
    - This is better because then lucene can do the paging
  - if no results, Then do spelling correct and (then) acronym expansion and search again
    - Use Lucene SpellChecker class for this.
    - config/src/main/resources/data/spelling.txt
    - config/src/main/resources/data/acronym.txt
    - For obtaining words for spelling correction, use "FieldedStringTokenizer" with " \t" as delimiter
      - (consider this delimiter list later: " \t-({[)}]_!@#%&*\\:;\"',.?/~+=|<>$`^")
    - For obtaining words for acronym expansion, split only on " \t"
  - if no results, then try putting * after each term and search again
- should autocomplete algorithm include acronym expansion? NO

Marketing/SEO

Create a video demo of the site (camtasia) and post as a link on the header (video glyphicon if there is one)
Promote "ICD10" browser on various lists
Ensure all entry pages for applications contain SEO text for browsers and are indexed by google
Verify all entry pages for applications are officially mobile friendly
Consider advertising SNOMED/UMLS/ICD browsers on google adwords.
Training Video for UMLS browser (need 10 min)
Training Video for SNOMED browser (simple 5 min)

User Interface Enhancements

Add features for "deep" relationships when browsing UMLS.
- Need a generalized way to know when to use this
- it is definitely only for "concept"
- It may be that if any "atoms" of the concept don't match the terminology, then we show it.
Reimplement the component report as a directive (with service callbacks for history and other features)
Implement routing for terminologyId/terminology/version/type so we can preserve URLs. can even include ?query=... for the query
- Also clean up the way routes work first so we have
  - https://umls.terminology.tools/content
- instead of
  - https://umls.terminology.tools/#/content
- This is likely related to the starting URL redirected to https://umls.terminology.tools/#/
- See http://stackoverflow.com/questions/14319967/angularjs-routing-without-the-hash
- e.g. https://umls.terminology.tools/content/CUI/SNOMEDCT/2015_01_31/12738006
Support opening a concept in a new window (e.g. there's a pointing off arrow icon that opens a new window with a routing URL that shows exactly that concept - then drag/drop between windows can be editing mechanism).
- https://umls.terminology.tools/report/SNOMEDCT/2015_01_31/12738006
DONE: Websocket (for a WebsocketWorkflowListener)
- test it
- What to do with it??
Expression-based searching?
Consider adding "LABELFOR" all subsets and making the star pop up a picklist of the things to highlight (ordered by type with extensions first, subsets later)
RECURRING: Mobile-friendly and other style issues

Additional/Enhanced Loaders ****PRIORITY

RF2 delta/full loaders
- Implement mojo testing too
Owl loader (e.g. for NCIt) - will require use of "DL" features
- Examples: https://github.com/owlcs/owlapi/blob/version4/contract/src/test/java/org/semanticweb/owlapi/examples/Examples.java
- DONE: Step 1: create the infrastrucure (mojo, rest call, client, algorithm, get parameters right).
- Step 2: get OwlAPI into the project,
- Step 3: add complexity
- May need a "SNOMED" style and a general style to capture "distribution normal form" for SNOMED (e.g. rel groups, etc)
- Also have a corresponding Owl export feature (e.g. "release")
- Implement mojo testing too
- REasoner
  - JFact
    - example: http://jfact.sourceforge.net/Example.java
      <dependency> <groupId>net.sourceforge.owlapi</groupId> <artifactId>jfact</artifactId> <version>1.0.0</version> </dependency>
  - .Hermit
    - example: http://hermit-reasoner.com/java.html
      <dependency> <groupId>com.hermit-reasoner</groupId> <artifactId>org.semanticweb.hermit</artifactId> <version>1.3.8.4</version> </dependency>
  - ELK: **
    - example: https://code.google.com/p/elk-reasoner/wiki/ElkOwlApi
      <dependency> <groupId>org.semanticweb.elk</groupId> <artifactId>elk-owlapi</artifactId> <version>0.4.2</version> </dependency>
  - Snorocket
    - example: https://github.com/aehrc/snorocket
      <dependency> <groupId>au.csiro</groupId> <artifactId>snorocket-owlapi</artifactId> <version>2.7.2</version> </dependency>

Services

Action Service
- Implement classification.
- Need to go to/from Owl so do Owl loader FIRST.

Testing

Mojo integration tests
- RRF-umls load/unlolad
- RRF-single load/unload
- RF2-Snapshot load/unload
- Claml load/unload
- RF2-full load/unload
Implement additional unit tests for model objects (PrecedenceListJpa, LabelSet, etc)
RRF loader -> create label set for SNOMED (both "single" and "umls")
Handler002Test for normal use
Implement Handler003/008Test - for ID assignment algorithms. Borrow code from other project (though there may be differences). The uuidHash algorithm is implemented properly for UMLS and may be different than for SNOMED.

Editing Features

Basic metathesaurus editing
Project
- Figure out how to capture "project scope" for SNOMED and for UMLS in a generalized way. Update project objects to be able to properly capture (and compute) project scope. NOTE: the scope definition may involve concepts/terminologies/semantic types. IN that event, the scope computer gets a little bit more complicated.
Test loading a DB with envers auditing disabled and then making changes in a DB while it is enabled. Does it properly create audit entries?
- for the old edition of the component?
- for the new edition?
Metathesaurus editing actions
- MetathesaurusContentEditingRest
  - methods for each "edit action"
  - Create a RestImpl
  - Create a client
  - Create integration tests to run against a "stock" dev database
- Add a semantic type component, Remove a semantic type component
  - Have a pencil icon by the STYs section
  - clicking gives you a list of available STYs assigned, in tree order with a filter box for typing characters of the STY you want.
    - See the metadata "semantic type" option
  - User may want to choose multiple ones (so have a "finished" button)
  - Dont allow user to choose STYs already assigned to the concept.
  - Final action is to call "contentService.addSemanticTypeComponent"
  - Consider what happens to workflow status
  - Consider how to show "NEEDS_REVIEW" content in the browser
  - Consider how to support "Undo". - perhaps an action log (atomic/molecular) is a good idea still for that
- Implement this completely including testing before moving on to other actions (each which requires a UI enhancement)
  - Approve a concept (e.g. set workflow status values).
  - Add an atom (e.g. MTH/PN - that needs to be generalized somehow)
  - Merge two concepts (consider the "workflow status "when this happens).
  - Move an atom (or atoms) from one concept to another
  - Split an atom (or atoms) out of a concept and specify a relationship type between the two concept
Terminology Editing (first use case)
- Add a concept (as a child of an existing concept) with one or more atoms and a PAR/CHD relationship.
- Run the classifier
- Show classifier results (e.g. new inferred relationships, etc)
- NOTE: this only works with a description logic based source that tracks inferred relationships.
- PREREQ: SNOMEDCT RF2 loader.

Admin Tools

Test QA queries and flesh them out for 100% coverage.

Optimizations

TBD

Future Stuff

Test conditional envers auditing: http://stackoverflow.com/questions/14250612/conditional-envers-auditing
escg (expression grammar - research)
Use Lucene SynonymFilter with synonym table
Component-Component relationships (between any two components).
Value set definitions (and marking of subset.isValueSet()) and linking to definition? via attribute?
Owl loader, Owl export of DL terminologies (e.g. RF2-loaded SNOMED)
- http://owlapi.sourceforge.net/owled2011_tutorial.pdf
Rdf export (flat)
Classifier (owl interface)
Expression language (based on SNOMED expression constraint grammar)
Sub-branching
- branchResolutionHandler - figures out how to copy and mark branched objects and update indexes - for different branching strategies.
Handle mappings - may be not worth it
Implement an RF2 loader (use DL features)
Implement a ClaML loader
Support semantic network (e.g. sty rels, etc). - probably want to wait for a real ontology - and maybe even load it as such.