SPRINT 9 - Finish 20160531

Overview

Sprint to extend some of the "read only" APIs, to finalize unit/integration testing for existing functionality, to extend the interface, to develop additional loaders, and to begin developing editing capabilities.

Data Reloads/Code Deployments

  • REdeploy data (on backend machine), rebuild and redeploy frontend code
    • ...: UMLS
    • ...: VET
    • ...: SNOMED
    • ICD
    • OWL
  • Fix some bugs before rebuilding code (because they don't affect db).Atte

 

Strategic Directions

  • Marketing/SEO - Deploying a demo site with description, screenshots, and video
  • Core application enhancement
  • Editing features

Marketing/SEO

  • PG: New front-page website with description, screenshots, video
  • "Login" link (with logout link on footer)
  • Have the "upload" and "source" tabs from transformer here
    • "Snomed starter kit"
  • Create a video demo of the site (camtasia) and post as a link on the header (video glyphicon if there is one)
    • also screenshots that are clickable, etc.
  • Verify all entry pages for applications are officially mobile friendly
  • Update campaigns for  SNOMED/UMLS/ICD browsers on google adwords.
  • Training Video for UMLS browser (need 10 min)
  • Training Video for SNOMED browser (simple 5 min)
  • Find the SNOMED video that Anil sent, redo with RF2 - same content!
  • Other notes (from email)
    • Google adwords campaign
      • “free icd-10” browser, “free SNOMED CT US Extension browser”, “free UMLS browser
    • Get webmaster setup for the icd/umls/snomed pages
      • Use config setup for that
    • Find east bay healthcare startups
      • Pursue like job search (to know they’re hiring, etc)
      • Tech
      • Device
      • Information/informatics 
    • Meetups

Starter Kit - PG

  • ValidationResult precheckLoadFromSourceData
  • Application hardening
    • Error handling during configuration (bad db, username, password)
    • directory already exists?
    • DONE failed load
      • sourceData marked as FAILED -> only support certain actions
    • DONE cancelled load -> allow reload of delta
    • source data missing, some terminology data loaded
  • spelling.txt, acronyms.txt
    • package these files into the war. (so they are in WEB-INF/classes)
    • getResourceAsStream("spelling.txt") -> source.data.dir/spelling.txt
  • Terminology Starter Kit - ...
    • Run the .war file via Jetty all packaged as an exectuable .jar
    • Single zip file download containing
      • termserver.jar
      • data/ (spelling.txt, acronyms.txt)
      • indexes/ (lucene indexes)
      • hsqldb/ (database files) 
    • config.properties
    • StarterKitApplication (e.g. the jersey application)
    • Maven process
      • ssk-rest (module)
        • Build .war file
        • .war file must contain a completely ready config.properties file (e.g. in WEB-INF/classes) - put in src/main/resources 
          • no filtering required
          • exact hsqldb jdbc url, user, password
          • indexBase=./indexes
          • spellingFile=./data/spelling.txt
          • security.guest.disabled=false
          • mail server??
          • landing/ login/ intro page configuration
      • ssk-app/ (module)
      • pom.xml
        • Load database  (Rrf2SnapshotSourceDataLoadMojo)
          • specify indexes directory        (${project.build.dir}/app/indexes
          • specify hsqldb directory          (${project.build.dir}/app/db)
          • specify snomed (or input dir)  (-Dinput.dir)
        • Gather all resources into ${project.build.dir}/app
          • packages .war file (from ssk-rest) as an executable .jar that runs jetty, snomed-starter-kit.jar
          • package spelling.txt and acronyms.txt files in ${project.build.dir}/app/data
        • Use assembly plugin to zip everything from ${project.build.dir}/app/* into snomed-starter-kit.zip
  • DONE: Get initial starter kit running
  • spelling/acronyms data -> load, need source data loaders
    • copy "default" spelling/acronym files into the source data dir
    • then automatically load them 
    • support abiltiy to remove/reload as well
    • configure service ? source data service?

Core Application Enhancement - P1

  • BAC: Remove PFSC parameter,  convert integration tests to expression, etc.

  • BAC: fix hibernate properties in config.props (on deployments and in project)

  • DONE: RF2 loader improvements

    • Handle changes in name, obsolete, moduleId and other fields of AtomSubset, ConceptSubset, and MapSet objects
    • This affects the snapshot loader in that all three should be set properly initially
    • This affects delta loader in that manageSubset/manageMapSet should handle the case where the set already exists and update this data.
      • Additionally, when concepts are changed, ensure the idMap contains the udpated copy because these are used to derive the name, module id, and obsolete of the correlated concept (this is probably already true as that it is pulled out of the map if already exist).
  • DONE: Validate integration tests again

  • Component Notes:
    • Add transient field on retrieval to include full user name (i.e. Patrick Granvold instread of pgranvold) for modal display
    • Allow guest users to leave notes, view other guest notes. Filter out non-guest notes based on application role.
  • PG: Implement expression searching based on ECL

    • DONE: add to TerminologyLoaderAlgorithm
      • DONE computeExpresssions
      • DONE Add to the Rf2Snapshot/Delta/Full
    • TODO: Support for codes and descriptions
    • TODO: support for non-numeric ids
      • to support other sources
      • get/fix antlr grammar and regenerate parsers.
    • TODO: support for other loaders.
    • Replace the "search criteria" mechanism with this.
    • Identify as a search by parsing and confirming it matches.  
      • Instead, added "expression" to PfsParameter
    • DONE Build SQL or lucene queries from ECL.
    • See Kai's project on github: https://github.com/IHTSDO/snomed-query-service
    • Also show expressions (like diagrams) for a concept
    • User Interface
      • Add an "active" picklist:  "All" (DEFAULT), "Active", "{{metadataLabels.obsolete_label}}"  => queryRestriction: " AND obsolete:..."
        • Use pfsParameter activeOnly, inactiveOnly flags
      • DONE Descendant of search - textbox, turn into "<< 123904234" (drag/drop??, or "search" button?) - show the name next to it also.
      • DONE Member Of search - picklist of concept subsets (may requre rest method)
        • turns into "^ 12345001"
      • Attribute refinement
        • select list of additionalRelationshipType
        • Operator: =, <, or <<
        • text box attribute value concept
      • DONE Arbitrary expression - text box
        • ContentServiceRest/Jpa.countExpression - validates syntax, counts results - warn user with icon
        • In findConcepts -> count the expression, if >10K && there is a lucene query (isLuceneQueryInfo) -> fail search with an error msg.
  • DONE: Compute preferred name enhancements

    • support all modes, including multiple meta
    • tie to user preferences.
  • DONE: Review overhaul of loader architecture

    • Algorithm -> LoaderAlgorithm
    • SourceDataLoaderAlgorithm (interface)
  • Employ a consistent handleLazyIniti strategy
    • enforce this at release time (add to wiki)
    • public handleLazyInit(...)
    • ContentService doesn't need it because of "graph resolver" which is entirely responsible for content objects.
    • applies to "source data" and "project" and "metadata" services.
  • Normalize use of sortable tables across the application (search in content, admin service tables) - use ng-table
  • Enabling "cancel"/progress of loaders
    • RrfLoaderAlgorithm (can implement at abstract loader algorithm level)
      • cancel() should set a cancelFlag to true
      • commit/log statements should check the cancel flag and throw a CancelException
    • Add CancelException
    • Change content service
      • Add "precondition" checking to loaders (Algorithm)
      • Put all loader logic into "compute"  -> e.g. "open readers", "sort files", etc.
    • Source Data Loader
      • have try/catch
      • support "background" parameter
      • add cancelLoad(...)
      • load -> processId
      • getRunningProcesses(...)
    • Put a ticket on wiki to rewire mojos from content service to sourcedataLoaderService (then remove from content service the load methods
    • BUG Terminology Removerdoes not correctly handle mapsets 
      • Cannot delete or update a parent row: a foreign key constraint fails (`umlsminidb`.`mapsets_attributes`, CONSTRAINT `FK_a1tfp35h17fsbdl07p9xeex2h` FOREIGN KEY (`attributes_id`) REFERENCES `attributes` (`id`))

  • Find all <i> within <a> and remove the <a> - e.g. atoms directive
  • DSS: Bring over logging stuff from refset
    • objects - LogEntry, LogEntryJpa
      • Add terminology, version, 
      • activity -> LogActivity.LOADER, RELEASE, EDITING
    • RootServiceJpa methods
    • RootServiceRestImpl methods
    • getLog REST API call from ProjectService
      • keep existing method
      • add one with: instead of "projectId": and "objectId" take "terminology" , "version", and "activity"
    • think about what should add log entries (e.g. loaders, terminology removers) 
      • idea; anything that changes the db.
      • addLogEntry statements ONLY go in RestImpl layer. (and maybe the algorithms)
    • Put logging statements for loaders - e.g. Rf2SnapshotLoaderAlgorithm
      • Add three local methods
        • commitClearBegin(...., terminology, version, activity) 
          • super.commitClearBegin(...)
          • if (objectCt % logCt == 0) {
             addLogEntry...
            }

        • logAndCommit(...., terminology, version, activity)
          • super.logAndCommit(...)
          • if (objectCt % logCt == 0) {
             addLogEntry...
            }

        • logInfo(message , terminology, version, activity) - also for warn/error, but add the word "WARNING: " or "ERROR: " to the message
          • Change all Logger.getLogger(...).info(...)  to logInfo(...) calls
          • addLogEnry(...) = use "loader" as username
          • also Logger.getLogger(...).info(... )
  • PG:  Implement Diagramming
    • model transformation
    • Show ONLY for "description logic terminology"
  • Enhancements to RRF preferred name computer
    • support multiple UMLS's - e.g. have a high-level terminology/version
    • reuse the same default precedence list.
  • DSS; Project/User/UserPrefs stuff
    • User
      • Application role
      • user preferences
      • projectRoleMap (e.g. and project role map adapter)
    • Project
      • Remove "leads", "authors", "administrators"
      • remove "scope" stuff
      • Remove actionWorkflowStatusValues
      • userRoleMap (and userRoleMapAdapter)
    • Bring over UserRole from Refset -> replace term server stuff
    • Align Security Service Rest (and Jpa)
      • Bring over methods from refset that are missing
      • Reconcile differences in the implementation methods.
    • ProjectService Rest (and Jpa)
      • Remove the scope concept calculation
      • Add stuff about user/project (assign,unassign,find...)
    • next step: add an "Admin" tab.
      • basically borrow from refset and make work for this project.
  • User Preferences stuff
    • Bring over model from Refset tool.
    • Add "last query" (e.g. "brain" and whether it's "list" or "tree" mode).
    • Add "last report" (e.g. type/ui/terminology/version)
  • Remove "void addXXX" and "void removeXXX" methods from model objects. Use getObjects().add/remove(...) only
    • e.g. AtomClass doesn't need addAtom or removeAtom
    • Rewire any uses of them  
    • "addXXX" -> "getXXX().add"
    • "removeXXX" -> "getXXX().remove"
    • same if there is "clear"
  • Database Naming Conventions – we currently have two styles
    • Decide if we want tables named e.g. "concept_X" or "concepts_X"
    • Rename all units accordingly
  • TODO: Restore functionality of SourceDataRemoverMojo -> should call a service.
    • delete the data, delete SourceData, deletes SourceDataFile, also deletes corresponding file system files
  • TODO: precondition checking for SouceData load  = verify all loaders are implementing it
  • TODO: Rf2Full LoaderAlgorithm - move "Look through files to obtain ALL release versions" logic to the file sorter.
    • sorter.getReleases(): List<String>
  • TODO: Verify that concept subset members and atom subset members should appear (graph resolver)
  • TODO: SourceDataFileUtility - error handling for  extractCompressedSourceDataFile
    • // TODO Delete any successfully extracted files on failed load

  • TODO:SourceDataServiceRestImpl - URLs against content service and update java and js clients.
  • TODO: popout in content controller should call a contentService method - controllers shoudl never have URL fragments.  Also make the "simple" part a parameter.
  • TODO: remove the part of content controller that picks ICD10CM over ICD10 - actually, probably better to just remove ICD10 from CLAML load (UTS license doesn't cover it anyway).
  • TODO: Generalize the handling of "simple" in isTabShowing of tab controller
  • TODO: don't show Precedence list if it is empty (or say  "Precedence list: (EMPTY)")
  • BAC: Remove PfscParameter.
  • Bring TypeKeyValue over from tt project (including loader mojo and configurations), update tt pom.xml too.
  • Rework spelling/acronyms to be TypeKeyValue data instead -> rework spelling correction and algorithm lookup handlers.
  • loader "description logic"
  • App configuration -> don't require configure service.  , thus allowing the Jersey Application doesn't need it
    • contentService.js would just check the appConfig.isConfigured??.
  • Mapping REST APIs (if not already there)
  • Content service call for finding paths betwen two "component info"

    • e.g. CUI "distance".  with parameters to control types of relationships (could consider ECLish for this too).
  • Enhancements for ECL
    • General
      • The way expression builder is used is different than treepos/transitive closure.  In particular, it’s not wired to properly  handle “cancel”.  We should declare at the top level and instantiate like we do with the others
    • Rf2DeltaLoaderAlgorithm
      • Like with transitiveClosure and treePos, we need to have a “reset” that gets called for expressions so that we clear whatever is there first, THEN recompute the full expression index.
      • The Ecl “reset” method should clear the indexes for the given terminology/version
    • RrfLoaderAlgorithm
      • Should implement this, just like for Rf2Snapshot. (but for each source, see how tree position computer works)
    • ClamlLoaderAlgorithm
      • Should implement this, just like for Rf2Snapshot.
    • OwlLoaderAlgorithm
      • Should implement this, just like for Rf2Snapshot.
    • Advanced search
      • For “description logic” sources, we can show/support the full ECL
      • For non-“description logic” and non-“metathesaurus”, if we’ve computed indexes, we can still support a limited form of ECL.  In particular, I’m thinking about supporting “descendant” searches.  The “semantic type” does this at the top level, but there’s on reason we can’t arbitrarily support it at a lower level.  All non-metathesaurus sources have a hierarchy.
    • We should add ECL testing to the mojo integration tests
      • e.g. perform several test searches with ECL to verify they return results. 
      • That way we're validating that the loader is computing indexes.

Core Application Enhancement - P2

  • Enhance "additional relationship type" model to capture
    • "hierarchical"
    • role attribute
    • non-defining attribute (association)
    • historical attribute
    • mapping correlation attribute
  • Consider metadata API to obtain the actual objects, not just key/value lists
  • Rf2SnapshotLoader - attempt to determine the to terminology/version (e.g. from the map set name?)
  •  PG: Ronald Cornet ideas for starter kit: 
    • Add some capability to annotate concepts (and save those annotations, and then recover annotated concepts and export them for later use - e.g. SIRS integration).  
      Also could support some basic refset maintenance (extensional, plus

    • "favorites list"
      • add concepts to favorites (or remove them)
      • SHow notes if they are there 
        • only show notes for your user
      • support export -> TYPE, terminology, version, id, name, notes... -> \t delimted file with .xls
  • Expression Language Builder
    • Expanding tree of individual expressions
    • Allow all supported operations at each step for building of complex queries
  • User interface for mappings content
    • Have a separate tab for this
    • search by concept id
    • restrict to map set
    • search by target id
    • Show full mappings in a table
    • Need methods for searching mappings.
  • Tree Position model changes
    • Change nodeTerminologyId to terminologyId to parallelize with component wrappers (type, terminology, version, terminologyId)
    • In webapp, remove getComponentFromTree and use standard getComponent functions
  • Review all the static config info from the config file
    • determine which have a single option and which have multiple options
    • determine which can have a single implementation or need to be copied (e.g. search handler)
    • determine how API calls (at JPA) layer work, and how they work at REST layer - make the same.
    • etc
  • Refactor loader mojos and service calls to all work the same way and use source data loader.
  • Refactor Content Service (Rest)
    • Separate out concept functions so that they can be deployed without CODE and DUI  services
  • Release criteria integration tests
    • Verify no *java classes have System.out.println
    • Verify no *java classes have TODO
    • ...
  • Implement precedence list saving in updateUserPreferences - currently it just sets user precedence list to null
  • Implement glass pane for an individual component so that say "deep relationships" can load in its own time, but other elements can be interactive and ready to go when ready.
  • Support RF2 association refset members for non-concept members  (e.g. descriptions)
  • Enable glass pane while switching tabs (between content/metadata)
    • UNNECESSARY, as any calls will glass pane immediately, and if no calls, rendering is immediate
  • Advanced search
    • LATER: Expression. - e..g. "Search Criteria" - for "descriptionLogicTerminology"  only
      • descendants of
      • has relationship -> xx
  • Show "atom" subset information in the report.  Only concept (or component) subset (or refset) information is being shown.  this may involve a change in the graph resolver to return the data.  i.e. for Snomed you should be able to tell what is just british.

Editing Features

  • Drag/Drop - see this
  • Basic metathesaurus editing
    • Add/remove STY
    • Add/remove atom
    • Move atom
    • Split atoms from concept
    • Merge concepts
    • Add/remove concept relationship
  • Publication Process
    • RRF
  • Project
    • Figure out how to capture "project scope" for SNOMED and for UMLS in a generalized way.  Update project objects to be able to properly capture (and compute) project scope.  NOTE: the scope definition may involve concepts/terminologies/semantic types.  IN that event, the scope computer gets a little bit more complicated.
  • Test loading a DB with envers auditing disabled and then making changes in a DB while it is enabled. Does it properly create audit entries?
    • for the old edition of the component?
    • for the new edition?
  • Metathesaurus editing actions
    • MetathesaurusContentEditingRest
      • methods for each "edit action"
      • Create a RestImpl
      • Create a client
      • Create integration tests to run against a "stock" dev  database
    • Add a semantic type component, Remove a semantic type component
      • Have a pencil icon by  the STYs section
      • clicking gives you a list of available STYs assigned, in tree order with a filter box for typing characters of the STY you want.
        • See the metadata "semantic type" option
      • User may want to choose multiple ones (so have a "finished" button)
      • Dont allow user to choose STYs already assigned to the concept.
      • Final action is to call "contentService.addSemanticTypeComponent"
      • Consider what happens to workflow status
      • Consider how to show "NEEDS_REVIEW" content in the browser
      • Consider how to support "Undo". - perhaps an action log (atomic/molecular) is a good idea still for that
    • Implement this completely including testing before moving on to other actions (each which requires a UI enhancement)
      • Approve a concept (e.g. set workflow status values).
      • Add an atom (e.g. MTH/PN - that needs to be generalized somehow)
      • Merge two concepts (consider the "workflow status "when this happens).
      • Move an atom (or atoms) from one concept to another
      • Split an atom (or atoms) out of a concept and specify a relationship type between the two concept
  • Terminology Editing (first use case)
    • Add a concept (as a child of an existing concept) with one or more atoms and a PAR/CHD relationship.
    • Run the classifier
    • Show classifier results (e.g. new inferred relationships, etc)
    • NOTE: this only works with a description logic based source that tracks inferred relationships.
    • PREREQ: SNOMEDCT RF2 loader.

Other Things

  • Create initial framework for JS unit testing
  • Verify that ClaML loader makes asterisk/dagger attributes for modifier concepts - e.g. M01.4 and M01.41
  • Integration tests for static code analysis (e.g. *.code.XXTest.java)
    • REST Service
      • opened sessions are closed in finally blocks.
  • For "semantic type" mechanism - if there are tree numbers, order by that and indent"
  • Labels – filtering out LABELFOR labels should happen in getPagedArray function in labels.js, and not in the html via ng-show
  • OWL - NCIt

User Interface Enhancements

  • Do something useful with websocket
  • Expression-based searching (get harold's parser?)
  • Consider adding "LABELFOR" all subsets and making the star pop up a picklist of the things to highlight (ordered by type with extensions first, subsets later)
  • RECURRING: Mobile-friendly and other style issues

Additional/Enhanced  Loaders

  • Owl loader - have a snorocket (2) reasoner for role relationships.

Services

  • Action Service
    • Implement classification.
    • Need to go to/from Owl so do Owl loader FIRST.

Testing 

  • Implement additional unit tests for model objects (PrecedenceListJpa, LabelSet, etc)
  • RRF loader -> create label set for SNOMED (both "single" and "umls")
  • Handler002Test for normal use
  • Implement Handler003/008Test - for ID assignment algorithms.  Borrow code from other project (though there may be differences).  The uuidHash algorithm is implemented properly for UMLS and may be different than for SNOMED.

Admin Tools

  • Test QA queries and flesh them out for 100% coverage.

Optimizations

  • TBD

 

Future Stuff

  • Test conditional envers auditing: http://stackoverflow.com/questions/14250612/conditional-envers-auditing
  • escg (expression grammar - research)
  • Use Lucene SynonymFilter with synonym table
  • Component-Component relationships (between any two components).
  • Value set definitions (and marking of subset.isValueSet()) and linking to definition? via attribute?
  • Owl loader, Owl export of DL terminologies (e.g. RF2-loaded SNOMED)
  • Rdf export (flat)
  • Classifier (owl interface)
  • Expression language (based on SNOMED expression constraint grammar)
  • Sub-branching
    • branchResolutionHandler - figures out how to copy and mark branched objects  and update indexes  - for different branching strategies.
  • Handle mappings - may be not worth it
  • Implement an RF2 loader (use DL features)
  • Implement a ClaML loader
  • Support semantic network (e.g. sty rels, etc).  - probably want to wait for a real ontology - and maybe even load it as such.