SPRINT 3 - concludes November 28

SPRINT 3 - concludes November 28

Overview

Sprint to develop core features of the application

Fleshing out Infrastructure

  • DONE: Implement the metadata refset entries (including equals, hashcode, and copy consturctors)
    • RefsetDescriptiorRefSetMember
    • DescriptionTypeRefSetMember
    • ModuleDependencyRefSetMember
  • DONE: Implemement the Jpa services (add/remove/update/get/set) for these as well
  • DONE: Metadata service should handle these too:
    • attributeDescription (refset descriptor)
    • attributeType (refset descriptor)
    • descriptionFormat (description type)
    • refsets
  • DONE: Graph resolver should set transient count fields in component objects.
  • DONE: Model fields directly that are counts of connected objects , e.g. concept.getRelationshipCt().  that way an interface can know exactly from the component level whether ther is more of a particular type of information.  This affects the add/remove/set methods of the Jpa API.
  • DONE: Currently conceptJpa and descriptionJpa wire the connected data to themselves - have the graph resolver do this instead to ensure it's defined formally.
  • DONE: ComputePreferredNameHandler
  • DONE: GraphResolutionHandler
    • handle "lastModifiedBy" on updates - are there separate methods for this?
    • handle "resolveAll" to read (e.g. deep copy)
  • DONE: GetXXXRefSetMembers(terminologyId,terminology,version) - for refset id - with paging
  • DONE: ReleaseInfo object
  • DONE: SearchCriteria object
  • DONE: How do we know something’s a release state? - it has an effective time that is null.
  • Handling default preferred name and id computation for concepts on say a description change

    • Put this into the REST service as an atomic operation (check for ID change, recompute preferered name).
  • PG: Concept - remove "inverse relationships"

    • Remove graph helper
    • add findChildren, add findDescendants to ContentService, ContentServiceJpa
      • the findChildren and findDescendants should return ConceptList
      • findDescendants should use transitive rels and take pfs parameter - fully implement it (see applyPfsToQuery).
    • Rewrite metadata handlers - to use these call
    • Verify that ICD9CM metadata services work.
  • Re-enable computation of transitive closure in delta loader - disabled for demo

  • Metadata - stated/inferred char types should be metadata methods - no way for transitive closure computer to know which rels to ignore and which to pay attention to.  In doing this, change the TerminologyUtility to answer the question differently.
  • Claml loader transitive computer - commit only once - need a setting for this and to test it.

  • DONE: New lists (and history methods for each)

    • AssociationReferenceConceptRefSetMember.java
      AssociationReferenceDescriptionRefSetMember.java
      AttributeValueConceptRefSetMember.java
      AttributeValueDescriptionRefSetMember.java
      ComplexMapRefSetMember.java
      ModuleDependencyRefSetMember.java
      RefsetDescriptorRefSetMember.java
      RefSetMember.java
      SimpleMapRefSetMember.java
      SimpleRefSetMember.java

Fleshing out APIs

  • Content service methods for adding/removing inferred rels in batch.

  • DONE: Wire graph resolver to REST calls for content service
  • DONE: Make sure setting of "last modified" is controlled either at the JPA level or the REST level and not by the application.
  • HistoryService
    • getRevisions (without boolean)
    • getReleaseRevision(id, String release) - pick any one with matching release - get first one
  • Configuration setting for synchronizing access to write services
  • PG: QUESTION: when setting a field and then removing an object- what happens - does audit trail record include this change?
    • ANSWER:  Setting a field and removing the object does not include the change.  The object must be updated first, then deleted
    • SOLUTION: don't worry about it
  • Action service timeout – configurable
  • ContentService (and content change service) - have a "deep" delete on concept and description that deletes connected refset data too.
  • Develop placeholder API calls across the board - add all calls to Rest and RestImpl and Client layer
    • SecurityService - DONE
    • MetadataService - DONE
    • ValidationService - DONE
    • ContentService/ContentChangeService - REST "change" services need to take a user and set lastModifiedBy.
      • SEPARATE read/write calls into separate services, so a simple read-only terminology server can be extracted from this.
      • TTN: For each model object (concept, description, relationship) - do refsets later
        • Add/remove/update - must take a "user" parameter and call "setLastModifiedBy" based on that user (before using graph resolver)
        • Get (by terminologyId, terminology, version)
      • Clear/compute transitive closure
        • Takes a “root” node and computes transitive closure from that starting point
      • For each non–cascaded type, have a get method by its grouping id
        • getXXXRefSetMembersByConcept(terminologyId, terminology, version)
        • getXXXRefSetMembersByDescription(terminologyId, terminology, version)
      • Get ancestors(terminologyId, terminology, version)
      • Get descendants(terminologyId, terminology, version)
      • Get children(terminologyId, terminology, version) - based on a root node and terminology - respect the ConceptJpa.childCt flag.
      • Add concepts (ConceptListJpa) – takes a list of concepts and inserts all of them (batch import)
        • Atomic operation – entirely succeeds or entirely fails
      • Add descriptions(DescriptionListJpa) – takes a list of descriptions and wires them to the correct concepts, which are presumed to exist, and inserts all of them
        • Atomic operation – entirely succeeds or entirely fails
      • Find concepts – with paging/sorting/filtering
        • Query based search – based around Lucene query syntax
        • Semantic based search – extension of query searching but supports the following features, SearchCriteria
          • Active vs. inactive
          • Module id
          • Ancestor (e.g. find concepts that are descendants of this)
          • Primitive vs. fully defined
          • Is source concept of a certain relationship type (with a particular destination)
          • Is destination concept of a certain relationship type (with a particular source)
          • Multiple conditions can be combined for conjunction.
          • Disjunction is supported via multiple calls
    • HistoryService
      • For each model object (concept, description, relationship, etc.)
        • Get components modified since a certain date - based on last modification date
        • Get release history – obtain a list of known releases
        • Get all component revisions in a date range for a particular id – all edits
        • Get all component release revisions in a date range for a particular id – only published states
        • Get the component revision corresponding to a particular id and point in time (last one before this date or equal to it)
      • Get concepts changed since certain date – performs a “deep” search for all concepts where it or any of its components have changed in the relevant period
      • Get current release
      • Get previous release
      • Get release history(String release)
    • BAC: ActionService
      • Configure action service – takes parameters potentially needed by services and returns a session token.
      • Get progress – takes a session token and reports progress of the current action. The correct use of the service is to perform only one action at a time.
      • Cancel – takes a session token and cancels any long running operation currently in progress under that token.
      • Prepare to classify – takes a terminology and a session token and prepares data structures for full classification. This mostly involves building classifier axioms from the data. In theory, this only needs to be done once per session (assuming only add operations).
      • Classify – takes a session token, verifies that “prepare” successfully completed, and performs a full classification, leaving the classified ontology in memory for later retrieval.
      • Incremental classify – takes a session token, verifies that “prepare” and a full classification were performed, obtains changes since last classification run, adds needed axioms, and performs an incremental classification. Note: incremental classification is not supported if changes include retirement or removal of content – only additions are supported.
      • Get equivalencies – takes a session token and returns classifier reported equivalent concepts.
      • Get new/old/unchanged inferred relationships – reports on the inferred relationships resulting from classification. New relationships need to be added, old relationships need to be retired (or removed) and unchanged inferred relationships are available merely for reporting.
      • Compare concepts – takes two concepts and compares them returning a “validation result” detailing the differences. This method is essentially used for conflict analysis and is very similar to how map records are compared within the mapping tool. Resolution of the conflict is entirely up to the application.
  • Develop REST client calls for each of these as well - in preparation for integration/functional testing.

 

ID Assignment

Choose an ID strategy

  • DONE: Application Managed - no assigning of terminologyIds, thus if they are null, it throws an explicit exception
    • No special treatment at release time
  • DONE: SCTID-UUID Identity Managed - each data type is assigned a UUID based on its "identity" fields.
    • Concepts are a special case and assigned a UUID based on parent concept and fsn? and is recomputed when it changes
    • Otherwise "equivalence" is determined later by the classifier.
    • whenever a concept changes it should be subject to recomputation of its terminology ID
      • assigned SCTIDs do not change
    • At release time, UUIDs get converted to SCTID based on "max" prior identifiers and the verhoff checksum algorithm.
  • DONE: Snomed release id handler
  • DONE: On a component change in the REST layer, verify whether "identity" fields have changed by checking the identifier of the object both before and after the change - they should be equal -  but this is only true if using hash based identity. - how do we generalize this - maybe part of the ID assignment handler - e.g. "allowIdChangeOnUpdate".  Also handle the case 

 

Workflow listener

  • DONE: Fire events for services that "change" data and also for commit and beginTransaction

Transitive Closure Improvements

  • Incremental TC maintenance
    • New relationship
    • Retired relationship
    • New concept
    • Retired concept

Algorithms

  • Release Processing
    • Begin Release
      • Create release info object for this release
    • Perform Release (real or fake IDs)
      • Assign IDs and effective time (compare to previous release states)
      • Generate delta files
    • Finish Release
      • sets the planned release to "not planned" and "published"
      • takes a "planned next release" and creates the infrastructure for it.

 

Full and Extension/Edition loaders

  • Refactor loaders
    • Use local variables for file handles, sorting, etc.
    • Refactor Rf2Snapshot as an algorithm
    • Refactor Rf2Delta as an algorithm
    • Refactor ClaML as an algorithm
    • Refactor to find corresponding files anywhere in the directory structure (e.g. FileUtils.listfiles(dir,"*txt", true);
  • Delta loader
    • Create or update the corresponding release info object - this is the "isPlanned" release - fail if not present or not marked as "is planned"
    • load everything into memory.
    • persist objects
    • recompute concept preferred names
    • identify and retire "removed" content - e.g. things with null effective times that are not in current refset
    • commit all
  • Snapshot loader
    • Create or update the corresponding release info objects
    • Load everything for one release into memory, cache concepts
    • recompute concept preferred names from graphs
    • For all concepts in cash, persist each one, then persist each connected object
    • commit periodically
  • Full loader
    • Same as snapshot loader but cycle over it for each "effectiveTime" value.
    • Delta loader needs to accommodate other refset file types - build on existing infrastructure

Admin tools

  • Refactor to take "run.config", a profile, and all other parameters as config properties.  The "run.config" is merely used for DB and service configuration and never for mojo parameter passing. 

 

 

Unit/Integration/Functional tests

  • Test fully the logic of each call based on the demo data. (use browser)
  • BAC: Unit tests for equals, hashcode, and copy constructors for model objects (use getter/setter/tester)

Concurrency and Locking

  • support a means to synchronize write access - e.g. make it a configurable feature of ContentChangeRestImpl.

Classifier

classifier integration

  • Sequence of events
    • Call actionService.configure( pass it a string list with "PUBLISHED" as the only thing)
    • prepareToClassify();
      • this loads classifier data structures
      • periodically, close and open the content service (every 2000 concepts)
        • or add a clear() to root service
    • handle progress monitoring - just put a progress into the tokenProgressMap.
    • when handling relationships for classification - ONLY treat stated rels (not inferred or additional)
    • classify()
      • handle classification
      • save concept equivalents
      • save old inferred relationships
      • save new relationships
    • getConceptEquivalents - just look up classification data
    • getOldInferredRelationships - just look up these
    • getNewInferredRelationships - just look up these
  • Other things to pay attention to in action service
    • have a "clear" method that takes a session token and removes all tracked data
    • tokenCheck should clear your data structures on token timeout.

Supporting Owl EL 2

  • Nesting - supported by anonymous concepts
  • GCI
    • multiple sufficient conditions - anonymous concepts with "equivalentClass" relationship
    • complex left hand side - anonymous concepts with "equivalentClass" relationship
  • Metadata model
    • Role (isTransitive, isReflexive)
      • DataRole(getDomain - e.g.)
      • Domain - anonymous concept expression
      • Range - anonymous concept expression
  • Disjoint sets - set of sets of concepts
  • property chains - new objects and services - classifier uses this.
    • PropertyChain (List<Concept>->Concept)
  • never group rels (metadata should be in terminology)
  • data type properties (data relationships) - new objects and services
    • like relationship but with an operator, and a value (could reuse the relationships table).
  • Concept instances - new objects and services (e.g. states, etc)
  • Templates build a service for this that replaces the "edit" service

 

Supporting Expressions

  • Post-coordinated expression service using expression syntax - creates a concept without descriptions and a UUID concept id
  • Query language - sparql?