Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

https://wcinformatics.atlassian.net/secure/RapidBoard.jspa?rapidView=1&projectKey=NE&view=planning.nodetail&selectedIssue=NE-8&epics=visible

Questions

...

  • Production deploy
    • 6/12/2017 currently editing about 900 RADLEX concepts on MEME4
    • Early July will likely add May/June NCI thesaurus together into MEME4 (not sure on size, complexity of this - and also dependent on new OWL2 inversion process)
    • Transfer to new system for editing will happen 2nd half of July or later
    • Will likely try to add NCI thesaurus at that point and shortly after add MTH
    • Documented on NCI wiki on page "Loading MEME from NCI Meta"
  • Priorities
    • Documentation
      • IN PROGRESS: NE-325: Screen shots (JFW)
      • TODO: Text on wiki (BAC)
      • NE-308: 30-sec to 1 min training videos (DSS, RAW)
    • Mini run-throughs (stock dev build + running stock processes) - determine "correctness"
      • Pre-production,  Release (DONE)
        • CORRECTNESS: compare output to input (REQUIRES work on MRSAT, MRREL, MRSAB, and MRHIER(MSH))
      • Pre-production, Prod-Mid Cleanup, Feedback (requires a "release directory" (matching pre-prod) with MR files in it, can be exactly same as input MR files) (DONE)
      • NCI Insertion, SNOMEDCT_US Insertion, MTH Insertion
        • CORRECTNESS: verify assumptions and volume of data related to each step of the insertion.
      • NCI Insertion, Pre-production, Release
      • NCI Insertion, Pre-production, Prod-Mid Cleanup, Feedback
    • Scale testing
    • Feedback/Fixes - see Jira
      • NEW - support ability to edit computeHierarchy flag for root terminologies in the UI
      • NE-317: Upgrade GenerateNciMetaDataMojo to load ALL workflow configurations from files, not just QA.
      • NE-319: JFW: use your own local deployment of the tool to edit and test queries.
      • NE-318: RAW: the "make checklists for report tables" is always producing checklists of size zero. - this can't be right can it?
      • There is still some concern about management of retired project concepts
        • prod-mid cleanup should remove any old version atoms (as it does)
        • NE-322: It should then remove any "empty" concepts (no atoms) that were not assigned a CUI (e.g. id = terminologyId).
        • NE-323: Reload CUI history should ensure that there is not more than one project concept in the database with the same CUI assignment (as terminologyId). 
      • ongoing work.
      NE-302: Prepare a ResetProductionNciDatabase integration test (and corresponding mojo) - based on the current one but without stuff we don't need.
      • should load all workflow configs from files
        • ). 
      • ongoing work.

Diagrams

...

  • DONE: MatrixInitializerAlgorithm
  • DONE: StampingAlgorithm
  • DONE: LexicalClassAssignmentAlgorithm
  • DONE: ComputePreferredNamesAlgorithm (code, concept, descriptor)
    • computes and sets preferred names where doesn't match
    • also computes and sets "publishable" where doesn't match
  • DONE: ReindexAlgorithm
    • indexed objects parameter
  • DONE: RelplaceAttributesAlgorithm extends AttributeLoaderAlgorithm
    • attributes.src (in the inputPath)
    • Any source, attribute_name combo in the file gets removed - match on terminology, version, name
    • attributes.src gets loaded as normal.
    • ALTERNATE: implement as AttributeLoaderAlgorithm with a "replace" parameter (boolean)
    • ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
  • *IN PROGRESS: ReplaceRelationshipsAlgorithm extends RelationshipsLoaderAlgorithm
    • relationships.src, contexts.src (in the inputPath)
    • Same as above
    • Match on terminology,version, relationshipType, additionalRelationshipType (and inverses)
    • ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
  • *IN PROGRESS: ReplaceContextsAlgorithm extends ContextLoaderAlgorithm
    • Same as above
    • Match on terminology, version
    • ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
  • *IN PROGRESS: RecomputeContextsAlgorithm - calls remove tree positions, then compute tree positions
  • TODODONE: use rootTerminology.hierarchyComputable to know whether or to load/compute (update Context Loader)
    • root terminology editing in interface needs to support changing this.

...

  • DONE: WriteRrfMetadataFilesAlgorithm
    • *write release.dat also
  • DONE: WriteRrfContentFilesAlgorithm
    • Write AMBIG and CHANGE files.
  • DONE: WriteRrfHistoryFilesAlgorithm (inlcluding writing NCI files)
    • IN PROGRESS: refactoring..
  • DONE: WriteRrfIndexesFilesAlgorithm
  • DONE: ValidateReleaseAlgorithm (referential CUI checking, etc.)
  • IN PROGRESSDONE: RunMetamorphoSysAlgorithm (this makes METASUBSET which is used for packaging)
    • This also uses make_config.csh (now in project) to build the final prop files
  • DONE: PackageReleaseAlgorithm
    • Create the final .zip file?  = see the wiki instructions

...

  • DONE: Application Config/ Project Management/ WorkflowConfig management
    • AccessRestriction - By Project (based on user role??)
      • READ_ONLY - null
      • ADMIN (maintenance/insertion is OK, but not authoring) - checked by meta editing service and workflow service (performWorfklowAction).
      • AUTHORING (all changes are allowed) - (double check this when actions are performed - e.g. service?)
      • Update project should send a websocket event (so that ui can prevent editing)
    • Processes need to check access restriction before starting - checked by process service
    • Process has isRunning and ProcessService will have "isProcessRunning"
      • This represents the global lock.
    • ProcessConfig/Execution, AlgorithmConfig/Execution, process execution should point back to its config, but also copy it (sub/superclass)
  • DONE: Content and Metadata Model Objects
    • DONE: Hierarchies like MSH - not based on transitive closure - need to udpate the RRF loader to support inserting tree positions rather that computing them (but needs configuration)
      • AUI -> CODE -> CODE -> SDUI -> SDUI*

    • DONE: AtomTreePosition/Jpa/UnitTest
    • DONE: SRC concepts - affects RrfLoaderAlgorithm
      • Load code also as a corresponding "organizing class type" (e.g. for RHT) - ?
      • Properly fix the relationships as well.
    • DONE: Rels where type of id1 is different than type of id 2
      • Could just use atom-atom rels with a relationship attribute of the real sg_type1/2
      • ComponentInfoRelationship
    • DONE: Rels represented in both directions (and UIs) 
      • Including reconciling misuse of "relationship group" in both directions.
      • Always use "blank" instead of 0.
      • Handle cases of essentially duplicate RUIs...
        • either disambiguate (e.g. with a DA flag)
        • or clean up the sample data
        • however, the loader should identify and/or correct this data condition. (e.g. identify where RUIs and inverses are ambiguous and just re-assign RUIs completely from scratch)
        • this will matter much more when handling full data.
    • DONE: Atom -> modeled as lowerNameHash and uses MD5
    • DONE: User -> "team" (for modeling groups), then project can be "isTeamBased" like mapping tool.

  • Editing
    • DONE: REST API : MetaEditingServiceRest (Client/Impl)
      • add/removeSemanticType(Long projectId, Long conceptId, SemanticTypeComponent, authToken)
      • merge, move, split
      • add/removeRelationship (concept level)
      • add/remove/updateAtom
      • Approve concept
    • DONE: atomic/molecular actions
    • DONE: ID assignment
      • Perform UI assignment in most cases during actions
      • Do not perform terminologyId assignment for SemanticTypeComponent
      • Do not perform terminologyId assignment for ConceptRelationship (e.g. for UMLS relationships)
      • Do not perform terminologyId assignment for CUIs (e.g. UMLS concepts)
    • DONE: Support uploading an editor manual (track as project attachments - like "ReleaseArtifact" -> rename as Attachment)
  • DONE: Insertion - like a "loader"
    • Recipe
      • Steps are "algorithms" with configuration.  Just add a step, remove a step, reorder a step, or reconfigure a step.
      • all written to be agnostic about SAB.
    • Complete UI handling
    • Attribute -> Defintion, Subset, Mapping, SubsetMember, Mapping (and requisite attributes and ID assignments)
    • Compute delta
    • source data loader?, link to local file system - uses ContentServiceJpa
    • insertion recipes (tracked over time). 
    • src_atom_id handling (need to track in DB because of cross-source relationships)
    • loading data (unit or batch?)
    • merge engine
    • matching/demotions
    • atom ordering
    • Ensure that update releasablity is done before merging so "publishable" is a proxy for "version"
    • Maintenance Tools for insertion
      • Mark deleted CUIs as deleted (e.g. instead of bequeathing them)
      • bequeathing concepts based on matching criteria
      • Performing merges based on matching criteria (e.g. query-based merges)
  • Maintence Tools
    • DONE: Precedence list management
    • Cluster type STY management (Brian has no recollection what this is)
    • DONE: Source information.
    • DONE: insert attributes, insert stys, insert relationships, recompute tree positions ,... (see $MEME_HOME/bin)  (already done in Insertion process algorithms)
  • Workflow
    • DONE: start with workflow stuff from Refset
    • DONE: Model objects
      • TrackingRecord (origConceptIds, componentIds, clusterId, clusterType, etc).
      • Epoch
      • ME, QA, AH bins - WorkflowConfig, WorkflowBin
      • Worklist, Checklist
      • WorkflowBinStatistics
      • WorklistStatistics
      • WorkflowBinDefinition
        • Track "isRequired" as a flag indicating required for release.
    • DONE: WorkflowActionHandler
    • DONE: Services
      • clear/regenerateBin(s)
      • createChecklist/Worklist
      • getWorkflowB
    • DONE: Checklist
      • Creating checklist from a workflow bin
      • Creating checklist from a query (SQL/HQL/Lucene)
      • Creating checklist from a list of conceptId
      • Creating checklist from a list of clusterId, conceptId
      • Creating checklist form a file of (conceptId)
      • Creating checklist form a file of (clusterId, conceptId)
    • DONE: Worklist
      • Creating a worklist -> should probably specify the team
    • DONE: Stamping (batch "concept approval" action)
    • DONE: Semantic type categories - chem/nonchem
      • Model as part of Project -
    • DONE: When lists are returned (e.g. on "finish"), track the edit/review time.  
      • Worklist/Jpa .get/setAuthor/ReviewerTime
    • DONE: Track "team" of a worklsit
      • Worklist/Jpa get/setTeam
      • Project/Jpa get/isTeamBased
    • DONE: Import/Export
      • export worklist (clusterId\tconceptId\tname)
      • export checklist
      • create checklist from file
      • export workflow config
      • import workflow config (on project editing)
  • QA
    • DONE: Matrix init (recompute concept status based on workflow status of embedded objects).
    • DONE: Validation for objects (concept, atom, etc.) as well as validation for actions (e.g. move, merge, split)
    • IN PROGRESS (JFW): MID Validation - query-based validation that feeds into workflow system (e.g. "create checklist")
    • DONE: EMS QA Bins - 
    • DONE: Sty-coocQA that will get left out
    • Counts, Comparisons, Adjustments, Sampling
    • STY QA
    • Research Unmapped Identifiers
    •  

  • Reporting
    • DONE: Concept reports - unit and batch modes
    • DONE: Query-based reports (role-based) (all reports from MEME4 are implemented in new system)(reporting code has been brought over)
    • n/a: Tools for researching issues in inversions (no one used these)
    • DONE: Canned reports
      • Daily editing report

        Code Block
        EMS v3 Daily Editing Report for Dec 01, 2016
        Database: memestg
        Time now: Fri Dec  2 06:02:22 EST 2016
        
        Concepts Approved this day: 105
                          Distinct: 105
        Number of actions this day: 193
        
        Shown below are editing statistics for each authority.  The E-{initials}
        authority shows approvals done in the interface while the S-{initials}
        authority counts batch or stamping approvals.  The percentages show
        the proportion of each, by editor.
        
        Authority  Actions  Concepts Approved  Rels Inserted  STYs Inserted  Splits  Merges
        ---------  -------  -----------------  -------------  -------------  ------  ------
           E-LAR       100        38 (100.0%)         19             1           1      15
        
           E-LLW        93        67 (100.0%)          4             4           2       0
        
        --------------------------------------------
        For more detail, follow this link to the EMS
  • Production
    • DONE: Bequeathal relationship strategies.
    • DONE: LUI reassignment!
    • DONE: CUI assignment
      • last assigned cui
      • last released cui
      • ConceptIdentity table to track max id?
    • DONE: Semantic type Component ATUI assignment
    • DONE: ConceptRelationship RUI assignment
    • DONE: CUI history
      • Model by leaving old concepts around and linking them to "live" concepts (for bequeathal)
      • Need to update concept history with new bequeathal rels, follow recursion, etc.
    • DONE: AUI history
    • DONE: When creating MRHIER, add in the SRC/RHT layer as needed
    • DONE: Incremental release (or export of a single source)
      • what about CUI assignment?
    • DONE: Begin editing cycle?
    • DONE: Abiltity to export RRF for just a single source?? (what about CUIs - can force temp CUIs)
  • Cross-cutting
    • DONE: Websocket
    • DONE: Disable editing (e.g. don't allow editors to make changes) - how to store this?
      • for admin processes, etc?
      • ProjectJpa flag?
    • DONE: Query engine - HQL, SQL, LUCENE -> produces clusterId, conceptId

...

Out Of Scope (mostly things for NLM)

  • QA that will get left out
    • Counts, Comparisons, Adjustments, Sampling
    • STY QA
    • Research Unmapped Identifiers
  • TOP level relationships PAR/CHD involving SRC atoms
      • need approval
  • MRAUI history tracking for past releases (just produce the current release atom movements)
  • Hierarchies with different kind of objects used (e.g. CODE-SDUI)
    • Resolution of top-level MSH hierarchy using different SG_TYPE than rest of tree. → must be the same
    • top-level SRC atoms are excepted
  • Mappings loaded from .src that do not use SRC_ATOM_ID are not supported
  • "ST", "DA", "MR" attributes (not present in NCI-META)
  • Support for tobereleased = Y, y, n, N
    • There is "publishable" flag and it is true or false.  that's it.  other mechanisms would need to be used to make this distinction. 
  • Custom Safe replacement steps for different terminologies/termTypes within same set of .src files. 
  • Special support for non-ENG content
    • Hiding/showing in UI
    • Moving/Merge/Split moving non-ENG atoms along with "translation_of" ENG ones.
  • Report stuff:
    • LEXICAL_TAG, LEGACY_CODE, EZ/RN:EC NUMBER
  • Definition or LT editing
  • Some integrity checks - MGV_B2, MGV_D, ...
  • Relationship type "LK"
  • P level relationships  (now there are just atom-atom and a workflowSTatus of "DEMOTION").
  • Embryo concept status
  • Team assignment for worklists.
  • Content Views
  • AH bins
  • Cluster types beyond "chem" and "nonchem"
  • QA "sampling" and UI
  • Map metadata editor
  • Map set viewer
  • Source information management system
  • Handling of AQ/QB relationships in a special way for release (these are just regular relationships)

...