Page Comparison

...

https://wcinformatics.atlassian.net/secure/RapidBoard.jspa?rapidView=1&projectKey=NE&view=planning.nodetail&selectedIssue=NE-8&epics=visible

Questions

Approve/next does what when working through a worklist? (what if there's a filter or sort, and what if it reaches the end but you started in the middle?).
Loading a Database with Full 201610 Data (including scale testing insertions and release)
Loading a Database with Sample 201604 Data (including testing insertions and release)

...

Production deploy
- 6/12/2017 currently editing about 900 RADLEX concepts on MEME4
- Early July will likely add May/June NCI thesaurus together into MEME4 (not sure on size, complexity of this - and also dependent on new OWL2 inversion process)
- Transfer to new system for editing will happen 2nd half of July or later
- Will likely try to add NCI thesaurus at that point and shortly after add MTH
- Documented on NCI wiki on page "Loading MEME from NCI Meta"
Priorities
- Documentation
  - IN PROGRESS: NE-325: Screen shots (JFW)
  - TODO: Text on wiki (BAC)
  - NE-308: 30-sec to 1 min training videos (DSS, RAW)
    - Notes for Training Videos
- Mini run-throughs (stock dev build + running stock processes) - determine "correctness"
  - Pre-production, Release (DONE)
    - CORRECTNESS: compare output to input (REQUIRES work on MRSAT, MRREL, MRSAB, and MRHIER(MSH))
  - Pre-production, Prod-Mid Cleanup, Feedback (requires a "release directory" (matching pre-prod) with MR files in it, can be exactly same as input MR files) (DONE)
  - NCI Insertion, SNOMEDCT_US Insertion, MTH Insertion
    - CORRECTNESS: verify assumptions and volume of data related to each step of the insertion.
  - NCI Insertion, Pre-production, Release
  - NCI Insertion, Pre-production, Prod-Mid Cleanup, Feedback
- Scale testing
  - ResetNciMetaDataDatabase - using 201610 data (input.dir = <dir with https://wci1.s3.amazonaws.com/NCI/NCIM_201610.zip>)
  - NCI Insertion - DONE
  - SNOMEDCT_US Insertion - IN PROGRESS
  - MTH Insertion
  - Pre-Production, Release
  - Prod-Mid Cleanup, Feedback
- Feedback/Fixes - see Jira
  - NEW - support ability to edit computeHierarchy flag for root terminologies in the UI
  - NE-317: Upgrade GenerateNciMetaDataMojo to load ALL workflow configurations from files, not just QA.
  - NE-319: JFW: use your own local deployment of the tool to edit and test queries.
  - NE-318: RAW: the "make checklists for report tables" is always producing checklists of size zero. - this can't be right can it?
  - There is still some concern about management of retired project concepts
    - prod-mid cleanup should remove any old version atoms (as it does)
    - NE-322: It should then remove any "empty" concepts (no atoms) that were not assigned a CUI (e.g. id = terminologyId).
    - NE-323: Reload CUI history should ensure that there is not more than one project concept in the database with the same CUI assignment (as terminologyId).
  - ongoing work.
  NE-302: Prepare a ResetProductionNciDatabase integration test (and corresponding mojo) - based on the current one but without stuff we don't need.
  - should load all workflow configs from files
    - ).
  - ongoing work.

Diagrams

NCI Gliffy Diagrams

...

DONE: MatrixInitializerAlgorithm
DONE: StampingAlgorithm
DONE: LexicalClassAssignmentAlgorithm
DONE: ComputePreferredNamesAlgorithm (code, concept, descriptor)
- computes and sets preferred names where doesn't match
- also computes and sets "publishable" where doesn't match
DONE: ReindexAlgorithm
- indexed objects parameter
DONE: RelplaceAttributesAlgorithm extends AttributeLoaderAlgorithm
- attributes.src (in the inputPath)
- Any source, attribute_name combo in the file gets removed - match on terminology, version, name
- attributes.src gets loaded as normal.
- ALTERNATE: implement as AttributeLoaderAlgorithm with a "replace" parameter (boolean)
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
*IN PROGRESS: ReplaceRelationshipsAlgorithm extends RelationshipsLoaderAlgorithm
- relationships.src, contexts.src (in the inputPath)
- Same as above
- Match on terminology,version, relationshipType, additionalRelationshipType (and inverses)
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
*IN PROGRESS: ReplaceContextsAlgorithm extends ContextLoaderAlgorithm
- Same as above
- Match on terminology, version
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
*IN PROGRESS: RecomputeContextsAlgorithm - calls remove tree positions, then compute tree positions
TODODONE: use rootTerminology.hierarchyComputable to know whether or to load/compute (update Context Loader)
- root terminology editing in interface needs to support changing this.

...

DONE: WriteRrfMetadataFilesAlgorithm
- *write release.dat also
DONE: WriteRrfContentFilesAlgorithm
- Write AMBIG and CHANGE files.
DONE: WriteRrfHistoryFilesAlgorithm (inlcluding writing NCI files)
- IN PROGRESS: refactoring..
DONE: WriteRrfIndexesFilesAlgorithm
DONE: ValidateReleaseAlgorithm (referential CUI checking, etc.)
IN PROGRESSDONE: RunMetamorphoSysAlgorithm (this makes METASUBSET which is used for packaging)
- This also uses make_config.csh (now in project) to build the final prop files
DONE: PackageReleaseAlgorithm
- Create the final .zip file? = see the wiki instructions

...

DONE: Application Config/ Project Management/ WorkflowConfig management
- AccessRestriction - By Project (based on user role??)
  - READ_ONLY - null
  - ADMIN (maintenance/insertion is OK, but not authoring) - checked by meta editing service and workflow service (performWorfklowAction).
  - AUTHORING (all changes are allowed) - (double check this when actions are performed - e.g. service?)
  - Update project should send a websocket event (so that ui can prevent editing)
- Processes need to check access restriction before starting - checked by process service
- Process has isRunning and ProcessService will have "isProcessRunning"
  - This represents the global lock.
- ProcessConfig/Execution, AlgorithmConfig/Execution, process execution should point back to its config, but also copy it (sub/superclass)
DONE: Content and Metadata Model Objects
- DONE: Hierarchies like MSH - not based on transitive closure - need to udpate the RRF loader to support inserting tree positions rather that computing them (but needs configuration)
  - AUI -> CODE -> CODE -> SDUI -> SDUI*
- DONE: AtomTreePosition/Jpa/UnitTest
- DONE: SRC concepts - affects RrfLoaderAlgorithm
  - Load code also as a corresponding "organizing class type" (e.g. for RHT) - ?
  - Properly fix the relationships as well.
- DONE: Rels where type of id1 is different than type of id 2
  - Could just use atom-atom rels with a relationship attribute of the real sg_type1/2
  - ComponentInfoRelationship
- DONE: Rels represented in both directions (and UIs)
  - Including reconciling misuse of "relationship group" in both directions.
  - Always use "blank" instead of 0.
  - Handle cases of essentially duplicate RUIs...
    - either disambiguate (e.g. with a DA flag)
    - or clean up the sample data
    - however, the loader should identify and/or correct this data condition. (e.g. identify where RUIs and inverses are ambiguous and just re-assign RUIs completely from scratch)
    - this will matter much more when handling full data.
- DONE: Atom -> modeled as lowerNameHash and uses MD5
- DONE: User -> "team" (for modeling groups), then project can be "isTeamBased" like mapping tool.
Editing
- DONE: REST API : MetaEditingServiceRest (Client/Impl)
  - add/removeSemanticType(Long projectId, Long conceptId, SemanticTypeComponent, authToken)
  - merge, move, split
  - add/removeRelationship (concept level)
  - add/remove/updateAtom
  - Approve concept
- DONE: atomic/molecular actions
- DONE: ID assignment
  - Perform UI assignment in most cases during actions
  - Do not perform terminologyId assignment for SemanticTypeComponent
  - Do not perform terminologyId assignment for ConceptRelationship (e.g. for UMLS relationships)
  - Do not perform terminologyId assignment for CUIs (e.g. UMLS concepts)
- DONE: Support uploading an editor manual (track as project attachments - like "ReleaseArtifact" -> rename as Attachment)
DONE: Insertion - like a "loader"
- Recipe
  - Steps are "algorithms" with configuration. Just add a step, remove a step, reorder a step, or reconfigure a step.
  - all written to be agnostic about SAB.
- Complete UI handling
- Attribute -> Defintion, Subset, Mapping, SubsetMember, Mapping (and requisite attributes and ID assignments)
- Compute delta
- source data loader?, link to local file system - uses ContentServiceJpa
- insertion recipes (tracked over time).
- src_atom_id handling (need to track in DB because of cross-source relationships)
- loading data (unit or batch?)
- merge engine
- matching/demotions
- atom ordering
- Ensure that update releasablity is done before merging so "publishable" is a proxy for "version"
- Maintenance Tools for insertion
  - Mark deleted CUIs as deleted (e.g. instead of bequeathing them)
  - bequeathing concepts based on matching criteria
  - Performing merges based on matching criteria (e.g. query-based merges)
Maintence Tools
- DONE: Precedence list management
- ~~Cluster type STY management (Brian has no recollection what this is)~~
- DONE: Source information.
- DONE: insert attributes, insert stys, insert relationships, recompute tree positions ,... (see $MEME_HOME/bin) (already done in Insertion process algorithms)
Workflow
- DONE: start with workflow stuff from Refset
- DONE: Model objects
  - TrackingRecord (origConceptIds, componentIds, clusterId, clusterType, etc).
  - Epoch
  - ME, QA, AH bins - WorkflowConfig, WorkflowBin
  - Worklist, Checklist
  - WorkflowBinStatistics
  - WorklistStatistics
  - WorkflowBinDefinition
    - Track "isRequired" as a flag indicating required for release.
- DONE: WorkflowActionHandler
  - Meta Editing Worfklow Diagram
- DONE: Services
  - clear/regenerateBin(s)
  - createChecklist/Worklist
  - getWorkflowB
- DONE: Checklist
  - Creating checklist from a workflow bin
  - Creating checklist from a query (SQL/HQL/Lucene)
  - Creating checklist from a list of conceptId
  - Creating checklist from a list of clusterId, conceptId
  - Creating checklist form a file of (conceptId)
  - Creating checklist form a file of (clusterId, conceptId)
- DONE: Worklist
  - Creating a worklist -> should probably specify the team
- DONE: Stamping (batch "concept approval" action)
- DONE: Semantic type categories - chem/nonchem
  - Model as part of Project -
- DONE: When lists are returned (e.g. on "finish"), track the edit/review time.
  - Worklist/Jpa .get/setAuthor/ReviewerTime
- DONE: Track "team" of a worklsit
  - Worklist/Jpa get/setTeam
  - Project/Jpa get/isTeamBased
- DONE: Import/Export
  - export worklist (clusterId\tconceptId\tname)
  - export checklist
  - create checklist from file
  - export workflow config
  - import workflow config (on project editing)
QA
- DONE: Matrix init (recompute concept status based on workflow status of embedded objects).
- DONE: Validation for objects (concept, atom, etc.) as well as validation for actions (e.g. move, merge, split)
- IN PROGRESS (JFW): MID Validation - query-based validation that feeds into workflow system (e.g. "create checklist")
- DONE: EMS QA Bins -
  - QA Bin Definitions - query-based QA bins
- DONE: Sty-coocQA that will get left out
- Counts, Comparisons, Adjustments, Sampling
- STY QA

Reporting

DONE: Concept reports - unit and batch modes
DONE: Query-based reports (role-based) (all reports from MEME4 are implemented in new system)(reporting code has been brought over)
n/a: Tools for researching issues in inversions (no one used these)

DONE: Canned reports

Daily editing report

Code Block

EMS v3 Daily Editing Report for Dec 01, 2016
Database: memestg
Time now: Fri Dec  2 06:02:22 EST 2016

Concepts Approved this day: 105
                  Distinct: 105
Number of actions this day: 193

Shown below are editing statistics for each authority.  The E-{initials}
authority shows approvals done in the interface while the S-{initials}
authority counts batch or stamping approvals.  The percentages show
the proportion of each, by editor.

Authority  Actions  Concepts Approved  Rels Inserted  STYs Inserted  Splits  Merges
---------  -------  -----------------  -------------  -------------  ------  ------
   E-LAR       100        38 (100.0%)         19             1           1      15

   E-LLW        93        67 (100.0%)          4             4           2       0

--------------------------------------------
For more detail, follow this link to the EMS

Production
- DONE: Bequeathal relationship strategies.
- DONE: LUI reassignment!
- DONE: CUI assignment
  - last assigned cui
  - last released cui
  - ConceptIdentity table to track max id?
- DONE: Semantic type Component ATUI assignment
- DONE: ConceptRelationship RUI assignment
- DONE: CUI history
  - Model by leaving old concepts around and linking them to "live" concepts (for bequeathal)
  - Need to update concept history with new bequeathal rels, follow recursion, etc.
- DONE: AUI history
- DONE: When creating MRHIER, add in the SRC/RHT layer as needed
- DONE: Incremental release (or export of a single source)
  - what about CUI assignment?
- DONE: Begin editing cycle?
- DONE: Abiltity to export RRF for just a single source?? (what about CUIs - can force temp CUIs)
Cross-cutting
- DONE: Websocket
- DONE: Disable editing (e.g. don't allow editors to make changes) - how to store this?
  - for admin processes, etc?
  - ProjectJpa flag?
- DONE: Query engine - HQL, SQL, LUCENE -> produces clusterId, conceptId

...

Out Of Scope (mostly things for NLM)

QA that will get left out
- Counts, Comparisons, Adjustments, Sampling
- STY QA
- Research Unmapped Identifiers
TOP level relationships PAR/CHD involving SRC atoms
- - need approval
MRAUI history tracking for past releases (just produce the current release atom movements)
Hierarchies with different kind of objects used (e.g. CODE-SDUI)
- Resolution of top-level MSH hierarchy using different SG_TYPE than rest of tree. → must be the same
- top-level SRC atoms are excepted
Mappings loaded from .src that do not use SRC_ATOM_ID are not supported
"ST", "DA", "MR" attributes (not present in NCI-META)
Support for tobereleased = Y, y, n, N
- There is "publishable" flag and it is true or false. that's it. other mechanisms would need to be used to make this distinction.
Custom Safe replacement steps for different terminologies/termTypes within same set of .src files.
Special support for non-ENG content
- Hiding/showing in UI
- Moving/Merge/Split moving non-ENG atoms along with "translation_of" ENG ones.
Report stuff:
- LEXICAL_TAG, LEGACY_CODE, EZ/RN:EC NUMBER
Definition or LT editing
Some integrity checks - MGV_B2, MGV_D, ...
Relationship type "LK"
P level relationships (now there are just atom-atom and a workflowSTatus of "DEMOTION").
Embryo concept status
Team assignment for worklists.
Content Views
AH bins
Cluster types beyond "chem" and "nonchem"
QA "sampling" and UI
Map metadata editor
Map set viewer
Source information management system
Handling of AQ/QB relationships in a special way for release (these are just regular relationships)

...

Version	Old Version 156	New Version 157
Changes made by	Deborah Shapiro	bcarlsen (Unlicensed)
Saved on	Jun 12, 2017	Jun 13, 2017

Page Comparison

Versions Compared

Key

Questions

Diagrams

Out Of Scope (mostly things for NLM)