...
Questions
- Approve/next does what when working through a worklist? (what if there's a filter or sort, and what if it reaches the end but you started in the middle?).
- Loading a Database with Full 201610 Data (including scale testing insertions and release)
- Loading a Database with Sample 201604 Data (including testing insertions and release)
...
- Production deploy
- 6/12/2017 currently editing about 900 RADLEX concepts on MEME4
- Early July will likely add May/June NCI thesaurus together into MEME4 (not sure on size, complexity of this - and also dependent on new OWL2 inversion process)
- Transfer to new system for editing will happen 2nd half of July or later
- Will likely try to add NCI thesaurus at that point and shortly after add MTH
- Documented on NCI wiki on page "Loading MEME from NCI Meta"
- Priorities
- Documentation
- IN PROGRESS: NE-325: Screen shots (JFW)
- TODO: Text on wiki (BAC)
- NE-308: 30-sec to 1 min training videos (DSS, RAW)
- Mini run-throughs (stock dev build + running stock processes) - determine "correctness"
- Pre-production, Release (DONE)
- CORRECTNESS: compare output to input (REQUIRES work on MRSAT, MRREL, MRSAB, and MRHIER(MSH))
- Pre-production, Prod-Mid Cleanup, Feedback (requires a "release directory" (matching pre-prod) with MR files in it, can be exactly same as input MR files) (DONE)
- NCI Insertion, SNOMEDCT_US Insertion, MTH Insertion
- CORRECTNESS: verify assumptions and volume of data related to each step of the insertion.
- NCI Insertion, Pre-production, Release
- NCI Insertion, Pre-production, Prod-Mid Cleanup, Feedback
- Pre-production, Release (DONE)
- Scale testing
- ResetNciMetaDataDatabase - using 201610 data (input.dir = <dir with https://wci1.s3.amazonaws.com/NCI/NCIM_201610.zip>)
- NCI Insertion - DONE
- SNOMEDCT_US Insertion - IN PROGRESS
- MTH Insertion
- Pre-Production, Release
- Prod-Mid Cleanup, Feedback
- Feedback/Fixes - see Jira
- NEW - support ability to edit computeHierarchy flag for root terminologies in the UI
- NE-317: Upgrade GenerateNciMetaDataMojo to load ALL workflow configurations from files, not just QA.
- NE-319: JFW: use your own local deployment of the tool to edit and test queries.
- NE-318: RAW: the "make checklists for report tables" is always producing checklists of size zero. - this can't be right can it?
- There is still some concern about management of retired project concepts
- prod-mid cleanup should remove any old version atoms (as it does)
- NE-322: It should then remove any "empty" concepts (no atoms) that were not assigned a CUI (e.g. id = terminologyId).
- NE-323: Reload CUI history should ensure that there is not more than one project concept in the database with the same CUI assignment (as terminologyId).
- ongoing work.
- should load all workflow configs from files
- ).
- ongoing work.
- Documentation
Diagrams
...
- DONE: MatrixInitializerAlgorithm
- DONE: StampingAlgorithm
- DONE: LexicalClassAssignmentAlgorithm
- DONE: ComputePreferredNamesAlgorithm (code, concept, descriptor)
- computes and sets preferred names where doesn't match
- also computes and sets "publishable" where doesn't match
- DONE: ReindexAlgorithm
- indexed objects parameter
- DONE: RelplaceAttributesAlgorithm extends AttributeLoaderAlgorithm
- attributes.src (in the inputPath)
- Any source, attribute_name combo in the file gets removed - match on terminology, version, name
- attributes.src gets loaded as normal.
- ALTERNATE: implement as AttributeLoaderAlgorithm with a "replace" parameter (boolean)
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
- *IN PROGRESS: ReplaceRelationshipsAlgorithm extends RelationshipsLoaderAlgorithm
- relationships.src, contexts.src (in the inputPath)
- Same as above
- Match on terminology,version, relationshipType, additionalRelationshipType (and inverses)
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
- *IN PROGRESS: ReplaceContextsAlgorithm extends ContextLoaderAlgorithm
- Same as above
- Match on terminology, version
- ALSO: make a maintenance process that runs just this with replace turned on and a name of "Replace Attributes"
- *IN PROGRESS: RecomputeContextsAlgorithm - calls remove tree positions, then compute tree positions
- TODODONE: use rootTerminology.hierarchyComputable to know whether or to load/compute (update Context Loader)
- root terminology editing in interface needs to support changing this.
- root terminology editing in interface needs to support changing this.
...
- DONE: WriteRrfMetadataFilesAlgorithm
- *write release.dat also
- DONE: WriteRrfContentFilesAlgorithm
- Write AMBIG and CHANGE files.
- DONE: WriteRrfHistoryFilesAlgorithm (inlcluding writing NCI files)
- IN PROGRESS: refactoring..
- DONE: WriteRrfIndexesFilesAlgorithm
- DONE: ValidateReleaseAlgorithm (referential CUI checking, etc.)
- IN PROGRESSDONE: RunMetamorphoSysAlgorithm (this makes METASUBSET which is used for packaging)
- This also uses make_config.csh (now in project) to build the final prop files
- DONE: PackageReleaseAlgorithm
- Create the final .zip file? = see the wiki instructions
- Create the final .zip file? = see the wiki instructions
...
- DONE: Application Config/ Project Management/ WorkflowConfig management
- AccessRestriction - By Project (based on user role??)
- READ_ONLY - null
- ADMIN (maintenance/insertion is OK, but not authoring) - checked by meta editing service and workflow service (performWorfklowAction).
- AUTHORING (all changes are allowed) - (double check this when actions are performed - e.g. service?)
- Update project should send a websocket event (so that ui can prevent editing)
- Processes need to check access restriction before starting - checked by process service
- Process has isRunning and ProcessService will have "isProcessRunning"
- This represents the global lock.
- ProcessConfig/Execution, AlgorithmConfig/Execution, process execution should point back to its config, but also copy it (sub/superclass)
- AccessRestriction - By Project (based on user role??)
- DONE: Content and Metadata Model Objects
- DONE: Hierarchies like MSH - not based on transitive closure - need to udpate the RRF loader to support inserting tree positions rather that computing them (but needs configuration)
AUI -> CODE -> CODE -> SDUI -> SDUI*
- DONE: AtomTreePosition/Jpa/UnitTest
- DONE: SRC concepts - affects RrfLoaderAlgorithm
- Load code also as a corresponding "organizing class type" (e.g. for RHT) - ?
- Properly fix the relationships as well.
- DONE: Rels where type of id1 is different than type of id 2
- Could just use atom-atom rels with a relationship attribute of the real sg_type1/2
- ComponentInfoRelationship
- DONE: Rels represented in both directions (and UIs)
- Including reconciling misuse of "relationship group" in both directions.
- Always use "blank" instead of 0.
- Handle cases of essentially duplicate RUIs...
- either disambiguate (e.g. with a DA flag)
- or clean up the sample data
- however, the loader should identify and/or correct this data condition. (e.g. identify where RUIs and inverses are ambiguous and just re-assign RUIs completely from scratch)
- this will matter much more when handling full data.
- DONE: Atom -> modeled as lowerNameHash and uses MD5
- DONE: User -> "team" (for modeling groups), then project can be "isTeamBased" like mapping tool.
- DONE: Hierarchies like MSH - not based on transitive closure - need to udpate the RRF loader to support inserting tree positions rather that computing them (but needs configuration)
- Editing
- DONE: REST API : MetaEditingServiceRest (Client/Impl)
- add/removeSemanticType(Long projectId, Long conceptId, SemanticTypeComponent, authToken)
- merge, move, split
- add/removeRelationship (concept level)
- add/remove/updateAtom
- Approve concept
- DONE: atomic/molecular actions
- DONE: ID assignment
- Perform UI assignment in most cases during actions
- Do not perform terminologyId assignment for SemanticTypeComponent
- Do not perform terminologyId assignment for ConceptRelationship (e.g. for UMLS relationships)
- Do not perform terminologyId assignment for CUIs (e.g. UMLS concepts)
- DONE: Support uploading an editor manual (track as project attachments - like "ReleaseArtifact" -> rename as Attachment)
- DONE: REST API : MetaEditingServiceRest (Client/Impl)
- DONE: Insertion - like a "loader"
- Recipe
- Steps are "algorithms" with configuration. Just add a step, remove a step, reorder a step, or reconfigure a step.
- all written to be agnostic about SAB.
- Complete UI handling
- Attribute -> Defintion, Subset, Mapping, SubsetMember, Mapping (and requisite attributes and ID assignments)
- Compute delta
- source data loader?, link to local file system - uses ContentServiceJpa
- insertion recipes (tracked over time).
- src_atom_id handling (need to track in DB because of cross-source relationships)
- loading data (unit or batch?)
- merge engine
- matching/demotions
- atom ordering
- Ensure that update releasablity is done before merging so "publishable" is a proxy for "version"
- Maintenance Tools for insertion
- Mark deleted CUIs as deleted (e.g. instead of bequeathing them)
- bequeathing concepts based on matching criteria
- Performing merges based on matching criteria (e.g. query-based merges)
- Recipe
- Maintence Tools
- DONE: Precedence list management
Cluster type STY management (Brian has no recollection what this is)- DONE: Source information.
- DONE: insert attributes, insert stys, insert relationships, recompute tree positions ,... (see $MEME_HOME/bin) (already done in Insertion process algorithms)
- Workflow
- DONE: start with workflow stuff from Refset
- DONE: Model objects
- TrackingRecord (origConceptIds, componentIds, clusterId, clusterType, etc).
- Epoch
- ME, QA, AH bins - WorkflowConfig, WorkflowBin
- Worklist, Checklist
- WorkflowBinStatistics
- WorklistStatistics
- WorkflowBinDefinition
- Track "isRequired" as a flag indicating required for release.
- DONE: WorkflowActionHandler
- DONE: Services
- clear/regenerateBin(s)
- createChecklist/Worklist
- getWorkflowB
- DONE: Checklist
- Creating checklist from a workflow bin
- Creating checklist from a query (SQL/HQL/Lucene)
- Creating checklist from a list of conceptId
- Creating checklist from a list of clusterId, conceptId
- Creating checklist form a file of (conceptId)
- Creating checklist form a file of (clusterId, conceptId)
- DONE: Worklist
- Creating a worklist -> should probably specify the team
- DONE: Stamping (batch "concept approval" action)
- DONE: Semantic type categories - chem/nonchem
- Model as part of Project -
- DONE: When lists are returned (e.g. on "finish"), track the edit/review time.
- Worklist/Jpa .get/setAuthor/ReviewerTime
- DONE: Track "team" of a worklsit
- Worklist/Jpa get/setTeam
- Project/Jpa get/isTeamBased
- DONE: Import/Export
- export worklist (clusterId\tconceptId\tname)
- export checklist
- create checklist from file
- export workflow config
- import workflow config (on project editing)
- QA
- DONE: Matrix init (recompute concept status based on workflow status of embedded objects).
- DONE: Validation for objects (concept, atom, etc.) as well as validation for actions (e.g. move, merge, split)
- IN PROGRESS (JFW): MID Validation - query-based validation that feeds into workflow system (e.g. "create checklist")
- DONE: EMS QA Bins -
- QA Bin Definitions - query-based QA bins
- DONE: Sty-coocQA that will get left out
- Counts, Comparisons, Adjustments, Sampling
- STY QA Research Unmapped Identifiers
- DONE: Concept reports - unit and batch modes
- DONE: Query-based reports (role-based) (all reports from MEME4 are implemented in new system)(reporting code has been brought over)
- n/a: Tools for researching issues in inversions (no one used these)
- DONE: Canned reports
Daily editing report
Code Block EMS v3 Daily Editing Report for Dec 01, 2016 Database: memestg Time now: Fri Dec 2 06:02:22 EST 2016 Concepts Approved this day: 105 Distinct: 105 Number of actions this day: 193 Shown below are editing statistics for each authority. The E-{initials} authority shows approvals done in the interface while the S-{initials} authority counts batch or stamping approvals. The percentages show the proportion of each, by editor. Authority Actions Concepts Approved Rels Inserted STYs Inserted Splits Merges --------- ------- ----------------- ------------- ------------- ------ ------ E-LAR 100 38 (100.0%) 19 1 1 15 E-LLW 93 67 (100.0%) 4 4 2 0 -------------------------------------------- For more detail, follow this link to the EMS
- DONE: Bequeathal relationship strategies.
- DONE: LUI reassignment!
- DONE: CUI assignment
- last assigned cui
- last released cui
- ConceptIdentity table to track max id?
- DONE: Semantic type Component ATUI assignment
- DONE: ConceptRelationship RUI assignment
- DONE: CUI history
- Model by leaving old concepts around and linking them to "live" concepts (for bequeathal)
- Need to update concept history with new bequeathal rels, follow recursion, etc.
- DONE: AUI history
- DONE: When creating MRHIER, add in the SRC/RHT layer as needed
- DONE: Incremental release (or export of a single source)
- what about CUI assignment?
- DONE: Begin editing cycle?
- DONE: Abiltity to export RRF for just a single source?? (what about CUIs - can force temp CUIs)
- DONE: Websocket
- DONE: Disable editing (e.g. don't allow editors to make changes) - how to store this?
- for admin processes, etc?
- ProjectJpa flag?
- DONE: Query engine - HQL, SQL, LUCENE -> produces clusterId, conceptId
...
Out Of Scope (mostly things for NLM)
- QA that will get left out
- Counts, Comparisons, Adjustments, Sampling
- STY QA
- Research Unmapped Identifiers
- TOP level relationships PAR/CHD involving SRC atoms
- need approval
- MRAUI history tracking for past releases (just produce the current release atom movements)
- Hierarchies with different kind of objects used (e.g. CODE-SDUI)
- Resolution of top-level MSH hierarchy using different SG_TYPE than rest of tree. → must be the same
- top-level SRC atoms are excepted
- Mappings loaded from .src that do not use SRC_ATOM_ID are not supported
- "ST", "DA", "MR" attributes (not present in NCI-META)
- Support for tobereleased = Y, y, n, N
- There is "publishable" flag and it is true or false. that's it. other mechanisms would need to be used to make this distinction.
- Custom Safe replacement steps for different terminologies/termTypes within same set of .src files.
- Special support for non-ENG content
- Hiding/showing in UI
- Moving/Merge/Split moving non-ENG atoms along with "translation_of" ENG ones.
- Report stuff:
- LEXICAL_TAG, LEGACY_CODE, EZ/RN:EC NUMBER
- Definition or LT editing
- Some integrity checks - MGV_B2, MGV_D, ...
- Relationship type "LK"
- P level relationships (now there are just atom-atom and a workflowSTatus of "DEMOTION").
- Embryo concept status
- Team assignment for worklists.
- Content Views
- AH bins
- Cluster types beyond "chem" and "nonchem"
- QA "sampling" and UI
- Map metadata editor
- Map set viewer
- Source information management system
- Handling of AQ/QB relationships in a special way for release (these are just regular relationships)
...