...
- Look up definitions of bins (e.g. logically what do they do, we'll need descriptions too
- Write up description
- Write and test query (with :terminology/:version interpolations)
REQUIRED bins
...
deleted_cui_split
# Source merge rules
Merged SCUI current version NCI atoms
Merged SDUI current version MDR atoms
Merged SDUI current version PDQ atoms
Merged SCUI current version CBO atoms
RXCUI Merges
# Source split rules
Split SCUI current version NCI (or sub-source) atoms
SNOMED concept clusters where the FN and PT terms are separated
RXCUI Splits
# Cross source merge rules
Concepts containing current version NCI and current version PDQ atoms
Concepts containing current version NCI and current version SNOMEDCT atoms
# NCIMTH/PN rules
Ambiguous concepts where at least one has no PN (sepstring)
Ambiguous concepts where at least one has no PN (sepstring)
Ambiguous concepts where at least one pair lacks an approved REL
Identical (same SUI) PN's in multiple concepts
Concepts with multiple MTH/PN atoms
Concept has MTH/PN atom but no ambiguous string
MTH/PN atom is ambiguous but has no matching ambiguous string
# General PN rules
MTH/PNs on their own
# STY rules
Find concepts with Clinical Drug STY and any other STY
No STY
concepts with more than 3 STYs
Reviewed concepts without releasable Semantic Types
One STY is an ancestor of another in the STY isa hierarchy
Current version CBO concepts with CHEM STYs
Current version GO concepts with CHEM STYs
Current version MDR concepts with CHEM STYs
# Relationship rules
Concepts with no releaseable relationships to any other concept
Short form in one concept, long form in another
# CUI rules
Potential missed synonmym between CL* and CO*
CUIs that are going away - will need bequeathal rel
Complex split/merge case CUIs that are going away - will need bequeathal rel
CUIS that are going away (without MTH-only) - will need bequeathal rel
# Other rules
Concepts with identical LUI atoms that have mixed suppressibility
# Awaiting Questions
Merged current version NCI/PT atoms
Demotions overlapping with a bad merge that was split
...
nci_merge: Merged SCUI current version NCI atoms
Code Block select a.id clusterId, adistinct c.id conceptId from concepts ac, concepts_atoms bca, atoms ca where ac.terminology = :terminology and ac.id = bca.concepts_id and bca.atoms_id = ca.id and a.publishable = true and ca.terminology='NCI' group by ac.id having count(distinct ca.conceptId)>1
nci_sub_split: Split SCUI current version NCI (or sub-source) atoms
Code Block TBD: select a.id clusterId, a.id conceptId from concepts a, concepts_atoms b, atoms c where a.terminology = :terminology and a.id = b.concepts_id and b.atoms_id = c.id and c.terminology in (select terminology from root_terminologies where family = 'NCI' and terminology != 'NCI') group by a.id having count(distinct c.conceptId)>1
sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
cdsty_coc: Find concepts with Clinical Drug STY and any other STY
multsty:concepts with more than 3 STYs|sfo_lfo: Short form in one concept, long form in another
deleted_cui: CUIs that are going away - will need bequeathal rel
styisa:One STY is an ancestor of another in the STY isa hierarchy
NON-required bins
...
- Same as nci_merge but with CBO
...
rxnorm_split: RXCUI splits
Code Block |
---|
see split query above |
...
--Identify Sub-source atoms that aren't in the same concept as the NCI atom with the same conceptId SELECT DISTINCT c.id conceptId1, c1.id conceptId2 FROM concepts c, concepts_atoms ca, atoms a, concepts c1, concepts_atoms ca1, atoms a1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a1.terminology IN (SELECT terminology FROM root_terminologies WHERE family = 'NCI' AND terminology != 'NCI') AND a.conceptId = a1.conceptId AND c.id != c1.id
sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
Code Block SELECT DISTINCT c.id conceptId1, c1.id conceptId2 FROM concepts c, concepts_atoms ca, atoms a, concepts c1, concepts_atoms ca1, atoms a1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND a.terminology = 'SNOMEDCT_US' AND a1.terminology = 'SNOMEDCT_US' AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.termType = 'FN' AND a1.termType = 'PT' AND a.conceptId = a1.conceptId AND c.id != c1.id
cdsty_coc: Find concepts with Clinical Drug STY and any other STY
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.concepts_id IN (SELECT cs.concepts_id FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType = 'Clinical Drug') GROUP BY cs.concepts_id HAVING COUNT(cs.concepts_id) > 1
multsty:concepts with more than 3 STYs|
Code Block SELECT DISTINCT cs.concepts_id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id GROUP BY c.id HAVING COUNT(c.id) > 3
styisa:One STY is an ancestor of another in the STY isa hierarchy
Code Block select t1.conceptId from (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st WHERE c.terminology = :terminology AND c.id = cstc.concepts_id AND cstc.semanticTypes_id = st.id) t1 JOIN (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st WHERE c.terminology = :terminology AND c.id = cstc.concepts_id AND cstc.semanticTypes_id = st.id) t2 on (t1.conceptId = t2.conceptId and t1.styId != t2.styId and t1.treeNumber != t2.treeNumber) WHERE t2.treeNumber like concat(t1.treeNumber, '.%')
sfo_lfo: Short form in one concept, long form in another
Code Block --ShortForm/LongForm are related atoms -- These live in the DB as "SY" atom relationships with RELA value either equal to "expanded_form_of" or starting with "mth_" and ending with "_form_of" SELECT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts_atoms ca1, atoms a1, concepts c2, concepts_atoms ca2, atoms a2, (SELECT ar.from_id, ar.to_id FROM atom_relationships ar WHERE publishable = TRUE AND relationshipType = 'SY' AND (additionalRelationshipType = 'expanded_form_of' OR additionalRelationshipType LIKE 'mth_%_form_of')) sfoLfoRels WHERE c1.terminology = 'NCIMTH' AND c2.terminology = 'NCIMTH' AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND c2.id = ca2.concepts_id AND ca2.atoms_id = a2.id AND a1.id = sfoLfoRels.from_id AND a2.id = sfoLfoRels.to_id AND c1.id != c2.id
deleted_cui: CUIs that are going away - will need bequeathal rel
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca, atoms a WHERE c.terminology = 'NCIMTH' AND c.id != c.terminologyId AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.publishable = FALSE AND NOT c.id IN ( SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca, atoms a WHERE c.terminology = 'NCIMTH' AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.publishable = TRUE ) AND NOT c.id IN ( SELECT DISTINCT c.id conceptId FROM concepts c, concept_relationships cr WHERE c.terminology = 'NCIMTH' AND c.id = cr.from_id AND cr.relationshipType like 'B%' ) AND NOT c.id IN ( SELECT c.id conceptId FROM concepts c, concepts_atoms ca WHERE c.terminology = 'NCIMTH' AND c.id = ca.concepts_id AND ca.concepts_id IN ( SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('MTH', 'NCIMTH') AND a.termType = 'PN' ) GROUP BY ca.concepts_id HAVING COUNT(DISTINCT ca.atoms_id) = 1 ) AND NOT c.id IN ( SELECT ca.concepts_id conceptId FROM mrcui mr, atomjpa_conceptterminologyids ac, concepts_atoms ca, concepts cpt WHERE mr.cui1 = ac.conceptTerminologyIds AND ca.atoms_id = ac.AtomJpa_id AND cpt.id = ca.concepts_id AND cpt.terminology = 'NCIMTH' AND ac.conceptTerminologyIds_KEY = 'NCIMTH' AND mr.rel = 'DEL' )
NON-required bins
- rxnorm_merge: RXCUI Merges
Same as nci_merge but with RXNORM
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='RXNORM' group by a.conceptId having count(distinct c.id)>1
- cbo_merge: Merged SCUI current version CBO atoms
Same as nci_merge but with CBO
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='CBO' group by c.id having count(distinct a.conceptId)>1
mdr_merge: Merged SDUI current version MDR atoms
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='MDR' group by c.id having count(distinct a.descriptorId)>1
pdq_merge: Merged SDUI current version PDQ atoms
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='PDQ' group by c.id having count(distinct a.descriptorId)>1
- sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
- DUPLICATE of above
rxnorm_split: RXCUI splits
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='RXNORM' group by a.conceptId having count(distinct c.id)>1
nci_pdq_merge: Concepts containing current version NCI and current version PDQ atoms
Code Block SELECT DISTINCT c.id conceptId FROM atoms a, atoms a1, concepts c, concepts c1, concepts_atoms ca, concepts_atoms ca1, terminologies t, terminologies t1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a.terminology = t.terminology AND a.version = t.version AND t.current = TRUE AND a1.terminology = 'PDQ' AND a1.terminology = t1.terminology AND a1.version = t1.version AND t1.current = TRUE AND c.id = c1.id
nci_sct_merge: Concepts containing current version NCI and current version SNOMEDCT atoms
Code Block SELECT DISTINCT c.id conceptId FROM atoms a, atoms a1, concepts c, concepts c1, concepts_atoms ca, concepts_atoms ca1, terminologies t, terminologies t1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a.terminology = t.terminology AND a.version = t.version AND t.current = TRUE AND a1.terminology = 'SNOMEDCT_US' AND a1.terminology = t1.terminology AND a1.version = t1.version AND t1.current = TRUE AND c.id = c1.id
ambig_no_pn: Ambiguous concepts where at least one has no MTH/PN nor NCIMTH/PN (sepstring)
Code Block --Ambiguous concept = concepts that share atoms that have the same case-insensitive name -- This is used by several of the below queries, so it should be created as a VIEW CREATE VIEW ambig_concepts AS SELECT DISTINCT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts_atoms ca1, atoms a1, concepts c2, concepts_atoms ca2, atoms a2 WHERE c1.terminology = 'NCIMTH' AND c2.terminology = 'NCIMTH' AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND c2.id = ca2.concepts_id AND ca2.atoms_id = a2.id AND c1.id < c2.id AND a1.lowerNameHash = a2.lowerNameHash AND a1.id != a2.id AND a1.publishable = TRUE AND a2.publishable = TRUE
Code Block --Use Pre-programmed View ambig_concepts SELECT conceptId1, conceptId2 FROM ambig_concepts WHERE NOT ambig_concepts.conceptId1 IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('NCIMTH' , 'MTH') AND a.termType = 'PN') AND NOT ambig_concepts.conceptId2 IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('NCIMTH' , 'MTH') AND a.termType = 'PN')
ambig_no_rel: Ambiguous concepts that lack an approved REL
Code Block --Use Pre-programmed View ambig_concepts SELECT conceptId1, conceptId2 FROM ambig_concepts WHERE NOT (conceptId1 , conceptId2) IN (SELECT cr.from_id, cr.to_id FROM concept_Relationships cr WHERE cr.publishable = TRUE AND cr.workflowStatus in ('READY_FOR_PUBLICATION','PUBLISHED'))
pn_pn_ambig: Identical (same SUI) PN's in multiple concepts
Code Block SELECT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts c2, concepts_atoms ca1, concepts_atoms ca2, (SELECT a1.id atomId1, a2.id atomId2 FROM atoms a1, atoms a2 WHERE a1.termType = 'PN' AND a2.termType = 'PN' AND a1.stringClassId = a2.stringClassId AND a1.id != a2.id) identicalPNAtoms WHERE c1.terminology = :terminology AND c2.terminology = :terminology AND c1.id = ca1.concepts_id AND c2.id = ca2.concepts_id AND ca1.atoms_id = identicalPNAtoms.atomId1 AND ca2.atoms_id = identicalPNAtoms.atomId2 AND c1.id != c2.id
multiple_pn: Concepts with multiple MTH/PN atoms
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, atoms a, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.terminology in ('MTH', 'NCIMTH') AND a.termType = 'PN' GROUP BY c.id HAVING COUNT(c.id) > 1
pn_no_ambig: Concept has MTH/PN atom but no ambiguous string
Code Block select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology in ('NCIMTH','MTH') and a.termType = 'PN' and not c.id in (select conceptId1 from ambig_concepts)
ambig_pn: MTH/PN atom is ambiguous but has no matching ambiguous string
Drop. No longer useful.
pn_orphan: MTH/PNs on their own
Code Block --Concepts whose only publishable atoms are MTH/PN or NCIMTH/PN SELECT c.id conceptId FROM concepts c, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.concepts_id IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('MTH' , 'NCIMTH') AND a.termType = 'PN' AND a.publishable = true) GROUP BY ca.concepts_id HAVING COUNT(DISTINCT ca.atoms_id) = 1
nosty: No STY
Code Block SELECT DISTINCT c.id conceptId FROM concepts c LEFT JOIN concepts_semantic_type_components cs ON (c.id = cs.concepts_id) WHERE c.terminology= :terminology AND c.publishable = TRUE AND cs.semanticTypes_id IS NULL
missing_sty: Reviewed concepts without releasable Semantic Types
Code Block SELECT DISTINCT c.id conceptId FROM concepts c WHERE c.terminology = :terminology AND c.publishable = TRUE AND c.workflowStatus IN ('READY_FOR_PUBLICATION' , 'PUBLISHED') AND NOT c.id IN (SELECT c.id FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.publishable = TRUE)
cbo_chem: Current version CBO concepts with CHEM STYs
Code Block -- to find chemical semantic types (select distinct semanticTypeCategoryMap sty from projects a, ProjectJpa_semanticTypeCategoryMap b where a.id = b.ProjectJpa_id and semanticTypeCategoryMap_KEY = 'chemical' and a.terminology = :terminology); -- Rick update 3/2/2017 (select distinct semanticTypeCategoryMap_KEY sty from projects a, ProjectJpa_semanticTypeCategoryMap b where a.id = b.ProjectJpa_id and semanticTypeCategoryMap = 'chem' and a.terminology = :terminology)
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'CBO' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
go_chem: Current version GO concepts with CHEM STYs
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'GO' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
mdr_chem: Current version MDR concepts with CHEM STYs
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'MDR' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
true_orphan: Concepts with no releasable relationships to any other concept
Uses the "deep relationships" query; only needs to search FROM role since rels are bidirectionalCode Block SELECT c.id conceptId FROM concepts c WHERE c.terminology = :terminolgy AND c.publishable=TRUE AND NOT EXISTS (SELECT * FROM deep_concept_relationships dcr WHERE role='FROM' AND c.id = dcr.concepts_id)
deleted_cui_split: Complex split/merge case CUIs that are going away - will need bequeathal
Drop
split_demotions: Demotions overlapping with a bad merge that was split
Drop. No longer useful.nci_pt_mrg: Merged current version NCI/PT atoms
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, atoms a, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.terminology in ('NCI') AND a.termType = 'PT' GROUP BY c.id HAVING COUNT(c.id) > 1
mxsuppr: Concepts with identical LUI atoms that have mixed suppressibility
Code Block SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca1, concepts_atoms ca2, (SELECT a1.id atomId1, a2.id atomId2 FROM atoms a1, atoms a2 WHERE a1.lexicalClassId = a2.lexicalClassId AND a1.id != a2.id AND a1.publishable = TRUE AND a2.publishable = TRUE AND a1.suppressible != a2.suppressible) mixedSuppresLuiAtoms WHERE c.terminology = :terminology AND c.id = ca1.concepts_id AND ca1.atoms_id = mixedSuppresLuiAtoms.atomId1 AND ca2.atoms_id = mixedSuppresLuiAtoms.atomId2 AND ca1.concepts_id = ca2.concepts_id
~ Mixed status concepts 196 Sat Jan 7 09:19:47 2017
~ Suppressible preferred name (level 0,9) 0 Sat Jan 7 09:19:58 2017