QA Bin Definitions
Overview
QA Bin Definitions
Parameters :terminology (e.g. NCIMTH) and :version (e.g. latest) are allowed in the queries.\
TODO:
- Look up definitions of bins (e.g. logically what do they do, we'll need descriptions too
- Write up description
- Write and test query (with :terminology/:version interpolations)
REQUIRED bins
nci_merge: Merged SCUI current version NCI atoms
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='NCI' group by c.id having count(distinct a.conceptId)>1
nci_sub_split: Split SCUI current version NCI (or sub-source) atoms
--Identify Sub-source atoms that aren't in the same concept as the NCI atom with the same conceptId SELECT DISTINCT c.id conceptId1, c1.id conceptId2 FROM concepts c, concepts_atoms ca, atoms a, concepts c1, concepts_atoms ca1, atoms a1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a1.terminology IN (SELECT terminology FROM root_terminologies WHERE family = 'NCI' AND terminology != 'NCI') AND a.conceptId = a1.conceptId AND c.id != c1.id
sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
SELECT DISTINCT c.id conceptId1, c1.id conceptId2 FROM concepts c, concepts_atoms ca, atoms a, concepts c1, concepts_atoms ca1, atoms a1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND a.terminology = 'SNOMEDCT_US' AND a1.terminology = 'SNOMEDCT_US' AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.termType = 'FN' AND a1.termType = 'PT' AND a.conceptId = a1.conceptId AND c.id != c1.id
cdsty_coc: Find concepts with Clinical Drug STY and any other STY
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.concepts_id IN (SELECT cs.concepts_id FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType = 'Clinical Drug') GROUP BY cs.concepts_id HAVING COUNT(cs.concepts_id) > 1
multsty: concepts with more than 3 STYs|
SELECT DISTINCT cs.concepts_id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id GROUP BY c.id HAVING COUNT(c.id) > 3
styisa: One STY is an ancestor of another in the STY isa hierarchy
select t1.conceptId from (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st WHERE c.terminology = :terminology AND c.id = cstc.concepts_id AND cstc.semanticTypes_id = st.id) t1 JOIN (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st WHERE c.terminology = :terminology AND c.id = cstc.concepts_id AND cstc.semanticTypes_id = st.id) t2 on (t1.conceptId = t2.conceptId and t1.styId != t2.styId and t1.treeNumber != t2.treeNumber) WHERE t2.treeNumber like concat(t1.treeNumber, '.%')
sfo_lfo: Short form in one concept, long form in another
--ShortForm/LongForm are related atoms -- These live in the DB as "SY" atom relationships with RELA value either equal to "expanded_form_of" or starting with "mth_" and ending with "_form_of" SELECT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts_atoms ca1, atoms a1, concepts c2, concepts_atoms ca2, atoms a2, (SELECT ar.from_id, ar.to_id FROM atom_relationships ar WHERE publishable = TRUE AND relationshipType = 'SY' AND (additionalRelationshipType = 'expanded_form_of' OR additionalRelationshipType LIKE 'mth_%_form_of')) sfoLfoRels WHERE c1.terminology = 'NCIMTH' AND c2.terminology = 'NCIMTH' AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND c2.id = ca2.concepts_id AND ca2.atoms_id = a2.id AND a1.id = sfoLfoRels.from_id AND a2.id = sfoLfoRels.to_id AND c1.id != c2.id
deleted_cui: CUIs that are going away - will need bequeathal rel
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca, atoms a WHERE c.terminology = 'NCIMTH' AND c.id != c.terminologyId AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.publishable = FALSE AND NOT c.id IN ( SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca, atoms a WHERE c.terminology = 'NCIMTH' AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.publishable = TRUE ) AND NOT c.id IN ( SELECT DISTINCT c.id conceptId FROM concepts c, concept_relationships cr WHERE c.terminology = 'NCIMTH' AND c.id = cr.from_id AND cr.relationshipType like 'B%' ) AND NOT c.id IN ( SELECT c.id conceptId FROM concepts c, concepts_atoms ca WHERE c.terminology = 'NCIMTH' AND c.id = ca.concepts_id AND ca.concepts_id IN ( SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('MTH', 'NCIMTH') AND a.termType = 'PN' ) GROUP BY ca.concepts_id HAVING COUNT(DISTINCT ca.atoms_id) = 1 ) AND NOT c.id IN ( SELECT ca.concepts_id conceptId FROM mrcui mr, atomjpa_conceptterminologyids ac, concepts_atoms ca, concepts cpt WHERE mr.cui1 = ac.conceptTerminologyIds AND ca.atoms_id = ac.AtomJpa_id AND cpt.id = ca.concepts_id AND cpt.terminology = 'NCIMTH' AND ac.conceptTerminologyIds_KEY = 'NCIMTH' AND mr.rel = 'DEL' )
NON-required bins
- rxnorm_merge: RXCUI Merges
Same as nci_merge but with RXNORM
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='RXNORM' group by a.conceptId having count(distinct c.id)>1
- cbo_merge: Merged SCUI current version CBO atoms
Same as nci_merge but with CBO
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='CBO' group by c.id having count(distinct a.conceptId)>1
mdr_merge: Merged SDUI current version MDR atoms
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='MDR' group by c.id having count(distinct a.descriptorId)>1
pdq_merge: Merged SDUI current version PDQ atoms
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='PDQ' group by c.id having count(distinct a.descriptorId)>1
- sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
- DUPLICATE of above
rxnorm_split: RXCUI splits
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology='RXNORM' group by a.conceptId having count(distinct c.id)>1
nci_pdq_merge: Concepts containing current version NCI and current version PDQ atoms
SELECT DISTINCT c.id conceptId FROM atoms a, atoms a1, concepts c, concepts c1, concepts_atoms ca, concepts_atoms ca1, terminologies t, terminologies t1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a.terminology = t.terminology AND a.version = t.version AND t.current = TRUE AND a1.terminology = 'PDQ' AND a1.terminology = t1.terminology AND a1.version = t1.version AND t1.current = TRUE AND c.id = c1.id
nci_sct_merge: Concepts containing current version NCI and current version SNOMEDCT atoms
SELECT DISTINCT c.id conceptId FROM atoms a, atoms a1, concepts c, concepts c1, concepts_atoms ca, concepts_atoms ca1, terminologies t, terminologies t1 WHERE c.terminology = :terminology AND c1.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND a.terminology = 'NCI' AND a.terminology = t.terminology AND a.version = t.version AND t.current = TRUE AND a1.terminology = 'SNOMEDCT_US' AND a1.terminology = t1.terminology AND a1.version = t1.version AND t1.current = TRUE AND c.id = c1.id
ambig_no_pn: Ambiguous concepts where at least one has no MTH/PN nor NCIMTH/PN (sepstring)
--Ambiguous concept = concepts that share atoms that have the same case-insensitive name -- This is used by several of the below queries, so it should be created as a VIEW CREATE VIEW ambig_concepts AS SELECT DISTINCT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts_atoms ca1, atoms a1, concepts c2, concepts_atoms ca2, atoms a2 WHERE c1.terminology = 'NCIMTH' AND c2.terminology = 'NCIMTH' AND c1.id = ca1.concepts_id AND ca1.atoms_id = a1.id AND c2.id = ca2.concepts_id AND ca2.atoms_id = a2.id AND c1.id < c2.id AND a1.lowerNameHash = a2.lowerNameHash AND a1.id != a2.id AND a1.publishable = TRUE AND a2.publishable = TRUE
--Use Pre-programmed View ambig_concepts SELECT conceptId1, conceptId2 FROM ambig_concepts WHERE NOT ambig_concepts.conceptId1 IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('NCIMTH' , 'MTH') AND a.termType = 'PN') AND NOT ambig_concepts.conceptId2 IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('NCIMTH' , 'MTH') AND a.termType = 'PN')
ambig_no_rel: Ambiguous concepts that lack an approved REL
--Use Pre-programmed View ambig_concepts SELECT conceptId1, conceptId2 FROM ambig_concepts WHERE NOT (conceptId1 , conceptId2) IN (SELECT cr.from_id, cr.to_id FROM concept_Relationships cr WHERE cr.publishable = TRUE AND cr.workflowStatus in ('READY_FOR_PUBLICATION','PUBLISHED'))
pn_pn_ambig: Identical (same SUI) PN's in multiple concepts
SELECT c1.id conceptId1, c2.id conceptId2 FROM concepts c1, concepts c2, concepts_atoms ca1, concepts_atoms ca2, (SELECT a1.id atomId1, a2.id atomId2 FROM atoms a1, atoms a2 WHERE a1.termType = 'PN' AND a2.termType = 'PN' AND a1.stringClassId = a2.stringClassId AND a1.id != a2.id) identicalPNAtoms WHERE c1.terminology = :terminology AND c2.terminology = :terminology AND c1.id = ca1.concepts_id AND c2.id = ca2.concepts_id AND ca1.atoms_id = identicalPNAtoms.atomId1 AND ca2.atoms_id = identicalPNAtoms.atomId2 AND c1.id != c2.id
multiple_pn: Concepts with multiple MTH/PN atoms
SELECT DISTINCT c.id conceptId FROM concepts c, atoms a, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.terminology in ('MTH', 'NCIMTH') AND a.termType = 'PN' GROUP BY c.id HAVING COUNT(c.id) > 1
pn_no_ambig: Concept has MTH/PN atom but no ambiguous string
select distinct c.id conceptId from concepts c, concepts_atoms ca, atoms a where c.terminology = :terminology and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true and a.terminology in ('NCIMTH','MTH') and a.termType = 'PN' and not c.id in (select conceptId1 from ambig_concepts)
ambig_pn: MTH/PN atom is ambiguous but has no matching ambiguous string
Drop. No longer useful.
pn_orphan: MTH/PNs on their own
--Concepts whose only publishable atoms are MTH/PN or NCIMTH/PN SELECT c.id conceptId FROM concepts c, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.concepts_id IN (SELECT ca.concepts_id FROM concepts_atoms ca, atoms a WHERE ca.atoms_id = a.id AND a.terminology IN ('MTH' , 'NCIMTH') AND a.termType = 'PN' AND a.publishable = true) GROUP BY ca.concepts_id HAVING COUNT(DISTINCT ca.atoms_id) = 1
nosty: No STY
SELECT DISTINCT c.id conceptId FROM concepts c LEFT JOIN concepts_semantic_type_components cs ON (c.id = cs.concepts_id) WHERE c.terminology= :terminology AND c.publishable = TRUE AND cs.semanticTypes_id IS NULL
missing_sty: Reviewed concepts without releasable Semantic Types
SELECT DISTINCT c.id conceptId FROM concepts c WHERE c.terminology = :terminology AND c.publishable = TRUE AND c.workflowStatus IN ('READY_FOR_PUBLICATION' , 'PUBLISHED') AND NOT c.id IN (SELECT c.id FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s WHERE c.terminology = :terminology AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.publishable = TRUE)
cbo_chem: Current version CBO concepts with CHEM STYs
-- to find chemical semantic types (select distinct semanticTypeCategoryMap sty from projects a, ProjectJpa_semanticTypeCategoryMap b where a.id = b.ProjectJpa_id and semanticTypeCategoryMap_KEY = 'chemical' and a.terminology = :terminology); -- Rick update 3/2/2017 (select distinct semanticTypeCategoryMap_KEY sty from projects a, ProjectJpa_semanticTypeCategoryMap b where a.id = b.ProjectJpa_id and semanticTypeCategoryMap = 'chem' and a.terminology = :terminology)
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'CBO' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
go_chem: Current version GO concepts with CHEM STYs
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'GO' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
mdr_chem: Current version MDR concepts with CHEM STYs
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_semantic_type_components cs, semantic_type_components s, terminologies t WHERE c.terminology = 'MDR' AND c.terminology = t.terminology AND c.version = t.version AND t.current = TRUE AND c.id = cs.concepts_id AND cs.semanticTypes_id = s.id AND s.semanticType IN ((SELECT DISTINCT semanticTypeCategoryMap_KEY sty FROM projects a, ProjectJpa_semanticTypeCategoryMap b WHERE a.id = b.ProjectJpa_id AND semanticTypeCategoryMap = 'chem' AND a.terminology = :terminology))
true_orphan: Concepts with no releasable relationships to any other concept
Uses the "deep relationships" query; only needs to search FROM role since rels are bidirectionalSELECT c.id conceptId FROM concepts c WHERE c.terminology = :terminolgy AND c.publishable=TRUE AND NOT EXISTS (SELECT * FROM deep_concept_relationships dcr WHERE role='FROM' AND c.id = dcr.concepts_id)
deleted_cui_split: Complex split/merge case CUIs that are going away - will need bequeathal
Drop
split_demotions: Demotions overlapping with a bad merge that was split
Drop. No longer useful.nci_pt_mrg: Merged current version NCI/PT atoms
SELECT DISTINCT c.id conceptId FROM concepts c, atoms a, concepts_atoms ca WHERE c.terminology = :terminology AND c.id = ca.concepts_id AND ca.atoms_id = a.id AND a.terminology in ('NCI') AND a.termType = 'PT' GROUP BY c.id HAVING COUNT(c.id) > 1
mxsuppr: Concepts with identical LUI atoms that have mixed suppressibility
SELECT DISTINCT c.id conceptId FROM concepts c, concepts_atoms ca1, concepts_atoms ca2, (SELECT a1.id atomId1, a2.id atomId2 FROM atoms a1, atoms a2 WHERE a1.lexicalClassId = a2.lexicalClassId AND a1.id != a2.id AND a1.publishable = TRUE AND a2.publishable = TRUE AND a1.suppressible != a2.suppressible) mixedSuppresLuiAtoms WHERE c.terminology = :terminology AND c.id = ca1.concepts_id AND ca1.atoms_id = mixedSuppresLuiAtoms.atomId1 AND ca2.atoms_id = mixedSuppresLuiAtoms.atomId2 AND ca1.concepts_id = ca2.concepts_id
~ Mixed status concepts 196 Sat Jan 7 09:19:47 2017
~ Suppressible preferred name (level 0,9) 0 Sat Jan 7 09:19:58 2017