QA Bin Definitions

Overview

QA Bin Definitions

Parameters :terminology (e.g. NCIMTH) and :version (e.g. latest) are allowed in the queries.\

TODO:

  1. Look up definitions of bins (e.g. logically what do they do, we'll need descriptions too
  2. Write up description
  3. Write and test query (with :terminology/:version interpolations)

REQUIRED bins

  • nci_merge: Merged SCUI current version NCI atoms

    select distinct c.id conceptId 
    from concepts c, concepts_atoms ca, atoms a
    where c.terminology = :terminology
      and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
      and a.terminology='NCI'
    group by c.id having count(distinct a.conceptId)>1
  • nci_sub_split: Split SCUI current version NCI (or sub-source) atoms

    --Identify Sub-source atoms that aren't in the same concept as the NCI atom with the same conceptId
     
    SELECT DISTINCT
        c.id conceptId1, c1.id conceptId2
    FROM
        concepts c,
        concepts_atoms ca,
        atoms a,
        concepts c1,
        concepts_atoms ca1,
        atoms a1
    WHERE
        c.terminology = :terminology
            AND c1.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND c1.id = ca1.concepts_id
            AND ca1.atoms_id = a1.id
            AND a.terminology = 'NCI'
            AND a1.terminology IN (SELECT 
                terminology
            FROM
                root_terminologies
            WHERE
                family = 'NCI' AND terminology != 'NCI')
            AND a.conceptId = a1.conceptId
            AND c.id != c1.id
  • sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated

    SELECT DISTINCT
        c.id conceptId1, c1.id conceptId2
    FROM
        concepts c,
        concepts_atoms ca,
        atoms a,
        concepts c1,
        concepts_atoms ca1,
        atoms a1
    WHERE
        c.terminology = :terminology
            AND c1.terminology = :terminology
            AND a.terminology = 'SNOMEDCT_US'
            AND a1.terminology = 'SNOMEDCT_US'
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND c1.id = ca1.concepts_id
            AND ca1.atoms_id = a1.id
            AND a.termType = 'FN'
            AND a1.termType = 'PT'
            AND a.conceptId = a1.conceptId
            AND c.id != c1.id
  • cdsty_coc: Find concepts with Clinical Drug STY and any other STY

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        concepts_semantic_type_components cs
    WHERE
        c.terminology = :terminology
            AND c.id = cs.concepts_id
            AND cs.concepts_id IN (SELECT 
                cs.concepts_id
            FROM
                concepts c,
                concepts_semantic_type_components cs,
                semantic_type_components s
            WHERE
                c.terminology = :terminology
                    AND c.id = cs.concepts_id
                    AND cs.semanticTypes_id = s.id
                    AND s.semanticType = 'Clinical Drug')
    GROUP BY cs.concepts_id
    HAVING COUNT(cs.concepts_id) > 1
  • multsty: concepts with more than 3 STYs|

    SELECT DISTINCT
        cs.concepts_id conceptId
    FROM
    	concepts c,
        concepts_semantic_type_components cs,
        semantic_type_components s
    WHERE
    	c.terminology = :terminology
    	AND c.id = cs.concepts_id
        AND cs.semanticTypes_id = s.id
    GROUP BY c.id
    HAVING COUNT(c.id) > 3
  • styisa: One STY is an ancestor of another in the STY isa hierarchy

    select t1.conceptId from 
    (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st
    WHERE c.terminology = :terminology
    AND c.id = cstc.concepts_id
    AND cstc.semanticTypes_id = st.id) t1
    JOIN 
    (select c.id conceptId, st.id styId, st.treeNumber from concepts c, concepts_semantic_type_components cstc, (select stc.id, st.treeNumber from semantic_type_components stc join semantic_types st on (stc.semanticType = st.expandedForm)) st
    WHERE c.terminology = :terminology
    AND c.id = cstc.concepts_id
    AND cstc.semanticTypes_id = st.id) t2
    on  (t1.conceptId = t2.conceptId and t1.styId != t2.styId and t1.treeNumber != t2.treeNumber)
    WHERE t2.treeNumber like concat(t1.treeNumber, '.%')
  • sfo_lfo: Short form in one concept, long form in another

    --ShortForm/LongForm are related atoms
      -- These live in the DB as "SY" atom relationships with RELA value either equal to "expanded_form_of" or starting with "mth_" and ending with "_form_of"
     
    SELECT 
        c1.id conceptId1, c2.id conceptId2
    FROM
        concepts c1,
        concepts_atoms ca1,
        atoms a1,
        concepts c2,
        concepts_atoms ca2,
        atoms a2,
        (SELECT 
            ar.from_id, ar.to_id
        FROM
            atom_relationships ar
        WHERE
            publishable = TRUE
                AND relationshipType = 'SY'
                AND (additionalRelationshipType = 'expanded_form_of'
                OR additionalRelationshipType LIKE 'mth_%_form_of')) sfoLfoRels
    WHERE
        c1.terminology = 'NCIMTH'
            AND c2.terminology = 'NCIMTH'
            AND c1.id = ca1.concepts_id
            AND ca1.atoms_id = a1.id
            AND c2.id = ca2.concepts_id
            AND ca2.atoms_id = a2.id
            AND a1.id = sfoLfoRels.from_id
            AND a2.id = sfoLfoRels.to_id
            AND c1.id != c2.id
  • deleted_cui: CUIs that are going away - will need bequeathal rel

    SELECT
      DISTINCT c.id conceptId
    FROM
      concepts c,
      concepts_atoms ca,
      atoms a
    WHERE
      c.terminology = 'NCIMTH'
      AND c.id != c.terminologyId
      AND c.id = ca.concepts_id
      AND ca.atoms_id = a.id
      AND a.publishable = FALSE
      AND NOT c.id IN (
        SELECT
          DISTINCT c.id conceptId
        FROM
          concepts c,
          concepts_atoms ca,
          atoms a
        WHERE
          c.terminology = 'NCIMTH'
          AND c.id = ca.concepts_id
          AND ca.atoms_id = a.id
          AND a.publishable = TRUE
      )
      AND NOT c.id IN (
        SELECT
          DISTINCT c.id conceptId
        FROM
          concepts c,
          concept_relationships cr
        WHERE
          c.terminology = 'NCIMTH'
          AND c.id = cr.from_id
          AND cr.relationshipType like 'B%'
      )
      AND NOT c.id IN (
        SELECT
          c.id conceptId
        FROM
          concepts c,
          concepts_atoms ca
        WHERE
          c.terminology = 'NCIMTH'
          AND c.id = ca.concepts_id
          AND ca.concepts_id IN (
            SELECT
              ca.concepts_id
            FROM
              concepts_atoms ca,
              atoms a
            WHERE
              ca.atoms_id = a.id
              AND a.terminology IN ('MTH', 'NCIMTH')
              AND a.termType = 'PN'
          )
        GROUP BY
          ca.concepts_id
        HAVING
          COUNT(DISTINCT ca.atoms_id) = 1
      )
      AND NOT c.id IN (
        SELECT
          ca.concepts_id conceptId
        FROM
          mrcui mr,
          atomjpa_conceptterminologyids ac,
          concepts_atoms ca,
          concepts cpt
        WHERE
          mr.cui1 = ac.conceptTerminologyIds
          AND ca.atoms_id = ac.AtomJpa_id
          AND cpt.id = ca.concepts_id
          AND cpt.terminology = 'NCIMTH'
          AND ac.conceptTerminologyIds_KEY = 'NCIMTH'
          AND mr.rel = 'DEL'
      )

     

     

NON-required bins

  • rxnorm_merge: RXCUI Merges
    • Same as nci_merge but with RXNORM

      select distinct c.id conceptId 
      from concepts c, concepts_atoms ca, atoms a
      where c.terminology = :terminology
        and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
        and a.terminology='RXNORM'
      group by a.conceptId having count(distinct c.id)>1

  • cbo_merge: Merged SCUI current version CBO atoms
    • Same as nci_merge but with CBO

      select distinct c.id conceptId 
      from concepts c, concepts_atoms ca, atoms a
      where c.terminology = :terminology
        and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
        and a.terminology='CBO'
      group by c.id having count(distinct a.conceptId)>1
  • mdr_merge: Merged SDUI current version MDR atoms

    select distinct c.id conceptId 
    from concepts c, concepts_atoms ca, atoms a
    where c.terminology = :terminology
      and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
      and a.terminology='MDR'
    group by c.id having count(distinct a.descriptorId)>1
  • pdq_merge: Merged SDUI current version PDQ atoms

    select distinct c.id conceptId 
    from concepts c, concepts_atoms ca, atoms a
    where c.terminology = :terminology
      and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
      and a.terminology='PDQ'
    group by c.id having count(distinct a.descriptorId)>1
  • sct_sepfnpt: SNOMED concept clusters where the FN and PT terms are separated
    • DUPLICATE of above
  • rxnorm_split: RXCUI splits

    select distinct c.id conceptId 
    from concepts c, concepts_atoms ca, atoms a
    where c.terminology = :terminology
      and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
      and a.terminology='RXNORM'
    group by a.conceptId having count(distinct c.id)>1
  • nci_pdq_merge: Concepts containing current version NCI and current version PDQ atoms

    SELECT DISTINCT
        c.id conceptId
    FROM
        atoms a,
        atoms a1,
        concepts c,
        concepts c1,
        concepts_atoms ca,
        concepts_atoms ca1,
        terminologies t,
        terminologies t1
    WHERE
        c.terminology = :terminology
            AND c1.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND c1.id = ca1.concepts_id
            AND ca1.atoms_id = a1.id
            AND a.terminology = 'NCI'
            AND a.terminology = t.terminology
            AND a.version = t.version
            AND t.current = TRUE
            AND a1.terminology = 'PDQ'
            AND a1.terminology = t1.terminology
            AND a1.version = t1.version
            AND t1.current = TRUE
            AND c.id = c1.id
  • nci_sct_merge: Concepts containing current version NCI and current version SNOMEDCT atoms

    SELECT DISTINCT
        c.id conceptId
    FROM
        atoms a,
        atoms a1,
        concepts c,
        concepts c1,
        concepts_atoms ca,
        concepts_atoms ca1,
        terminologies t,
        terminologies t1
    WHERE
        c.terminology = :terminology
            AND c1.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND c1.id = ca1.concepts_id
            AND ca1.atoms_id = a1.id
            AND a.terminology = 'NCI'
            AND a.terminology = t.terminology
            AND a.version = t.version
            AND t.current = TRUE
            AND a1.terminology = 'SNOMEDCT_US'
            AND a1.terminology = t1.terminology
            AND a1.version = t1.version
            AND t1.current = TRUE
            AND c.id = c1.id
  • ambig_no_pn: Ambiguous concepts where at least one has no MTH/PN nor NCIMTH/PN (sepstring)

    --Ambiguous concept = concepts that share atoms that have the same case-insensitive name
    -- This is used by several of the below queries, so it should be created as a VIEW
     
    CREATE VIEW ambig_concepts AS
        SELECT DISTINCT
            c1.id conceptId1, c2.id conceptId2
        FROM
            concepts c1,
            concepts_atoms ca1,
            atoms a1,
            concepts c2,
            concepts_atoms ca2,
            atoms a2
        WHERE
            c1.terminology = 'NCIMTH'
                AND c2.terminology = 'NCIMTH'
                AND c1.id = ca1.concepts_id
                AND ca1.atoms_id = a1.id
                AND c2.id = ca2.concepts_id
                AND ca2.atoms_id = a2.id
                AND c1.id < c2.id
                AND a1.lowerNameHash = a2.lowerNameHash
                AND a1.id != a2.id
                AND a1.publishable = TRUE
                AND a2.publishable = TRUE
    --Use Pre-programmed View ambig_concepts
     
    SELECT 
        conceptId1, conceptId2
    FROM
        ambig_concepts
    WHERE
        NOT ambig_concepts.conceptId1 IN (SELECT 
                ca.concepts_id
            FROM
                concepts_atoms ca,
                atoms a
            WHERE
                ca.atoms_id = a.id
                    AND a.terminology IN ('NCIMTH' , 'MTH')
                    AND a.termType = 'PN')
            AND NOT ambig_concepts.conceptId2 IN (SELECT 
                ca.concepts_id
            FROM
                concepts_atoms ca,
                atoms a
            WHERE
                ca.atoms_id = a.id
                    AND a.terminology IN ('NCIMTH' , 'MTH')
                    AND a.termType = 'PN')
  • ambig_no_rel: Ambiguous concepts that lack an approved REL

    --Use Pre-programmed View ambig_concepts
    
    SELECT 
        conceptId1, conceptId2
    FROM
        ambig_concepts
    WHERE
        NOT (conceptId1 , conceptId2) IN (SELECT 
                cr.from_id, cr.to_id
            FROM
                concept_Relationships cr
            WHERE
                cr.publishable = TRUE
                AND cr.workflowStatus in ('READY_FOR_PUBLICATION','PUBLISHED'))
  • pn_pn_ambig: Identical (same SUI) PN's in multiple concepts

    SELECT 
        c1.id conceptId1, c2.id conceptId2
    FROM
        concepts c1,
        concepts c2,
        concepts_atoms ca1,
        concepts_atoms ca2,
        (SELECT 
            a1.id atomId1, a2.id atomId2
        FROM
            atoms a1, atoms a2
        WHERE
            a1.termType = 'PN'
                AND a2.termType = 'PN'
                AND a1.stringClassId = a2.stringClassId
                AND a1.id != a2.id) identicalPNAtoms
    WHERE
        c1.terminology = :terminology
            AND c2.terminology = :terminology
            AND c1.id = ca1.concepts_id
            AND c2.id = ca2.concepts_id
            AND ca1.atoms_id = identicalPNAtoms.atomId1
            AND ca2.atoms_id = identicalPNAtoms.atomId2
            AND c1.id != c2.id
  • multiple_pn: Concepts with multiple MTH/PN atoms

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        atoms a,
        concepts_atoms ca
    WHERE
        c.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND a.terminology in ('MTH', 'NCIMTH')
            AND a.termType = 'PN'
    GROUP BY c.id
    HAVING COUNT(c.id) > 1
  • pn_no_ambig: Concept has MTH/PN atom but no ambiguous string

    select distinct c.id conceptId 
    from concepts c, concepts_atoms ca, atoms a
    where c.terminology = :terminology
      and c.id = ca.concepts_id and ca.atoms_id = a.id and a.publishable = true
      and a.terminology in ('NCIMTH','MTH')
      and a.termType = 'PN'
      and not c.id in (select conceptId1 from ambig_concepts)
  • ambig_pn: MTH/PN atom is ambiguous but has no matching ambiguous string

    Drop. No longer useful. 

  • pn_orphan: MTH/PNs on their own

    --Concepts whose only publishable atoms are MTH/PN or NCIMTH/PN
     
    SELECT 
        c.id conceptId
    FROM
        concepts c,
        concepts_atoms ca
    WHERE
        c.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.concepts_id IN (SELECT 
                ca.concepts_id
            FROM
                concepts_atoms ca,
                atoms a
            WHERE
                ca.atoms_id = a.id
                    AND a.terminology IN ('MTH' , 'NCIMTH')
                    AND a.termType = 'PN'
                    AND a.publishable = true)
    GROUP BY ca.concepts_id
    HAVING COUNT(DISTINCT ca.atoms_id) = 1
  • nosty: No STY

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c
            LEFT JOIN
        concepts_semantic_type_components cs ON (c.id = cs.concepts_id)
    WHERE
    	c.terminology= :terminology
        AND c.publishable = TRUE
        AND cs.semanticTypes_id IS NULL
  • missing_sty: Reviewed concepts without releasable Semantic Types

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c
    WHERE
        c.terminology = :terminology
            AND c.publishable = TRUE
            AND c.workflowStatus IN ('READY_FOR_PUBLICATION' , 'PUBLISHED')
            AND NOT c.id IN (SELECT 
                c.id
            FROM
                concepts c,
                concepts_semantic_type_components cs,
                semantic_type_components s
            WHERE
                c.terminology = :terminology
                    AND c.id = cs.concepts_id
                    AND cs.semanticTypes_id = s.id
                    AND s.publishable = TRUE)
  • cbo_chem: Current version CBO concepts with CHEM STYs

    -- to find chemical semantic types
    (select distinct semanticTypeCategoryMap sty
    from projects a, ProjectJpa_semanticTypeCategoryMap b
    where a.id = b.ProjectJpa_id
      and semanticTypeCategoryMap_KEY = 'chemical'
      and a.terminology = :terminology);
     
    -- Rick update 3/2/2017
    (select distinct semanticTypeCategoryMap_KEY sty
    from projects a, ProjectJpa_semanticTypeCategoryMap b
    where a.id = b.ProjectJpa_id
      and semanticTypeCategoryMap = 'chem'
      and a.terminology = :terminology)
     
     
    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        concepts_semantic_type_components cs,
        semantic_type_components s,
        terminologies t
    WHERE
        c.terminology = 'CBO'
            AND c.terminology = t.terminology
            AND c.version = t.version
            AND t.current = TRUE
            AND c.id = cs.concepts_id
            AND cs.semanticTypes_id = s.id
            AND s.semanticType IN ((SELECT DISTINCT
                semanticTypeCategoryMap_KEY sty
            FROM
                projects a,
                ProjectJpa_semanticTypeCategoryMap b
            WHERE
                a.id = b.ProjectJpa_id
                    AND semanticTypeCategoryMap = 'chem'
                    AND a.terminology = :terminology))



  • go_chem: Current version GO concepts with CHEM STYs

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        concepts_semantic_type_components cs,
        semantic_type_components s,
        terminologies t
    WHERE
        c.terminology = 'GO'
            AND c.terminology = t.terminology
            AND c.version = t.version
            AND t.current = TRUE
            AND c.id = cs.concepts_id
            AND cs.semanticTypes_id = s.id
            AND s.semanticType IN ((SELECT DISTINCT
                semanticTypeCategoryMap_KEY sty
            FROM
                projects a,
                ProjectJpa_semanticTypeCategoryMap b
            WHERE
                a.id = b.ProjectJpa_id
                    AND semanticTypeCategoryMap = 'chem'
                    AND a.terminology = :terminology))
  • mdr_chem: Current version MDR concepts with CHEM STYs

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        concepts_semantic_type_components cs,
        semantic_type_components s,
        terminologies t
    WHERE
        c.terminology = 'MDR'
            AND c.terminology = t.terminology
            AND c.version = t.version
            AND t.current = TRUE
            AND c.id = cs.concepts_id
            AND cs.semanticTypes_id = s.id
            AND s.semanticType IN ((SELECT DISTINCT
                semanticTypeCategoryMap_KEY sty
            FROM
                projects a,
                ProjectJpa_semanticTypeCategoryMap b
            WHERE
                a.id = b.ProjectJpa_id
                    AND semanticTypeCategoryMap = 'chem'
                    AND a.terminology = :terminology))
  • true_orphan: Concepts with no releasable relationships to any other concept
    Uses the "deep relationships" query; only needs to search FROM role since rels are bidirectional

    SELECT c.id conceptId
    FROM concepts c
    WHERE c.terminology = :terminolgy
    AND c.publishable=TRUE
    AND NOT EXISTS
        (SELECT * FROM deep_concept_relationships dcr
         WHERE role='FROM' AND
         c.id = dcr.concepts_id)
  • deleted_cui_split: Complex split/merge case CUIs that are going away - will need bequeathal

    Drop

  • split_demotions: Demotions overlapping with a bad merge that was split
    Drop. No longer useful. 

     

  • nci_pt_mrg: Merged current version NCI/PT atoms

    SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        atoms a,
        concepts_atoms ca
    WHERE
        c.terminology = :terminology
            AND c.id = ca.concepts_id
            AND ca.atoms_id = a.id
            AND a.terminology in ('NCI')
            AND a.termType = 'PT'
    GROUP BY c.id
    HAVING COUNT(c.id) > 1
  • mxsuppr: Concepts with identical LUI atoms that have mixed suppressibility

     SELECT DISTINCT
        c.id conceptId
    FROM
        concepts c,
        concepts_atoms ca1,
        concepts_atoms ca2,
        (SELECT 
            a1.id atomId1, a2.id atomId2
        FROM
            atoms a1, atoms a2
        WHERE
            a1.lexicalClassId = a2.lexicalClassId
                AND a1.id != a2.id
                AND a1.publishable = TRUE
                AND a2.publishable = TRUE
                AND a1.suppressible != a2.suppressible) mixedSuppresLuiAtoms
    WHERE
        c.terminology = :terminology
            AND c.id = ca1.concepts_id
            AND ca1.atoms_id = mixedSuppresLuiAtoms.atomId1
            AND ca2.atoms_id = mixedSuppresLuiAtoms.atomId2
            AND ca1.concepts_id = ca2.concepts_id

~ Mixed status concepts 196 Sat Jan 7 09:19:47 2017
~ Suppressible preferred name (level 0,9) 0 Sat Jan 7 09:19:58 2017