Information on software development design, architecture, frameworks, and other details.
Software Architecture
WCI Terminology server is a REST service that leverages Lucene (or Elasticsearch) indexes and a Postgres database to deliver terminology information through an API and to an included user interface.
<insert diagram: TBD>
The architecture is simple and involves a common library and a single deployed service that runs as a docker container (and can be easily made to run in a clustered environment, such as Kubernetes).
Data Model
The data model is inspired by the UMLS and designed to capture the core important common aspects across all terminology models. It is fundamentally “Concept” oriented with a notion of hierarchical terminologies that can be computed completely via transitive closure of the parent/child relationships. Key data model objects include:
Concept - represents a “meaning” in a terminology that has a code, one or more names, one or more semantic types, (optional) text definitions, and (optional) key/value attributes.
Atom - represents a single “name” from a terminology for a given code. All of the “atoms” in a concept are considered synonymous with each other. The naming convention “atom” comes from the UMLS (https://umls.nlm.nih.gov)
Definition - represents a textual definition for a concept.
ConceptRelationship - represents a relationship between two Concepts with a general type (such as “parent”, “child”, “broader”, or “narrower”) and a specific type (such as “has_finding_site”).
ConceptTreePosition - represents one position of a Concept in it’s hierarchy (as computed by transitive closure of the parent/child relationships). In mono-hierarchy terminologies (such as ICD10CM), concepts will have only a single tree position. However, in poly-hierarchy terminologies (such as SNOMEDCT), concepts may have multiple tree positions.
Mapset - represents a collection of mappings from one terminology to another one.
Mapping - represents a mapping from an individual code to code in a different terminology (e.g. SNOMED → ICD10)
Metadata - Used to provide additional information about metadata fields spread throughout the model (e.g. semantic types, relationship types, attribute names, atom term types, etc.)
Subset - In other contexts called a ValueSet or a Refset, this represents a collection of codes within a terminology that represent a portion of the content used for a particular reason. For example, you could create a “primary cancer diagnoses” subset of SNOMED that would represent exactly those concepts that could be legitimately used to label a primary cancer diagnosis.
Terminology - represents a terminology and associated information.
.
0 Comments