SPRINT 6 - concludes 3/31/2015
Overview
Sprint to continuing to develop core features of the application
Finalize Read-only Server with Testing Infrastructure
- DONE: serialization of concepts should include id, terminologyId, preferred name
- Applies to Description and Relationship
- Return 0 for Long if not set
- Return "" for String if not set
- DONE: Claml loader
- make sure transitive closure computation is working
DONE: BAC: REST APIs - fix @Consumes and @Produces
- Annotations are now at the class level, with TEXT_PLAIN overrides where parameters or return values are String.
- PG: Finalize integration REST testing
- DONE: security service
- DONE: Metadata Service
- DONE: Content Service
- History service
DONE: PG: History service - handle other data structures
- Loaders
- DONE: Make sure snapshot and full create release info
- DONE: BAC: RF2 full - get this working
- DONE: DS: RF2 delta
- Handle other data structures (currently only 4) - need to add sample data (and headers) too
- Remember AttributeValue and AssociationReference need handling for both description and concept
- Need new ContentServiceJpa methods
- No need for "handle retracted changes"
- DONE: Scale to full RF2
- Monitor the number of copies of objects
- BAC: Manage memory usage and optimize commit
- Can delta loader scale to a full sized delta?
- Load full SNOMED 20140731
- then load delta SNOMED 20150131
- Can snapshot loader scale to a full sized snapshot with only a single commit?
- Consider making description/concept connections merely an id (lose referential integrity, but these could be added qa database)
- Can delta loader scale to a full sized delta?
DONE: change run.config to run.config.ts
DONE: BAC: Admin tools - all should use "server" flag except for QaDatabase and Create/Updatedb
DONE: BAC: QA Admin tool - fill in queries and reuse
Add queries for project and for release info
DONE: BAC: Separate "updatedb" into
admin project called "db"
pass hibernate.hbm2dd.auto as a parameter to the execution
call System.setProperty("hibernate.hbm2ddl.auto", hbm2ddlParameter).
THEN create the root service.
"createdb" profile
"updatedb" profile
- Remove the properties section from pom.xml
- Update documentation.
DONE: Mojo integration testing
DONE: Testing
- DS - Unit Testing
- model - add checks for @Field annotations (and analyze criteria)
- helpers
- handlers
- algorithms
- DS - Unit Testing
- DONE: ProjectService
- add project, etc.
- DONE: Very basic user interface (as part of rest project).
- Header/Footer
- Point to swagger (change that index page to swagger.html)
- Security: login/logout
- Metadata:
- show terminologies and versions
- show all metadata for a selected terminology
- Content
- find concept
- get ancestors
- get children
- get descendants
FOR FUTURE SPRINTS
Fleshing out APIs
- Domain model
- Decide where to have "count" fields (and whether this should be maintained in DB, in Lucene, or computed by REST layer).
- e.g. concept.getChildCount(), concept.getSimpleRefSetMemberCount()
- Consider separating "relationship" and "inferred relationship" into separate tables/calls. This may help working with classifier easier.
- The "graph resolution handler" and the concept model and loaders would have to be updated too.
- Transitive closure computer - avoid cycles.
- Pass set of identifiers "seen" along the way - fail?
Handling effective time (new handler)
- Loaders should be allowed to set effective time
- release process should be allowed to set effective time
- want to be able to distinguish between published and unpublished things (perhaps a "published" flag is better)
Semantic search
- Support searching by hierarchy, by relationships, etc.
- Sparql end point? or expression language.
Project metadata service
- For Project (pull out of content service)
- for release info (pull out of history service)
PG Support running all admin tools through REST services
- Add additional rest calls (e.g. to action service or whatever)
- Add corresponding client calls.
- When adding a new project, have REST service set the "lastModifiedBy" to the user who authenticated.
Project-based authorization (e.g. only allow edits on concepts where a user is an author)
- This requires tracking the full concept scope with the project. (what if you add a concept?)
Application metadata service (project stuff?)
- Terminology Metadata service
- DONE Support classifier - root node (done), isa relationship (done), and role root.
- DONE stated/inferred char types should be metadata methods,
- Update TerminologyUtility accordingly to avoid hardcoded values
- new DL features for properties and data properties
- access to "RoleRelationship" objects
Security service - n/a
Content service methods for adding/removing inferred rels in batch.
- Separate read/write services of content service
- SNOMEDCT Editing Service
- HistoryService
- getRevisions (without boolean)
- getReleaseRevision(id, String release) - pick any one with matching release - get first one
- Configuration setting for synchronizing access to write services
- ContentService (and content change service) - have a "deep" delete on concept and description that deletes connected refset data too.
- REST layer
- Implement semantic search
- Implement everything across the board for Concept, Description, Language Refset, Relationship, and AssociationReference (e.g. reason for inactivation)
- Let the demo drive all use cases,
- READ only version of rest service
- Consider reorganizing snapshot loader to load concept-at-a-time.
Integration Testing
Mojo integration testing
Have a setup dev environment mojo
Have a teardown dev environment mojo (calls createdb at the end to clean up after itself, leavign an empty Db ready to go).
Identifier assignment
- Relationship and description and language refset member id assignment is tough because IDs don't exist yet when cascade is being used.
- Consider removing cascade and handling cascade manually based on graph resolution handler - e.g. however much of the graph is present is what gets udpated, then we can also compare against current states to see if its really chagned. If so, then we update last modified and save it (this addresses the next issue for now)
- then relationships can have ids based on hibernate ids instead of terminology ids..
- Recomputation of concept identifier and default preferred name should probably not require a "lastModifiedBy" or "lastModified" change.
LastModified peculiarities
- add concept - for now it sets last updated for all CASCADE=ALL
- update concept - n/a
- add description - - for now it sets last updated for all CASCADE=ALL
- update description - n/a
- other objects - n/a
- consider only actually calling merge if the thing changed and/or using @Version
- XML and JSON serialize dates differently -see if this causes any problems on in/out
Transitive Closure Improvements
- Incremental TC maintenance
- New relationship
- Retired relationship
- New concept
- Retired concept
Algorithms
- Release Processing
- Begin Release
- Create release info object for this release
- Perform Release (real or fake IDs)
- Assign IDs and effective time (compare to previous release states)
- Generate delta files
- Finish Release
- sets the planned release to "not planned" and "published"
- takes a "planned next release" and creates the infrastructure for it.
- Begin Release
Admin tools
- Refactor
- All require a profile
- All parameters are passed as.dot.style parameters
- All require run.config for DB parameters
- Consider REST-based admin tools for interacting with the server programmatically rather than directly with DB.
Unit/Integration/Functional tests
- Unit tests for equals, hashcode, and copy constructors for model objects (use getter/setter/tester)
- Have integration tests of JPA layer
- Have client-based integration tests of REST layer
- Demo script - full life-cycle - with tests to support
- Start from scratch - empty database
- Create schema
- Clear indexes
- Load "full" SNOMED
- Create a "project"
- Start the editing cycle.
- Create a new concept
- verify id assignment
- verify last modified management
- verify transitive closure recomputation
- verify classification
- Create an RF2 daily build delta - show the result
- Change concept definition status
- Retire the concept. (with reason for inactivation)
- Create an RF2 daily build delta delta - show that the concept doesn't show up
- Create, edit, retire a description (with reason for inactivation)
- Create, edit, retire an "isa" relationship
- Begin a release
- Perform a relelease
- Finish a release
- Demo loading an RF2 delta (verify id, last modified, transitive closure, classification)
Concurrency and Locking
- support a means to synchronize write access - e.g. make it a configurable feature of ContentChangeRestImpl.
Classifier
classifier integration
- Sequence of events
- Call actionService.configure( pass it a string list with "PUBLISHED" as the only thing)
- prepareToClassify();
- this loads classifier data structures
- periodically, close and open the content service (every 2000 concepts)
- or add a clear() to root service
- handle progress monitoring - just put a progress into the tokenProgressMap.
- when handling relationships for classification - ONLY treat stated rels (not inferred or additional)
- classify()
- handle classification
- save concept equivalents
- save old inferred relationships
- save new relationships
- getConceptEquivalents - just look up classification data
- getOldInferredRelationships - just look up these
- getNewInferredRelationships - just look up these
- Other things to pay attention to in action service
- have a "clear" method that takes a session token and removes all tracked data
- Add a timeout to remove session tokens - check every 5 min.
FOR FUTURE DL ENHANCEMENTS
Supporting Owl EL 2
- Nesting - supported by anonymous concepts
- GCI
- multiple sufficient conditions - anonymous concepts with "equivalentClass" relationship
- complex left hand side - anonymous concepts with "equivalentClass" relationship
- Metadata model
- Role (isTransitive, isReflexive)
- DataRole(getDomain - e.g.)
- Domain - anonymous concept expression
- Range - anonymous concept expression
- Role (isTransitive, isReflexive)
- Disjoint sets - set of sets of concepts
- property chains - new objects and services - classifier uses this.
- PropertyChain (List<Concept>->Concept)
- never group rels (metadata should be in terminology)
- data type properties (data relationships) - new objects and services
- like relationship but with an operator, and a value (could reuse the relationships table).
- Concept instances - new objects and services (e.g. states, etc)
- Templates build a service for this that replaces the "edit" service
- Export to Owl.
Supporting Expressions
- Post-coordinated expression service using expression syntax - creates a concept without descriptions and a UUID concept id
- Query language
- Template editing