EVS Status Meeting


January 6, 2009 (version 1) 2

February 17, 2009 (version 1) 5

March 17, 2009 (version 1) 8

April 21, 2009 (version 1) 11

May 19, 2009 (version 1) 14

June 16, 2009 (version 1) 17

July 2009 – No meeting held. 20

August 18, 2009 21

September 15, 2009 24

October 2009 – No meeting held 26

November 17, 2009 (version 1) 27

November 17, 2009 (version 2) 29

December 15, 2009(version 1) 31

January 6, 2009 (version 1)

EVS Meeting January 6, 2009 (Version 1)

Attendees: Frank Hartel (phone), Sherri De Coronado (phone), Margaret Haber (phone), Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Tracy Safran, Theresa Quinn (phone), Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Bob Dionne (phone), David Yee (phone).

Attachments: Header Change Proposal

Agenda items:

1. Update on the Wiki: Due to time constraints, Sherri will demonstrate the new features of Wiki when she is in town next week Jan 13 and 14.

2. cogZ mapping tool demo: Bob demonstrated the cogZ mapping tool. CogZ is being looked at as a potential mapping tool to map between the mouse and human anatomy. CogZ runs with Protégé and uses Prompt. The tool looks promising for our use. Bob will be in town the January 13th and 14th. This can be discussed more at that time. Sherri and Frank will also be here. Sherri would like to meet with editors at that time.

3. Protégé update: The BGT editors provided feedback on the UAT for Protégé 1.3. User acceptance testing has been completed. The next version of Protégé should be released in about one week. Protégé 1.3 will be released for BiomedGT not Thesaurus, at this time. Liz reported that using the copy tab to edit two concepts at the same time was found to be causing problems (a tracker item was entered in Gforge for this and the bug has been fixed). Also, Liz mentioned that the domain and range constraints of properties prevented a full modeling test with PATO; Gilberto indicated this would cause a classification problem but the modeling should be feasible during testing. Nicole remarked this is the first version of Protégé we have failed to break.

4. Project plan for BiomedGT: Liz discussed the project plan for BiomedGT version 1.0. This is a production version. The plan and timeline are as follows: BiomedGT clean-up which includes kind clean-up and programmatic deletion of branches(95 days); Prep for implementing BFO—the upper level ontology that we will use as the back bone in BGT (10 days); Concept type remediation to help prepare for BFO (28 days); Looking at restrictions and annotation properties (5 days); BFO implementation and external ontology(38 days); Hands on use of external ontologies in Protégé modeling—training the editors (2 days).

5. Referencing copyright holders and survey instruments and psychometric products for population sciences: Frank would like to know if the copyright should be included in the definition or as a separate property. He asks that someone look at how APA represents their instruments. Action Item: Lori will look at this and report back.

6. Using and citing APA definitions as NCI definitions for population science: Regarding discussed the use of American Psychological Association definitions to be used as NCI definitions for population science, Frank would like a volunteer to go to APA site to be sure that we cite them properly. Action Item: Lori will go to the site and report back.

7. Introduction to Vocabulary Knowledge Center: Frank provided an introduction and overview of Vocabulary Knowledge Center that is being used to support caBIG. This is an organization outside of NCI; VKC is part of Mayo. They are being used to provide support to the user community in part by providing a Helpline. Eddie is helping VKC gather EVS information; Sherri and Larry are also providing information. Per Frank, if we see incorrect information being disseminated or missing please contact them via the website. Frank encourages editors to share information with VKC

8. Update on examination of ontologies for federation with BiomedGT: Liz announced that a draft document of potential ontologies to federate with BiomedGT has been completed. Action item: an EVS subgroup will meet for a second pass at the document.

9. Header change request for ‘List Extensible’: Terry submitted a request for header change to add the property ‘list extensible’. This will enable us to include the CDISC codelist extensible that is included on the spreadsheet they provide us but is not currently being inputted into Thesaurus. By adding this property, all the data on the spreadsheet will be represented in Thesaurus. General discussion about possible use by other sources and if this will create a problem. Because this is applicable to CDISC subsets there should not be a problem. Group approved the request. Per Laura form will go on Gforge and Liz and Nicole will help create the property.

10. New Meeting Invite to be sent out: EVS meetings will occur every two weeks starting this week; sub group meetings occur on alternate weeks. A new invite will be sent out by Lori. Per the group request, agenda items will be sent with the invite, and alone in a separate email.


Header Change Proposal

1. Use Case

CDISC has a feature of their Codelists that indicates whether the set of values is extensible or not. This information is attached to the Codelist. This information needs to be in Thesaurus so that it can be pulled out with the CDISC data and makes the data in Thesaurus match exactly what is published by CDISC.


2. Summary of the proposed semantics

We would like to be able to add the property of Extensible List to CDISC concepts that are codelist terms. The possible values would be Yes or No.

3. Impact On Thesaurus-Related Documents And Other Artifacts

Using the Guidelines provided, the following EVS-related artifacts are impacted as shown.

|Artifact |Impact Assessment by Change Sponsor |Decision |

|Modelers Guide |YES | |

|Style Guide |YES | |

|Browser Guide |NO | |

|API Guide |NO | |

|Semantics Document |YES | |

|Role_to_Use_Case_Map |NO | |

|T-Box Diagram |NO | |

|Self-Defining Terms |YES | |

|Config File |YES | |

|caCORE Change Request |NO | |

|(CR) | | |

4. The modeler(s) expected to be the primary Tester(s) of the change:

Theresa Quinn

February 17, 2009 (version 1)

EVS Meeting February 17, 2009 (Version 1)

Attendees: Frank Hartel (phone), Margaret Haber, Larry Wright (phone), Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Tracy Safran, Theresa Quinn (phone), Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Bob Dionne (phone), Harold Solbrig (phone), David Yee (phone), Will Garcia (phone).


Agenda items:

1. Discuss newly formed semantics group within CBIIT. Submitted by Frank Hartel.

Frank described a fundamental shift at the top level of CBIIT. The new vision will shift away from focusing solely on the cancer research community, to a broader more diverse community. The new focus will be on the national clinical care community. There has also been a goal shift. The Center is looking to specify and provide a top level design of consistent semantics and syntactic intra-operability to this broader community. Shifting away from software development and looking to provide a global model like the HL7 philosophy. EVS will be key in supporting this new direction and the work will be more immediately available to people working in the clinical care community. BGT vision may need to be brought into reconciliation with this new vision.

Tracy asked if the new vision is beyond the NCI mandate. Margaret commented that--caBIG was held up as a model and thus the gives the Center an edge in this effort.

Nick asked how Thesaurus might be impacted. Frank said the shift may require whole new ontologies and EVS would be more tied into the overall operability of the day to day operations. The Thesaurus will still be created and produced.

2. Discuss creating a standard for modeling genetic fusions.   Submitted by Liz Hahn- Dantona and John Bradsher.

Liz provided an overview. Currently we have a few Fusion_Genes and a handful of Fusion_Proteins but they are not complementary.  Liz proposed we model only Fusion_Gene concepts as many products may derive from one fusion. If this is adopted, what do we do with existing Fusion_Protein concepts? 1) Leave them and model the complementary fusion gene concepts, 2) Go through depreciation of them and then Retire them and model the complementary genes or 3) Convert the concepts to fusion genes and re-tree them? Nick said he needs fusion proteins for disease modeling. Action Item: After general discussion it was decided that Nick, Liz and John will meet separately to determine the best way to handle the fusion protein.

3. Propose model restrictions for terms in molecular abnormality kind that are fusion protein expression concepts and other pre-coordinated gene and protein expression abnormalities.  Submitted by Liz Hahn- Dantona

This item relates to number two above. There are several terms in the Molecular_Abnormality_Kind that are Fusion_Protein_Expression concepts and other precoordinated gene and protein expression abnormalities. They have no modeling but are used as fillers in Disease concepts Disease_May_Have_Molecular_Abnormality. Liz proposed we use two restrictions for modeling these fusion protein concepts. 1) Molecular_Abnormality_Has_Associated_Gene could point to one or more genes and 2) Molecular_Abnormality_Has_Genetic_Abnormality, could point to abnormalities like over-expression, translocation, truncation etc.

Action Item: Nick, Liz and John to discuss and determine the best way to handle this matter.

4. Discuss potential workflow change for the weekly Prompt. Submitted by Liz Hahn- Dantona

Editors doing batch loads can interfere with others working on creating concepts and can cause performance issues. It may be prudent to change the non-production baseline comparison to Mondays so that batch loads and batch edits can be done on Fridays and the weekends when fewer editors would be impacted. Per Laura, let’s move batch loads to Friday late in the day and evening and avoid weekends for now. This will start at the beginning of March. Batch edits do not need to adhere to this schedule except during the production build week.

Larry asked about Protégé productivity overall. Due to performance issues, it may be a while before the latest version of Protege will be released for Thesaurus.

There was a general discussion about the Change Ontology Tab and how it may affect performance. Liz is the only one who uses this tab, so it was suggested the tab be changed so that only Liz can view it. Action Item: Gilberto will check with Stanford about whether or not this tab can be removed and will report back.

5. Proposal for clean-up of radioconjugate in BiomedGT. Submitted by John Bradsher.

Due to Centra not working this will be discussed at the next meeting.

6. Discuss how to handle source code for Zebrafish strains.  Submitted by Sherri De Coronado and John Bradsher.

Due to Centra not working and Sherri’s absence, this will be discussed outside of the meeting or at next meeting if still pending.

7. Discuss Tagging FDA CDER data element terminologies --using association such as “structured product label,” “UNII codes,” RPS, ICSR, device failure codes or with FDA_Table property. Submitted by Wen-Ling Shaiu

Wen-Ling described the issue with tagging FDA data. Action Item: Per Margaret’s request, Wen-Ling will compile terminology sets from FDA CDER, and examine what tag (Property or association) was used for them; then the group involved will review and decide how to proceed.

March 17, 2009 (version 1)

EVS Meeting March 17, 2009 (Version 1)

Attendees: Frank Hartel, Margaret Haber, Sherri De Coronado (phone), Gilberto Fragoso, Larry Wright, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Tracy Safran, Theresa Quinn (phone), Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Bob Dionne (phone), David Yee (phone), Will Garcia (phone). Cynthia Minnery (phone).

Attachments: None

Agenda items:

8. Update on changes to EVS program due to White House stimulus package and NCI budget increase. Submitted by Frank Hartel.

Frank spoke about stimulus money that NIH has received. The money has not yet to come to NCI but assuming we get the stimulus funds and additional money from the Office of National Coordinator to support the general healthcare community, we will need to respond quickly to change and be ready to work with new groups that work with Charlie Mead. We will be providing semantic support for information architects and their services. The focused will be around HL7 instead of caGRID. Don’t see us stopping anything that we are currently doing but we will be doing additional things.

9. FDA definition suppression problem. Submitted by Tracy Safran.

Tracy discussed ways to suppress FDA definitions in BiomedGT. There was agreement to change the attribution to something unique so it could be suppressed and not available to the outside. Action Item: Margaret would like this done before the next scheduled release.

10. Update of NCI_META_CUI and UMLS_CUI properties. Submitted by Tracy Safran.

Tracy discussed the need to do a large batch edit on NCIt (and BioPortal) to update the NCI_META_CUI and UMLS_CUI properties.  We need to determine the best way/time to do that.  Suggestion is to export OWL as usual, perform batch edit on local client mode, take resulting OWL and load back into database.  Some downtime is involved in this, need to stop editing for about a day. Tracy is willing to do it on a weekend. Action Item: Editing cycle will end on the 27th instead of the 30th this month and Tracy will do the batch edit that weekend.

11. Notice of QA report now being made available at Submitted by Tracy Safran.

The QA report is where to get detailed information about concepts. Rob looks at this regularly. Editors should go to the link and check their editing. Larry would like to know if monthly statistics could be added to the report. Action Item: Larry will send email to Tracy outlining specifics of his request.

12. Elimination of self-defining terminology in the NCIt. Submitted by Nicole Thomas.

Discussion about whether self-defining terminology concepts in NCIt are still needed. In Protégé metadata provides this information. There was discussion about retiring these concepts. The group did not agree that retirement was the best approach. The group distinguished two parts of the problem: owl allows for self-documentation without supporting concepts, and users use incorrect concepts for coding. How to handle this issue remains under discussion.

13. User error/bug issue in Protégé.  Submitted by Liz Hahn-Dantona.

Liz demonstrated for the group how double clicking in Protégé causes the entire branch to disappear and then must be rebuilt. In the 1.4 version double clicking will not cause this to happen. Action Item: Liz will send out an email alert to editors describing the problem, how it occurs, and how to avoid it. Editors please do not double click thus avoiding this issue until we have the 1.4 version.

Wen-Ling asked about simultaneous editing problem. This issue will also be fixed in the 1.4 version.

14. Partitioning of NCIT into different OWL files (e.g. for each domains) to facilitate reuse by others, particularly by BGT. Submitted by Gilberto Fragoso.

Gilberto discussed that OWL can take one monolithic file and break it into different files. We are getting close to being able to do this should be able to do this with the 1.4 version and we can start testing it in 1.3. Frank points out that we will have to publish all these different files; there will be more to manage and maintain. Larry and Margaret state this is worth pursuing. Others mention that caMOD and biomedGT will be users. Frank said Value Set may be able to do some of this work. If there are use cases 5.1 could have this available and we would have support. Larry asked Frank for a spec sheet.

15. Summary of the BiomedGT content meeting that was held in February. Submitted by Liz Hahn-Dantona.

Liz gave a summary of the BiomedGT meeting held in February. Detailed information, meeting notes, and the revised BiomedGT Project Plan can be found on the BiomedGT Gforge page under version 1.0.

Margaret asked about migrating information from BiomedGT to Thesaurus. Gilberto explained difficulties of doing this-- modeling would not convey, definitions wouldn’t always convey. Referencing to BGT would not present this problem. How to handle updates of NCIT from common areas with BGT remains open.

16. Discussion of Radioconjugate domain in BiomedGT.  Submitted by John Bradsher.

This will be discussed in a sub-group meeting.

April 21, 2009 (version 1)

EVS Meeting April 21, 2009 (Version 1)

Attendees: Frank Hartel (phone), Margaret Haber, Sherri De Coronado (phone), Gilberto Fragoso, Larry Wright, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Tracy Safran, Theresa Quinn (phone), Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Carol Creech (phone), Brian Carlsen (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), David Yee (phone), Will Garcia (phone).

Attachments: None

Agenda items:

1. Update on NCIt and NCI Meta Browsers. Submitted by Larry Wright.

Larry introduced the NCIt browser that is scheduled to launch on May 13. Tabs and hyperlinks demonstrated. Search has a bug that will be fixed before release. Hierarchy return is slow and will be looked at before release. Action Item: Larry to send URL out to all. or Please review and send comments/ suggestions to Larry.

The new Meta browser was also introduced; it is currently in phase one of development. Much of the functionality of the Thesaurus browser will be used for the Meta browser. Frank has concerns about data load issues with Meta that may affect the browser. There was a general discussion about the loader issues. Alameda has been asked to help with this. Laura asked if in the meantime, the incomplete data can be taken down. Tracy states it is on LexBIG and caDSR is using it, so it cannot be taken down.

2. Editing of retired concepts. Submitted by Gilberto Fragoso.

Gilberto described the issue of retired concepts being edited. There are three problems: 1) There is a loop hole in batch loads that allow retired concepts to be edited. This will be fixed in the next version of Protégé. 2) Matching does not excluding retired concepts. This has already been fixed. 3) Editors ask a workflow manager to edit a retired concept. Action Item: There should be no editing of retired concepts. If an editor wants a retired concept edited, he/she needs to send an email to Laura for special consideration.

3. Concept_Status_Property. Submitted by Gilberto Fragoso.

Gilberto discussed retired concepts that are missing an appropriate concept status property. This is usually automatically added but is not working properly in this version of Protégé. Version 1.4 will fix the problem but we need to deal with it in the meantime. Action Item: 1) Editors, when doing a pre- merge you must manually add a concept status property of retired concept. 2) Workflow Managers, if an editor forgets to add it, then workflow manager will need to do it at the time they approve the merge. 3) If the editor and the workflow manager miss it, Tracy, Rob or John will need to catch it and add the property then.

4. CTCAE 4.0 update. Submitted by Larry Wright.

Larry discussed the CTCAE Update Project, from CTCAE 3 to CTCAE 4. It is a VCDE supported project, and is making CTCAE 100% MedDRA compatible. EVS has been assisting with the update by matching to Thesaurus concept, creating new concepts and writing definitions. Margaret remarked on the importance of this work. It is a high priority and the quality of data (accuracy and spelling) was emphasized. Working group meetings have identified things that will require more work from EVS, such as reviewing CTCAE definitions.

CTCAE 4.0 will be a stand-alone terminology and will need to be loaded into LexBIG in June. We are creating the pre-coordinated OWL version similar to the CTCAE3 that is in NCIt so that it can be loaded to LexBIG. Per Frank, VCDE approval is not needed before it gets loaded but will be needed before it is used. We will have some long term responsibility for helping with clean up and maintaining it after the VCDE project ends in June.

5. Radlex 2.01. Per Tracy, Radlex 2.01 can be viewed at

6. miRNA Concepts. Submitted by John Bradsher.

John led a discussion about naming and defining microRNA concepts. The group decided Hugo nomenclature will be used for the PT; Sanger will be used for Full SYN. The definition will be structured like other gene definitions using the standard of normal gene expression but adding information about roles in disease. Laura states these definitions should be consistent across the board for miRNA concepts. Action Item: John to update the Style Guide noting the definition structure change for these concepts.

7. BiomedGT only—external vocabularies. Submitted by Gilberto Fragoso.

Gilberto led a discussion about external vocabularies for BiomedGT. It was noted that publication varies between the sources but we will publish only their stable versions. Regarding history, we will document the external history practices and point to the appropriate authority. It was determined that BFO binning can begin. A sub-group meeting for BiomedGT will be arranged.

May 19, 2009 (version 1)

EVS Meeting May 19, 2009 (Version 1)

Attendees: Margaret Haber (phone), Sherri De Coronado (phone), Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Tracy Safran, Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Brian Carlsen (phone), Bob Dionne (phone), David Yee (phone)Will Garcia (phone) Cynthia Minnery (phone).

Attachments: None

Agenda items:

1. Creation of OLD_ASSOCIATION and OLD_SOURCE_ASSOCIATION properties. Submitted by Gilberto.

Regarding retiring concepts with associations we are going to support the creation of OLD_ASSOCIATION and OLD_SOURCE_ASSOCIATION properties to keep the association data in retired concepts from interfering with the valid use of these associations while preserving the information. For BiomedGT we will proceed to implement. Rob will look into modifying the report writer. Discussion for use in NCIt. Group decided it was OK to go ahead and implement this.

Action Item: Gilberto to complete change request and send out.

2. NCIt and NCIm Browswer update.   Submitted by Larry.

Larry gave an update on the process of the new browsers and demonstrated both. Discussion about the browsers being 508 compliant, which they will be.

NCIt Browser should reach production in the next few days. Sherri demonstrated the browser in the VCDE-Architecture F2F meeting. Positive feedback was received; the new browser is more user-friendly and faster than the old browser. Action Item: All, look at it for issues before production. Tracy recommends adding a Gforge link on the wiki so users can report issues. Sherri suggests we also send the release info to Mayo for VKC wiki and make announcement to the caBIG and EVS lists.

NCIm has issues with LexBIG data. Working on fixing discrepancies between RRF files and LexBIG. Aiming for July to launch the first version.

3. Gforge notes related to browsers and LexBIG.  Submitted by Larry.

Where and how to report what on browsers and LexBIG data issues. Gforge link to submit bugs. for Thesaurus, and for Metathesaurus.

4. CTCAE Update. Submitted by Larry.

Work is wrapping up on version 4.0. Terms are entirely Meddra compliant which is the international standard. Sherri will request Tech writer support to help polish the document this summer. Transition from 3.0 to 4.0 in October. EVS will be involved in the on-going maintenance.

5. Report back from VCDE/Architecture F2F. Submitted by Larry.

Sherri and Margaret reported information from the VCDE/Architecture F2F meeting. Regarding the Cardiovascular Research Grid, there are six projects comprising the research grid. Their data management needs include terminology. ECG is one of those terminologies. ECG is now in Bioportal. Al had worked on this project. These are NHLBI and Duke groups in conjunction with other institutions. They are using our tools and some of our terminology. Margaret also mentioned the MelaGrid, a newer smaller project to share data and resources among the NCI funded Skin SPORES. NCIt and caBIG infrastructure is being used as the backbone of all these Grids. Link for group to view is at

6. Semantic requirements clarification.  Submitted by Larry.

Harold created a good overview of the standards and he is going to create a wiki page with the information. When the wiki page is available the link will be distributed to the group.

7. Information about caBIG annual meeting. Submitted by Liz.

John Bradsher will be doing a poster on the miRNA content that we are adding. Laura suggests we do a presentation or poster of the Browsers. Mayo will be doing a presentation on LexBIG 5. Discussion of other topics for possible presentation or poster. Action Item: Liz and John to get together with Gilberto to create a poster for BFO implementation. For those presenting, May 29th is the deadline for abstracts.

8. Training for next version of Protégé. Submitted by Gilberto.

Action Item: All editors required to attend training session on Tuesday May 26th at 1 PM on site. Gilberto will arrange a room. Training to cover new features of Protégé version 1.4. This will be the production version for the next year or longer. It is important to thoroughly test this during the user acceptance period.

June 16, 2009 (version 1)

EVS Meeting June 16, 2009 (Version 1)

Attendees: Sherri De Coronado (phone), Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Brian Carlsen (phone), Bob Dionne (phone), Will Garcia (phone) Cynthia Minnery (phone).

Attachments: None

Agenda items:

1. Shallow hierarchy in the retired_concepts bin, using intermediate “by year” nodes to tree under. Submitted by Gilberto.

Gilberto discussed the structure of the retired concepts. Retired concepts 2008 and 2009 can be re-treed into a more structured tree to minimize performance issues during retirements and merges, associated with retreeing in flat trees. Suggestion is to tree by year. Archive cycle can be automatic or done at the end of the year. UAT instance didn’t take very long. Group agreed to go ahead with this grouping by year.

2. Feedback from Editors on the UAT for Protégé 1.4. Submitted by Gilberto.

Laura mentioned that performance didn’t seem better than 1.2 for Thesaurus editors.

Fridays are particularly slow in 1.2, takes about 1 hour per concept. This has been a problem for some time. Liz stated that the Friday slowdown has not been observed in 1.4, but others had not really tested that or noticed a difference.

John saw a real slow down with concepts with 17-18 properties took 6 minutes to save a concept. But this was prior to the performance fix introduced mid-UAT.

Liz stated before the fix it took 5-10 minutes to save modeled concepts, after the fix it was only 30 second to save.

Per Erin searching is going quicker. Setting up the query is slightly easier. Easier to traverse the hierarchies and going from tabs was faster and not freezing as much.

Wen –Ling said the advanced query isn’t any faster on big query like FDA.

Nick reported it suddenly freezes like in 1.2, and it isn’t any faster.

Wen-Ling said the GUI refresh was faster.

Liz commented in 1.2 when things slow down it was exponential, if it took 2-3 minutes to do something it then would take 15 when things slowed down. We are not seeing that in 1.4—if it takes 2 minutes to do an action it stays at 2 minutes later on.

Liz--After the configuration fix it take no more than 1 minute to save.

Discussion about it accepting an inappropriate filler value. Per Gilberto, this was not fixed but gets caught in classification. There are about 15 Gforge items that are non-performance issues--3 or 4 are show stoppers and are being fixed.

Laura reiterated this is the tool we will be using so if there are issues they must be voiced and addressed now.

3. Batch loading in Protégé 1.4. Submitted by Gilberto.

There is no improvement with batch loads but batch edit did improve enough so that concurrent editing can continue. Batch loading will take about 1 hour, but this depends on the number of concepts batch loaded. Per Laura we will need to schedule time for batch loads but will also need flexibility for things that need to be done quickly. Bob commented anything above about 100-150 concepts really taxes the system.

Action item: Same batch loads that were used last week will be tested in biomedGT this week to check performance.

Other issue with batch loading is we are able to tree under child of retired concepts. Per Gilberto, that will be fixed in the final version.

Discussion about refreshing the server—the more people logged in the more the server has to update. When people log out of protégé it hangs on for some reason, so when they log back in they are logged into multiple instances. By refreshing the server it gets rid of the stale log-ins. Action Item: Bob will talk to Stanford about the extra connections and get that cleaned-up.

There was a discussion that stand alone batch loads would be quicker but server would still need to be shut-down.

Doing batch-edit is OK. Action Item: Editors to notify others that you are doing a batch-edit so they are aware.

4. Update on transition to protégé 1.4. Submitted by Gilberto.

All please continue testing and posting Gforge items. On Friday the final triage of Gforge will be done. Action Item: Build on Tuesday will need to be tested. Those builds will also go through regression testing. If everything is OK we push to production. We will have to stop editing for a day or two while we rebuild the databases. This would also be published on Monday June 29. BGT baseline will be done on Tuesday 30. Laura says let’s get Thesaurus back up first. Gilberto is gone next week through June 26. Laura wants us to touch base before we push things forward given that Gilberto will be unavailable.

5. Editor suggestions for the NCI Meta Browser. Submitted by Larry.

Update on Meta browser by Larry. Issues have been identified and cleaned-up. Performance is an issue that we are working on. Action Item: Carol, Al, Terry to look at the data because they are more familiar with the data. Other editors to look it as well. Please send comments and recommendations to Larry. Links were sent out with the last meeting notes.

Larry reports the issue with the NCIt browser link and firewall is being worked on by NIH and may take a couple of days to resolve.

6. Legacy Concept Name property to store old concept name identifiers. Submitted by Larry.

Per Larry concept names are no longer being used but we still allowed people to look up on old concept names so the proposal is to create a property legacy_concept_name. This will only be done for existing concepts including retired concepts, but not for any new concepts created after the legacy_concept_name is assigned. Group agrees to new property. Action Item: We need a list of concept names. Header change needs to be done. A batch load of every concept currently active in the NCIt will be given this property and batch loaded. Gilberto will do the data load. Larry to do the header change.

July 2009 – No meeting held.

August 18, 2009

EVS Meeting August 18, 2009 (Version 1)

Attendees: Sherri De Coronado (phone), Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Nicole Thomas, Rob Wynne, John Park, Erin Muhlbrandt (phone), Nicholas Sioutos (phone), Amy Jacobs (phone), Wen-Ling Shaiu (phone), Carol Creech (phone), Sharon Quan (phone), Stephanie Lipow (phone), Joanne Wong (phone), Brian Carlsen (phone), David Yee (phone), Kim Ong (phone), Cynthia Minnery (phone).

Attachments: None


1. Gilberto gave an update on Protégé. We are close to next version. User acceptance testing will begin next week. NCIt editors will test first; BGT editors will test the following week.

2. Margaret informed the group MedDRA production was discussed at the management meeting and the decision made that MedDRA version 12 is to be extracted from RRF and loaded in LexBIG one time only, for stand alone. There was discussion about the issue with SMQ. Action Item: Alameda to begin work on this and report back.

Agenda items:

7. Creation of special concept_status to denote concepts that should not be used for coding. Submitted by Gilberto.

Discussion about outside users unknowingly using header, retired, and self-defining concepts for coding and how to alert them not to use these concepts for that purpose. In depth discussion about Concept Status Property and how to flag self-defining terminology for internal administration-- not recommended for coding. Larry voiced concern about gray areas such as the antiquated terms. Group agreed to flag the concepts with a concept status property; notation yet to be determined. Laura suggested: “not recommended for coding”. It was decided we will do the self-defining terminology first, the rest can be done in increments. Once it is complete, an announcement can be made.

8. History and reasoning behind the use of NCBI Taxon ID in Thesaurus and request creation for new property that is IT IS taxonomy specific. Submitted by Erin and Terry.

Erin reported we will be creating lists of microbes for CDISC controlled terminology in conjunction with FDA. This will be a terminology coded subset and point to outside terminology list for anything we don’t have. This will be accomplished by attaching a Taxon ID. The question is which taxonomy to use. The group felt NCBI is the taxonomy most used for organisms. Wen-Ling suggested IT IS as a superior taxonomy. There was no objection from Group to creating a new property if we decide not to use NCBI. Action Item: Wen-Ling to send out info on IT IS. From Sherri: () I have to agree with Wen-Ling. This is an excellent resource, containing foreign names, subspecies, when and who named, and references. Far more complete than NCBI, and being used as the Tree of Life Taxonomic Backbone. It includes research species like Rattus norvegicus as well. Although note, if you are looking for zebrafish the research organism, it is Danio rerio (zebra danio) and it does have the synonym Brachydanio as well. If you look up zebrafish, you get a different organism. If you look up zebra fish (with a space) you get still something different. So, like anything else, its search is not perfect.

9. Internationalization and character data based on second part of Gforge item Submitted by Gilberto.

Gilberto summarized the internationalization and character data issues that we need to deal with. From input into Protégé through publication in the LexBig database, we are covered for UNICODE characters encoded in UTF-8, this covers international and special characters. However, we are not covered for what happens prior to input into protégé. For background, UNICODE (and its encoding transformations – UTF-32, UTF-16, and UTF-8) and other international ISO character sets were reviewed. If we receive something from someone that is using for example Microsoft-specific characters, UTF16, or another international character set, and do a cut and paste there can be conversion issues. Data must be reviewed. Laura asks if we can have a program that can scan for question marks and other symbols, before we go to production. Gilberto suggests when we get a request to include Greek letters, we should spell them out. Per Stephanie, LVG from NLM can be used for conversion. Editors need to be aware of these symbols and that there will be issues in Protégé (as well as in any other programs). Margaret suggests a filtering system. Rob says we currently have a script that catches these characters. It could make it to publishing but would be caught and corrected in the next version. Sherri suggests we check if Microsoft has a program to prevent the conversion problems. Action Item: Sherri will inquire with Stanford; Tracy will check Microsoft support sites; Larry can contact OCE. Alameda will also look into the matter.

10. Auditing ST assignments of ST: Experimental Model of Disease. Submitted by Wen-Ling.

Sherri discussed an analysis done by Yehoshua Perl on NCIT looking at the STY of Experimental Model of Disease vs. Experimental Organism Diagnosis. The analysis indicated we had assigned the wrong STY. We disagreed with the analysis and declined to change the STY. Laura pointed out sometimes the analysis does not accurately reflect what the concept represents but we appreciate the feedback and gives us things to look at. Margaret agreed and stated a mouse diagnosis is not a model, and therefore the STY cannot be changed. Sherri has already responded to Perl about this matter.

11. Proactively developing support terminology in NCIt. Submitted by Margaret and Sherri.

Margaret and Sherri discussed that we are receiving more and more requests for support terminology such as IT concepts, Nanotech, etc. Laura thinks discussion with Management is warranted to have some guidelines as to what we want to add and how much. Examples given are Liz’s work with Nanotech; John’s work with microRNA, Terry and Erin’s work with CDISC on organisms, and Nicole’s work on IT concept requests. Action Item: Editors to discuss at next LM staff meeting.

September 15, 2009

EVS Meeting September 15, 2009 (Version 1)

Attendees: Margaret Haber, Sherri De Coronado (phone), Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Rob Wynne, John Park, Amy Jacobs (phone), Wen-Ling Shaiu, Sharon Quan (phone), Joanne Wong (phone), Brian Carlsen (phone), Cynthia Minnery (phone), Bob Dionne (phone), Tracy Safran

Attachments: None

Agenda items:

1. UTF-8 Discussion: In follow-up to last month’s meeting. How do we make sure what we receive is UTF-8. Frank wants to be sure the character issue is resolved. Tracy demonstrated how to convert and save documents. On Confluence wiki under EVS Protégé Page, there is a child page for Protégé and UTF-8. Instructions are there to make sure that your characters are UTF-8 before you take it out of Word or Excel and paste into Protégé. Instruction for both Word 2003 and Word 2007. There is also list of characters that are problems.

2. Protégé Update: Gilberto gave the update. Developing and testing is complete and we are moving toward deploying version 1.4. A number of things need to be done as we update the software. Next Monday we will start the process for NCIt which will take 2 to 2 ½ days to complete. Plan is to return to editing late Wednesday or on Thursday. Editors need to be prepared for 3 days of downtime. Prompt will happen on Monday for this version. Does anybody have a big project that needs to go in and push back this? Action item: Laura/Lori to check with those editors not on the call regarding projects.

3. Protégé Refresher Training: Next week we will meet for Protégé refresher training. For now, Gilberto outlined a few items related to data and its cleanup happening during the deployment: clean-up concept status; CUI update; eliminate empty attributes; eliminate kinds. The items that required a bit more explanation were the processing of HT, versus AQ, versus PT; populating a new property legacy_concept_name, which will be done one time only and the editors won’t need to create it or edit it in the future, and the creation of header concepts in retired concepts. We are going to use Retired_Concept_2003, 2004, etc. up to current year.

4. MedDRA Update: Sherri gave the update. We wanted to get MedDRA 12 up as soon as possible. Can’t load directly from distribution file format, and no loader currently exists. Rob is extracting MedDRA12 from Meta export and will post. Frank wants to keep updated with new releases, which will require a loader to be created. If Mayo has trouble getting to the loader, or trouble with it, we have the option of creating a “fake” RRF MedDRA to load, without going through the whole NCI Meta cycle. CTCAE may want to keep in synch with MedDRA as well. Per Margaret, there will about 100 hours of consulting now going through contracts that can be used as a resource to keep things in synch.

5. CTCAE Update: Larry gave the update. There is a new release 4.0.2 mainly a clean-up of truncated definitions, typos, etc. No need to do an announcement because nothing will change in terms of content. This version being used to produce the published paper document. Per Sherri: we are going to try to update in BGT as well. Full load is preferred to manual updating. Larry reported there will be a new procedure going forward, we will no longer be using editor spreadsheets. We will be doing it in the OWL file with Protégé. (At some point, we will switch to a version that can be edited with NCI Edit Tab.) Action Item: Begin thinking about what is the best way for EVS to track these and who is going to do it? Also, we may want to antiquate the 3.0 version.

6. New Browsers Update: Larry gave the update. We are now working on the first release of creating a replacement for Bioportal. We are into second phase for NCIt and NCIm browsers. For Thesaurus, we want to be able to search on relationships and properties. Action Item: Any specific thoughts about things we need, or right or wrong ways to do it—let Larry now as soon as possible for this release. Development of replacement for Bioportal can be found at

7. BiomedGT Semantics: Liz gave the update. Our group met with Harold Solbrig. He was updated on our progress and we wanted his opinions on future direction. Regarding modeling needs—it was concluded that BFO does have a role hierarchy we can use. For the thesaurus nodes in BGT, our navigational concepts need to be moved into there. Procedures and technicalities of federating with other ontologies were discussed. There is a question of do we need the federated ontologies in LexBig and LexEVS. GO and NCI Taxon will be two to go there. We want to get on the concept modeler mailing list for IHTSDOdipsido. Per Margaret: It is OK to monitor lists but if we are going to collaborate with clinical groups, then our clinical editors need to be informed and involved. This will be discussed in greater detail outside of the meeting.

October 2009 – No meeting held

November 17, 2009 (version 1)

EVS Meeting November 17, 2009 (Version 1)

Attendees: Frank Hartel (phone), Margaret Haber, Larry Wright, Gilberto Fragoso, Laura Roth, Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Rob Wynne, John Park, Tracy Safran, Nicole Thomas, Bob Dionne (phone), Nicholas Sioutos (phone), Theresa Quinn (phone)

Attachments: Protégé Batch Loading doc.

Agenda items:

1. Batch Loading: Overall look at and discussion of batch loading. Margaret discussed concerns about data base down during working hours due to speed issues. Our goal is to maximize efficiency and minimize downtime. Gilberto discussed that Protégé is a GUI tool that supports interactive editing. Small batch editing and loading. Large batch loads are incompatible with an interactive environment—run into technology issues—like indexing DB tables. There have been recent upgrades to protégé but is not working as well as needed. Planned software updates: 1.4.1 will eliminate simultaneous batch and interactive mode support. In testing it has been considerably faster; batching 400 concepts in 2 minutes. (See attachment Protégé Batch Loading doc.) General discussion and questions from the group about recent batching experience. Won’t know until UAT if batch times will be improved for remote users.

Another issue is Citrix timeouts for remote users. While running a lengthy batch job, you will get timed out after a given time period of inactivity. This behavior can’t be changed.

In the current version of protégé (1.4.0) the tipping point between a small job and what a workflow manager needs to do is 750 batch edits/ per file. Action Item: Frank asks that Gilberto or Tracy to talk to Doug Hosier (the Citrix people—remote apps) to find out what Citrix is supposed to do. Should things run to completion? And if not, then are we doing something wrong?

- In follow-up with the Citrix group, we were advised that indeed a process should run to completion even after a user is timed out from Citrix. Furthermore, you should be able to connect to that timed out session and pick up where you left off. If editors find that this is not the case, Gilberto requests to document it and submit it to him to forward to the Citrix group:

1) while logged in as "username"

2) a timeout happened during a period of inactivity

3) on day and time

4) while you were running "program"

5) but you were unable to connect to that session afterwards

Where “username” is the name that you use to login to remoteapps and “program” is the Protégé version that you are running from the NAL, e.g. protege14-NCIT-GF.

2. Troubleshooting vs. exiting properly in Citrix by Gilberto Fragoso: If we exit Protégé by clicking the frame, Protégé continues to run causing problems in Citrix (neither licenses or connections are reused). To exit properly: Go up to menu File, first click on “Close Project” then again from the File menu click on “Exit Protégé”. Action Item: Editors please adhere to this procedure for exiting. Please feel free to ask or demo again over Centra.

3. Disease Modeling in Protege: by Larry Wright: The problem is with inherited roles. In Protégé and the Browser you have to look for which things are necessary and sufficient. What can be done so editor and users can see the modeling information defining conditions for each concept that we want them to see? One solution would be to store that information at the time of classification when this takes place otherwise we have to do classification each time to see the info because it is not preserved. Can relations be tagged as defining in LexBIG? Action Item: Tracy will check. The problem is Protégé doesn’t show the inferred view. Protégé software doesn’t do this. Continue to follow-up outside the meeting on this matter. Gilberto suggests we try to label defining conditions that come from a super class.

4. How to make classes defined in OWL Protégé by Gilberto Fragoso. Both superclasses and restrictions need to be added to the block of Necessary&Sufficient conditions. This is especially true for classes that rely on inheritance for their definitions. Action item: Editors need to include the superclass in the Necessary&Sufficient section of of the GUI.

5. New Browsers update by Larry Wright. New version of Meta browser coming out in early December. Data for Metathesaurus is relatively complete and performance much improved.  How to build hierarchy for each source is still being worked on.  Help file describes the changes in more detail.  New version of Thesaurus Browser on the development server with better features and performance, and it now covers all of our standalone terminologies (plus NCIm), visible and searchable in the Term Browser interface. Can search on a concept’s terms and codes, other properties, or the concepts related to concepts with terms/codes matching a search string.  There is a Viewed Concept button.

6. Announcements: New hire starting on Dec. 14 Dr. Mike Cantwell, a Medical Doctor that will take the lead in organizing the clinical needs of the drug/chemical space. CBIIT holiday party Dec. 16.

December 15, 2009(version 1)

EVS Meeting December 15, 2009 (Version 1)

Attendees: Frank Hartel (phone), Margaret Haber (phone), Larry Wright, Gilberto Fragoso, Sherri DeCoronado, Laura Roth (phone), Lori Whiteman, Elizabeth Hahn-Dantona, John Bradsher, Rob Wynne, John Park, Tracy Safran, Nicole Thomas, Bob Dionne, Nicholas Sioutos (phone), Theresa Quinn (phone), Wen-Ling Shaui (phone), Mike Cantwell (phone), Amy Jacobs (phone), Erin Muhlbradt (phone), Brian Carlsen (phone), Stephanie Lipow (phone), Sharon Quan (phone), David Yee (phone), Joanne Wong (phone),

Attachments: None

Agenda items:

1. Introduction of new team member.

Laura introduced Mike Cantwell, M.D., a new Lockheed Martin Employee that will support the EVS contract.

2. Discuss problems or potential features in future releases of Protégé and issues in 1.4.1 release with Bob Dionne. Submitted by Liz.

UAT is ongoing, in 2nd week. Gilberto wants feedback on how it is going.

Liz notes that things are slower. Not sure if it is QA environment or tool. Searching is slower. Wen-Ling says GUI refresh is much slower. Both “contains” and “exact match” are slower. Larry asks about batch loading. Batch editing is a lot faster. Loading is faster, but not as fast as batch edit. Can’t edit in parallel while doing loading. Can’t edit during batching anything still. But can get more done in a shorter period of time. Discussion that memory may be an issue causing slowness. Also what is taking so long to export file after PROMPT. Laura asked about speed of editing when a lot of folks are editing at the same time. Action Item: plan a time to get this tested on UAT, have multiple people editing at the same time.

3. Information about NCICB Intranet being turned off. Submitted by Rob Wynne.

Rob informed the group to please update your bookmarks with the following link.

Meme summary files are being moved to the wiki/ some documents moved to Gforge and others to the Internal FTP Site. Larry suggests we look at where Carol has put things and restructure site so we have one internal page that can help us find other things.

4. UAT feedback and if possible Gforge categorization into 1.4.1 fix vs. 1.4.2 fix. Submitted by Gilberto.

Gilberto, went through the Protégé bugs with the group to identify show stoppers item for 1.4.1

# 24801—show stopper

#24803—per bob this has been fixed

#24805—will be fixed in the next version

#24813—has been sent to Tim for investigation

#24837—needs to be investigated further

#24870—needs to be fixed

#24878—show stopper

#24929—show stopper

#24940—may require more investigation

#24942—this is in Prod so needs to be checked in UAT—Action Item: Liz will try to reproduce this in UAT.

#24955—needs to be looked at, discuss with Tim on Friday; follow-up from there. Applies to workflow managers only.

Discussion about 30 retired concepts that need to find a home. These were not recorded in EVS history. Constitutes only about 1% of retired concepts. Action Item: send list to Larry he may be able to find some information on some. Those that can’t be found will be assigned to 2005.

5. EVS Browsers update. Submitted by Larry.

(New versions at NCIt/Term Browser: (report bugs by 12/16! NCIm: (to launch before holidays)

Larry gave an update regarding the Meta Browser. We got a lot of data in and is much more complete; performance is significantly improved and provides extensive results in reasonable amount of time that is usable data. It is due to launch any day now but maybe delayed due to LexEVS delays and then holidays and people taking time off.

Next to come-- NCI Term Browser, built from new NCIt Browser—to support all the stand alone terminologies. Can search on name, code, property and relationship. Sources now have their own home page. Launch in late January. All bugs and features need to be finalized by Dec. 16. Erin asked if the legacy concept name is the class name in protégé. Yes. Batch loading and editing is dependent on the legacy concept name and this is useful when Protégé is unavailable. Discussion ensued because the Legacy Concept Name is not currently being added. The issue of continuing to add values for this property for concepts created since the initial load was not resolved.  NCIt will be going to code possibly in 1.4.2, after which concept names will no longer be needed for ongoing batch and other processes.

6. Common Terminology Service 2 (CTS2) LexEVS reference implementation. Submitted by Larry.


Larry gave a quick heads-up to the group regarding extending the current server into the next year. This is important because it ties into subsets, report writer, and mapping.

7. Whether to assert ‘Gene found in Organism Human’ at a high level for many of the branches of Gene. Submitted by John Bradsher.

John explained that restriction for genes, “gene found in organism” is used to define wild type alleles. A small minority are not human. Do we want to assert “Gene found in organism Human” then it can be inherited but other organisms would have to be moved out. Gilberto, we decided to do the organisms at the lowest level. Liz reports we use the HUGO name for concept names. Action Item: This matter will be discussed further and determined in a subgroup meeting.


