FRONT PAGE - CORDIS



PROJECT PERIODIC REPORT

Grant Agreement number: FP7-ICT-2007-1-215847

Project acronym: EU-ADR

Project title: Exploring and Understanding Adverse Drug Reactions by Integrative Mining of

Clinical Records and Biomedical Knowledge

Funding Scheme: Collaborative Project – Small or medium-scale focused research project

Date of latest version of Annex I against which the assessment will be made: 25/05/2011

Periodic report: 1st □ 2nd □ 3rd ( 4th □

Period covered: from 01/02/2010 to 31/07/2011

Name, title and organisation of the scientific representative of the project's coordinator:

Prof.dr. Johan van der Lei, Erasmus Universitair Medisch Centrum Rotterdam

Tel: +31 10 704 3050

Fax: +31 10 704 4722

E-mail: j.vanderlei@erasmusmc.nl

Project website address:

euadr-

Declaration by the scientific representative of the project coordinator

| |

|I, as scientific representative of the coordinator of this project and in line with the obligations as stated in Article II.2.3 of the Grant |

|Agreement declare that: |

| |

|The attached periodic report represents an accurate description of the work carried out in this project for this reporting period; |

| |

|The project (tick as appropriate) [1]: |

|( has fully achieved its objectives and technical goals for the period; |

|has achieved most of its objectives and technical goals for the period with relatively minor deviations. |

|has failed to achieve critical objectives and/or is not at all on schedule. |

| |

|The public website, if applicable |

| |

|( is up to date |

|is not up to date |

| |

|To my best knowledge, the financial statements which are being submitted as part of this report are in line with the actual work carried out and are |

|consistent with the report on the resources used for the project (section 3.4) and if applicable with the certificate on financial statement. |

| |

|All beneficiaries, in particular non-profit public bodies, secondary and higher education establishments, research organisations and SMEs, have |

|declared to have verified their legal status. Any changes have been reported under section 3.2.3 (Project Management) in accordance with Article |

|II.3.f of the Grant Agreement. |

| |

| |

|Name of scientific representative of the Coordinator: Johan van der Lei |

| |

| |

|Date: 13/12/2011 |

| |

| |

|For most of the projects, the signature of this declaration could be done directly via the IT reporting tool through an adapted IT mechanism. |

1 Publishable summary

[pic]euadr-

During its third period, the EU-ADR project has continued its progress towards the production of the main results expected from the core work packages.

The project did successfully pass its second review with the help of external independent experts during the period. It has reinforced its visibility and the contacts established with key stakeholders (including regulatory authorities and other projects), and has developed plans for long term sustainability. EU-ADR has actually become a seminal project, spinning out several other projects (mainly funded via FP7-Health) that apply EU-ADR technologies and methods to the study of specific drug and vaccine safety problems.

On the scientific and technical front, the most important advances in the project have been the nearly completion of work in core WPs, and a progressive shift towards a ‘system view’ of the development efforts, by which the different pieces (signal detection, substantiation, evidence combination, validation, exploitation) have been integrated and the focus has moved to the challenges that such integration presented. This process has been helped by a ‘cycle-based’ configuration of the workplan that, superimposed to the original workplan, has helped steer efforts through a common, ‘pipeline’ framework addressed to reproduce the sequential steps of the system, revealing input-output relationships and dependencies.

Notably, the project has also made great progress in devising a strategy for long term sustainability.

Particular outputs that deserve to be highlighted include:

• Reappraisal and rework of the set of true-positive and true negative signals, now named ‘reference sets’, so as to maximise their usefulness, especially for the signal detection method performance asessment and the validation activities.

• Addition of three new events to the priority list of ten: Hip Fracture, Pancreatitis and Progressive Multifocal Leukoencephalopathy. Event definition for these events, and completion of terminology mapping, data extraction, and query harmonisation tasks related to all events.

• Development and testing of algorithms and methods for signal detection. Tackling of studies on how parametrisation of methods affect their performance.

• Development and completion of signal substantiation methods, implemented as web services.

• Development of evidence combination methods based on Dempster-Shafer theory. Implementation of tools allowing visualisation, exploration and user interaction with such methods.

• Development of the integrated EU-ADR Web Platform, which allows drug-event dataset management and sharing, visualisation, comparison and exploration of signal detection methods results, of signal substantiation results, and of evidence combination results, including custom user weightings given to the different methods based on belief and plausability.

• Progress in system validation using both retrospective and prospective approaches, comparing results with those of spontaneous reporting systems. Tackling of event validation activities, addressed to verify the reliability of the events extracted from the databases.

Overall, the project progress has been substantial, and the Consortium believes that it has allowed EU-ADR to become a breakthrough, reference project not only in the application of IT to pharmacoepidemiology, but also more generally in relation to demonstrating the feasibility of a federated model for re-using massive amounts of existing EHR data for research purposes; a model that, while respecting local and national ethico-legal limitations, allows to unleash the power of available medical information on millions of patients.

2 Core of the report for the period: Project objectives, work progress and achievements, project management

2.1 Project objectives for the period

The overall objective of the EU-ADR project is the design, development and validation of a computerized system that exploits data from electronic healthcare records and biomedical databases for the early detection of adverse drug reactions. The EU-ADR system intends to generate signals using data mining, epidemiological, computational and text mining techniques, and subsequently substantiate these signals in the light of current knowledge and understanding of biological mechanisms. The system should be able to detect signals better and faster than spontaneous reporting systems and should allow for identification of subpopulations at higher risk for ADRs. For the system to operate as an adjunct to safety reviewers during signal evaluation and follow-up, it should also enable easy access to the underlying data sources, allowing to quickly focus on information that is pertinent to a suspected ADR.

In this project, electronic healthcare records (EHRs) comprising demographics, drug use and clinical data of over 30 million patients from several European countries are available. These EHR databases form the foundation of the project, insofar as they supply the patient data on top of which the system is built.

A number of designs and techniques are used to process these electronic medical records. One of the objectives of this project is to study and compare a number of different techniques that, in essence, all aim to detect unexpected or disproportional rates of events.

Once generated, the signals are substantiated to place them in the context of the current biomedical knowledge. Essentially, this means searching for evidence that supports causal inference of the signal. The list of signals will be assessed by automatically investigating feasible paths that connect the drug and the adverse reaction involved in the signal. The general strategy is the automatic linkage of biomedical entities (drugs, proteins and their genetic variants, biological pathways, and clinical events) by means of data mining approaches and in silico predictions based on biomolecular structures.

The signal detection and substantiation algorithms are integrated in a computerized ADR detection and monitoring system. This involves the development of an evidence weighting scheme to combine the various pieces of information and present the user with a final list of ranked signals.

The system is expected to be tested retrospectively using test sets that are based on recent literature, including both known side effects and spurious signals. The system’s ability to rediscover drug-event combinations from the test set with known side effects provides an indication of the sensitivity of the system. The system’s ability not to signal drug-event combinations from the test set with spurious signals provides an indication of the specificity of the system.

After the system has been validated retrospectively, a prospective evaluation is also among the original objectives, centred on further investigating the top-ranking signals generated by the system.

The ultimate objective of the project is to demonstrate that an earlier detection of adverse side effects of drugs is possible using EHRs.

There is no definition of project objectives per period in the EU-ADR Description of Work. For our second report, submitted last year, we distilled from the work plan what the project was expected to deliver during its second period. Activities in this third project period were obviously expected to extend and deepen the tasks of the previous periods, and result in the bulk of outputs expected from the project.

For what refers to WP1, the third period of the project was expected to continue the maintenance of appropriate communication and work dynamics among partners and with external stakeholders. It was also expected to entail continuing ethical surveillance, and, following the experience of the previous period, repeat the project assessment exercise and compare the results obtained.

WP2 was officially finished in previous periods; however, the previous period already showed that a re-assessment of the true negative signals in the test set was needed. In this period, it was also planned to re-assess as needed the test sets, in case the results obtained from other WPs showed any important shortcoming in those sets. WP2 was also expected to cover as needed terminology mapping exercises derived from new events of interest beyond the 10 selected by the project, time and resource permitting.

The WP2/WP3 task force created in the previous period was also expected to continue its work in the third period, harmonising queries among databases by monitoring and optimising the actual codes used in each case to detect a specific event.

For WP3, and once the use of Jerboa as platform to extract, aggregate and export data was well established, work was expected to advance in signal detection decisively, so that a ‘final’ set of methodologies and parameters could be obtained in this third period.

Similarly, WP4 was expected in this period to complete the development of the different web services used to filter and substantiate signals output by WP3.

Additionally, this period was central for WP5, including the development of both evidence combination tools and the EU-ADR platform itself, including re-evaluation of design decisions as needed to enable full development of the system.

Although not expected to be completed in the period, activities in WP6 (System Validation) were expected to quickly unfold and show results as the full system ‘pipeline’ became progressively available.

For what refers to WP7, and aside from deepening the communication activities, with specific emphasis on liaison with relevant stakeholders and elaboration of relevant scientific publications, the focus was on developing the exploitation strategies, defining post-project use scenarios for the long term sustainability of the EU-ADR system.

Finally, management activities in WP8 were expected to consistently support all other activities in the project, ensuring efficiency and compliance with contractual requirements.

2.2 Work progress and achievements during the period

WP1: Scientific Coordination

WP leader: Johan van der Lei – EMC

Work for WP1 has continued as expected during the third project period, providing overall scientific leadership to the initiative, steering the project and balancing scientific excellence with pragmatic approaches. As in previous periods, cross-talking between WPs has been a specific concern, moreso as the different results in the central technical WPs (3, 4, 5) were progressively evolving towards their ‘almost-final’ shape – accordingly, system integration has been a continuing focus in the period.

A noteworthy activity in the period, developed in collaboration with WP8, was the devising of development ‘cycles’ to promote alignment of activities across WPs. This was necessary because, as WPs 3, 4 and 5 progressed, it was noted that there could be mismatches in the input-output relationships between them, derived from the different stages of development of each WP, a situation that could hardly be avoided taking into account the significant overlapping in the time schedules of these WPs as set out in the original plan. Therefore, three development cycles were devised (“silver”, “gold” and “bonus”), so that all WPs could follow the intended system ‘sequence’ from terminology mapping, to query harmonisation, signal detection, substantiation, evidence combination, system integration and validation. This helped clarify the dependencies among WPs, highlighted the iterative nature of the project developments, and served to test the system pipeline in order to detect bottlenecks or specific risks that could feed the exploitation plan. As an example, the cycles time schedule, which was updated during the period as needed, was originally devised as depicted below:

[pic]

The different cycles were defined according to the events that were included, the data coverage used in the process, and the Jerboa version used to extract and pool the data. Overall, the definition of development cycles has proven invaluable to focus the efforts of partners and achieving the expected results.

It is to be remarked that the last cycle (“bonus”) was explicitly devised as an optional activity, as it resulted from interactions with relevant stakeholders. Indeed, the attendance of representatives of the Dutch national regulatory authorities to one EU-ADR meeting resulted in a ‘request’ to explore what the project could contribute to two events of recent interest regarding drug safety (Progressive Multifocal Leukoencephalopathy and Acute Pancreatitis). The project agreed to look into these events after the bulk of the work with the ‘committed’ ten main events was complete. As the period elapsed, and given some of the difficulties encountered with these events, it was agreed in the Consortium to also look at Hip Fracture, as additional event of interest.

Aside from strengthening the relationships with regulators, the project has also continued to ‘spin-out’ other projects in the FP7 Health and other programmes, in which subsets of partners participate. This is the case of VAESCO and SAFEGUARD, who together with the ongoing SOS and ARITMO projects, make use of EU-ADR technology to implement specific safety studies. At the end of the period, the project has also established solid bonds with the very relevant EHR4CR IMI project; this has resulted in a joint application to the recent IMI call. Relationships are also ongoing with the EMA-coordinated EnCePP initiative. Importantly, collaboration with US initiatives (FDA Sentinel, OMOP) has been reinforced during the reporting period (including mutual visits), and the lead developer of Jerboa won the international “OMOP Cup” competition thanks to one of the algorithms for signal detection developed in EU-ADR.

In the previous period, an interim assessment of the progress of the project (Deliverable D1.3) was elaborated. For this, two complementary methods were used. Firstly, an ‘objective’ assessment was carried out following standard project management practices for progress evaluation (earned value), considering both cost and schedule performance. Secondly, a survey was carried out among partners to provide a ‘subjective’ assessment counterpart, by scoring the degree of achievement of each original objective on a 0-10 scale. Both exercises yielded a progress estimation of around 64% of the project completed at the end of month 24.

The exercise was repeated at the end of the period currently being reported, in order to compare the results and evaluate the interim progress achieved. Using the same earning rule (50/50), the ‘objective’ estimation at the end of July 2011 yields a “progress” (Budgeted Cost of Work Performed, BCWP, or Earned Value) of 89% of the project’s budget. For comparison, taking into account that the project was granted during the period a 6-month extension (see Management section of this report for details), a linear time reference at month 42 over 48 months of total duration means that by July 2011 87.5% of the project had elapsed. However, the bulk of activities were expected to be completed by that date, and the estimated BCWS (Budgeted Cost of Work Scheduled) is actually 96.7%. Therefore, it could be concluded that the project was behind schedule (Schedule Performance Index, SPI, of 0.92, where unity means perfectly on track). However, it is noteworthy that both WP1 and WP8 (and, to a lesser extent, WP7) are continuing WPs throughout the project, and in these cases the 50/50 earning rule artificially reduces the earned value estimation at the late stages of the project. Taking into account only WPs 2 to 7, the BCWP is 94%, while the BCWS is 98% - for an improved SPI of 0.95. Looking at the costs incurred in to achieve this progress, the calculations (pending approval of the period costs by the EC) show an Actual Cost of Work Performed (ACWP) of 101% of the budget, which clearly indicates that the project is over budget (Cost Performance Index, CPI, of 0.88). Taking into account WPs 2 to 7 only, the CPI improves to 0.92, which indicates only a slight overspending compared to the original budget. The ‘objective’ analysis, therefore, shows the project slightly behind schedule and over budget, but reasonably on track.

When looking at the ‘subjective’ assessment, partners (n=18) score the degree of achievement of the objectives with a mean 8.5 (which can be interpreted as 85% of the project completed):

|WP1 |WP2 |WP3 |WP4 |WP5 |WP6 |

|ALI |N03AF01 |Carbamazepine |CARDFIB |No drug with sufficient exposure that satisfies PubMed|

| | | | |criterion for True Positive |

|ALI |N03AG01 |Valproic acid |CARDFIB | |

|ALI |M01AX17 |Nimesulide |CARDFIB | |

|ALI |J01CR02 |Amoxicillin and clavulanic acid |CARDFIB | |

|ALI |A07EC01 |Sulfasalazine |CARDFIB | |

|AMI |M01AH02 |Rofecoxib |NEUTROP |

|BE |N03AF01 |Carbamazepine |UGIB |N02BA01 |Acetylsalicylic acid |

|ALI |R03AC13 |Formoterol |CARDFIB |

|Frequentist |Proportional Reporting Ratio (PPR) |Incidence Rate Ratio (IRR) |Matched case-control (CC) |

| |Reporting Odds Ratio (ROR) |  |Self-Controlled Case Series (SCCS) |

|Bayesian |Gamma Poisson Shrinker (GPS) |Longitudinal GPS (LGPS) |  |

| |Bayesian Confidence Propagation Neural |Bayesian Hierarchical Model (BHM)* |  |

| |Network (BCPNN) | | |

|Elimination of |LEOPARD |

|protopathic bias | |

* Note: even though the BHM was currently only applied to the IRR, it can be applied to other types of estimates as well.

The methods performance was done on the experience of in total, 146,830,906 patient years of 20,042,652 subjects over a total of 10-15 years (see figure 2).

Figure 3 shows the area under the ROC curve for the different methods that have been employed for signal detection in EU-ADR. This figure shows that all methods perform better than random baseline, that the LEOPARD filtering for protopathic bias always improves performance (but less so for methods that are already performing well), and that performance of methods does not differ that much. In general LGPS and case-control adjusting for drug count seem to slightly outperform the other methods, although this is certainly not statistically significant. In general the performance of methods is high, with the best performing method achieving an area under the ROC curve of 0.83, and a sensitivity and specificity of 0.80 and 0.70, respectively. This is not surprising, as the reference set was limited to drugs with a large amount of exposure in the different databases.

[pic]

Figure 2. Distribution of patient data per database over time.

[pic]

Figure 3. Area under the ROC curve for all methods, with and without LEOPARD filtering. Combination across databases was performed by pooling data. Error bars indicate 95% confidence interval.

2) Effect of the change in parameter settings across the signal detection algorithms

For all methods, the following specifications were used to define exposures and outcomes:

• Incident events. Only the first occurrence of an event was considered. Patient time after an event was completely ignored. The main reason for this is that, in EHR data, it is often difficult to distinguish between a recurrence of an event, or whether a reference is made to the event that occurred earlier.

• Run-in period of 365 days. In order to determine that an event is incident, some patient time has to be available before the event occurred. Hence, during the first year of observation subjects were not considered for events or exposure counts, but events during this so-called run-in period were used to determine whether later events were truly incident events. This run-in period was not used for children younger than one year at the start of observation.

• Exposure window definition. Exposure to a drug was defined as the duration of the prescription, excluding the first day of the prescription. If two prescriptions of the same drug overlapped in time, the exposure was assumed to start the day after the first day of the first prescription, and end on the last day of the last prescription.

• Age stratification. Whenever appropriate, age was stratified in 5-year age ranges.

• Independence of drug risks. Currently, every drug-event pair is evaluated separately. Co-medication is not taken into account.

LEOPARD was considered to be potentially complementary to all methods, and was therefore applied as a filter to the output of each method. LEOPARD can be applied at the level of the individual drug, but it can also be applied to a group of drugs. By grouping drugs with the same 4 higher level ATC digits (i.e. drugs with the same indication), LEOPARD has proven more able to detect protopathic bias. Signals that are flagged by LEOPARD either at individual or at group level were ranked lower in the list of signals than signals that were not flagged when calculating the AuC.

Currently we are testing the method performance along a wide range of parameter settings.

3) Methods for prime suspect selection

One of the use cases of the EU-ADR system is the ranking of potential ADRs that require attention first. During the past months we have developed different strategies for the selection of prime suspects. The methods have been presented at the EU-ADR consortium meetings for discussion with the wider audience, and are still under development. A prime suspect has been defined as a drug-event combination that is not yet known, cannot be immediately explained by bias/confounding and affects many people. Methods for selection of prime suspects combine the number of excess cases, with enough exposure that pass the LEOPARD test and that do not seem to be confounded.

WP4: Signal Substantiation

WP leader: Ferran Sanz / Laura Furlong – UPF

During the reporting period, the work focussed on finishing the development of software tools that accomplish the tasks required for the substantiation of the signal and subsequent evaluation of the such tools. The software tools include both methods implemented as web services and workflows that combine the web services.

The annotation of the EU-ADR corpus was finalized, and work was performed towards the use of the corpus for the development of a relation extraction system. Moreover, a lot of effort was put in the writing of manuscripts describing the work performed in the WP.

During the reporting period several teleconferences and web-seminars were organized to monitor the progress in each of the tasks. In addition, several face-to-face meetings were organized to address issues related to web services and workflow design and implementation. The progress in the individual activities is described hereunder.

Database and literature mining

During this reporting period the annotation of the EU-ADR corpus was finalized. The details on the development of the EU-ADR corpus were provided in Deliverable 4.4: Report on literature and DB mining, and in a manuscript submitted for publication to the Journal of Biomedical Informatics.

The EU-ADR corpus consists of 300 Medline abstracts containing semantic annotations on biomedical entities and their relationships. The annotations were performed by domain experts who were capable of deciding if a text describes a relationship, and thus the corpus represents a “gold standard” dataset. To aid the work of annotators, a web-based annotation tool was developed to provide automatic annotation of the entities and propose relationships to be annotated.

The entities annotated were target (gene/protein, sequence variants), disorder (disease phenotypes of the adverse drug reactions), and drug (biologically active chemicals, marketed drugs and drug metabolites). The relationships considered were the following: target-disorder, target-drug and drug-disorder. Moreover, the level of certainty of each relationship was also specified by providing the relationship types:

• Positive association (PA): the sentence clearly states that there is an association between the entities.

• Negative association (NA): the sentence clearly states that there is no association between the entities.

• Speculative association (SA): the sentence describes a putative relationship between the target and the disease. This might be confirmed or refuted later in the abstract, but in the sentence under study the relationship is presented as a speculation.

Once the expert annotation was finalized, work was conducted to harmonize annotations and evaluate the agreement between the annotators. The final EU-ADR corpus consists of the consensus annotations performed by the experts both at the level of entities and at the level of relationships. Based on all annotations of entities and relationships we analysed the number of entities and relations for whom a majority exists and in consequence were included in the final EU-ADR corpus (Table 4).

Table 4. Number of annotated entities and relationships and their agreement in the EU-ADR corpus. For the relationships agreement the second percentage shows the agreement given agreement on the entities.

|Relationship type |Entities |Agreement Entities |Relationships |Agreement Relations |

|Drug-disorder |1849 |1464 (79.2%) |655 |294 (44.8%, 71.6%) |

|Target-drug |2214 |1701 (76.8%) |802 |324 (40.4%, 68.5%) |

In order to test the agreement of each annotator with the EU-ADR corpus we computed both the agreement statistics for both the entities (Table 5) and the relations (Table 6). The agreement figures show a good correspondence between the different annotations. From the results we can see that apart from annotator A4 all annotators show a good agreement with the EU-ADR corpus.

Table 5. Agreement between the annotators (A1-A5) and the automatic tool against the EU-ADR corpus for the annotated entities.

|Relationship type |A1 |A2 |A3 |A4 |A5 |Computer |

|Drug-disorder |0.83 | | |0.77 |0.87 |0.73 |

|Target-drug |0.82 |0.83 |0.87 | | |0.67 |

Table 6. Agreement between the annotators (A1-A5) and the automatic tool against the EU-ADR corpus for the annotated relationships.

|Relationship type |A1 |A2 |A3 |A4 |A5 |Computer |

|Drug-disorder |0.75 | | |0.51 |0.83 |0.69 |

|Target-drug |0.77 |0.79 |0.50 | | |0.79 |

In addition to comparing the annotations against the annotated corpus we also computed the inter-annotator agreement for each relationship (Table 7).

Table 7. Inter-annotator agreement statistics per relationship type.

|Drug-Disorder |A1 |A4 |A5 |Computer |

|A1 |1.00 |0.78 |0.72 |0.59 |

|A4 |0.72 |1.00 |0.70 |0.56 |

|A5 |0.78 |0.70 |1.00 |0.64 |

|Computer |0.59 |0.64 |0.64 |1.00 |

|Target-Disorder |A1 |A2 |A3 |Computer |

|A1 |1.00 |0.73 |0.74 |0.46 |

|A2 |0.73 |1.00 |0.75 |0.49 |

|A3 |0.74 |0.75 |1.00 |0.58 |

|Computer |0.46 |0.49 |0.58 |1.00 |

|Target-Drug |A1 |A2 |A3 |Computer |

|A1 |1.00 |0.78 |0.75 |0.49 |

|A2 |0.78 |1.00 |0.74 |0.52 |

|A3 |0.75 |0.74 |1.00 |0.58 |

|Computer |0.49 |0.52 |0.58 |1.00 |

The agreement statistics are comparable with what has been shown in other annotation efforts[2] [3]. The agreement on the entity annotation is a little higher than on the relationships. One reason for this is that it may be difficult for annotators to distinguish between a relationship being described in the text and the relationship actually being true. Even though a named entity recognition system has been used to suggest annotations to the annotators, we can see that the agreement between this system and the annotators is lower than the inter-annotator agreement. This means that the annotators modified the suggested annotations and were consistent on suggestions for change. Nevertheless, in our experience the use of a NER system is highly recommended to facilitate the annotation, since it made possible for the annotators to focus in the annotation of relationships.

The EU-ADR corpus can be downloaded from: . The annotation tool is available online at: .

Work in the period has also been directed to implement a system for the detection of relationships between biomedical entities from text. The system is based in the JSRE implementation from Giuliano and colleagues[4], a Java implementation of a supervised machine learning approach developed for the identification of interactions between proteins, and achieves state-of-the-art performance. It is a kernel-based approach that uses only shallow linguistic information, such as tokenization, sentence splitting, Part of Speech tagging and lemmatization. The original JSRE system uses a linear combination of two kernel functions to represent the global context where entities appear and their local contexts. The global context kernel (GC kernel) considers the whole sentence to discover a relationship between two entities. The local context kernel (LC kernel) uses windows of predefined size around the entities to identify the roles of the entities within a relationship. The kernel-based machine algorithm used by JSRE is Support Vector Machines. We have added a third kernel to the JSRE system to incorporate deep syntactic information (D kernel) obtained from dependency parse trees to develop a method to detect the following relationships: target-disorder, target-drug and drug-disorder. The EU-ADR corpus was used for training and evaluation of the system. More details will be provided in a manuscript in preparation.

In a first step, we evaluated the contribution of using dependency parse information for the detection of the relationships of interest. The results for target-disorder relation are summarized in Table 8.

Table 8. Results for target-disorder relation.

|System |Precision |Recall |F1 score |

|Original JSRE (LC kernel + GC kernel) |0.72 |0.61 |0.66 |

|Dependency kernel |0.76 |0.60 |0.65 |

|Original JSRE + Dependency kernel |0.77 |0.71 |0.73 |

|Original JSRE + Dependency kernel + keywords |0.77 |0.70 |0.73 |

These results indicate that the best system incorporates the 3 kernels without the use of association keywords. The association keywords are words or expressions that denote an association between the two entities, for instance the word bind is a keyword for associations between proteins. Similar results were obtained for the other two relationships (not shown).

Once the optimal configuration of the system was determined, it was used to train models for the detection of the above mentioned relationships. The purpose of these models is to distinguish the presence of an association between the entities from a mere co-occurrence without denoting an association. For that, we used the EU-ADR corpus and 10-fold cross-validation. The results are shown in the table below, and indicate good performance of the system for the detection of the relationships.

| |Precision |Recall |F1 score |

|Drug-Disorder |0.66 |0.75 |0.69 |

|Drug-Target |0.78 |0.70 |0.73 |

|Target-Disorder |0.72 |0.67 |0.69 |

Drug-target-pathway-adverse event mapping

In the reporting period, work was focused in finalizing and evaluating the developed tools (web services and workflows), and in writing manuscripts to report the results. Moreover, we finalized an analysis conducted on the DisGeNET database[5] aimed at exploring the modular nature of human disease and also drug adverse reactions, which was published in PLoS One[6]. The details on the work towards the methodologies developed in this task were already reported in Deliverable 4.5: Report on Drug-Target-Pathway-Adverse event mapping, and is the subject of the publication entitled “Automatic filtering and substantiation of drug safety signals” currently under evaluation in the journal Plos Computational Biology. Here we provide an overview on the methodology for signal filtering and substantiation and present an example application. Details on the implementation of the web services and workflows can be found in Deliverable 4.5 and in the above-mentioned manuscript. The framework developed for the filtering and substantiation of drug safety signals consists of placing the signal in the context of current knowledge of biological mechanisms that might explain it. Essentially, we are searching for evidence that supports causal inference of the signal, i.e. feasible paths that connect the drug with the clinical event of the adverse reaction. The signal filtering analysis looks for evidence reporting the drug-event association in the biomedical literature and biomedical databases (Figure 4).

[pic]

Figure 4. Schematic representation of the signal filtering process. Two workflows are available for the signal filtering process: the ADR-FM workflow uses a MeSH®-based approach to find drug-event pairs in Medline® citations, while the ADR-FD workflow uses text-mining to find the drug-event pairs in Medline® abstracts, databases such as DrugBank and drug labels available at DailyMed®.

The signal substantiation process can be framed as a closed knowledge discovery process, analogous to the Swanson model based on hidden literature relationships[7]. We extend this framework by considering not only relationships found in the literature, but also those discovered by mining other data sources or found by applying different bioinformatics methods. For a drug-event association, we collect information about the targets of the drug by querying publicly available databases and by applying drug-target profiling methods[8]. In parallel, we retrieve information about the genes and proteins associated with the clinical event from a database covering knowledge about the genetic basis of diseases. Then, we combine these two pieces of information assuming that if the disease phenotype elicited by the drug is similar to the phenotype observed in a genetic disease, then the drug acts on the same molecular processes that are altered in the disease. Currently we consider two scenarios able to provide a causal inference of the signal (see Figure 5).

[pic]

Figure 5. Schematic representation of the signal substantiation process. A. Signal substantiation through proteins. B. Signal substantiation through pathways.

First, we look for connections between the drug and the event through their associated protein profiles. Here, a connection is established if there are proteins in common between the drug-target and the event-protein profile (Figure 5A). Many ADRs are caused by altered drug metabolism for which genetic variants in metabolizing enzymes are often responsible. Consequently, we also consider drug metabolism phenomena as an underlying mechanism of the observed ADR by assessing if the drug metabolites are targeting proteins that are known to be associated with the clinical event. The profile of targets of the drug and its metabolites is obtained by in silico profiling methods (Drug-Target-Profile). The profile of proteins associated with the clinical event is obtained by mining DisGeNET (Event-Protein Profile). The profiles are compared to find proteins in common in both profiles (Drug-Event Linking Proteins). The evidences that support the association of the drug and event with the Drug-Event Linking proteins are explored to determine if they support the causal inference of the signal.

Second, the association between the drug and the clinical event can involve proteins that are not directly associated with the drug and the clinical event, but indirectly in the context of biological networks. The final consequence of the drug action is the observed clinical event. Thus, proteins in the Drug-Target-Profile and in the Event-Protein Profile are searched in The Human Protein Atlas database to determine if they are expressed in the same tissue and cell type. Proteins that share expression at both levels (tissue and cell type) are used to query the Reactome database, and pathways that contain at least one protein from the Drug-Target-Profile and one protein from the Event-Protein Profile are retrieved. Then, these pathways are explored to determine if they support the causal inference of the signal (Figure 5B).

[pic]

Figure 6. Integration of diverse biomedical sources and bioinformatics tools for the implementation of the filtering and substantiation frameworks. Data sources and bioinformatics methods relevant for signal filtering and substantiation are accessed by means of SOAP web services and integrated using Taverna workflows.

Our approaches for signal filtering and signal substantiation were implemented using dedicated bioinformatic methods that are accessed through web services and integrated into processing pipelines by means of Taverna workflows (Figure 6). The substantiation workflow results can be visualized and analyzed by means of other bioinformatics tools such as Cytoscape[9] (see Figure 7 and tables for an example results below), a software for network visualization and analysis. For the signal filtering process, we have implemented two Taverna workflows (ADR-FM and ADR-FD) that access data mined from databases such as DrugBank[10], DailyMed[11] and Medline® .(see tables below for example results). A third Taverna workflow, ADR-S, performs the signal substantiation process and was implemented by combining in silico target profiling, text mining and pathway analysis, among other bioinformatics approaches.

[pic]

Figure 7. Cytoscape graph for QTPROL-haloperidol. The results of the ADR-S workflow can be visualized as a graph in which the nodes are proteins, compounds and clinical events. A: Detail of the network depicting the haloperidol targets, the proteins associated with QTPROL and the connection between them. The proteins encoded by the genes KCNH1, KCNH2 and CACNA1C constitute Drug-Event linking proteins between haloperidol and the terms corresponding to QTPROL. B: Detail of the targets of haloperidol, showing the adrenergic receptors (light blue) and the drug transporter encoded by the gene ABCB1 (magenta). In both graphs, the multiple edges between two nodes represent different evidences for the corresponding association between the nodes.

Table 9. Antipsychotics with low and high risk of producing prolongation of the QT interval (QTPROL) analyzed with the filtering workflows (ADR-FM and ADR-FD). For the ADR-FD, the individual results obtained from the three different sources used (Medline, DailyMed and DrugBank) are shown. The table shows the number of records found in each case. NA: Not Available.

| |Workflow |

| |ADR-FM |ADR-FD |

|Risk of QTPROL |

|  |

|Del. no. |

|Milestone |Milestone name |Work package no |Lead beneficiary |Delivery date from Annex |Achieved |Actual / Forecast |Comments |

|no. | | | |I |Yes/No |achievement date | |

| | | | |dd/mm/yyyy | |dd/mm/yyyy | |

|1 |Definition of event list |2 |UB2 |31/05/2008 |Yes |04/09/2008 |D2.1 |

|2 |Completion of validation sets |2 |UB2 |31/05/2008 |Yes |16/10/2008 |D2.2 |

|3 |Finalisation of standardisation |2 |UB2 |31/10/2008 |Yes |09/02/2009 |D2.3 |

| |and mapping terminologies | | | | | | |

|4 |Completion of 1st versions of |3, 4 |EMC, UPF |30/04/2009 |Yes |30/03/2010 |Basic functionality of prototype |

| |software and algorithms for data| | | | | |software and algorithms for mining of |

| |extraction and mining of | | | | | |clinical and biomedical dbs and |

| |databases and repositories | | | | | |repositories demonstrated, and |

| | | | | | | |reported as D3.1, D4.1, D4.2 and D5.2.|

|5 |Completion of mid-term |1 |EMC |31/10/2009 |Yes |30/03/2010 |Project explicitly evaluated as |

| |assessment of the project | | | | | |regards to fulfilment of its |

| | | | | | | |objectives, as reported in D1.3 |

|6 |Completion of EU-ADR system |3,4,5 |EMC,UPF, |31/03/2010 |Yes |22/09/2010 |Prototype of EU-ADR web system |

| |software version 1, including | |UAVR | | | |accessible and running flawlessly with|

| |underlying software components | | | | | |basic functionality, reported as D5.2,|

| | | | | | | |including completed underlying |

| | | | | | | |software components, documented and |

| | | | | | | |reported as D3.1, D3.2, D4.2 and D4.3 |

|7 |Finalisation of an evidence |5 |UAVR |31/01/2011 |Yes |11/02/2011 |Combination framework documented and |

| |combination framework | | | | | |reported as D5.3 |

|8 |Completion of final version of |3,4,5 |EMC,UPF, |31/07/2011 |Yes |07/10/2011 |Prototype of final EU-ADR web system |

| |the EU-ADR System software, | |UAVR | | | |running flawlessly with full |

| |including underlying software | | | | | |functionality, reported as D5.4, |

| |components and algorithms | | | | | |including completed underlying |

| | | | | | | |software components and algorithms for|

| | | | | | | |data and literature mining, pathway |

| | | | | | | |mapping, etc. documented and reported |

| | | | | | | |as D3.3, D4.4 and D4.5 |

|9 |Completion of retrospective |6 |EMC |31/07/2011 |Yes |20/09/2011 |Results from retrospective validation |

| |validation studies | | | | | |documented and reported as D6.3 |

4 Explanation of the use of the resources

Below it is provided an explanation of personnel costs, subcontracting and any major direct costs incurred by each beneficiary, such as the purchase of important equipment, travel costs, large consumable items, etc. linking them to work packages.

These are listed in the following tables:

|Table 4.1 Personnel, subcontracting and other major cost items for Beneficiary EMC |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,3,4,5,6,7,8 |Personnel direct costs |437,170.00€ |Salaries of 8 Researchers for a total of 70.5 PM:J van der Lei (3.5 pm), J Kors (4.1 pm), M. Sturkenboom (3.4 pm), E van Mulligen (2.0 |

| | | |pm) M Schuemie (15.9 pm), P Coloma (19.5 pm), G. Trifiro (9.5), B. Mosseveld (9.1) |

| | | |Salaries of 2 financial management staff for a total of 3.5 pm: T de Ben (1.7 pm), A Woerdeman (1.8 pm) |

|3,6 |Subcontracting |65,200.00€ |Subcontractor SIMG |

|1,3,4,5,6,7 |Other direct costs |48,301.94€ |Travel costs: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, J Kors, |

| | | |G Trifiró, S Romio, |

| | | |2nd Project Review, Brussels April 13-14, 2010, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom,J Kors, G Trifiró, |

| | | |8th CM, Rotterdam, May 25-26, 2010, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, J Kors, G Trifiró, S Romio, T |

| | | |de Ben |

| | | |9th CM, Florence, October 25-26, 2010, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, J Kors, G Trifiró, S Romio, |

| | | |V Patadia, |

| | | |10th CM, Aveiro, February 2-3, 2011, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, J Kors,  G Trifiró, |

| | | |11th CM, Barcelona, April 4-5, 2011, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, G Trifiró, B Singh |

| | | |12th CM, Barcelona, June 22-23, 2011, E van Mulligen, J van der Lei,  M Schuemie, M Sturkenboom, P Coloma, G Trifiró, J Kors, B Singh |

| | | |Other travel costs and travel costs related to dissemination activities, including the Conferences (March 2010, September 2010, January |

| | | |2011) attended by M Schuemie, the Conference July 2010 attended by G. Trifiro, the Conference (July 2010) attended by P. Coloma and the |

| | | |Conferences (April 2010, June 2010, Oktober 2010, December 2010, January 2011) attended by M. Sturkenboom |

|1,3,4,5,6,7 |Remaining direct costs |4,184.06€ |Various cost: mailing, teleconferencing, tuition (P. Coloma, Training program Clinical Epidemiology) and hardware costs. |

| |Indirect costs |293,793.00€ |60% rate for indirect costs |

|TOTAL COSTS |848,649.00€ | |

|Table 4.2 Personnel, subcontracting and other major cost items for Beneficiary FIMIM |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,7,8 |Personnel direct costs |41,169.00€ |Salaries of Carlos Díaz (2.5 p/m), Eva Molero (2.9 p/m), Sandra Pla (3.7 p/m) |

|7 |Subcontracting |220.00€ |Website Maintenance carried out by the subcontractor “” |

|7,8 |Travel costs and meeting |22,125.00€ |7th Consortium Meeting (CM), Sitges, February 15-16, 2010. Meeting organised and hosted by FIMIM. The organisation costs include the |

| |organisation | |dinner, catering and hiring meeting rooms (4,673.79€). |

| | | |2nd Project Review, Brussels April 13-14, 2010, C Diaz, E Molero, S Pla |

| | | |8th CM, Rotterdam, May 25-26, 2010, C Diaz, E Molero, |

| | | |9th CM, Florence, October 25-26, 2010, C Diaz, E Molero, S Pla |

| | | |10th CM, Aveiro, February 2-3, 2011, C Diaz, E Molero, S Pla |

| | | |11th CM, Barcelona, April 4-5, 2011. Meeting organised and hosted by FIMIM. The organisation costs include the dinner, catering and |

| | | |hiring meeting room (4,416.54€). |

| | | |12th CM, Barcelona, June 22-23, 2011. Meeting organised and hosted by FIMIM. The organisation costs include the dinner, catering and |

| | | |hiring meeting room (2,528.85€). |

| | | | |

| | | |Travel costs for Medinfo Conference (September, 2010) attended by Eva Molero |

|7,8 |Remaining direct costs |893.00€ |Print of Flyer and courier costs. |

| |Indirect costs |38,511.00€ |60% rate for indirect costs |

|TOTAL COSTS |102,918.00€ | |

|Table 4.3 Personnel, subcontracting and other major cost items for Beneficiary UPF |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,4,5,7,8 |Personnel direct costs |56,842.47€ |Salaries of own personnel, 9.81 p/m,(2 senior scientists and 1 pre-doc) and new personnel for the project, 11 p/m, (1 pre-doctoral |

| | | |student). |

|4,8 |Subcontracting |7,389.00€ |Subcontracting of Work performed by TAU, and Audit Certificates. |

|1,4,5 |Travel costs |7,623.27€ |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, L Furlong, A Bauer-Muhren, F Sanz |

| | | |2nd Project Review, Brussels April 13-14, 2010, L Furlong, J Mestres, F Sanz |

| | | |8th CM, Rotterdam, May 25-26, 2010, L Furlong |

| | | |26th. ICPE 2010 Project dissemination, Brighton, England. August 19-22, 2010 L Furlong |

| | | |ECCB10 Project dissemination, Ghent, Belgium. September 26, 2010, A Bauer-Mehren |

| | | |Collaboration Meeting, Paris, France. September 28, 2010, A Bauer-Mehren, L Furlong, |

| | | |ICSB2010 Project dissemination, Edinburgh, England. October 11-14, 2010, A Bauer-Mehren, |

| | | |9th CM, Florence, October 25-26, 2010, L Furlong, A Bauer-Muhren, F Sanz, J Mestres |

| | | |10th CM, Aveiro, February 2-3, 2011, A Bauer-Mehren, L Furlong |

| |Remaining direct costs |5,563.04€ |Scientific collaborations about “Mapping side effects on drug-target networks”, this collaborations were specific contributions from |

| | | |visitor experts. |

| | | |Congress fees for Laura Furlong (ICPE2010) and Anna Bauer-Mehren (ECCB2010). |

| |Indirect costs |42,017.27€ |60% rate for indirect costs |

|TOTAL COSTS |119,435.05€ | |

|Table 4.4 Personnel, subcontracting and other major cost items for Beneficiary UAVR |

|for the period 01/02/2010-31/07/2011 |

|Work |Item description |Amount in € with 2 |Explanations |

|Package | |decimals | |

|1,5,6,7,8 |Personnel direct costs |133,741.94€ |Salaries corresponding to 24.20 p/m: |

| | | |José Luís Oliveira: 6.35 p/m |

| | | |José Pinto: 8.5 p/m |

| | | |Carlos Costa: 4.25 p/m |

| | | |Joaquim Arnaldo  Martins: 5.1 p/m |

|1,5,6,7,8 |Other direct costs |12,977.39€ |General meetings of the Project and several joint activities, namely: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, Jose Luis Oliveira |

| | | |2nd Project Review, Brussels April 13-14, 2010, Jose Luis Oliveira, |

| | | |8th CM, Rotterdam, May 25-26, 2010, Jose Luis Oliveira |

| | | |9th CM, Florence, October 25-26, 2010, Jose Luis Oliveira |

| | | |10th CM, Aveiro, February 2-3, 2011, Local Organisation. |

| | | |11th CM, Barcelona, April 4-5, 2011, Jose Luis Oliveira |

| | | |12th CM, Barcelona, June 22-23, 2011, Jose Luis Oliveira |

| | | | |

| | | |Presentation of Project results in international conferences (PACBB2010, ITAB2010). |

| |Indirect costs |88,031.6€ |60% rate for indirect costs |

|TOTAL COSTS |234,750.93€ |  |

|Table 4.5 Personnel, subcontracting and other major cost items for Beneficiary UB2 |

|for the period 01/02/2010-31/07/2011 |

|Work |Item description |Amount in € with 2 |Explanations |

|Package | |decimals | |

|1,5,6,7,8 |Personnel direct costs |44,510.89 € |Salaries for partial workload of 7 researchers for the period according to time sheets and as detailed below: |

| | | |N. Moore: 1.3 p/m, A Fourrier-Réglat: 3.9 p/m, A Pariente: 0.8 p/m, F. Salvo: 0.5 p/m, P Avillach: 0.3 p/m, F. Mougin: 0.03 p/m, G |

| | | |Diallo: 0.6 p/m |

|5,6 |Other direct costs |17,102.73 € |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, A. Fourrier-Réglat, F. Mougin, P. Avillach, G. Diallo |

| | | |2nd Project Review, Brussels April 13-14, 2010, A. Fourrier-Réglat, F. Thiessard |

| | | |8th CM, Rotterdam, May 25-26, 2010, P. Avillach, G. Diallo |

| | | |9th CM, Florence, October 25-26, 2010, A. Fourrier-Réglat, P. Avillach, G. Diallo |

| | | |10th CM, Aveiro, February 2-3, 2011, G. Diallo |

| | | |11th CM, Barcelona, April 4-5, 2011, A. Fourrier-Réglat, G. Diallo |

| | | | |

| | | |Travel costs for Medinfo Conference (September, 2010) attended by P Avillach |

| |Indirect costs | 36,968.17€ |60% rate for indirect costs |

|TOTAL COSTS |98,581.79€ |  |

|Table 4.6 Personnel, subcontracting and other major cost items for Beneficiary LSHTM |

|for the period 01/02/2010-31/07/2011 |

|Work |Item description |Amount in € with 2 |Explanations |

|Package | |decimals | |

|2,3,4,5,6,7 |Personnel direct costs |73,315.43€ |Contribution to salaries of one computer programmer (Ferran Orsola 0.11 p/m), one data manager (Richard Jackson 0.9 p/m); one Project |

| | | |co-ordinator (Ruqayya Suleman 6.6 p/m), one statistician (David Prieto 0.4 p/m ), one epidemiologist (Justin Matthews 7 p/m ) Total 15 |

| | | |p/m |

|2,3,4,5,6 |Other direct costs |6,141.36€ |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, Mariam Molokhia |

| | | |2nd Project Review, Brussels April 13-14, 2010, Mariam Molokhia |

| | | |8th CM, Rotterdam, May 25-26, 2010, Mariam Molokhia, Justin Matthews |

| | | |ICPE conference Brighton, UK Aug 2010, David Prieto |

| | | |9th CM, Florence, October 25-26, 2010, Mariam Molokhia |

| | | |11th CM, Barcelona, April 4-5, 2011, Mariam Molokhia, Justin Matthews, David Prieto |

| | | |12th CM, Barcelona, June 22-23, 2011, Mariam Molokhia, Justin Matthews |

| |Indirect Costs |47,674.07€ |60% rate for indirect costs |

|TOTAL COSTS |127,130.86€ |  |

|Table 4.7 Personnel, subcontracting and other major cost items for Beneficiary AUH-AS |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1 |Personnel direct costs |8,926.70€ |Salaries for one admin person for 1 month and for one statistician for 0.5 month |

|3 |Personnel direct costs |18,124.32€ |Salaries for two statisticians for 1 month and 3 months, respectively |

|3 |Personnel direct costs |20,461.96€ |Salaries for two statisticians for 3.7 months and 0.8 months, respectively |

|6 |Personnel direct costs |18,407.23€ |Salaries for two statisticians for 1 month and 3 months, respectively |

|6 |Personnel direct costs |4,764.37€ |Salary for one statistician for 1 month |

|7 |Personnel direct costs |5,341.48€ |Salaries for one statistician for 0.3 month and for one admin person for 0.6 month |

|1,3,6 |Travel costs |4,280.91€ |Costs for one person attending: |

| | | |8th Consortium Meeting (CM) Rotterdam, May 25-26, 2010 |

| | | |9th CM, Florence, October 25-26, 2010 |

| | | |10th CM, Aveiro, February 2-3, 2011 |

| | | |11th CM, Barcelona, April 4-5, 2011 |

|  |Indirect costs |48,184.18€ |60% rate for indirect costs |

|TOTAL COSTS |128,491.14€ |  |

|Table 4.8 Personnel, subcontracting and other major cost items for Beneficiary AZ |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|4,5,7 |Personnel direct costs |106,431.23€ |Salaries for a post doc 9 months (Ernst Ahlberg). A researcher for 2.5 months (Samuel Andersson). PA for 2 months (Marita Franzén). |

| | | |Senior researcher for 2 months (Scott Boyer) |

|4,5,7 |Travel costs |22,825.90€ |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, Scott Boyer, Samuel Andersson |

| | | |8th CM, Rotterdam, May 25-26, 2010, Scott Boyer, Samuel Andersson, Ernst Ahlberg |

| | | |9th CM, Florence, October 25-26, 2010, Scott Boyer, Ernst Ahlberg |

| | | |10th CM, Aveiro, February 2-3, 2011, Scott Boyer, Ernst Ahlberg |

| | | |11th CM, Barcelona, April 4-5, 2011, Scott Boyer, Ernst Ahlberg |

| | | |12th CM, Barcelona, June 22-23, 2011, Scott Boyer, Ernst Ahlberg |

| | | |Additional travel costs for 3 conferences attended by Scott Boyer and one attended by Ernst Ahlberg. |

| |Indirect costs |38,777.14€ |Actual indirect costs |

|TOTAL COSTS |168,034.27€ |  |

|Table 4.9 Personnel, subcontracting and other major cost items for Beneficiary UNOTT |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,2,3,5,6,7 |Personnel direct costs | 29,722.00€ |Salary for Professor Julia Hippisley-Cox 1.2.10 - 31.7.11 |

|1 |Personnel direct costs | 751.00€ |April McCambridge-general administrative tasks including completion of forms, travel arrangements and timesheets |

|1 |Personnel direct costs | 53.00€ |Project Admin Support-Jill Harris (Accounting, Research Innovation Services) |

|1,2,3,5,6 |Travel costs | 2,075.00€ |Travel costs for Project meetings: |

| | | |2nd Project Review, Brussels 13-14, April 2010, Julia Hippisley-Cox |

| | | |8th CM, Rotterdam, May 25-26, 2010, Julia Hippisley-Cox |

| | | |11th CM, Barcelona, April 4-5, 2011, Julia Hippisley-Cox |

|3,7 |Remaining direct costs | 58,695.00€ |QResearch data access: |

| | | |April 10 extract costs for WP3 €11,792.45 |

| | | |July 10 extract costs for WP3 €11,792.45 |

| | | |April 11 Facilities cost transfer for WP3 €11,461.32 |

| | | |Extract costs Gold run for WP3 €11,792.45 |

| | | |Extract costs Silver run for WP3 €11,792.45 |

| | | |Printing of poster for display at University of Nottingham €63.88 June 2010 WP7 |

|  |Indirect costs | 54,777.60€ |60% rate for indirect costs |

|TOTAL COSTS | 146,073.60€ |  |

|Table 4.10 Personnel, subcontracting and other major cost items for Beneficiary UNIMIB |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,3,6,7 |Personnel direct costs |15,334.03€ |Salaries of ordinary staff: 1 professor for 0.5 p/m, 2 researchers for 1.5 p/m, 1 associate professor for 1 p/m |

|3 |Personnel direct costs |39,394.42€ |Salaries of temporary staff: 1 stat-info technician for 14.4 p/m |

|1,3,6 |Travel costs |2,099.61€ |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, Lorenza Scotti |

| | | |8th CM, Rotterdam, May 25-26, 2010, Lorenza Scotti |

| | | |9th CM, Florence, October 25-26, 2010, Lorenza Scotti, Arianna Ghirardi |

| | | |10th CM, Aveiro, February 2-3, 2011, Lorenza Scotti, Arianna Ghirardi |

| | | |11th CM, Barcelona, April 4-5, 2011, Lorenza Scotti, Arianna Ghirardi |

|1,3,6 |Remaining direct costs |807.51€ |Pc, printer, monitor, memory unit Depreciation |

| |Indirect costs |34,581.64€ |60% rate for indirect costs |

|TOTAL COSTS |92,217.71€ | |

|Table 4.11 Personnel, subcontracting and other major cost items for Beneficiary ARS |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|1,2,3,6,7 |Personnel direct costs |39,072.67€ |Salaries of 4 researches: 3 p/m, 5 p/m, 3.2 p/m and 0.6 p/m |

|6 |Subcontracting |440.80€ |Chart review from collaborating hospital |

|1,2,3,6 |Travel costs |5,057.20€ |Travel costs for Project meetings (1,829.54€): |

| | | |7th Consortium Meeting (CM), Sitge February 14, 2010, Rosa Gini; |

| | | |10th CM, Aveiro, February 2-6, 2011, Rosa Gini; Franccesco Innocenti |

| | | |11th CM, Barcelona, April 4-5, 2011, Rosa Gini; |

| | | |12th CM, Barcelona, June 22-23, 2011, Rosa Gini; |

| | | | |

| | | |Consortium meetings (3,227.66€): |

| | | |9th CM, Florence, October 25-26, 2010 |

|7 |Travel costs |3,243.25€ |Travel costs for Medinfo Conference (September, 2010) attended by Rosa Gini |

| |Indirect costs |28,423.87€ |60% rate for indirect costs |

|TOTAL COSTS |76,237.79€ | |

|Table 4.12 Personnel, subcontracting and other major cost items for Beneficiary PHARMO |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|2,3,6,7 |Personnel direct costs |56,113.00€ |Salaries of statistician, scientific director, software engineer, managing director, data logistic manager for a total 6.6 p/m |

|2,3,6 |Other direct costs |7,081.70€ |Data costs: 3.000,00.€ |

| | | | |

| | | |Travel costs for Project meetings (4.081,70€): |

| | | |7th Consortium meeting (CM), Sitges February 14, 2010. 2 travelers (Huub M.P.M. Straatman, Ron Herings) |

| | | |9th CM Florence, October 25-26, 2010. 2 travelers (Huub M.P.M. Straatman, Ron Herings) |

| | | |10th CM, Aveiro, February 2-6, 201. 2 travelers (Huub M.P.M. Straatman, Ron Herings) |

| | | |11th CM, Barcelona, April 4-5, 2011. 1 traveler (Ron Herings) |

| | | |12th CM, Barcelona, June 22-23, 2011. 1 traveler (Ron Herings) |

|  |Indirect costs |37,916.82€ |60% rate for indirect costs |

|TOTAL COSTS |101,111.52€ |  |

|Table 4.13 Personnel, subcontracting and other major cost items for Beneficiary PEDIANET |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|3,6 |Personnel direct costs |37,802.66€ |Salary of Data Manager (15.3 p/m) |

|3,6 |Subcontracting |15,000.00€ |These costs were incurred for the payment to the International Pharmaco-epidemiology and Pharmaco-economics Research Center (IPPRC) for |

| | | |the elaboration of large databases and the statistical analysis of epidemiological studies (3 p/m) |

| 3 |Other direct costs |4,219.76€ |Purchase of personal computers and of a server for data collection and management (depreciation costs only are charged) |

|  |Indirect costs |25,213.45€ |60% rate for indirect costs |

|TOTAL COSTS |82,235.87€ |  |

|Table 4.14 Personnel, subcontracting and other major cost items for Beneficiary USC |

|for the period 01/02/2010-31/07/2011 |

|Work Package |Item description |Amount in € with 2 |Explanations |

| | |decimals | |

|4,6 |Personnel direct costs |25,985.00€ |Salaries of Prof. M. Isabel Loza (1.35 p/m), José Manuel Brea (3.02 p/m), Prof. María Angeles Castro (2.04 p/m), José Manuel Santamaría |

| | | |(0.69 p/m) |

|7 |Personnel direct costs |1,338.00€ |Salary of Prof. M. Isabel Loza (0.30 p/m) |

|4,6 |Travel costs |5,902.00€ |Travel costs for Project meetings: |

| | | |7th Consortium Meeting (CM), Sitges, February 15-16, 2010, Mabel Loza, José Manuel Brea |

| | | |8th CM, Rotterdam, May 25-26, 2010, Mabel Loza, Ainhoa Nieto |

| | | |9th CM, Florence, October 25-26, 2010, Mabel Loza, Ainhoa Nieto |

| | | |10th CM, Aveiro, February 2-3, 2011, Mabel Loza, Ainhoa Nieto, José Manuel Santamaría |

| | | |11th CM, Barcelona, April 4-5, 2011, Ainhoa Nieto |

|  |Indirect costs |19,934.00€ |60% rate for indirect costs |

|TOTAL COSTS |53,159.00€ |  |

5 Financial statements – Form C and Summary financial report

Please see PDF file with all the Form Cs and the summary of costs reported.

6 Certificates

|Beneficiary |Organisation short name|Certificate on the financial |Any useful comment, in particular if a certificate is not |

| | |statements provided? |provided |

| | |yes / no | |

|1 |EMC |Yes |Expenditure threshold reached |

|2 |FIMIM |No |Expenditure threshold not reached |

|3 |UPF |Yes |Expenditure threshold reached |

|4 |UAVR |No |Expenditure threshold not reached |

|6 |UB2 |No |Expenditure threshold not reached |

|7 |LSHTM |No |Expenditure threshold not reached |

|8 |AUH-AS |No |Expenditure threshold not reached |

|9 |AZ |No |Expenditure threshold not reached |

|10 |UNOTT |No |Expenditure threshold not reached |

|11 |UNIMIB |No |Expenditure threshold not reached |

|12 |ARS |No |Expenditure threshold not reached |

|13 |PHARMO |No |Expenditure threshold not reached |

|14 |PEDIANET |No |Expenditure threshold not reached |

|15 |USC |No |Expenditure threshold not reached |

7 Annexes – Current Top Risks

RISK DOCUMENTATION FORM

Risk ID Resolved:

|TYPE OF RISK: |Threat |Opportunity |

|CLASSIFICATION: |

|WORK PACKAGE/ACTIVITY: WP3 |

|RISK OWNER[12]: |Martijn Schuemie (EMC) |

|LAST UPDATE: |30/09/2011 |

|DESCRIPTION: |

|Risk located in the performance of the Jerboa software, as it is difficult to predict the actual performance of Jerboa on large datasets in |

|various environments. However, the work in WP3 has been mostly caried out without major problems, and data extraction is undertaken off-line |

|(outside the EU-ADR Web Platform), so impact on the project is regarded as low. However, this is a risk currently perceived as more related to |

|the post-project phase if many databases participate and more dynamic data turnover is essential. Therefore, it is an important risk for the |

|future, so overall Impact is deemed to be Medium. |

Proximity in time:

Impact on project: (a)

Probability of happening: (b)

Exposure: (a)*(b)

|Mitigation Approaches[13] |

|Use of Jerboa in different environments (e.g. benchmarking exercises with Asian and American databases) and other projects (e.g. SOS, ARITMO, |

|VAESCO, SAFEGUARD) with as realistic datasets as possible. |

|Trigger Events |

|Performance problems in Jerboa when used in large datasets (errors, processing delays, etc.) |

|Contingency Plans[14] |

|Initiate and resource a technical workgroup dedicated to analyse performance issues and ways to solve them. |

RISK DOCUMENTATION FORM

Risk ID Resolved:

|TYPE OF RISK: |Threat |Opportunity |

|CLASSIFICATION: |

|WORK PACKAGE/ACTIVITY: WP7 |

|RISK OWNER: |Eva Molero (FIMIM) |

|LAST UPDATE: |30/09/2011 |

|DESCRIPTION: |

|Lack of consensus regarding use of results after project jeopardise post-project phase and adequate exploitation of results. Partners are |

|focused on the technical scientific work of the project; therefore, they may postpone the decisions on exploitation until the end of the |

|project, where agreements will be more difficult. |

Proximity in time:

Impact on project: (a)

Probability of happening: (b)

Exposure: (a)*(b)

|Mitigation Approaches |

|To arrange brainstorming sessions within the consortium in order to know partners' views regarding the results after project life (DONE). To |

|ask partners (through a survey form) their opinion on the implications of the post-project phase (DONE). Analyse lightweight approaches to |

|facilitate initial exploitation (DONE) |

|Trigger Events |

|Lack of appropriate feedback derived from the mitigation activities above. |

|Contingency Plans |

| Reach agreement with selected partners, look for consensus and IPR arrangements to ensure continuation of activities with the involvement of |

|motivated partners only. |

RISK DOCUMENTATION FORM

Risk ID Resolved:

|TYPE OF RISK: |Threat |Opportunity |

|CLASSIFICATION: |

|WORK PACKAGE/ACTIVITY: WP3 |

|RISK OWNER: |Johan van der Lei (EMC) |

|LAST UPDATE: |07/09/2011 |

|DESCRIPTION: |

|Difficulties to prove superiority over traditional methods. EU-ADR intended to serve to identify signals better and faster than spontaneous |

|reporting systems. However, comparison is difficult. |

Proximity in time:

Impact on project: (a)

Probability of happening: (b)

Exposure: (a)*(b)

|Mitigation Approaches |

| Focus on validation activities and all WPs to support them in the final months of the project. |

|Trigger Events |

| Inconclusive results obtained in validation deliverables. |

|Contingency Plans |

| Reinforce communication activities addressed to present EU-ADR as complementary (i.e. not a replacement) for SRS, emphasizing areas in which |

|it performs better. |

-----------------------

[1] If either of these boxes below is ticked, the report should reflect these and any remedial actions taken.

[2] Roberts A, Gaizauskas R, Hepple M, Davis N, Demetriou G, et al. (2007) The CLEF corpus: Semantic annotation of clinical text. AMIA Annu Symp Proc 2007: 625-629.

[3] Kolárik C, Klinger R, Friedrich CM, Hofmann-Apitius M, Fluck J. (2008) Chemical names: Terminological resources and corpora annotation. Proceedings of the LREC 2008 Workshop on Building and Evaluating Resources for Biomedical Text Mining : 51.

[4] Giuliano C, Lavelli A, Romano L. (2006) Exploiting shallow linguistic information for relation extraction from biomedical literature. : 5-7.

[5] Bauer-Mehren A, Rautschka M, Sanz F, Furlong LI. (2010) DisGeNET: A cytoscape plugin to visualize, integrate, search and analyze gene-disease networks. Bioinformatics 26(22): 2924-2926.

[6] Bauer-Mehren A, Bundschus M, Rautschka M, Mayer MA, Sanz F, et al. (2011) Gene-disease network analysis reveals functional modules in mendelian, complex and environmental diseases. PLoS One 6(6): e20284.

[7] Swanson DR. (1986) Fish oil, raynaud's syndrome, and undiscovered public knowledge. Perspect Biol Med 30(1): 7-18.

[8] Garcia-Serna R, Mestres J. (2010) Anticipating drug side effects by comparative pharmacology. Expert Opin Drug Metab Toxicol 6(10): 1253-1263.

[9] Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, et al. (2003) Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 13(11): 2498-2504.

[10] Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, et al. (2008) DrugBank: A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36: D901-906.

[11] [Anonymous]. DailyMed. Available: via the Internet.

[12] Partner in the best position to recommend mitigation strategies, develop contingency plans and monitor the status of the risk.

[13] All threat risks with medium or high probability or impact should have a mitigation strategy.

[14] All risks having an exposure equal or greater than 4 should have a contingency plan in advance.

-----------------------

01-013

05-0003

05-0004

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download