br To summarize three goals motivate
To summarize, three goals motivate this paper. The first is to in-crease the precision of outcome extraction from sentences to entities. The second is to evaluate the predictive performance of a list model and multiple machine learning algorithms using three standards: the initial set of seed outcomes, a manual evaluation of overall outcomes, and an estimated evaluation of overall outcomes. Achieving these goals en-ables the third goal, which is to compare the automatically extracted clinical and surrogate outcomes with respect to 5 different breast cancer treatment strategies that are based on 3 different biological processes.
2. Related work
There are multiple efforts underway to unify medical outcomes, but the work most related to this paper focuses on breast cancer. Information extraction techniques have been developed for many of the PICO elements, but we focus on outcome extraction.
2.1. Cancer outcomes
There is a concerted effort within the general medical Chloramphenicol to better articulate the key outcomes for a given medical condition and to differentiate between clinically relevant and surrogate outcomes. The Journal of Biomedical Informatics: X 1 (2019) 100005
GRADE framework for example, provides specific recommendations for conducting a systematic review and has been officially endorsed by over 100 organizations worldwide.1 The CONSORT (Consolidated Standards of Reporting Trials) group also provides reporting guidelines for randomized trials . Lastly, CONSORT states outcomes should be “completely defined pre-specified primary and secondary outcome measures, including how and when they were addressed”.
Several standards have been developed to capture key outcomes for breast cancer treatments, and for cancer more broadly. The STEEP system, which was created to provide standard definitions for efficacy endpoints in adjuvant breast cancer trials reported that “overall sur-vival has been recognized as the least ambiguous and most clinically relevant clinical end point in clinical trials of cancer therapy” [18, p. 2128]. Overall survival (OS) was also referred to as a primary endpoint in the DATECAN standard where OS is defined as “the time from ran-domization to patients’ death (all causes)” . Since submitting this paper for publication, a manually constructed ontology for cancer care that includes treatment, health services, physical, and psychosocial health–related concepts was developed using endpoints reported in clincialtrials.gov .
Interestingly, the domain experts who created the Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) noted that “The endpoint should be defined precisely and not referred to just as ‘survival’ or ‘overall survival’” . Consider time that has elapsed between when a study starts and when survival is evaluated, which authors sometimes reported as part of the survivorship noun phrase such as 1-year disease-free survival or five year overall survival (authors frequently use 5-year and 1-year), but the precision varied from days to months to years and was often not included in the phrase. The REMARK standard specifically states that “time origin should al-ways be specified”, but an earlier study found that the “time origin was not stated for at least one endpoint in 48% of 132 papers in cancer journals reporting survival analyses.” . This gap between ideal and actual outcomes will inevitably continue as new measurement methods are created and as we better understand the biological processes in-volved in a disease.
In addition to overall survival new treatment regimens have spurred several surrogate measures such as progression-free survival and disease-free survival that enable authors to report results of a trial sooner and can therefore lower the duration and cost of a randomized clinical trial. There is however, much discussion within the breast cancer community regarding how each of these alternative outcome measures are oper-ationalized and defined when reporting the results of a clinical trial. Differences in surrogate endpoints have started to be addressed in the DATECAN  initiative that involves an international community of experts who harmonize time-to-event endpoints. The group notes that it is important to realize that the “selection of TTE [time to event] end points to assess a therapeutic strategy depends on the characteristics of a given trial including settings (adjuvant versus metastatic) and treat-ments (systemic, local, or any combination thereof). As such, the choice of the end points is trial-specific. Once the end point is identified, it then has to be appropriately defined, ideally using a standardized de-finition to enable future comparisons.” [15, p. 874]. Thus the goal of DATECAN is not to create a single unifying survivorship endpoint, but to better define the set of endpoints that can be used appropriately when reporting clinical trial results.