Conference proceeding

Large Language Models for Risk of Bias Assessment: A Case Study

YHEC authors: Mary Edwards, Lavinia Ferrante di Ruffano
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Poster

Abstract

OBJECTIVES: Risk of bias assessment (RoBA) of primary studies is a key part of any systematic review. As a repetitive and structured task, RoBA would initially appear to be well suited to automation or AI support. We assessed the chat interface to Claude 3 Opus for accuracy, consistency, presentation of data, and time savings in the context of RoBA of RCTs for a systematic review.

METHODS: Six RCTs were selected from three reviews conducted by our consultancy over the past five years. Following an initial prompt engineering phase using a report of a seventh RCT, the LLM was used to: 1. Conduct fully automated assessment of each paper using Cochrane RoB 1 tool (Method 1), and 2. Supply information only to facilitate joint human / LLM assessment (Method 2) using the same tool. The results were compared to fully human assessment (Method 3).

RESULTS: Method 1 resulted in very brief answers, with little supporting information provided by the model. Asking for supporting information only (Method 2) resulted in better quality and more complete data, although no judgement was made by the LLM. The agreement percentage between the three methods was mixed, ranging from 16.7% to 100% across domains. The lower agreement level was seen on questions relating to treatment allocation, incomplete outcome data and other sources of potential bias. In these instances, the LLM appeared to have misinterpreted the questions, resulting in different answers to the human assessor. However, there were also a few occasions where the LLM picked up information that the human did not.

CONCLUSIONS: Using LLMs for fully automated RoBA is not recommended at this stage, as such models can misinterpret questions and provide limited or incorrect justification for judgments. However, with suitable prompt engineering, and fine tuning using existing RoBA data, the performance of these models may improve with time.

Conference proceeding

Measuring Environmental Outcomes is More Complex Than we Think

YHEC authors: Matthew Taylor
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Poster

Abstract

OBJECTIVES: Accounting for the environmental impact of healthcare is an important issue. In recent years, there has been a substantial increase in the number of economic evaluations that also report an environmental outcome (usually carbon emissions). However, the true impact of changes in the care pathway is more complex that is currently being reported.

METHODS: A case study comparing two separate surgical devices is used to demonstrate the various consequences of a healthcare decision on the environment. Two devices are compared, with different levels of cost, health outcomes and environmental outcomes.

RESULTS: Patients who receive Device A incur costs of €3,820, compared with €5,410 for Device B. Device A is more effective (121 months of life expectancy compared with 119 for Device B) and is less harmful to the environment (25kg of CO2 emissions, compared with 30kg for Device B). Based on these outcomes, Device A would normally be considered 'dominant', since it has better outcomes on all three metrics. However, because Device A results in two additional months of life expectancy, overall CO2 emissions will increase (i.e. including non-healthcare-related emissions). Based on an estimate of 12.7 tons of emissions per year of life, the emissions associated with two additional months would be 2,117kg, vastly outweighing the short-term reduction associated with Device A. In addition, the money saved by Device A will be reinvested into other healthcare, which will increase healthcare emissions by an estimated 0.235kg.

CONCLUSIONS: Measuring the true environmental impact of medicines is complex, and ignoring key indirect consequences could results in suboptimal (or even counter-productive) decisions. Moving healthcare systems towards a net zero target will require wider thinking that simply choosing between individual therapies.

Conference proceeding

Reappraisal Following the Loss of Medicine Patent in HTA Guidelines

YHEC authors: Matthew Taylor, James Mahon
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Poster

Abstract

OBJECTIVES: We aimed to explore whether a proactive approach should be used to update HTA guidance when: (i) A medicine that had originally received a negative recommendation becomes unbranded (and has a lower price), (ii) An original medicine was not appraised but is now off patent, and (iii) A comparator treatment in an original appraisal is now unbranded.

METHODS: We undertook a review of existing National Institute for Health and Care Excellence (NICE) appraisals to explore the practical implications of using existing evidence from previous appraisals to assess whether, and how, this type of information could be repurposed to facilitate a rapid assessment of unbranded technologies. We also undertook a detailed assessment of seven appraisals where an intervention within the pathway had since become unbranded.

RESULTS: Of 74 appraisals that had originally received a negative recommendation, 8 of the originator drugs were later recommended because of later appraisals, and 60 led to changed guidance. In 6 cases, recommendations stood, despite the medicine becoming unbranded. Of the 7 detailed assessments, in 6 cases we found that a rapid re-assessment would not possible because: (i) The pathway had changed, (ii) The clinical effectiveness evidence for the originator was considered poor, or (iii) The model had been deemed unreliable due to structural flaws, implausible assumptions or errors that the ERG had not been able to rectify. Only the documentation from 1 of these STAs had the potential to be repurposed to inform a rapid assessment of an unbranded version of the originator.

CONCLUSIONS: In most cases, current approaches to HTA were able to deal with losses of patent, due to the regular appraisal of disease areas when new medicines reach market. In only a small proportion of cases would a proactive approach (i.e. at the point a medicine loses its patent) be useful.

Conference proceeding

Recommended Standards for Managing and Reporting Missing Utility Data for Health Technology Appraisal

YHEC authors: Neil Hansell
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Podium

Abstract

OBJECTIVES: Health technology assessment (HTA) in the UK often requires that health-related quality of life is considered in a cost-utility analysis (CUA). Most studies contain missing data at some level. Missing utility data can misrepresent the denominator of the ICER. Despite this, there are no definitive guidelines on how to manage missingness for UK HTA, and analysts rely on judgment to address missingness. We intend for this research to formalise our recommendations for dealing with missing utility data for UK HTA.

METHODS: A simulated individual patient dataset, similar to those used to estimate utility values for a CUA submission for HTA in the UK, was developed. With this dataset, we simulated missingness at various levels and by different mechanisms. We assessed the performance of: complete case analysis (CCA), mean square estimation (MSE), linear mixed modelling (LMM) and multiple imputation via chained equations (MICE), and used this to make recommendations for handling and reporting of missing data in IPD that will be submitted to decision-makers in the UK.

RESULTS: Regardless of the mechanism or the level of missingness, MICE and LMM always resulted in substantially less error when calculating health state utility. Where data were missing at 30%, CCA and MSE were often associated with a substantially higher mean difference than MICE and LMM. The standard deviation was always substantially depressed regardless of mechanism or level of missingness when LMM was used.

CONCLUSIONS: It is our recommendation that an assessment of the mechanism and magnitude of missingness be made and reported for all data sets that are used to inform economic models for HTA in the UK as a minimum standard. Further, we recommend that CCA is almost never appropriate and our research, echoed by others supports the use of MICE as standard.

Conference proceeding

Single-Arm Studies: Are All Created Equal?

YHEC authors: Mary Chappell, Deborah Watkins, Lavinia Ferrante di Ruffano, Rachael McCool
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Poster

Abstract

OBJECTIVES: Randomized controlled trials (RCTs) are the gold standard for evaluating effectiveness of interventions. Interventional single-arm trials (SATs) are increasingly being considered, despite a lack of agreement on their validity and position in the hierarchy of evidence. Notably, it is unclear whether SATs are superior to observational single-arm studies (case series). We investigated whether there are systematic differences in outcome and between-study heterogeneity for SATs compared with case series.

METHODS: We conducted a pragmatic literature review for systematic reviews (SRs) of pharmacological interventions including single-arm studies. A single reviewer identified SRs and extracted primary study characteristics and outcome data. For each SR, meta-analysis of dichotomous outcomes was conducted, with sub-group analysis of the included SATs versus case series. To investigate whether statistical heterogeneity was explained by clinical heterogeneity or indicated bias, clinically different primary studies were removed in a sensitivity analysis.

RESULTS: 13 SRs were included. When primary studies were sub-grouped by study design, there was no significant difference for SATs versus case series across SRs (risk difference -0.02, 95% CI 0.09, 0.05). There were high levels of between-study heterogeneity within both SATs (median I2: 55%) and case series (median I2: 77%). When clinically heterogenous studies were removed, effect size tended to be greater for case series, but not significantly so (risk difference -0.071, 95% CI -0.161, 0.019). Levels of within-group statistical heterogeneity remained high, suggesting that bias may have been a moderator of effect in both SATs and case series.

CONCLUSIONS: There do not appear to be systematic differences in outcome between SATs and case series. However, levels of heterogeneity in effect size are high within both designs, even after attempts to reduce clinical heterogeneity, indicating that bias may have an impact on outcomes. Future work should utilize larger samples and additional methods to further clarify the relative validity of single-arm designs.

Conference proceeding

Submission Processes and Requirements for Health Technology Assessment in Australia, Canada, England and Spain

YHEC authors: Emily Gregg, Charlotte Graham, Karina Watts, Karin Butler, Stuart Mealing
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Poster

Abstract

OBJECTIVES: The health technology assessment (HTA) submission process is becoming increasingly diverse between countries. This study assesses the HTA requirements in Australia, Canada, England and Spain: four countries where pharmacoeconomic evidence forms an integral part of the value assessment. Technology developers can use these insights to identify where efficiencies can be made in the global market access strategy for new technologies, such as when to submit HTA dossiers.

METHODS: A pragmatic review and desk-based research were conducted in May 2024. Published articles, HTA guidelines, process documents, conference abstracts, and white papers were reviewed to identify country-specific processes. Where available, data were extracted about the general submission process and stakeholders involved (including regulatory, HTA and pricing authorities), as well as the clinical and pharmacoeconomic evidence requirements for HTA submission. Comparisons of the median time from marketing authorization to HTA decision within each country were also conducted. The key findings and between-country differences were synthesized in a narrative summary.

RESULTS: The review identified several areas with implications for market access strategy. All countries offer a parallel regulatory/HTA process. The median HTA review time between 2014 and 2018 was shortest in Australia (125 days) and longest in England (266 days). Australia demonstrated general consistency in HTA review time between submissions (interquartile range = 9 days), and England had the most variation in the duration of HTA reviews (interquartile range = 216 days). All countries require comparative clinical evidence within the indication and pharmacoeconomic evidence. A cost-utility analysis is the preferred analytical tool. However, England also readily accepts cost-effectiveness analysis.

CONCLUSIONS: While the median HTA review time varied between countries, similar requirements in clinical and pharmacoeconomic evidence allow efficiencies in the preparation of submission documentation. Future research should investigate the impact of the EU HTA Regulation on market access and how this could affect strategic decision making.

Conference proceeding

Technical Validation of an Environmental Model of Aurora EV-ICD: Recommendations to Guide Environmental Criteria in Health Technology Assessment

YHEC authors: Melissa Pegg
Publication date: November 2024
Conference: ISPOR EU, Barcelona
Type of conference proceeding: Podium

Abstract

OBJECTIVES: Climate breakdown is affecting human health globally. The National Health Service (NHS) generates 26 million tonnes of CO2e per annum equivalent to the size of Croatia's annual emissions. Healthcare suppliers possess a sizeable opportunity to support health technology environmental sustainability (HTES) underpinned by leaner pathways and resource optimization. HTA is developing approaches to include environmental sustainability (ES). Reporting a broad range of environmental outcomes is important, but there is a lack of published guidance. This study aims to develop recommendations for decision makers on how broader environmental criteria should be included in HTA, based on the technical validation of an environmental model of Aurora EV-ICD by Medtronic.

METHODS: The Medtronic environmental model estimates the impact of the device on CO2e, water usage and waste volumes over a 10-year time horizon. A multi-step process was used to internally validate the model, using a specifically designed checklist for HTES models being reported as an information conduit. Recommendations were provided to support appropriate reporting of the data to the NHS.

RESULTS: The model structure was deemed appropriate and suitable for submission to the NHS. Recommendations for reporting ES to the NHS include: 1) using and referencing multiple environmental management guidelines for reporting CO2e to ensure transparency and reproducibility; 2) applying the same principles to quantify other environmental outcomes that will enable a more holistic evaluation of unintended consequences, including human health impact, resource use and biodiversity loss; 3) using life cycle assessment software (such as OpenLCA) to overcome data challenges and to facilitate reporting appropriate and comprehensive environmental endpoint categories.

CONCLUSIONS: This multi-perspective collaboration between industry and researchers supports ES in HTA framework development. These recommendations can be used by decision makers to aid the inclusion of environmental models that supports reporting a broader range of environmental outcomes over a long-term time horizon.

1 7 8 9 10 11 76