Estimating the economic benefit of iCBT

A mapping function to aid prediction of QALY gains from routinely collected data


The COVID-19 pandemic has seen reports of increasing prevalence of mental health difficulties (Santomauro et al., 2021) and enormous growth in the use of tele- and digitally delivered healthcare (Bestsennyy et al., 2021). The traditional limitations in the availability of trained mental health professionals, and long wait-list times, means that guided digitally delivered interventions for depression and anxiety are being increasingly employed to deliver services to those in need. Digitally delivered interventions, such as the SilverCloud® by Amwell® platform, have demonstrated good clinical outcomes in a scalable and potentially cost-effective way (Donker et al., 2015; Massoudi et al., 2016; Mosche et al., 2021; Paganini et al., 2018; Pauley et al., 2021; Richards et al., 2020). For organisations seeking to adopt digitally delivered services, the cost-effectiveness of an intervention is a critical factor in choosing between different approaches to care. It is therefore important to appropriately quantify cost-effectiveness of digital mental health interventions, to ensure decisions are informed by the best available information. 

Determining whether an intervention is cost effective requires careful consideration of how to appropriately quantify the potential long-term benefits of an intervention on an individual’s wellbeing. Despite the proliferation of digital mental health interventions, there is a scarcity of tools available to reliably estimate the long-term benefits of any given treatment. Here, we describe the development of a mapping function to aid prediction of health-related quality-of-life changes from routinely collected data, allowing the ability to predict cost-effectiveness estimates that are potentially directly comparable across different contexts and interventions.


Cost-effectiveness and cost-utility analyses 

Cost-effectiveness analyses seek to compare the monetary costs of different interventions, relative to their effectiveness in treating the condition of interest (e.g., reduction in symptoms of depression). However, these analyses are often limited due to a reliance on symptom change as a measure of effectiveness, which creates two challenges. First, focusing only on symptom change can overlook broader changes in health-related quality of life for an individual. This is particularly notable in the context of mental health conditions, where there is an increasing focus on the importance of patients’ experiences of recovery (Keetharuth et al., 2018). Secondly, when different effectiveness measures are used with different interventions, it is not possible to directly compare them. 

Cost-utility analyses are a sub-type of cost-effectiveness analysis that focus on preference-based measures of health status, rather than symptom change. Preference-based measures compare an individual patient’s current health state with country-specific health preferences (e.g. generated via surveys of general population preferences for different combinations of health problems), creating ‘utility scores’. Utility scores can then be combined with estimated years of life over which a particular health state is lived, to estimate ‘quality-adjusted life years’ (QALYs). As such, the QALY provides a valuable metric for directly comparing the health-related benefits of different treatments on the lives of those receiving them. 

Organisations such as the National Institute for Health and Care Excellence (NICE) in the UK, and the Institute for Clinical and Economic Review (ICER) in the US recommend ‘cost per QALY’ analysis as part of their guidance for assessing health innovations (NICE, 2022a; ICER, 2022). Cost per QALY can be calculated by dividing the difference in cost between two treatments, by the difference in QALYs. The cost per QALY is compared against a ‘cost-effectiveness threshold’, which is a pre-specified monetary value below which an innovative treatment is considered ‘cost-effective’, i.e., a use of resources that is considered value for money. This means that even if a new treatment is more expensive than an existing treatment, if it demonstrates a greater effect on health-related quality of life over a specified time horizon, it may be deemed cost-effective if the cost per QALY gained is below the cost-effectiveness threshold. 

For example, in a recently drafted update to NICE guidelines for the management of depression in adults, computerised CBT was more costly than treatment as usual, but was shown to be more cost-effective:

“Based on a NHS perspective, computerised CBT was more costly and more effective than treatment as usual, with an ICER [incremental cost-effectiveness ratio] ranging from £2,678 to £10,614 per QALY (depending on package, uplifted to 2020 prices). The probability of computerised CBT being cost-effective ranged from 0.54 to 0.87 at a cost effectiveness threshold of £44,000 per QALY, suggesting that computerised CBT may overall be a cost-effective intervention.” (NICE, 2022b)

This resulted in computerised CBT being included in new UK NICE guidelines as an alternative to group therapy for individuals with mild depression. 

In the US, ICER has not issued a formal analysis of computerized CBT for depression or anxiety. However, an ICER evaluation of behavioral health integration (BHI) for depression and anxiety into primary care (of which digital interventions form a part) was demonstrated to be generally cost effective:

“...findings from multiple evaluations across a variety of integration models and populations suggest that BHI falls within generally-acceptable thresholds for cost-effectiveness ($15,000 - $80,000 per QALY gained vs. usual care).” (ICER, 2015)

ICER also note the importance of improving the quality of economic evaluations of behavioural health strategies, and identify computerized CBT as a technological intervention to be assessed further (ICER, 2015).


Developing a mapping function to aid calculation of QALYs from routine outcome measures

Despite the benefits of QALYs, many clinical studies and routine data collection in healthcare services do not include a preference-based measure, limiting the ability to assess cost-effectiveness based on QALYs. To combat this issue, a new statistical tool has been developed which can predict (or ‘map’) utility scores from data that are routinely collected in healthcare settings, enabling estimations of QALYs (Franklin & Alava Hernández, in prep). This mapping function was developed using data from SilverCloud® large randomised controlled trial of iCBT programmes for depression and anxiety in an NHS trust in the UK (Richards et al., 2020), which included a preference-based measure (EQ-5D Five-Level; Herdman et al., 2011). The resulting best-fitting model was able to estimate utility scores with reasonable goodness-of-fit and low predictive errors, using the summary scores from the Patient Health Questionnaire-9 (PHQ-9; depression severity), and Generalized Anxiety Disorder-7 (GAD-7; anxiety severity). 

How the mapping function is used

One important use of the mapping function is to predict utility scores in order to calculate QALYs from studies and datasets where a preference-based measure was not originally included, but the PHQ-9 and GAD-7 were collected. This innovation unlocks the potential for cost-effectiveness analysis across a whole host of datasets collected in different contexts that include only routine clinical outcomes measures. To calculate the economic benefit of iCBT for an individual organisation, first the mapping function is used to estimate utility scores based on changes in PHQ-9 or GAD-7 scores observed from before to after the intervention. Utility scores can then be used to calculate associated QALYs. 

One approach to converting estimated gains in health-related quality of life to an economic impact for an organisation is to use nationally relevant cost-effectiveness thresholds. If we assume the monetary value of one QALY is equivalent to the local estimated cost-effectiveness threshold, the average number of QALYs can be used to determine a ‘monetary gain’ for each individual using the intervention. This method has some limitations worth noting. In particular, applying this method to a non-randomised study means that potential bias in the sample is uncontrolled (i.e., there is no matched control group against which results can be directly compared). This may limit the generalisabilty of findings observed to novel samples, although more work is underway to address this limitations.

Value of the mapping  function

Healthcare systems, payors, employers, individuals and other responsible parties will likely find this type of function helpful as they evaluate potential therapies and treatments for depression and anxiety. Additionally, the function can be used retrospectively to evaluate actual results for a population without needing additional data from other sources.

UK case study

UK Case Study

The SilverCloud® platform assesses users’ clinical outcomes (symptoms of depression and anxiety measured using the PHQ-9 and GAD-7) at various points in their mental health journeys. We used the mapping function on PHQ-9 and GAD-7 data (EQ-5D-5L crosswalk; Van Hout et al. 2012) to examine changes in QALYs for several customers. One such customer, a NHS Foundation Trust in the UK had 1,904 users in 2021 with a scorable engagement with SilverCloud® programmes (i.e., completion of two or more PHQ-9 and GAD-7 questionnaires during treatment). 

Pre-treatment PHQ-9 and GAD-7 scores predicted 1,204.25 QALYs, increasing to 1,319.85 QALYs post-treatment. This gain of 115.6 QALYs equates to a 0.061 QALY gain per patient. Assuming a £25,000 cost-effectiveness threshold (NICE, 2020a), the change in QALYs from pre to post SilverCloud® equated to a monetised gain of £1,518 per user (i.e. 0.061 QALYs * £25,000 = £1,518 per user, see figure). 

US case study

US Case Study

The SilverCloud® platform assesses users’ clinical outcomes (symptoms of depression and anxiety measured using the PHQ-9 and GAD-7) at various points in their mental health journeys. We used the mapping function on PHQ-9 and GAD-7 data (EQ-5D-5L US value set) to examine changes in QALYs for several customers. One such customer, a large provider organization had 1,096 users in 2021 with a scorable engagement with SilverCloud® (i.e., completion of two or more PHQ-9 and GAD-7 questionnaires during treatment). 

Pre-treatment PHQ-9 and GAD-7 scores predicted 776.51 QALYs, increasing to 834.41 QALYs post-treatment. This gain of 57.90 QALYs equates to a 0.053 QALY gain per patient. Assuming a $50,000 willingness to pay threshold (ICER, 2022), the change in QALYs from the pre to post SilverCloud® equated to a return on investment of $2,650 per user (i.e. 0.053 QALYs * $50,000 = $2,650 per user, see figure).

Future plans

Amwell® will continue to use this mapping function to work with customers to understand the QALYs generated for their populations and the scale of economic impacts achieved. Further, future efforts will be undertaken to determine the relationships between QALYs and additional metrics, such as medical costs, within treated populations in order to understand the impact of cost-effective mental health treatment not only on the life of the individual, but the potential broader healthcare cost savings associated with improved mental wellbeing.



Bestsennyy, O., Gilbert, G., Harris, A., & Rost, J. (2021). Telehealth: a quarter-trillion-dollar post-COVID-19 reality. McKinsey & Company. 

Donker, T., Blankers, M., Hedman, E., Ljotsson, B., Petrie, K., Christensen, H., Ljótsson, B., Petrie, K., & Christensen, H. (2015). Economic evaluations of Internet interventions for mental health: a systematic review. Psychological Medicine, 45(16), 3357–3376.

Franklin, M., & Alava Hernández , M. (2022). Enabling QALY estimation in mental health trials and care settings: mapping from the PHQ-9 and GAD-7 to the ReQoL-UI or EQ-5D-5L UK and US utility scores using mixture models. Paper Draft in Progress.

Keetharuth, A. D., Brazier, J., Connell, J., Bjorner, J. B., Carlton, J., Taylor Buck, E., Ricketts, T., McKendrick, K., Browne, J., Croudace, T., & Barkham, M. (2018). Recovering Quality of Life (ReQoL): a new generic self-reported outcome measure for use with people experiencing mental health difficulties. The British Journal of Psychiatry, 212(1), 42–49.

ICER. (2015). Institute for Clinical and Economic Review: Integrating Behavioral Health into Primary Care: A Technology Assessment.

ICER. (2022). Institute for Clinical and Economic Review: 2020-2023 Value Assessment Framework.

Massoudi, B., Holvast, F., Bockting, C. L. H., Burger, H., & Blanker, M. H. (2019). The effectiveness and cost-effectiveness of e-health interventions for depression and anxiety in primary care: A systematic review and meta-analysis. Journal of Affective Disorders, 245, 728–743.

Moshe, I., Terhorst, Y., Philippi, P., Domhardt, M., Cuijpers, P., Cristea, I., Pulkki-Råback, L., Baumeister, H., & Sander, L. B. (2021). Digital Interventions for the Treatment of Depression: A Meta-Analytic Review. In Psychological Bulletin (Vol. 147, Issue 8).

NICE. (2022a). Evidence standards framework for digital health technologies.

NICE. (2022b). Depression in adults: [B] Treatment of a new episode of depression. NICE guideline NG222

Paganini, S., Teigelkötter, W., Buntrock, C., & Baumeister, H. (2018). Economic evaluations of internet- and mobile-based interventions for the treatment and prevention of depression: A systematic review. Journal of Affective Disorders, 225, 733–755.

Pauley, D., Cuijpers, P., Papola, D., Miguel, C., & Karyotaki, E. (2021). Two decades of digital interventions for anxiety disorders: A systematic review and meta-analysis of treatment effectiveness. Psychological Medicine.

Richards, D., Duffy, D., Blackburn, B., Earley, C., Enrique, A., Palacios, J., ... & Timulak, L. (2018). Digital IAPT: the effectiveness & cost-effectiveness of internet-delivered interventions for depression and anxiety disorders in the Improving Access to Psychological Therapies programme: study protocol for a randomised control trial. BMC psychiatry, 18(1), 1-13.

Santomauro, D. F., Mantilla Herrera, A. M., Shadid, J., Zheng, P., Ashbaugh, C., Pigott, D. M., Abbafati, C., Adolph, C., Amlag, J. O., Aravkin, A. Y., Bang-Jensen, B. L., Bertolacci, G. J., Bloom, S. S., Castellano, R., Castro, E., Chakrabarti, S., Chattopadhyay, J., Cogen, R. M., Collins, J. K., … Ferrari, A. J. (2021). Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. The Lancet, 398(10312), 1700–1712.

Van Hout, B., Janssen, M. F., Feng, Y. S., Kohlmann, T., Busschbach, J., Golicki, D., ... & Pickard, A. S. (2012). Interim scoring for the EQ-5D-5L: mapping the EQ-5D-5L to EQ-5D-3L value sets. Value in health15(5), 708-715.


About the Authors

Derek Richards, Ph.D.

Chief Science Officer

Dr. Richards is part of the executive team at SilverCloud health. As Chief Science Officer, Derek leads on the strategic research objectives for the company. He is also director of the e-mental health research group in the School of Psychology, Trinity College Dublin. Since 2002, he has been extensively involved in clinical research, development and implementation of technology delivered interventions for mental health. Over the last 15 years, Derek has built a team of world-class scientists whose published work is cited widely and whose impact has spearheaded continued innovation in digital mental health care.


Jorge Palacios, MD, PhD

Senior Digital Health Scientist.

Jorge is a key driver in the research strategy and agenda at SilverCloud through the design, execution, and publication of our trials, also working closely with our collaborators across the world which include some of the most highly cited and regarded researchers in the field of psychology and digital mental health. He is a thought leader in the space and a keynote speaker in principal academic and commercial conferences and global events on digital therapeutics. He completed his medical degree in Mexico City, and thereafter won scholarships to undertake a Masters in Psychiatric Research and PhD in Psychological Medicine in London at the Institute of Psychiatry, Psychology, and Neuroscience (IoPPN), before joining SilverCloud in 2017. He has earned numerous awards such as the Young Investigator of the Year prize from Elsevier and the European Association of Psychosomatic Medicine. Jorge believes passionately in the potential of digital interventions to improve the mental health of populations at a large scale, helping more and more individuals have access to wellbeing solutions that fit their particular needs.


Katherine Young

PhD, Digital Health Scientist

Katie Young is a Digital Health Scientist at SilverCloud. She received her PhD in Psychiatry from the University of Oxford and has over a decade of experience in mental health research, most recently as a Lecturer at King’s College London. At SilverCloud, her research interests focus on using digital interventions to better understand and treat mental health problems in diverse populations, with a particular focus on children and young people’s mental health.


Michael Anselmo

Vice President of Healthcare Economics