Natalie Nelissen is the evidence and evaluation lead for mHabitat. In addition to its ongoing commitment in supporting people-centred digital innovation in health and social care, mHabitat is partnering with the Yorkshire & Humber Academic Health Science Network (AHSN) to deliver Propel@YH – the region’s first digital health accelerator programme which will help SMEs innovating in digital health to navigate the complex healthcare landscape and build an NHS-relevant business case.
While the number of tools for digital health and care – such as apps, websites, software and wearables – is growing rapidly, the evidence to show these tools are actually beneficial is lagging behind.
For example, there are over 318,000 apps available, yet only 22 stand-alone health apps have reported randomised controlled trials (RCT, the golden standard for evidence-based medicine). Out of these, 11 were able to demonstrate some meaningful impact on health and overall, the evidence was considered of very low quality1.
Broadly speaking, there seems to be a distinction between commercially-produced digital tools and those developed within academic or government institutions. The latter have more potential health benefits but are slower to develop, often not very good at engaging users and don’t always make it to market2.
In addition, high-quality digital tools may not be presented to potential users first. For example, the highest-ranked smoking cessation apps in the app stores were of poor quality, while high-quality ones ranked lowest3.
There is debate as to whether digital health can, and should be, held to evidence-based medicine standards – similar to drugs or therapies. In a publicly funded healthcare system, digital interventions will be directly competing with non-digital ones for the scarce resources available, and therefore should be evaluated by the same standards4.
Traditional evidence collection methods, such as RCTs, usually require a stable intervention and controlled environment for several years. Digital trials, however, can recruit people more quickly, collect real-time and continuous data, and omit the need for test locations and trained staff. Having a less controlled, more noisy trial can be offset by richer, more naturalistic data. New, more flexible and iterative evaluation designs are currently being tested (e.g. MOST and SMART), as well as other means for digital deployment of evaluation (e.g. ResearchKit)5.
Sleepio, a digital platform to treat insomnia, proves it is possible to run an RCT and even has a placebo control in place6. Placebo-controlled trials are almost non-existent in digital health, meaning that some observed benefits may be due to a placebo effect rather than a real impact of the specific tool, especially considering the strong relationship people have with their smartphone1.
Another important bias is that the people participating in digital health trials are often not representative of the general population, and may even present those who are the least in need of additional or alternative support; health app users tend to be younger, more highly educated, in better health and have a higher income7.
While digital health science is still in its infancy, consensus between experts is slowly increasing and, as a result, best practice guidelines and policies have been proposed – though these guidelines and policies are likely to change as this field rapidly evolves.
The WHO recommended the mobile health evidence reporting and assessment (mERA) checklist to improve reporting of mobile health interventions, and this will ensure they can be replicated8.
Various criteria have been proposed on how to rate the quality of digital health tools, such as the Mobile App Rating Scale (MARS)9, the Royal College of Physicians’ checklist10, APEASE criteria11 and the Digital Assessment Questionnaire (DAQ) 12.
These criteria cover multiple domains, including effectiveness, but don’t set explicit expectations for what best practice evidence looks like. Filling this information gap, NICE has recently published its first version of evidence standards for effectiveness and economic value of digital health technologies13.
The NICE framework classifies tools according to their function and associated potential risk, going from low-risk transactional tool (such as booking an appointment) to high-risk diagnostic or therapeutic replacements (such as online therapy). More risky tools require more rigorous evidence for their benefits. This balancing of anticipated benefits versus risks reflects how healthcare professionals, and at a higher level, their organisation and commissioning group, tend to select any treatment.
The framework stays deliberately vague on the exact research designs, for example RCTs are alluded to but not seen to be the only, or best, way. All the guidelines mentioned above, including NICE, MARS and DAQ, allude to the fact that (cost) effectiveness does not exist in a bubble and should not be treated as an independent component.
The most obvious interdependent domains are clinical safety (risk assessment and mitigation) and user engagement (including user friendliness and retention). The latter is especially important to consider alongside effectiveness: if people are not using the digital tool correctly (for example, not frequent enough or not using all components), the expected benefits should be lower (for example, smaller or no effect on health). Related considerations are large scale processes such as the adoption of a tool (will people be motivated and capable to use the tool?) and its sustainability (will the tool still be relevant and available in the future?). The NASSS framework invites developers and implementers to reflect on these topics, as well as other important considerations, such as digital exclusion and organisational context14.
The current guidelines and frameworks are mainly written by experts in the field, often working in academia or government organisations. However, most digital tools are developed outside this sector, with little or no access to expert support. The guidelines also reflect an end outcome, such as an RCT-like study, without the often lengthy process to get there (including feasibility studies and process evaluation).
Moreover, software developers have a different perspective and approach, compared to researchers – for example, favouring continuous refinement and integration as opposed to static controlled testing15. The commercial sector may also question the return on this considerable investment associated with complying with these guidelines: will it make a large enough impact on their ability to sell their products? High-profile failures of previous attempts to accredit apps are not helpful, such as the lack of good practice effectiveness evidence for 12 out of 14 recommended depression apps in a previous instantiation of the NHS library16. The gap between (commercial) development and the existing knowledge base urgently needs to be bridged, with information flowing both ways.
Best practice guidance for digital health is starting to emerge but is likely to keep updating in the foreseeable future, as feedback on its implementation and new developments in research methods and technology become available. Evidence for the effectiveness and cost-effectiveness of a digital tool should not be considered in isolation, but rather as part of a larger evaluation encompassing domains such as user engagement, clinical and data safety, adoption and sustainability. A closer dialogue between those who develop the digital tools and those who create and interpret the knowledge base is needed.
For more information on evidence in health visit our evid.health pages >>
Sources:
1 Byambasuren O, Sanders S, Beller E, Glasziou P. Prescribable mHealth apps identified from an overview of systematic reviews. npj Digital Medicine. 2018 May 9;1(1):12. https://www.nature.com/articles/s41746-018-0021-9
2 Anthes E. Mental health: there’s an app for that. Nature News. 2016 Apr 7;532(7597):20. https://www.nature.com/news/mental-health-there-s-an-app-for-that-1.19694
3 Wyatt JC. How can clinicians, specialty societies and others evaluate and improve the quality of apps for patient use?. BMC medicine. 2018 Dec;16(1):225. https://bmcmedicine.biomedcentral.com/articles/10.1186/s12916-018-1211-7
4 Greaves F, Joshi I, Campbell M, Roberts S, Patel N, Powell J. What is an appropriate level of evidence for a digital health intervention?. The Lancet. 2018 Dec 22;392(10165):2665-7. https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(18)33129-5/fulltext
5 Pham Q, Wiljer D, Cafazzo JA. Beyond the randomized controlled trial: a review of alternatives in mHealth clinical trial methods. JMIR mHealth and uHealth. 2016 Jul;4(3). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5035379/
6 Espie CA, Kyle SD, Williams C, Ong JC, Douglas NJ, Hames P, Brown JS. A randomized, placebo-controlled trial of online cognitive behavioral therapy for chronic insomnia disorder delivered via an automated media-rich web application. Sleep. 2012 Jun 1;35(6):769-81. https://www.ncbi.nlm.nih.gov/pubmed/22654196
7 Carroll JK, Moorhead A, Bond R, LeBlanc WG, Petrella RJ, Fiscella K. Who uses mobile phone health apps and does use matter? A secondary data analytics approach. Journal of medical Internet research. 2017 Apr;19(4). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5415654/
8 http://www.nweurope.eu/media/3204/mhealth-evidence-reporting-and-assessment-mera-checklist__bmj2016.pdf
9 Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR mHealth and uHealth. 2015 Jan;3(1). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4376132/
10 Wyatt JC, Thimbleby H, Rastall P, Hoogewerf J, Wooldridge D, Williams J. What makes a good clinical app? Introducing the RCP Health Informatics Unit checklist. Clinical Medicine. 2015 Dec 1;15(6):519-21. http://www.clinmed.rcpjournal.org/content/15/6/519
11 https://www.ucl.ac.uk/behaviour-change/files/bcw-summary.pdf
12 https://developer.nhs.uk/digital-tools/daq/
13 https://www.nice.org.uk/about/what-we-do/our-programmes/evidence-standards-framework-for-digital-health-technologies
14 Greenhalgh T, Wherton J, Papoutsi C, Lynch J, Hughes G, A’Court C, Hinder S, Fahy N, Procter R, Shaw S. Beyond adoption: a new framework for theorizing and evaluating nonadoption, abandonment, and challenges to the scale-up, spread, and sustainability of health and care technologies. Journal of medical Internet research. 2017 Nov;19(11). https://www.ncbi.nlm.nih.gov/pubmed/29092808
15 Peiris D, Miranda JJ, Mohr DC. Going beyond killer apps: building a better mHealth evidence base. https://gh.bmj.com/content/3/1/e000676
16 Leigh S, Flatt S. App-based psychological interventions: friend or foe?. Evidence-based mental health. 2015 Sep 16:ebmental-2015. https://ebmh.bmj.com/content/18/4/97