class: center, middle .pull-left[ # Funnel Plots <br><br> __Chris Mainey__ <br> Senior Statistical Intelligence Analyst <br> Healthcare Evaluation Data (HED) .small[Univeristy Hospitals Birmingham NHS FT] <br><br> [www.hed.nhs.uk](https://www.hed.nhs.uk) ] .pull-right[ <img src="https://github.com/chrismainey/FunnelPlotR/raw/master/man/figures/logo.png" width=250 fig.align="center"> <img src="https://chrismainey.github.io/Building_funnel_plots_in_R/Assets/logo.png" width=250 fig.align="center"> ] --- ## Healthcare Evaluation Data (HED) <a href="https://www.hed.nhs.uk">www.hed.nhs.uk </a> .pull-left[ - Online hospital benchmarking system - Statistical models and analysis tools - Activity, Mortality, Readmissions, Length-of-Stay, Marketshare etc. - Used by ~60 NHS and other organisations - Training and support <br><br> - __Using national NHS HES data__ ] .pull-right[ <br> <img src="https://chrismainey.github.io/EARL_2019_presentation/assets/HED_system.png"> <img src="https://chrismainey.github.io/EARL_2019_presentation/assets/HED_system.png"> ] --- # HED R Training We offer a variety of R training courses, both public or onsite. + Two day introduction course, public or onsite. __(24th-25th March, 9th-10th June)__ + We also offer other courses, including: + Introduction to R Markdown - __26th Feb__ + Machine Learning methods in R __28th - 29th April__ + Regression Modelling in R - __22nd - 23rd September__ + R Essentials - __20th October__ <br> Discounted price for HED customers <br> More info, or book at: https://www.hed.nhs.uk/Info/hed-courses.aspx --- # Overview + What's a funnel plot? + What's overdispersion? + What do we do about it? -- <br><br> + _If you are interested (or if we have time):_ + How to set up and build an R package + What I've learned from the process + How to publish it to CRAN <br> [https://github.com/chrismainey/FunnelPlotR](https://github.com/chrismainey/FunnelPlotR) --- ## Context: ### Monitoring Standardised Mortality + How do we compare between organisations with different patients? -- + Direct standardisation: + Adjust all to common standard (e.g. European Standard Popn) + Indirectly Standardised Ratios: + Predict an expected rate for each unit, based on average (regression model) + Case-mix adjustment $$ SR=\frac{\Sigma{(\text{observed deaths})}}{\Sigma{(\text{predicted deaths})}} $$ -- + <b>Predicted = Expected, but `\(\ne\)` 'preventable'</b> --- # Standardised Ratio example + 10 identical patients: + Risk factor of model + E.g. age, sex, primary diagnosis etc. -- + If these patients had a probability of death of 0.3 (30%) + Expected deaths = 10 * 0.3 = 3 -- + If we then observed 4 deaths: + `\(>1\)` = higher than expected + `\(<1\)` = lower than expected --- class: middle, center # Question: ### How can we present these data appropriately? --- # Visualising Uncertainty in SMRs + Confidence intervals + Statistical process control charts + Funnel plots (Speigelhalter, 2005a) ``` ## $plot ``` <img src="Building_funnel_plots_in_R_files/figure-html/plot1-1.png" width="504" style="display: block; margin: auto;" /> --- ## Funnel Limits and Overdispersion + Plot above is based on Poisson disribution, used for counts -- + Poisson distribution has fixed variance, mean and variance = `\(\mu\)`: `$$Pr(Y=y) = \frac{\mu^ye^{-\mu}}{y!}$$` __where:__ `\(\mu\)` is the expected average count or rate, + `\(e\)` is Euler’s number (the base of the natural logarithm: ~2.71828), + and `\(y!\)` is the factorial of y. -- ### "Real-world" data isn't perfectly Poisson distributed! + Causes us to underestimate the error in the data + Often caused by under-specification, poor parameterisation, clustering, aggregation --- # Clustering of organisations .pull-left[ + Regression assume independence + We've got repeated measurements form the same organsiation. + Both _within_ trust variation `\(\sigma\)` <br> + And _between_ trust variation `\(\tau\)` + Can model with a __random-intercept__ Traditional regression model has single-intercept ] .pull-right[ <br><br><br> <img src="Building_funnel_plots_in_R_files/figure-html/clustersetup-1.png" width="360" style="display: block; margin: auto;" /> ] --- # Random-intercepts <img src="Building_funnel_plots_in_R_files/figure-html/rintplot2-1.png" width="792" style="display: block; margin: auto;" /> --- # z-scores (1) + Compare different indicators on standard scale + Scale indicators by their standard deviation: `\(\frac{Y - Target}{\sigma^2}\)` -- + This is only valid for normally distributed data + Proportions, counts & standardised ratios not normally distributed -- <br> ### Transformations: + __CQC/Spiegelhalter:__ square-root + Winsorization + __SHMI:__ natural logarithm + truncation -- ### Dispersion ratio is calculated on winsorised/truncated scores: $$ \phi = \frac{\sum_{i=1}^n Z_i^2}{n}$$ --- # z-scores (2) The dispersion ratio ( `\(\phi\)` ): + <=1, the set `\(\tau^2 = 0\)` + Otherwise: -- `$$\tau^2 = \frac{(n \phi-(n-1))}{\sum_{i=1}^n W_i - (\sum_{i=1}^n W_i^2 / \sum_{i=1}^n W_i)}$$` _Where:_ + `\(W_i\)` is `\(1/\sigma^2\)`, the within Trust ( `\(i\)` ) standard deviation + `\(\phi\)` is the dispersion ratio <br><br> ___These techniques are imprecise estimates of random intercepts in models___ --- # Funnel Plots - Control limits commonly 95 and 99% (≈ 2 and 3 σ ) - OD issue: - Limits are independent of data, and relate to sample size (predicted) - Limits are too tight. -- - We can inflate them using the `\(\tau^2\)` calculated above. - Control limits are then, functionally: `\(1 \pm z * (\frac{\sigma + \tau}{n})\)` - Taken >= 99.8% limits --- ## Seeing the plot again: __Poisson limit__ ```r library(FunnelPlotR) # lets use NHSD's SHMI dataset: https://digital.nhs.uk/data-and-information/publications/clinical-indicators/shmi/current/shmi-data SHMIdt <- read.csv("Data/SHMI data at trust level, Oct18-Sep19 (csv).csv") SHMIdt$PROVIDER_NAME <- factor(SHMIdt$PROVIDER_NAME) funnel_plot(numerator=SHMIdt$OBSERVED, denominator=SHMIdt$EXPECTED, group = SHMIdt$PROVIDER_NAME, title = "Funnel plot of NHSDigital's SHMI data Oct18-Spet19", Poisson_limits = TRUE, OD_adjust = FALSE,label_outliers = TRUE, return_elements = "plot") ``` ``` ## $plot ``` <img src="Building_funnel_plots_in_R_files/figure-html/plot2-1.png" width="504" style="display: block; margin: auto;" /> --- ## Seeing the plot again: __OD adjustment__ ```r library(FunnelPlotR) # lets use NHSD's SHMI dataset: https://digital.nhs.uk/data-and-information/publications/clinical-indicators/shmi/current/shmi-data SHMIdt <- read.csv("Data/SHMI data at trust level, Oct18-Sep19 (csv).csv") SHMIdt$PROVIDER_NAME <- factor(SHMIdt$PROVIDER_NAME) funnel_plot(numerator=SHMIdt$OBSERVED, denominator=SHMIdt$EXPECTED, group = SHMIdt$PROVIDER_NAME, title = "Funnel plot of NHSDigital's SHMI data Oct18-Spet19", Poisson_limits = TRUE, OD_adjust = TRUE,label_outliers = TRUE, return_elements = "plot") ``` ``` ## $plot ``` <img src="Building_funnel_plots_in_R_files/figure-html/plot3-1.png" width="504" style="display: block; margin: auto;" /> --- # Issues with Funnel Plots + Are we asking the right question? + Implicit ranking + Which limits do we use? + Overdispersion + Assumed to be clustering + Adjustment proposed (Spiegelhalter, 2005b) + Alternative used by NHSD (with log/truncation) + Strictly cross-sectional <br> We could do it better with a mixed model! --- # Thank you for your time! .pull-left[ + `FunnelPlotR` 📦 + Available on CRAN. + Contributions and bug reports welcome! [https://chrismainey.github.io/FunnelPlotR/](https://github.com/chrismainey/FunnelPlotR) [https://github.com/chrismainey/FunnelPlotR](https://chrismainey.github.io/FunnelPlotR/) ] .pull-right[ <img src="https://github.com/chrismainey/FunnelPlotR/raw/master/man/figures/logo.png" width=200 fig.align="center"> ] <br> + HED R Training: [https://www.hed.nhs.uk/Info/hed-courses.aspx](https://www.hed.nhs.uk/Info/hed-courses.aspx) + Intro to R + Rmarkdown + Regression modelling + Machine Learning in R --- # References .reference[ + CAMPBELL, M.J., JACQUES, R.M., FOTHERINGHAM, J., MAHESWARAN, R. and NICHOLL, J., 2012. Developing a summary hospital mortality index: retrospective analysis in English hospitals over five years. *BMJ,* **344** , pp. e1001 + KEOGH, B. 2013. *Keogh review on hospital deaths published - NHSUK* [Online]. Department of Health. Available: https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf [Accessed 13/11/2017 2017]. + LILFORD, R. and PRONOVOST, P., 2010. Using hospital mortality rates to judge hospital performance: a bad idea that just won't go away. *BMJ,* **340** , pp. c2016. + LILFORD, R., MOHAMMED, M.A., SPIEGELHALTER, D. and THOMSON, R., 2004. Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma. *The Lancet,* **363** (9415), pp. 1147-1154. + SPIEGELHALTER, D., GRIGG, O., KiINSMAN, R., FAREWELL, V., T., TREASURE, T., Risk-adjusted sequential probability ratio tests: applications to Bristol, Shipman, and adult cardiac surgery. I *Int J Qual Health Care;* **15** , pp 7-13 + SPIEGELHALTER, D.J., 2005a. Funnel plots for comparing institutional performance. *Stat Med,* **24** (8), pp. 1185-1202. + SPIEGELHALTER, D.J., 2005b. Handling over-dispersion of performance indicators. *Qual Saf Health Care,* **14** (5), pp. 347-351 [https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf](https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf) ]