Introduction to R

class: center, middle

.pull-left[
# Funnel Plots
 
__Chris Mainey__ 
 
 Senior Statistical Intelligence Analyst 
 
Healthcare Evaluation Data (HED)

.small[Univeristy Hospitals Birmingham NHS FT]
 
[www.hed.nhs.uk](https://www.hed.nhs.uk)
]

.pull-right[
<img src="https://github.com/chrismainey/FunnelPlotR/raw/master/man/figures/logo.png" width=250 fig.align="center">
<img src="https://chrismainey.github.io/Building_funnel_plots_in_R/Assets/logo.png" width=250 fig.align="center">
]

---

## Healthcare Evaluation Data (HED)

.pull-left[
- Online hospital benchmarking system
- Statistical models and analysis tools
- Activity, Mortality, Readmissions, Length-of-Stay, Marketshare etc.
- Used by ~60 NHS and other organisations
- Training and support
 
- __Using national NHS HES data__
]
.pull-right[
 
<img src="https://chrismainey.github.io/EARL_2019_presentation/assets/HED_system.png">
<img src="https://chrismainey.github.io/EARL_2019_presentation/assets/HED_system.png">
]

---
# HED R Training

We offer a variety of R training courses, both public or onsite.

+ Two day introduction course, public or onsite. __(24th-25th March, 9th-10th June)__

+ We also offer other courses, including:
 + Introduction to R Markdown - __26th Feb__
 + Machine Learning methods in R __28th - 29th April__
 + Regression Modelling in R - __22nd - 23rd September__
 + R Essentials - __20th October__

Discounted price for HED customers

More info, or book at: https://www.hed.nhs.uk/Info/hed-courses.aspx
---

#  Overview

+ What's a funnel plot?
+ What's overdispersion?
+ What do we do about it?

+ _If you are interested (or if we have time):_
 + How to set up and build an R package
 + What I've learned from the process
 + How to publish it to CRAN

[https://github.com/chrismainey/FunnelPlotR](https://github.com/chrismainey/FunnelPlotR)

---
##  Context:
### Monitoring Standardised Mortality

+ How do we compare between organisations with different patients?

--
+ Direct standardisation:
 + Adjust all to common standard (e.g. European Standard Popn)
+ Indirectly Standardised Ratios:
 + Predict an expected rate for each unit, based on average (regression model)
 + Case-mix adjustment

$$ SR=\frac{\Sigma{(\text{observed deaths})}}{\Sigma{(\text{predicted deaths})}} $$
--

+ Predicted = Expected, but `$\ne$` 'preventable'

---
#  Standardised Ratio example

+ 10 identical patients:
 + Risk factor of model
 + E.g. age, sex, primary diagnosis etc.

+ If these patients had a probability of death of 0.3 (30%)
 + Expected deaths = 10 * 0.3 = 3

+ If we then observed 4 deaths:
 + `$>1$` = higher than expected
 + `$<1$` = lower than expected

---
class: middle, center

# Question:
### How can we present these data appropriately?

---
#  Visualising Uncertainty in SMRs

+ Confidence intervals
+ Statistical process control charts
+ Funnel plots (Speigelhalter, 2005a)

```
## $plot
```

---

## Funnel Limits and Overdispersion

+ Plot above is based on Poisson disribution, used for counts

+ Poisson distribution has fixed variance, mean and variance = `$\mu$`:

`$$Pr(Y=y) = \frac{\mu^ye^{-\mu}}{y!}$$`
__where:__ `$\mu$` is the expected average count or rate, 
+ `$e$` is Euler’s number (the base of the natural logarithm: ~2.71828), 
+ and `$y!$` is the factorial of y.

### "Real-world" data isn't perfectly Poisson distributed!
+ Causes us to underestimate the error in the data
+ Often caused by under-specification, poor parameterisation, clustering, aggregation
 
---

# Clustering of organisations

.pull-left[
+ Regression assume independence
+ We've got repeated measurements form the same organsiation.
+ Both _within_ trust variation `$\sigma$`
 
+ And _between_ trust variation `$\tau$`
+ Can model with a __random-intercept__

Traditional regression model has single-intercept
]

.pull-right[
 
<img src="Building_funnel_plots_in_R_files/figure-html/clustersetup-1.png" width="360" style="display: block; margin: auto;" />
]

---

# Random-intercepts

---

# z-scores (1)

+ Compare different indicators on standard scale
+ Scale indicators by their standard deviation: `$\frac{Y - Target}{\sigma^2}$`

+ This is only valid for normally distributed data
+ Proportions, counts & standardised ratios not normally distributed

### Transformations:
+ __CQC/Spiegelhalter:__ square-root + Winsorization
+ __SHMI:__ natural logarithm + truncation

### Dispersion ratio is calculated on winsorised/truncated scores:

$$ \phi = \frac{\sum_{i=1}^n Z_i^2}{n}$$
---
# z-scores (2)

The dispersion ratio ( `$\phi$` ):
+ <=1, the set `$\tau^2 = 0$`
+ Otherwise:

`$$\tau^2 = \frac{(n \phi-(n-1))}{\sum_{i=1}^n W_i - (\sum_{i=1}^n W_i^2 / \sum_{i=1}^n W_i)}$$`
_Where:_ 
 + `$W_i$` is `$1/\sigma^2$`, the within Trust ( `$i$` ) standard deviation
 + `$\phi$` is the dispersion ratio

___These techniques are imprecise estimates of random intercepts in models___

---

# Funnel Plots

- Control limits commonly 95 and 99% (≈ 2 and 3 σ )
- OD issue:
 - Limits are independent of data, and relate to sample size (predicted)
 - Limits are too tight.
--

- We can inflate them using the `$\tau^2$` calculated above.

- Control limits are then, functionally: `$1 \pm z * (\frac{\sigma + \tau}{n})$`
- Taken >= 99.8% limits

---

## Seeing the plot again:
__Poisson limit__

```r
library(FunnelPlotR)
# lets use NHSD's SHMI dataset: https://digital.nhs.uk/data-and-information/publications/clinical-indicators/shmi/current/shmi-data
SHMIdt <- read.csv("Data/SHMI data at trust level, Oct18-Sep19 (csv).csv")
SHMIdt$PROVIDER_NAME <- factor(SHMIdt$PROVIDER_NAME)

funnel_plot(numerator=SHMIdt$OBSERVED, denominator=SHMIdt$EXPECTED, group = SHMIdt$PROVIDER_NAME, 
          title = "Funnel plot of NHSDigital's SHMI data Oct18-Spet19", 
          Poisson_limits = TRUE,  OD_adjust = FALSE,label_outliers = TRUE, return_elements = "plot")
```

```
## $plot
```

<img src="Building_funnel_plots_in_R_files/figure-html/plot2-1.png" width="504" style="display: block; margin: auto;" />
---
## Seeing the plot again:
__OD adjustment__

funnel_plot(numerator=SHMIdt$OBSERVED, denominator=SHMIdt$EXPECTED, group = SHMIdt$PROVIDER_NAME, 
          title = "Funnel plot of NHSDigital's SHMI data Oct18-Spet19", 
          Poisson_limits = TRUE,  OD_adjust = TRUE,label_outliers = TRUE, return_elements = "plot")
```

```
## $plot
```

---
#  Issues with Funnel Plots

+ Are we asking the right question?
+ Implicit ranking
+ Which limits do we use?
+ Overdispersion
	+ Assumed to be clustering
	+ Adjustment proposed (Spiegelhalter, 2005b)
	+ Alternative used by NHSD  (with log/truncation)
+ Strictly cross-sectional

We could do it better with a mixed model!

---
#  Thank you for your time!

.pull-left[
+ `FunnelPlotR` 📦
	+ Available on CRAN.
	+ Contributions and bug reports welcome!

[https://chrismainey.github.io/FunnelPlotR/](https://github.com/chrismainey/FunnelPlotR)
[https://github.com/chrismainey/FunnelPlotR](https://chrismainey.github.io/FunnelPlotR/)
]

.pull-right[
<img src="https://github.com/chrismainey/FunnelPlotR/raw/master/man/figures/logo.png" width=200 fig.align="center">
]

+ HED R Training:  [https://www.hed.nhs.uk/Info/hed-courses.aspx](https://www.hed.nhs.uk/Info/hed-courses.aspx)
 + Intro to R
 + Rmarkdown
 + Regression modelling
 + Machine Learning in R

---
#  References

.reference[
+ CAMPBELL, M.J., JACQUES, R.M., FOTHERINGHAM, J., MAHESWARAN, R. and NICHOLL, J., 2012. Developing a summary hospital mortality index: retrospective analysis in English hospitals over five years. *BMJ,* **344** , pp. e1001
+ KEOGH, B. 2013. *Keogh review on hospital deaths published - NHSUK* [Online]. Department of Health. Available: https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf [Accessed 13/11/2017 2017].
+ LILFORD, R. and PRONOVOST, P., 2010. Using hospital mortality rates to judge hospital performance: a bad idea that just won't go away. *BMJ,* **340** , pp. c2016.
+ LILFORD, R., MOHAMMED, M.A., SPIEGELHALTER, D. and THOMSON, R., 2004. Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma. *The Lancet,* **363** (9415), pp. 1147-1154.
+ SPIEGELHALTER, D., GRIGG, O., KiINSMAN, R., FAREWELL, V., T., TREASURE, T., Risk-adjusted sequential probability ratio tests: applications to Bristol, Shipman, and adult cardiac surgery. I *Int J Qual Health Care;* **15** , pp 7-13
+ SPIEGELHALTER, D.J., 2005a. Funnel plots for comparing institutional performance. *Stat Med,* **24** (8), pp. 1185-1202.
+ SPIEGELHALTER, D.J., 2005b. Handling over-dispersion of performance indicators. *Qual Saf Health Care,* **14** (5), pp. 347-351
[https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf](https://www.nhs.uk/NHSEngland/bruce-keogh-review/Documents/outcomes/keogh-review-final-report.pdf)
]