
21 October 2025
“…what to do when you’ve got no data, no clear questions, and no clue if the service is working…”
A few general thoughts, and a worked example, including:
What real is the question?
Complex system, or more direct question?
What are the questions on the way to answering it?
What do we already know?
How could we answer it with a degree of confidence?


“The thing we are estimating”
We are often supplied with a pre-determined estimand:
Can it be measured directly?
Are there confounders / biases to consider?
Do you need to make any adjustment (sets)?
Community Care Collaboratives - part of NHS plans to shift care out of hospitals
System had already defined a benefit of 30,000 bed days saved per year

Simulate previous fiscal year, to see if it is possible
Bed days saved!

We needed to randomly assign new discharge dates
Red Herring!: This will give little to no variation.


Set EDD as 0
Count days to DD as MAX
Random value 0 - MAX
Need to have equal probability of any day
Monte Carlo method
‘discrete uniform’ distribution

sim_fun <-
function(.data, fraction_effect = 1) {
new_dates <-
as.Date(.data$earliest_discharge_date +
extraDistr::rdunif(nrow(.data),
# Earliest discharge date (EDD)
min = 0,
# No. days EDD actual discharge date
max = .data$DischargeDate_range))
fraction_effect * as.numeric(sum(.data$DischargeDate - new_dates))
}Even being generous, 30,000 is unlikely

Be clear on the questions you are trying to answer
Take time to draw it out and think about it
Not having data is not the end!
What would it look like, according to your best assumptions?
