True positive labels come from those 1% of claims with a so-called Z-Code and we are reasonably sure of our true zeros (probably 1–3 in ten will be mislabeled based on our priors).
Statisticians will cringe, but deep learning is robust against label noise.
One-sided label noise is a little different because estimating the real class-conditional distributions require stricter theoretical assumptions (we will have an upcoming blog about this!).
Using everything we know about a member (demographics, neighborhood, disease burden, utilization, medications) we can reasonably predict which members may be struggling with these same social factors as in the labeled set.
We built 10 models (one for each set of diagnosis codes) and scored our entire membership with it.
Armed with this new knowledge, our population health team, care managers and provider partners can more readily reach out to members who might be struggling with one or more social factor and connect them with helpful resources like PATH for housing and SNAP for food insecurity.
These risk scores are valuable on their own and also provide a rich new feature-set for further analysis.
The health insurance industry hasn’t figured out how to incorporate them into a risk model yet, but that doesn’t stop them for using them to impact clinical outcomes.
The financial and housing stress risk score is the second most important variable in our Plan All Cause Readmissions model, after “have you been readmitted in the past?” Indeed, the first principal component of all 10 scores acts as a decent readmissions model itself: AUC of 0.
68 and Brier score of 0.
Our goal is to use them as a standard descriptive feature in any analysis, much like you would age or ethnicity.
Here’s a density map of our members with known social determinant statusAnd here we have all members classified by our models, a factor of 10 more!VALIDATION AND IMPACTAs a provider-sponsored health plan, we are lucky enough to have the ultimate validation set of data: something that was nowhere near the data scientists during model training.
We have survey data from sponsor hospitals that tries to capture members struggling with social burdens.
They built the survey not knowing we were modeling and we built the models not knowing they were going to survey.
The results: a c-statistic of 0.
71 and Brier Score of 0.
16, which means we are directionally correct (You can practically hear the chorus of data scientists singing, “All models are wrong, some are useful”).
Our understanding of the social determinants needs to inform every clinical interaction we have with our members.
If we notice members are struggling to afford food (or even if we have a good guess), we can connect them to important services they might not know about.
As we progress in our vision to better understand the needs of our membership, we will improve provider recognition of these conditions and partner with more sponsors; both of which gives us better labels which yield better models.
Let’s definitely do that.
But also, we can use the ideas developed here to immediately impact our members.
That’s why I show up to work.