Predicting a true positive event (e.g. morbidity, mortality, adverse event) provides a different epidemiological, financial, and emotional response than predicting a true negative. It follows that having a large number of false negatives might be worse than a model generating many false positives. But early warning systems (EWS) are developed with the assumption that all outcomes are equal. That is a huge mistake when evaluating health outcomes. What can be done to remedy this?

Take a look at publications describing the creation of (take your pick) MEWS, NEWS, ViEWS, PEWS, Worthing score, etc…. How is accuracy of the “alert” measured? Invariable it will be either sensitivity (% TP/events), specificity (%FP/non-events), positive predictive value (%events/trigger), and negative predictive value (%non-events/no trigger). There may be a gratuitous nod that these metrics aren’t equivalent, but very little is done about it.

There are ways to circumvent this. One is at the evaluation stage. Assigning costs to each outcome will give you an idea of whether the EWS is effective. For example, suppose you wanted to create an alert that a non-ambulatory patient was at risk for a pressure ulcer. You could assign dollar values to each outcome and thereby determine if the alert was cost-effective. The other way is a bit more complicated and involves weighting outcomes at the model development stage. In the pressure ulcer example above, weights could be assigned to the cost matrix (known also as the penalty matrix or truth tables) and an optimizer such as a genetic algorithm used to minimize the cost of an alert.

Either way you’ll be going beyond the typical vanilla metrics reported for EWS. More importantly, you will have ROI data that can be used to justify implementation of the alert at your institution. Because truly, all outcomes are not equal in importance.