Wednesday, August 29, 2018
The Truth About Hurricane Maria and Puerto Rico
In September, 2017, two huge hurricanes-- Irma and Maria-- hit Puerto Rico. The scene was horrifying, but the death toll was surprisingly small: The official tally was 64 dead. It turns out, though, that the actual number is probably closer to 3,000.
How could the reported number be so wrong? A lot of the problem, it seems, had to do with the way deaths were recorded-- that is, the cause of death did not reflect a relationship to the storm. Some might suspect political motives played a role, though there does not seem to be much real evidence of that.
One thing this brings to the surface is how hard it can be to measure things that appear simple. We often promote the use of data, but too rarely talk about the value of data and the problems with collection and analysis.
In my own field, we see this in sentencing all the time. Someone's criminal history (their prior convictions) should be simple to determine-- there is a computer database, right? Well, of course there is. And people rely on it all the time. And it is often wrong, with disastrous consequences. It is wrong because people put in the wrong numbers or code when they log in a conviction, or records are incomplete. Some jurisdictions are good and some are bad. Often the people who do the data entry are low-paid clerks for whom crucial distinctions are insignificant. What's the difference between probation and deferred adjudication, after all? A lot, in subsequent sentencings.
We want certainty. We want data to tell us what is what. But, in many areas, we are not there yet. If we over-rely on data, we risk determining freedom by false measures. That is injustice, as much as it is by any other route.
Comments:
<< Home
An interesting recent example of this problem with data--particularly relevant here given (a) the general sense that data are "fact" and (b) that public policy can flow from bad data just as easily as good data--is NPR's investigation of a Department of Education report identifying 240 school shootings in the 2015-2016 school year. That number, obviously, is shocking. It is also, NPR discovered, way too high. And the explanation, per the other, less prestigious Razor, is that the data were collected by surveying a huge number of schools (96,000) and some of the respondents simply made erroneous clicks. In short, NPR was able to confirm 11 of the 240 reported shootings--within the margin of error for a sample size of nearly 100k, but with major implications for public perception.
NPR: The School Shootings That Weren't
NPR: The School Shootings That Weren't
Yes, the NPR story--as well as the criminal history database example—only reinforces, to a shocking degree, that some percentage of large amounts of data will inevitably be entered wrong. It's tempting to lay those errors at the feet of "low-paid clerks for whom crucial distinctions are insignificant;" however, I would respectfully push back on that characterization. Getting low pay doesn't mean that you don't care about the accuracy of your work, or even that you don't know what the terms mean. In the NPR story, at least, it sounds as though administrators made some of the errors or didn’t catch errors made by others.
And then there’s another level where employees higher up the ladder intentionally manipulate data, i.e., the college administrators who game the US News rankings by not reporting accurate 6-year graduation rates, or claiming to have a lower than actual enrollment so that per-student spending will appear higher. Lower-down employees may account for every student correctly, but that accuracy gets ruined by the decisions their bosses make in how they represent the data. (https://www.insidehighered.com/admissions/article/2018/08/27/eight-more-colleges-identified-submitting-incorrect-data-us-news)
Otherwise yes, now is an especially good time to be reminded that we all need to question everything we read, and be shown the questions to ask of data if we don’t know. The media have a crucial role here.
And then there’s another level where employees higher up the ladder intentionally manipulate data, i.e., the college administrators who game the US News rankings by not reporting accurate 6-year graduation rates, or claiming to have a lower than actual enrollment so that per-student spending will appear higher. Lower-down employees may account for every student correctly, but that accuracy gets ruined by the decisions their bosses make in how they represent the data. (https://www.insidehighered.com/admissions/article/2018/08/27/eight-more-colleges-identified-submitting-incorrect-data-us-news)
Otherwise yes, now is an especially good time to be reminded that we all need to question everything we read, and be shown the questions to ask of data if we don’t know. The media have a crucial role here.
People speak of data as though they come to us from some mystical place of perfection. But data are collected and analyzed by humans, so they are subject to the same imperfections that plague humans. Just as the school shootings that weren't show, the story and the people behind the data are as important as the data themselves.
Post a Comment
<< Home