Book: Weapons of Mass Destruction
Author(s): Cathy O'Neil

problems of big data (wrong models, spurious data)

Models that rule the world and change the objectives and priorities of institutions (New Corp University rankings for example).

Models are not always the best answer to everything. New approaches are good if tested and have a feedback loop. Many policies like walk patroling are still better, in many instances than data-driven and automated approaches.

We still have a lot to learn.


Notes and Highlights


“The math-powered applications powering the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives.”

Computers cannot be expected to be perfect, they are, after all, a human product.


“The privileged, are processed more by people, the masses by machines.”


“the folks building WMDs routinely lack data for the behaviors they’re most interested in. So they substitute stand-in data, proxies.”

This is related to the substitution heuristic that Kanheman describes in his book. It might look similar but usually substituting leads to wrong conclusions.


“Our own values and desires influence our choices, from the data we choose to collect to the questions we ask.”

Q: “Models are opininions embbeded in mathematics”

There’s always a risk when modelling the world, we can’t be sure we are being completely objective.


“WMD: Weapon of Mass Destruction.

The big-data algorithm that are constantly used everyday, everywhere.

The three elements of a WMD: Opacity, Scale and Damage"


“They paid more attention to customer satisfaction than to the accuracy of their models”

I’d say a conflict of interest is at play here, a very dangerous one.


“I saw all kinds of parallels between finance and Big Data. Both industries gobble up the same pool of talent, much of it from elite universities like MIT, Princeton, or Stanford. These new hires are ravenous for success andhave been focused on external metrics (e.g. SAT scores) their entire lives. Whether in finance or in tech, the message they’ve received is that they will be rich, that they will run the world.”

External metrics and this competition based on standardized test is doing more harm than good all over the place. We stop focusing on learning and understanding and start worrying about some odd score.


On Big-Data:

“I worried about the separation between technical models and real people, and about the moral repercussions of that separation”

This is related to “Skin in the game”


“A formula, whether it’s a diet or a tax code, might be perfectly innocuous in theory. But if it grows to become a national or global standard, it creates its own distorted and dystopian economy.”

“This is what happened in higher education” The rise of university rankings have distorted the focus of university and higher education. Many universities focus on profit, research or other variables rather than ensuring a quality education.


“Our entire society has embraced not only the idea that a college education is essential but the idea that a degree from a highly ranked school can catapult a student into a life of power and privilege”

Having studied at Oxford, I understand what the author is trying to convey, but there are definitely benefits of studying in “elite” universities.


“Anywhere you find the combination of great need and ignorance, you’ll likely see predatory ads.”


“Fairness isn’t calculated into WMDs. And the result os a massive, industrial production of unfairness. If you think of a WMD as a factory, unfairness is the black stuff belching out of the smoke stacks. It’s an emission, a toxic one.”

Again, these programs are made by humans and not all of us have the best intentions.


“…a crucial part of justice is equality. And that means, among many other things, experiencing criminal justice equally. People who favor policies like stop and frisk should experience it themselves.”

Again, skin in the game!


“As technology advances, we’re sure to see a dramatic growth of surveillance. The good news, if you want to call if that, is that once thousands of security cameras in our cities are sending up our images for analysis, police won’t have to discriminate as much.

But it means that we’ll all be subject to a digital form of stop and frisk, our faces matched against databases of known criminals and terrorists."

Challenges, many challenges. It is very hard to want the benefits of technology without giving up away a bit of privacy.


“From a mathematical point of view, however, trust is hard to quantify. That’s a challange for people building models.”


“mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty or education. It’s up to society whether to use that intelligence to reject and punish them - or to reach out to them with the resources they need.”


“Under the inefficient status quo, workers had not only predictable hours but also a certain amount of downtime. You could argue that they benefited from the inefficiency: some were able to read on the job, even study. Now, with software choreographing the work, every minute should be busy. And these minutes will come whenever the program demands it, even if it means clopening from Friday to Saturday”

Not all new is good just for the sake of it. New systems need prove they are robust and bring more benefits than past methods.


“”The more data, the better" is the guiding principle of the Information Age. Yet in the name of fairness, some of this data should remain uncrunched."


Sometimes algorithms target us because we have a very common name or look similar to somebody else.

Perhaps, the strange names that Venezuelans come up with are not a bad idea after all, they might shield us from being targeted or wrongly flaged, hahah.


“These automatic programs will increasingly determin how we are treated by the other machines, the ones that choose the ads we see, set prices for us, line us up for a dermatologist appointment, or map our routes. They will e highly efficient, seemingly arbitrary and utterly unaccountable.

If we don’t wrest back a measure of control, these future WMDs will feel mysterious and powerful. They’ll have their way with us, and we’ll barely know it’s happening. "

A bit too pessimistic here but I agree that we need to be very careful with automation and letting machines rule over us.


“Much of the proxy data ollected, whther step counts or sleeping patterns, is not protected by law, so it would theoretically be perfectly legal. And it would make sense. As we’ve seen, they routinely reject applicants on the basis of credit scores and personality tests. Health scores represent a natural - and frightening - next step.”

This is a bit paradoxical since it would seem that being healthy is beneficial not only to ourselves but to society as well. If anything, using health data we can start developing good habits and maybe then avoid being ruled out by these algorithms


“Some 73 percent of Americans, according to a Pew Research report, believe that search results are both accurate and impartial.”

Good luck with this one! We can learn from the past elections. So much misinformation around the internet.