Tamora Pierce, Big Data and Other Books
Still a bit behind on the blogging - but a couple of weeks ago, I read Big Data, Terrier, Bloodhound, Mastiff and Cosmicomics.
Terrier, Bloodhound and Mastiff are the three books in Tamora Pierce’s Beka Cooper series. Tamora Pierce was a staple of my childhood - I loved her books, which are high fantasy with female main characters and feminist themes. She handles gender relations and cultural issues well! In fantasy! I still am a huge fan of hers as an adult. I read her Trickster’s Choice and Trickster’s Queen books in college and they are far more nuanced and powerful than the Song of the Lioness series I read as a kid. So if you haven’t read anything by her, and are looking for some light reading, any of her books are worth picking up.
The Beka Cooper series, which I devoured over the course of a weekend, is set in Tortall’s past, many years before most of her other books. It centers around a “Provost’s Dog” (basically, a cop) named Beka Cooper, who has the power to sometimes speaking to the dead. It basically reads as a fantasy-CSI, but in the best way - good plotting, surprising turns, compelling characters. Like all of Tamora Pierce’s female characters, Beka is someone I can identify with, while still admitting that she’s flawed.
Cosmicomics is by Italo Calvino. I enjoyed his “Invisible Cities” in school, and Cosmicomics is very similar in style but very different in content. It’s a series of stories about cosmic bodies and early life, broadly construed, as they go through the beginning of the universe or gaining legs. It’s engagingly written, and sometimes beautiful.
Finally I read Big Data by Viktor Mayer-Schönberger and Kenneth Neil Cukier. I wasn’t a huge fan of this book. The authors fall into the classic trap of organizing the book based on general themes, and they put risks at the end. So when the authors finally get to the risks, the fact that it is relegated to the end of the book makes it feel like an afterthought. It also means that the first hundred and fifty pages read like a paen to current tech trends. “Look what Google, Facebook, the City of New York, etc.. did with Big Data! It’s amazing and awesome!”
The book also breaks little new ground - the major thesis that represents a contribution to the field, as far as I could tell, is “who needs causal evidence when you can use correlational evidence based on ALL the data.” There are a couple of major issues with that, but the most easily identified is that the big data sources that the authors are talking about don’t necessarily include all the data - so it’s easy to jump to correlational patterns that don’t match or explain parts of the dataset that aren’t included.
Additionally, companies that rely on correlational data open themselves up to manipulation from anyone who understands causal factors in the data. For example, Mayer-Schnonberger and Cukier cite predicting airline prices as an example of not needing to understand the underlying causal factors if given enough correlational data. However, an airline could pretty easily manipulate any service that relies purely on correlational data to determine when prices will be the highest. This doesn’t break their argument, of course - certainly, correlational data may often be good enough. But this is symptomatic of the author’s general lack of engagement with the consequences of the trends they suggest, and for a book that is rehashing old (in Internet years) territory, that’s a pretty bad thing.