Ubiquitous computers, troves of data gathered about all aspects of our life. All this is part of the brave new world of big data. “Bit by Bit: Social Research in the Digital Age,” a new book by Princeton University sociology professor Matthew J. Salganik, shows the promise and the peril of this emerging field of data science.
Published by the Princeton University Press, Bit by Bit is intended, says Salganik, for “social scientists who want to do more data science, data scientists who want to do more social science, and anyone interested in the hybrid of these two fields.” The following excerpt from the book’s introduction gives the layman an overview:
Welcome to the digital age. The digital age is everywhere, it’s growing, and it changes what is possible for researchers . . .
Researchers can now observe behavior, ask questions, run experiments, and collaborate in ways that were simply impossible in the recent past. Along with these new opportunities come new risks: researchers can now harm people in ways that were impossible in the recent past. The source of these opportunities and risks is the transition from the analog age to the digital age. This transition has not happened all at once . . . and, in fact, it is not yet complete. However, we’ve seen enough by now to know that something big is going on.
One way to notice this transition is to look for changes in your daily life. Many things in your life that used to be analog are now digital. Maybe you used to use a camera with film, but now you use a digital camera (which is probably part of your smartphone). Maybe you used to read a physical newspaper, but now you read an online newspaper. Maybe you used to pay for things with cash, but now you pay with a credit card. In each case, the change from analog to digital means that more data about you are being captured and stored digitally.
In fact, when looked at in aggregate, the effects of the transition are astonishing. The amount of information in the world is rapidly increasing, and more of that information is stored digitally, which facilitates analysis, transmission, and merging. All of this digital information has come to be called “big data.” In addition to this explosion of digital data, there is a parallel growth in our access to computing power . . .
For the purposes of social research, I think the most important feature of the digital age is computers everywhere. Beginning as room-sized machines that were available only to governments and big companies, computers have been shrinking in size and increasing in ubiquity. Each decade since the 1980s has seen a new kind of computing emerge: personal computers, laptops, smartphones, and now embedded processors in the “Internet of Things” (i.e., computers inside of devices such as cars, watches, and thermostats). Increasingly, these ubiquitous computers do more than just calculate: they also sense, store, and transmit information.
For researchers, the implications of the presence of computers everywhere are easiest to see online, an environment that is fully measured and amenable to experimentation. For example, an online store can easily collect incredibly precise data about the shopping patterns of millions of customers. Further, it can randomize groups of customers to receive different shopping experiences. This ability to randomize on top of tracking means that online stores can constantly run randomized controlled experiments. In fact, if you’ve ever bought anything from an online store, your behavior has been tracked and you’ve almost certainly been a participant in an experiment, whether you knew it or not.
Matthew Salganik’s book is not only about the digital age, it has also been edited in part through an unusual “Open Review” made possible by the digital age. In an interview posted on the Princeton University Press’s blog, at http://blog.press.princeton.edu/, Salganik explains the new process.
This book is about social research in the digital age, so I also wanted to publish it in a digital age way. As soon as I submitted the book manuscript for peer review, I also posted it online for an Open Review during which anyone in the world could read it and annotate it. During this Open Review process dozens of people left hundreds of annotations, and I combined these annotations with the feedback from peer review to produce a final manuscript. I was really happy with the annotations that I received, and they really helped me improve the book.
The Open Review process also allowed us to collect valuable data. We could see which parts of the book were being read, how people arrived to the book, and which parts of the book were causing people to stop reading.
Finally, the Open Review process helped us get the ideas in the book in front of the largest possible audience. During Open Review, we had readers from all over the world, and we even had a few course adoptions. Also, in addition to posting the manuscript in English, we machine translated it into more than 100 languages, and we saw that these other languages increased our traffic by about 20 percent.
Q.: Was putting your book through Open Review scary?
No, it was exhilarating. Our back-end analytics allowed me see that people from around the world were reading it, and I loved the feedback that I received. Of course, I didn’t agree with all the annotations, but they were offered in a helpful spirit, and many of them really improved the book.
Actually, the thing that is really scary to me is putting out a physical book that can’t be changed anymore. I wanted to get as much feedback as possible before the really scary thing happened.
Q.: And now you’ve made it easy for other authors to put their manuscripts through Open Review?
With a grant from the Sloan Foundation, we’ve released the Open Review Toolkit, open source software that enables authors and publishers to convert manuscripts into a website that can be used for Open Review. During Open Review, you can also collect valuable data to help launch your book. Furthermore, all of these good things are happening at the same time that you are increasing access to scientific research, which is a core value of many authors and academic publishers.