5.2.1 Descriptive Data Quality

If it is not apparent from the preceding chapters, let me be explicit—gathering social network data is resource intensive. It takes more space within any project to gather network data than does gathering comparable data at the individual level. The necessary resources devoted to this type of work–ranging from computational demands to simple things like the amount of space in the pages of a survey—can be intimidating. In particular, network studies often demand more of their respondents’ time than other research of similar scales. This is because for each tie elicited by a name generator, each name interpreter requires data on each of those alters; this compounds if you are using multiple name generators. In individual level surveys, adding an additional data point that you may or may not use in analyses only requires a single additional question, and therefore doesn’t often have to compete too vigorously for space in the researchers’ attention. But it’s much more important for most network data to know in advance how you anticipate using it, so that you don’t waste efforts in spending excess resources on things that are ultimately unnecessary (McCarty, Killworth, and Rennell 2007). Even if you’re relying on data that is collected via passive means, not surveys, both encoding and retrieving relational data often requires similar additional resources. If using APIs to scrape network data, each additional call may seem trivial in the individual case, but multiplied over the scale of data often involved in such work, can make even automated data gathering more demanding in the network context than for tracking down individual attribute information for a sample of similar size.