5.2.5 Incorporating Data Uncertainty into Network Models

One question that has remained relatively uncharted until recently is how we should treat the levels of uncertainty found in network data, given many of the patterns found in the literature, some selections of which I’ve summarized above. One strategy occasionally deployed has been to bootstrap a series of estimates of any modeled associations to allow for multiple assessments of examined questions, ensuring that the conclusions reached are robust to the assumptions of data completeness and cleanliness. While this allows researchers some increased confidence in the conclusions they reach in their research, it’s still not a fully satisfactory solution for what to do with data that we know have some identifiable limitations within them.

In a study of exchange networks within samples from two separate studies, one of students in China and the other of rural households in South Africa, An and Schramski (2015) proposed a method that leverages the differences in contested reports to provide modelable estimates of the presence/absence of various relationships. Since the accurate reporting of the provision of any resource would necessarily require that the other party involved report the receipt of that same resource, their data provide a number of strategies for comparing—and aggregating—across multiple reports of the same exchange relationships. When discrepant reports arise, the traditional approaches of union or intersection strategies for tie inclusion are available.114 However, they show that either of these strategies are inherently excluding substantial amounts of the gathered data, and essentially treating it all as equally uninformative, which is clearly less than ideal. The propose using the multiple comparisons available across an individual’s ties to others in the population as a means to generate a “credibility” score for each actor.115 With these credibility scores in hand, researchers can then re-estimate the networks of interest by either taking the more credible actor’s report as the best information available (a strategy they refer to as the deterministic method), or by drawing from a probabilistic distribution among the discrepant reports that is weighted by the respective actors’ credibility scores (a strategy they refer to as the random method). Moreover, they adapt this strategy for weighted ties in addition to dichotomous exchange relationships. While the method An and Schramski (2015) developed relied on exchange relationships, this approach is readily adaptable to other types of network data.