The “Tastes, Ties, and Time” (T3) dataset compiled Facebook profile data from a cohort of college students in the mid-2000s, which was used to analyze the relationship between social networks and personal cultural preferences (Lewis et al. 2008). The data were also intended to be made publicly available for other researchers’ use. Upon publication of the data’s codebook, it became rapidly apparent that the school that was the source of the data was readily identifiable, even without accessing the data itself (Zimmer 2010). Moreover, in datasets like these, the unique combinations of a relatively small number of individual characteristics can make individuals quite readily identifiable by people with access to purportedly “de-identified” data and publicly available resources (Arfer and Jones 2018). Scholars have generally criticized the deductive disclosure of individual identities as unethical (Poor and Davidson 2018). Computer scientists have devoted considerable attention to optimal strategies for protecting against data de-anonymization, including for social network data with particular structural patterns (Onaran, Garg, and Erkip 2016).
Furthermore, network data provide increased analytic capabilities for potentially identifying research subjects. Guidelines exist in individual-level data for the ethical protection against disclosing PII. These standards restrict the presentation of detailed analytic combinations that would reduce the number of specified cases below a certain threshold (e.g., not presenting any cells in tables with fewer than five cases for “sensitive data” according to Office for National Statistics (2006)). Along similar lines, reporting structural position within network data can make apparent individual identities (e.g., those who are especially central, peripheral, or occupying otherwise unique positions). Given that social networks research so commonly relies on the visual presentation of data (Freeman 2004), researchers must evaluate whether such presentation would potentially violate confidentiality agreements with their research subjects (Borgatti and Molina 2003). The answer to this question is not always apparent, and the potential problems are broader than for most individually-oriented research; in the SNA case identifying one node may unravel the identities of many others that they are directly or indirectly connected to.