Disclosure Risk of Contextual Data: The Role of Identified Geography, Spatial Scale, and Nesting of Information in Public-Use Files
Kristine Witkowski, University of Michigan
This project investigates the ways that contextual data associated with individual-level data in a dataset relate to disclosure risk. Analyzing an array of contextual data at different spatial scales, I simulate models to measure the likelihood of pinpointing geographic location under various distributional scenarios. Specifically, I investigate how disclosure risk for geographic units is affected by (1) spatial scale; (2) providing contextual data at multiple geographic levels; and (3) identifying region and state. Using individual geographic units as well as the “test data collection” as my units of analysis, the amount of disclosure risk as the outcome of interest, and associated experimental traits (e.g., masking technique), descriptive statistics, maps, and multivariate analyses are produced. As expected, disclosure risk increases when the geographic scope of a study is constrained to a sub-national level. Reidentification increases with the spatial scale of contextual data and is compounded when data is provided at multiple scales.