Differential Privacy in Census Data

The field of differential privacy is moving towards more advanced and adaptive methods for protecting sensitive data, particularly in the context of census data. Recent work has focused on developing algorithms that can balance privacy and accuracy, allowing for more detailed and informative data releases. A key area of innovation is the use of discrete Gaussian distributions to add noise to statistics, which has been shown to satisfy variants of differential privacy such as zero-concentrated differential privacy. Another important direction is the development of hierarchical methods for differentially private counting queries, which can handle complex datasets with multiple categorical features. Noteworthy papers in this area include:

  • PHSafe, which implemented a disclosure avoidance algorithm for the 2020 Census Supplemental Demographic and Housing Characteristics File.
  • SafeTab-P, which introduced an adaptive approach to choosing the granularity of data release.
  • InfTDA, which extended a TopDown mechanism for mobility datasets to a general setting, allowing for differentially private synthetic datasets that answer hierarchical queries.

Sources

PHSafe: Disclosure Avoidance for the 2020 Census Supplemental Demographic and Housing Characteristics File (S-DHC)

SafeTab-P: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File A (Detailed DHC-A)

SafeTab-H: Disclosure Avoidance for the 2020 Census Detailed Demographic and Housing Characteristics File B (Detailed DHC-B)

InfTDA: A Simple TopDown Mechanism for Hierarchical Differentially Private Counting Queries

Built with on top of