20 20

Transactions on
Data Privacy
Foundations and Technologies

http://www.tdp.cat


Articles in Press

Accepted articles here

Latest Issues

Year 2024

Volume 17 Issue 1

Year 2023

Volume 16 Issue 3
Volume 16 Issue 2
Volume 16 Issue 1

Year 2022

Volume 15 Issue 3
Volume 15 Issue 2
Volume 15 Issue 1

Year 2021

Volume 14 Issue 3
Volume 14 Issue 2
Volume 14 Issue 1

Year 2020

Volume 13 Issue 3
Volume 13 Issue 2
Volume 13 Issue 1

Year 2019

Volume 12 Issue 3
Volume 12 Issue 2
Volume 12 Issue 1

Year 2018

Volume 11 Issue 3
Volume 11 Issue 2
Volume 11 Issue 1

Year 2017

Volume 10 Issue 3
Volume 10 Issue 2
Volume 10 Issue 1

Year 2016

Volume 9 Issue 3
Volume 9 Issue 2
Volume 9 Issue 1

Year 2015

Volume 8 Issue 3
Volume 8 Issue 2
Volume 8 Issue 1

Year 2014

Volume 7 Issue 3
Volume 7 Issue 2
Volume 7 Issue 1

Year 2013

Volume 6 Issue 3
Volume 6 Issue 2
Volume 6 Issue 1

Year 2012

Volume 5 Issue 3
Volume 5 Issue 2
Volume 5 Issue 1

Year 2011

Volume 4 Issue 3
Volume 4 Issue 2
Volume 4 Issue 1

Year 2010

Volume 3 Issue 3
Volume 3 Issue 2
Volume 3 Issue 1

Year 2009

Volume 2 Issue 3
Volume 2 Issue 2
Volume 2 Issue 1

Year 2008

Volume 1 Issue 3
Volume 1 Issue 2
Volume 1 Issue 1


Volume 12 Issue 2


Analyzing the disclosure risk of regression coefficients

Felix Ritchie(a),(*)

Transactions on Data Privacy 12:2 (2019) 145 - 173

Abstract, PDF

(a) Bristol Centre for Economics and Finance, University of the West of England, Coldharbour Lane, Bristol BS16 1QY. UK.

e-mail:felix.ritchie @uwe.ac.uk


Abstract

A major growth area in social science research this century has been access to highly sensitive confidential microdata, often via restricted-access remote facilities. These allow researchers highly unlimited access to manipulate the data but with checks for disclosure risk before the statistical results can be published. Effective output-based statistical disclosure control (OSDC) is therefore central to effective use of confidential microdata for research.

Multiple regression is a key anaytical tool for researchers, and so knowing whether multiple regression results are 'safe' for release is essential for research facilities. This is a relatively unexplored field; guidelines used by almost all restricted-access facilities reference an informal document from 2006, but more recent work suggests that problems may exist.

This paper demonstrates that linear regression coefficients show no substantive disclosure risks in realistic environments, and so should be considered as 'safe statistics' in the terminology of this field. Conflicting results in the literature reflect institutional perceptions rather than statistical differences, the confusion of statistical quality with disclosure risk, or the failure to identify the source of risk. The result has important implications for those responsible for providing research access to sensitive data.

The paper explores this result on simple linear regression models; more complex models are shown to be 'safer' subsets. Non-linear models pose slightly different problems, but this paper indicates a way such models may be tackled.

* Corresponding author.

Follow us




Supports



ISSN: 1888-5063; ISSN (Digital): 2013-1631; D.L.:B-11873-2008; Web Site: http://www.tdp.cat/
Contact: Transactions on Data Privacy; Vicenç Torra; Umeå University; 90187 Umeå (Sweden); e-mail:tdp@tdp.cat
Note: TDP's web site does not use cookies. TDP does not keep information neither on IP addresses nor browsers. For the privacy policy access here.

 


Vicenç Torra, Last modified: 00 : 08 May 19 2020.