Courses and Events
Generating Synthetic Data for Statistical Disclosure Control
02/12/2014 - 03/12/2014
Southampton Statistical Sciences Research Institute, Building 39, University of Southampton, Highfield Campus, Southampton
View in Goolge Maps (SO17 1BJ)
NOTE: THIS COURSE IS NOW FULLY BOOKED AND THE ON-LINE BOOKING SYSTEM HAS BEEN CLOSED.
Course No. ADRCE-Training011 Drechsler
Short Summary of Course
This short course will provide a detailed overview of the topic, covering all important aspects relevant for the synthetic data approach. Starting with a short introduction to data confidentiality in general and synthetic data in particular, the workshop will discuss the different approaches to generating synthetic datasets in detail. Possible modeling strategies and analytical validity evaluations will be assessed and potential measures to quantify the remaining risk of disclosure will be presented. Finally, recent extensions of the synthetic data approach will be reviewed and chances and obstacles of the idea will be discussed. To provide the participants with hands on experience, all steps will be illustrated using simulated and real data examples in R.
The course covers:
This event includes computer workshops.
The practical implementation of the approach will be illustrated using the statistical software R.
Jörg Drechsler is the deputy head of the Department for Statistical Methods at the Institute for Employment Research in Nürnberg. He studied business administration in Nürnberg and obtained his Ph.D. from the University in Bamberg in 2009. During the winter term 2011/2012 he held an interim professor position at the Institute for Statistics at the Ludwig-Maximilians-University in Munich. His main research interests are data confidentiality and nonresponse in surveys. He received several awards for his research on synthetic data and recently published a book on this topic.
The course intends to summarize the state of the art in synthetic data. The main focus will be on practical implementation and not so much on the motivation of the underlying statistical theory. Participants may be academic researchers or practitioners from statistical agencies working in the area of data confidentiality and data access. Some background in Bayesian statistics is helpful but not obligatory.
This is a two-day course. On Day one, the Registration will start from 9.30 and formal teaching will commence at 10.00 and finish at around 17.00. On Day two, it will start at 9.00 and finish at around 16.00.
Event Outline (Programme)
1. A Brief History of Data Confidentiality
Some background regarding general linear modelling is expected. Familiarity with the concept of Bayesian statistics is helpful but not required. The statistical software R will be used to illustrate the implementation of the approach.
Kinney, S. K., Reiter, J. P., Reznek, A. P., Miranda, J., Jarmin, R. S., and Abowd, J. M. (2011), Towards unrestricted public use business microdata: The synthetic Longitudinal Business Database, International Statistical Review, 79, 363 - 384.
Participants will receive written course notes.
University of Southampton/ADRC-E
Intermediate (some prior knowledge)
Thanks to ESRC funding we are able to offer this course at reduced rates as follows: 1) £30 per day for UK registered students 2)£60 per day for staff at UK academic institutions, RCUK funded researchers, UK public sector staff and staff in UK registered charity organisations 3)£220 per day for all other participants 4)Free Place for ADRC-E & ADRN/ADS staff The course fee includes course materials, lunches and morning and afternoon refreshments. Travel and accommodation are to be arranged and paid for by the participant.
Website and registration
Related publications and presentations