The Impact of Synthetic Data Generation Techniques in IDI Research

Synthetic data is a versatile tool. When used correctly, it is compliant with data confidentiality principles, rules and methods set out by Statistics NZ. We believe synthetic data in general may have the risk of single out an individual in the dataset with extreme value and linkage attack with more than one dataset. In order to mitigate these risks, we will follow NZ Stats release guidance and expertise from the NZ Stats release team.

Digitizing Birth Records for Intergenerational Research

Intergenerational links are important to assess intergenerational transfer of wealth, intergenerational socio-economic mobility, and familial influences on health and wellbeing. Department of Internal Affairs (DIA) birth records in the Integrated Data Infrastructure (IDI) can be used for intergenerational research, but currently these records are only available from 1985 onwards, which limits the ability to investigate multigenerational links. Digitization of earlier records back to 1972 (when full dates of birth - a necessary linking variable - were first recorded) would enable a greater exploration of multigenerational links for the full population of New Zealand births over a 50 year period.

What's in the IDI?

The IDI Search App allows researchers to explore what variables are available in the IDI, and provides descriptions and SQL information about each variable. Check out the app here: https://idi-search.web.app

Intergenerational Analyses using the IDI

This report documents the potential for intergenerational links in the Integrated Data Infrastructure (IDI). Three questions are answered: How many people can be linked back one, two, and three generations? How does this vary by decade of birth? How does this vary by ethnicity and deprivation?