The Impact of Synthetic Data Generation Techniques in IDI Research

Synthetic data is a versatile tool. When used correctly, it is compliant with data confidentiality principles, rules and methods set out by Statistics NZ. We believe synthetic data in general may have the risk of single out an individual in the dataset with extreme value and linkage attack with more than one dataset. In order to mitigate these risks, we will follow NZ Stats release guidance and expertise from the NZ Stats release team.

IDI Search App

The IDI Search App allows researchers to explore what variables are available in the IDI, and provides descriptions and SQL information about each variable. The app uses Data Dictionaries from Statistics NZ to collate collection, dataset, and variable information, which is stored within a relational database. The front end is powered by NextJS, allowing us to provide an interface for users to list all variables or, more usefully, those variables, datasets, and collections that match a specific search term.

Digitizing Birth Records for Intergenerational Research

Intergenerational links are important to assess intergenerational transfer of wealth, intergenerational socio-economic mobility, and familial influences on health and wellbeing. Department of Internal Affairs (DIA) birth records in the Integrated Data Infrastructure (IDI) can be used for intergenerational research, but currently these records are only available from 1985 onwards, which limits the ability to investigate multigenerational links. Digitization of earlier records back to 1972 (when full dates of birth - a necessary linking variable - were first recorded) would enable a greater exploration of multigenerational links for the full population of New Zealand births over a 50 year period.

What's in the IDI?

The IDI Search App allows researchers to explore what variables are available in the IDI, and provides descriptions and SQL information about each variable. This build is a prototype of the types of features we would like, and worked as a proof-of-concept before buildling a more fully-featured and user friendly version. Check out the prototype app here: https://idi-search.web.app The full version is here:

Intergenerational Analyses using the IDI

This report documents the potential for intergenerational links in the Integrated Data Infrastructure (IDI). Three questions are answered: How many people can be linked back one, two, and three generations? How does this vary by decade of birth? How does this vary by ethnicity and deprivation?