Contributed talks

[Presented in te reo Māori] He Ara Poutama mō te Reo Māori — Forecasting te Reo Māori Acquisition
Speaker: Pip Bennett
Read abstract

He Ara Poutama mō te Reo Māori is a project that focuses on forecasting levels of te Reo Māori acquisition into the future including consideration of the impacts of changing language revitalisation initiatives. We are working with subject matter experts and organisations to help ensure this taonga is accessible for present and future generations. Nicholson Consulting is a data science consultancy in Te Whanagnui-a-Tara; we are partnering with Kōtātā Insights to use quantitative methods to assist our clients on their haerenga to see Māori as a living language and as an ordinary means of communication. For more information contact

Accelerating Dataset Assembly
Speaker: Simon Anastasiadis
Read abstract

Data preparation is a key stage for analytic and research projects. However, as the number of data sources increases so does the complexity of preparation. Without a consistent method for assembling analysis-ready datasets, this process can become time-consuming, expensive, and error prone.

In response to this, we have developed the Dataset Assembly Tool. By standardising and automating the data preparation and dataset assembly stages of analytic projects, the tool helps staff deliver higher quality work faster. The assembly tool is now available for other researchers to use. This presentation with describe the tool and its advantages.

Adopting an improvement science approach to inform ongoing implementation of the Mana Ake initiative in Canterbury
Speaker: Caralyn Purvis
Read abstract

Recognising the value of data and analytics saw the integration of an improvement science approach to inform service-delivery, within a cross-sector mental health initiative — Mana Ake. An increased prevalence of wellbeing concerns resulted in a holistic, locally-informed, collaborative initiative for tamariki. Internal review of mixed-methods data has allowed for a truly data-informed iterative approach to service improvement. Consequently, evaluation has influenced the responsiveness of the initiative to optimally meet the needs of the communities it supports. A collaborative initiative between health, education, and wider communities. Evaluation domains include: tamariki, whānau, schools/kura, communities, system.

More information:

AIS Explorer — Integrating technology to prevent aquatic invasive species in Minnesota
Speaker: Nick Snellgrove and Petra Muellner
Read abstract

Minnesota in the United States is well known for its beautiful “10,000 lakes” - however the complex system of waterbodies is threatened by the spread of aquatic invasive species. We developed the AIS Explorer in collaboration with the Aquatic Invasive Species Research Center of the University of Minnesota to bridge research and decision making, for example to optimise watercraft inspections.

AIS Explorer:

In this talk we will showcase how a diverse set of technologies, like R, R Shiny, Python and cloud computing was integrated to a create a tool that is easy to use and matches stakeholder needs.

An analysis of household changes using Housing Register data
Speaker: Joel Gibbs
Read abstract

There are a number of data sources and research methods that can be used to understand how households change over time. This project used Social Housing Register data collated via the Integrated Data Infrastructure to analyse the housing pathways of individuals and households who require public housing, and sought out to understand the likelihood that an individual or household that leaves the Housing Register will re-enter the Housing Register in the future. For individuals that re-enter the Housing Register, this research also looked at how households tend to change between applications.

Analysing Surveys with iNZight
Speaker: Tom Elliott
Read abstract

Survey data has a huge importance for many research groups, but many software tools require users to understand the survey design and know how to specify it to the relevent program. iNZight is a graphical user interface for R that lets researchers quickly and efficiently visualise and explore their data. Recent changes to iNZight allow users to forget the design and focus on what matters: exploring their data.

For more information, visit

Application of data science methodologies to explore, predict, and model wellbeing outcomes using the New Zealand Integrated Data Infrastructure (IDI).
Speaker: Anantha Narayanan, Tom Stewart, and Scott Duncan
Read abstract

Wellbeing measures currently available in New Zealand are not sensitive enough to capture the effects of policy change – this is partly due to the lack of detailed population-level wellbeing data. Microdata within NZ’s IDI promises to provide a better understanding of wellbeing; however, detailed measures of wellbeing (from the General Social Survey) are only available for a smaller subset of IDI records (i.e., ~8,500 individuals). Extrapolating these wellbeing data to the full IDI population may be possible via advanced data modelling. In this research, we aim to apply machine learning techniques that can predict wellbeing outcomes from IDI administrative data.

Association between Occupation and Ischaemic Heart Disease (IHD) by sex and ethnicity in the whole New Zealand population using the Integrated Data Infrastructure (IDI)
Speaker: Marine Corbin
Read abstract

Associations between IHD and occupations are poorly understood. This whole-population study aimed to identify occupations associated with increased IHD risk in NZ by sex and ethnicity. A cohort of workers was constructed for the whole NZ adult working population using the Statistics NZ IDI. Occupation was obtained from census data and incident IHD was determined using hospitalisation, prescription, and death records from 2013 to 2018. This study confirmed an increased IHD risk for several occupations previously identified as being high risk for IHD and has also identified some potentially new occupational groups. Important sex and ethnic differences were also observed.

Augmented Decisions - even when you don’t have all the data!
Speaker: Peter Hong
Read abstract

Houston We Have uses proprietary software, data, technology and mathematics to help humans make better decisions, especially when most at risk. Sectors where we already develop predictions, forecasts and risk models for enhanced strategic and operational decision making include defence, financial services and fintech, natural resources, and health. Our software, technology and data insight capability can be deployed in any area where better quality, visibility and traceability around decision making is important.

Automating national land cover mapping using artificial intelligence
Speaker: Alexander Amies and Jan Schindler
Read abstract

Land cover mapping is important for national- and regional-scale policy decision making, and environment and biodiversity monitoring. Previous time steps of Manaaki Whenua’s Land Cover Database (LCDB) have leveraged expert analysis of large volumes of multi-temporal satellite imagery to identify areas of change. This approach traditionally relies on rule-based decision systems in conjunction with time consuming manual data cleaning. We explore the potential of artificial intelligence for developing more flexible, automated processes for land cover classification and change detection. We compare machine learning and object-based methods with image-based convolutional encoder-decoders enabling more fine-grained spatial and temporal predictions.

Bayesian Demography
Speaker: John Bryant
Read abstract

If your data consists of tables rather than individual records, and if one of your variables is age, then you are doing applied demography. Bayesian statistical methods help you do applied demography better. Bayesian methods are particularly helpful if your data is disaggregated, or if you are forecasting. The technical barriers to doing Bayesian demography can, however, be daunting. The talk describes a long-term project to develop methods and tools that reduce these barriers.

Building for wellbeing
Speaker: Michael Nuth
Read abstract

The aim of this project is to develop a digital post occupancy evaluation to efficiently capture the self-reported qualitative perspectives of building occupants about the wellbeing performance of residential buildings. By doing so, my intention is to develop an app-based wellbeing assessment and data collection tool that is informed by the Government’s Living Standards Framework. Data collected via the app will ultimately help inform the planning, design and construction of residential buildings that meet the wellbeing needs of New Zealanders.

The app will be developed and trialled in collaboration with Auckland University of Technology, Tether Ltd and Kāinga Ora.

Contact Michael Nuth @ for more information.

Calibrating Economic Agent-Based Models with Microdata
Speaker: Will Scarrold
Read abstract

Despite the complexity of modern socio-economic systems, current benchmark models assume the economy is much simpler than it really is: households are often assumed to be identical, and firms are assumed to use the same equipment to produce the same “representative” product.

Recent developments in Agent-Based Modelling have provided an alternative to conventional models. Such models show comparable forecasting performance despite being a relatively early stage of development.

This presentation will discuss ABM modelling and the use of microdata from the New Zealand IDI to calibrate these models.

More information on recent macroeconomic agent-based models can be found at

Can behavioural science help stop callers from hanging up?
Speaker: Caitlin Spence
Read abstract

Thousands of people ring the Ministry of Justice every week, but some abandon the call (hang up) before they reach an agent. With the aim of keeping callers on the line, Behavioural Science Aotearoa worked with the contact centre to develop three sets of hold messages designed to reduce caller abandonment. The messages are currently being trialled using a quasi-experimental ‘reversal’ design. The team take advantage of administrative call-level data to measure impact by applying survival analysis techniques.

Behavioural Science Aotearoa apply behavioural science across the Justice Sector to build more effective, people-centred justice services and policies that improve outcomes for all.

Connecting research data with stakeholders – the Gambling Data Explorer
Speaker: Uli Muellner
Read abstract

The National Gambling Study (NGS) is the first New Zealand population representative longitudinal study into gambling behaviours and attitudes, health, and lifestyles. To support the dissemination and use of the findings from the study the National Gambling Study Explorer provides easy and user-friendly access for stakeholders to information on gambling participation and epidemiological risk factors. The Gambling Data Explorer is freely available at This project was delivered for the Ministry of Health in collaboration with AUT. It demonstrates an innovative way to share research findings and data with stakeholders through online dashboards.

Constructing a spatial index of public transport supply across Auckland, Canterbury and Greater Wellington.
Speaker: Adam Ward
Read abstract

In this study we constructed a low-level spatial index of public transport (PT) supply across the 25,000 meshblocks in the Auckland, Canterbury and Greater Wellington regions. Using Open Route planner and Google Transit Feed data, for each meshblock in the study area we calculated the walking distance to the nearest PT location, the number of accessible locations (per capita), the number of accessible routes and (peak-time) services and the transit time from place of residence to place of work via PT networks. These metrics were then combined into an overall PT Supply Score. The results of this study are visualised in an interactive Power BI dashboard.

Constructing an individual level interaction network for modelling COVID-19 in Aotearoa NZ
Speaker: Emily Harvey
Read abstract

Illnesses and deaths from the COVID-19 pandemic have not been evenly distributed across communities. Risk factors for serious disease (including age, ethnicity, & health conditions), and risk factors for infection (including dwelling size, inter-generational living, & type of work) are not independent. In order to capture the interactions between these factors we build on data from linked StatisticsNZ Integrated Data Infrastructure (IDI) to construct a network of the ~5million individuals and their interaction contexts, in particular homes, workplaces, and schools. I will highlight our progress so far, and priorities for future refinements.

Data Wrangling At Warp Speed with Segna
Speaker: Will Haringa
Read abstract

As a Data scientist you want your data in an analysis-ready form faster, delivered through pipelines that don’t break as the shape of your data changes. Harnessing the power of machine learning, Segna’s smart wrangling tool aggregates and cleans data from multiple sources up to 1,600 times faster!

More information:

Epidemiology for non-epidemiologists
Speaker: Andrew Sporle and Daniel Barnett
Read abstract

Comparing population health outcomes for different places, population groups or time periods is complicated by access to data, different population sizes and population age structures. There are robust epidemiological methods for doing these calculations, but they are not always straightforward, especially when depicting the precision estimates in the results. Mātau is a tool that embeds various robust epidemiological methods, population data and imports outcome data to create a point and click way to calculate levels of health outcomes and compare populations. Already in use in Aotearoa and overseas, it is quick, easy to learn and all results in table and graph format.

Epidemix- visualising analytical complexity to improve decision-making for disease control
Speaker: Petra Muellner
Read abstract

Epidemix^[Muellner U, Fournie G, Muellner P, Ahlstrom C, Pfeiffer D. epidemix - an Interactive Multi-Model Application for Teaching and Visualizing Infectious Disease Transmission. Epidemics, doi: 10.1016/j.epidem.2017.12.003, 2017.] allows users to develop an understanding of the impact of disease modelling assumptions on the trajectory of an epidemic and the impact of control interventions, without having to directly deal with the complexity of equations and programming languages. The app provides a visual interface for nine generic models, plus two disease-specific models of international relevance (COVID-19 and ASF). Epidemix supports the teaching of mathematical modelling to non-specialists – including policy makers by demonstrating key concepts of disease dynamics and control in a hands-on way. Funded by the Royal Veterinary College and the City University of Hong Kong (

Exploratory Drivers of Transition as Applied on Workforce Modelling
Speaker: Joel E. Bancolita
Read abstract

Describing factors affecting individuals’ transitions addresses several policy questions. For instance, differentiating retention and exit rates of groups of the workforce based on geography, demography or intervention may inform decision-makers in formulating more targeted policies that can influence funding and the lives of people. We tried developing a framework analysing the drivers of transition in the IDI and employed transition modelling to answer these questions aimed at replicating this to several aspects of people’s lives, such as labour force, education and health. We also employed classification models to address some of limitations, like imputing occupation.

Family Violence in the News: An analysis of media reporting of extreme family violence in New Zealand
Speaker: Harini Dissanayake
Read abstract

This study investigated whether coverage of extreme family violence in New Zealand media is biased across a range of key factors gender, ethnicity, and age of victims, as well as the victim’s relationship to the killer. Our results were derived from a cohort of 946 articles published online by New Zealand media outlets. Analyzing the number of media articles relating to victims from each group (exposure) and their online presence (prominence), we found that although media coverage is generally quite equitable, certain groups of victims are severely under-represented - particularly true for victims from Pacifica communities and elderly victims.

Growth monitoring Aotearoa: Scoping a national system for tamariki and rangatahi
Speaker: Teresa Gontijo de Castro
Read abstract

In New Zealand (NZ) three in ten 2-14 year- olds are affected by overweight/obesity with ethnic disparities. The existing data used to monitor growth of 0-19-year-olds are fragmented and mostly aggregated at the national level. Thus, NZ lacks group-specific information on prevalence, trends, and determinants of healthy growth for this group, crucial information if equitable improvements are to be achieved. We will map and assess diverse national sources of anthropometric data of 0-19-year-olds (last 2 decades) to establish what would be required and which sources would be suitable to be included in a national monitoring platform. For more information:

He Ara Poutama mō te Reo Māori — Forecasting te Reo Māori Acquisition
Speaker: Pip Bennett
Read abstract

He Ara Poutama mō te Reo Māori is a project that focuses on forecasting levels of te Reo Māori acquisition into the future including consideration of the impacts of changing language revitalisation initiatives. We are working with subject matter experts and organisations to help ensure this taonga is accessible for present and future generations. Nicholson Consulting is a data science consultancy in Te Whanagnui-a-Tara; we are partnering with Kōtātā Insights to use quantitative methods to assist our clients on their haerenga to see Māori as a living language and as an ordinary means of communication. For more information contact

HEEP2 - Energy Insights from NZ Homes
Speaker: Greg Overton
Read abstract

The Household Energy End-Use Project #2 (HEEP2) is collecting data on how, and why, energy is used in households across New Zealand. From July 2021, HEEP2 will be recruiting 280 households from the Stats NZ Household Economic Survey. These households will each be monitored over 12 months, using a range of instrumentation and surveys, with the final up-to-date picture of energy use being available in the Stats NZ Data Lab. The HEEP2 data will allow researchers to answer a whole raft of questions, with the aim of understanding how to enable and motivate people to affordably create healthy home environments, in ways that contribute to a low-emissions economy.

For more information visit

How do victims of crime interact with the government?
Speaker: Callum Sleigh
Read abstract

A large proportion of victimisation in New Zealand is unreported. Because of this, the Ministry of Justice has commissioned a series of surveys to understand the nature and prevalence of unreported victimisation. By linking this survey data to administrative datasets in the IDI it is possible to understand government service use for victims who do not necessarily report to the Police.

How people learn Te Reo Māori: Evidence from the IDI
Speaker: Matt Jones
Read abstract

This presentation uses the IDI to combine Census and education data to estimate the number of young people who learn te reo Māori through education versus other methods. It also estimates the amount of formal education required to become proficient in the language. Finally it suggests how a model might work for understanding the growth of te reo speakers over time.

Improved synthetic data method for Stats NZ datalab microdata output
Speaker: Alistair Ramsden
Read abstract

Stats NZ has approved a variation to Microdata Output Guide confidentiality rules, for the ‘He Ara Poutama mō te reo Māori SURF’ datalab project.

The synthetic data method is Classification And Regression Tree (CART) modelling, implemented using the Synthpop R package. The difference compared to existing rules is to generate, test, and release synthetic data counts tables with noised counts values {0,3,6,9,12,…}. Previously such values were {‘Suppressed’,6,9,12,…}.

The new method releases data with better utility (inferential validity), yet retaining adequate safety (disclosure control).

Next steps are to calculate Differential Privacy (DP) parameters {epsilon, delta} for this data and method.

Increasing fine payments: A behavioural science approach
Speaker: Olivia Wills
Read abstract

Over half a million New Zealanders owe fines to the Ministry of Justice, and total collections debt is around US$395m, or US$78 for every person in New Zealand. There are consequences for people who do not pay, as they can be hit with enforcement actions or may be summoned to Court. We used evidence from behavioural science to test new approaches to improve fines payment behaviour, using letters and text message reminders. Our results demonstrate that small tweaks to existing processes can be effective at changing behaviour, and pave the way for the future application of behavioural science in the fines collection space.

Integrating Contact Patterns into Simple Models of Disease Spread
Speaker: Nic Steyn
Read abstract

Simple models of disease spread often assume homogeneous mixing. While these are easy to construct and quick to solve, they ignore important detail. Contact matrices are an easy way of including heterogeneities in these simple models without introducing much complexity but require robust data. I will highlight why these contact matrices are important for our COVID-19 response, outline the best data we have, discuss what we are missing, and give examples of what other countries are doing in this space.

Linked data is vital: evaluating the impact of prehospital care on mortality following major trauma
Speaker: Gabrielle Davie
Read abstract

The potentially fatal or severe consequences of many injuries can be reduced through an optimally structured prehospital trauma system that can provide timely and appropriate care. This retrospective cohort study linked data from St John and Wellington Free Ambulances, the NZ Trauma Registry, Ministry of Health (Mortality, Hospital discharge and NHI) and Coronial post-mortem reports. The project’s aim is to help identify opportunities to optimise the delivery of Emergency Medical Services (EMS) care in NZ. The collaborative multidisciplinary team includes researchers from St John, Universities of Auckland and Otago. More details about this project are in our protocol paper

Modelling the transfer of tacit knowledge on temporal bipartite networks
Speaker: Adrian Ortiz-Cervantes
Read abstract

In this work we analyze the employee mobility in New Zealand and its repercussions on knowledge and skills transfer amongst different industries, in the context of in temporal complex networks. Using tax records from all the employees and firms in New Zealand from the year 2000 to 2017, we created a bipartite temporal network of employers and employees that allow us to implement a model for the transfer of tacit knowledge between employees, and estimate the stock of knowledge inside firms and sectors across different skills.

New Zealand Activity Index
Speaker: Jamas Enright
Read abstract

Economic conditions evolve very quickly during crisis periods, and we (the New Zealand government) needed a timely and high-frequency measure of economic conditions to inform policy-making. Gross Domestic Product (GDP) is released quarterly and 10 weeks after the reference quarter. The New Zealand Activity Index (NZAC) was developed by combining eight source indicators and uses principal component analysis to capture as much co-movement of these indicators as possible. This measure is monthly and comes out 2-3 weeks after the end of the month. While not as comprehensive as GDP the NZAC provides a good leading indicator for monthly economic activity.

Ngā Tapuae: Stepping stones for Māori Student Transitions
Speaker: Marianna Pekar and Joel E. Bancolita
Read abstract

The Social Wellbeing Agency worked in partnership with Tokona Te Raki, a Ngāi Tahu-led collaborative established to increase Māori participation, success and progression in education and employment outcomes. Our research focuses on data discovery to learn actionable insights that identify the most important barriers, levers, and boosters that help young Māori to succeed. The research builds on the existing evidence base investigating the need for better transition pathways for Māori aged 15-29 years. Mixed methods were applied to answer the research question. We conducted the quantitative analysis in New Zealand’s Integrated Data Infrastructure (IDI).

NZ motor vehicle VKT (vehicle kilometres travelled) and fleet size estimation
Speaker: Kain Glensor
Read abstract

Information on the number of motor vehicles in use in NZ and how much they are driven is of great interest to the public and stakeholders. The process for using Waka Kotahi NZTA’s MVR (motor vehicle register) data to estimate these. How, in R, the data is cleaned, processed and projected forward to allow for a real-time estimate of the current figures for both. More information available from Kain Glensor (

Occupational Exposures and Ischaemic Heart Disease (IHD): results from the entire New Zealand population using the Integrated Data Infrastructure (IDI)
Speaker: Amanda Eng
Read abstract

Common occupational exposures have been associated with IHD, but evidence is conflicting. For the whole NZ adult working population at the time of the 2013 census, data were extracted from the Statistics NZ IDI, on occupation and incident IHD from 2013 to 2018. The number of working hours was extracted from the census, and exposure to sedentary work, loud noise, and night shift was assessed through NZ job exposure matrices. This study suggests occupational exposure to high levels of noise and night shifts increase IHD risk while there was no evidence of association with sedentary work and long working hours.

Participatory surveillance of influenza and COVID-19 symptoms in New Zealand
Speaker: Frank Mackenzie
Read abstract

FluTracking is an online participatory surveillance system that has been used for collecting data on influenza seasons in Australia and New Zealand, wherein participants fill out a weekly survey on whether they have experienced symptoms. This has several advantages over sentinel-based data collection, allowing for near real-time reporting of symptoms and mitigating some bias. This study analyses New Zealand FluTracking data from April 2020 to April 2021, examining the impact of various factors on the incidence of flu- and COVID-like symptoms. I will discuss the various ways bias is introduced into these data, the methods employed to mitigate bias, and the potential of systems like FluTracking for public health surveillance.

Pathway modelling to optimise long-term policy impact in New Zealand
Speaker: Suzanne Woodward, Anna Brown, and Mike O’Sullivan
Read abstract

Evaluating policy initiatives is inherently difficult. We can address this by considering how policy decisions inform social pathways across multiple sectors, and how pathway trajectories affect individual wellbeing. Trajectory data and narratives of lived experience that complement large, linked datasets, provide the opportunity to construct models that can accurately predict wellbeing outcomes at scale. Policymakers can use these models to assess the efficacy of policy initiatives, quantifying their contribution to wellbeing. This approach combines ideas and methods from the policy and social sciences, co-design, data science and mathematical modelling to address the wicked problem of policy design. Visit

Population outcome visualisation
Speaker: Daniel Barnett and Andrew Sporle
Read abstract

Communicating analysis of social data to non-statistical decision makers is made even trickier when the results are subject to high levels of variability and uncertainty. Two such situations are the estimation of regional impact of the Covid-19 epidemic and the impact of Census counts on the number of Maori electorates. We have created two public domain tools with simple interfaces that allow non-statisticians to explore how the outcome measures vary with changes to multiple determining factors.

Precision Driven health: Partnership model for health data science
Speaker: Kevin Ross
Read abstract

Precision Driven Health is New Zealand’s health data science partnership. For the past five years, we have been forming collaborations between New Zealand’s health IT sector, health providers and universities, aimed at improving health outcomes through data science. Our work seeks to personalise health by integrating new data sources, developing predictive models, optimise decision making and empower people with new tools. We will share some lessons learned from over 100 projects, including the challenges of working with personal information and ensuring equity is improved by design.

Project Monty: an agent based model of the New Zealand transport system
Speaker: Julie Mugford
Read abstract

The Ministry of Transport is developing an agent based model of the NZ transport system. The model builds a representation of the entire transport system as the sum of millions of different interactions as everyone seeks to achieve their own sets of goals and ambitions. We aim to offer better insights into complex systems by aggregating the interactions of individual agents (people) as they attempt to complete activities. Through their interactions with a transport network and each other, we get a more realistic representation of how individual travel choices impact the system.

Providing customised health intelligence in real-time – COVID-19 dashboards
Speaker: Uli Muellner
Read abstract

Real-time health intelligence is essential to an effective public health response. The NZ COVID-19 dashboards were built with this need in mind drawing on ESR’s EpiSurv database for notifiable diseases. Once published, the dashboards and underlying data streams had to evolve as the response progressed e.g. to differentiate between cases occurring in the community and MIQ or to provide insight into identified outbreaks. Using a R Shiny framework allowed us to dynamically adjust the dashboards to response requirements, which was critical. The COVID-19 dashboards were built for the Ministry of Health in collaboration with ESR. Public dashboard available at:

RAPping in the public sector: binding your legacy code into a pipeline with Python
Speaker: Shrividya Ravi
Read abstract

Straggling legacy SAS scripts requiring manual steps are a common “feature” of data processing tasks in the public sector. However, refactoring such scripts to modern, reproducible analytical pipelines (RAP) can be challenging due to a lack of IT infrastructure or high complexity. In such situations, interim solutions can at least reduce manual effort and mental overhead. One such solution is using Python as an effective glue to create one click execution pipelines. Manual tasks like downloading data from email, updating new data file names in scripts, running scripts in sequence and more, can be managed with Python and its rich ecosystem of packages. In this talk, I will showcase how three Python packages, exchangelib, jupyter and saspy, can create quick and easy automated versions of legacy SAS scripts that contain many types of manual steps.

Representative timeline modelling
Speaker: Simon Anastasiadis
Read abstract

Timeline are sequences of events and periods which reflect significant parts of a person’s experience. They can be an effective tool for researchers seeking to understand life events, and interactions between events, over time.

To share timeline information while respecting privacy and confidentiality, we have developed a representative timeline methodology. This method produces a timeline that captures experiences that are common across a group of people — similar to a group average.

This presentation will demonstrate the technique, drawing examples from our first application of it – a study of South Auckland families’ experiences around the birth of a child.

Seeking feedback on the Administrative Population Census
Speaker: Christine Bycroft and Vinayak Anand Kumar
Read abstract

Stats NZ’s Census Transformation programme will release the first iteration of the experimental Administrative Population Census (APC) in August 2021. The APC uses administrative data to construct an annual census file, and will be an opportunity for groups to provide feedback on an early iteration of admin first Census outputs. At this presentation, we will outline:

  • the features of the 2021 APC to demonstrate its potential value to the research community;
  • how future iterations of the APC will build on the 2021 release .

Please contact Christine Bycroft ( or Vinayak Anand-Kumar ( to get more information about the APC.

Sharing our experiences from COVID-19 and Te Pokapū Hātepe o Aotearoa, the New Zealand Algorithm Hub
Speaker: Pieta Brown
Read abstract

In support of the response to COVID-19, New Zealand initiated and rapidly delivered a solution for national algorithm management. During this project, we stood up a national instance of a machine learning platform, developed a governance process tailored to our local context in Aotearoa, deployed 30 models, and went live with a website ( to support user engagement and interaction. The underlying technology solution has evolved over the last five years through our research partnership focused on the development and delivery of data science in healthcare; it was built in response to the need for tooling to deploy, manage and monitor algorithms safely and effectively in a healthcare context. This talk will share the key lessons learned from our COVID-19 experience and discuss the technology, user engagement and governance processes that made this successful.

Smart Search for Electronic Health Records: An NLP-powered approach
Speaker: Edmond Zhang
Read abstract

This work looks at how advanced information search software can improve how doctors and nurses find what they need to know from patient electronic records. It is hoped that clinicians can find patient specific information faster, and more comprehensively and accurately. While electronic patient records can keep information in one physical area, this does not necessarily make it easier to find specific information. This is especially so for patients with complicated illnesses, or illnesses that affect many different organs at once. With Smart search clinicians should be more confident that they have found all the information needed for patient care.

TAWA - Treasury’s microsimulation model of the New Zealand personal tax and transfer system
Speaker: Michael Eglinton
Read abstract

Tax and Welfare Analysis (TAWA) is the Treasury’s microsimulation model of the New Zealand personal tax and transfer system. The TAWA model uses a combination of survey and administrative data within Stats NZ Integrated Data Infrastructure to model potential policy changes. It is used extensively within Treasury and in external work related to policy analysis of tax and welfare settings.

Te Matatini o te Horapa - a contagion network model for Aotearoa NZ
Speaker: Dion O'Neale
Read abstract

Contagion models for infectious disease can only be as good as the assumptions that they are built on. One common simplifying assumption is that the disease spreads through a well-mixed, homogeneous population (or populations). This has some obvious consequences for modeling the differential impact of disease on different groups. I will introduce an individual level, network based, contagion model, built from a range of data sources, including census data, and will show how it has been used for modelling the spread of CVOID-19 in Aotearoa NZ.

Uncertainty Quantification for Complex Network Contagion Simulation Models
Speaker: Frankie Patten-Elliott
Read abstract

As the complexity of a model increases, so does the uncertainty inherent in the model’s output. I consider Uncertainty Quantification (UQ) techniques to assess the reliability of a complex network contagion model we have developed for informing Aotearoa New Zealand’s COVID-19 response. By fitting Gaussian process surrogate models to output from the network contagion model, we can quickly and efficiently apply UQ methods and perform inference on model parameters. Using New Zealand COVID-19 case number data, we can also condition these surrogates to make predictions without having to rerun the computationally expensive network contagion model.

Understanding gaps in income for fathers with new babies
Speaker: Raj Kulkarni and Tze Ming Mok
Read abstract

This piece of work followed the Having a Baby in South Auckland project, which applied the Social Wellbeing Agency’s representative timeline modelling method to produce insights into South Auckland families’ experiences around the birth of a child. As a result of one of these insights, we investigated fluctuations in new fathers’ income around time of the birth and found that a substantial proportion of fathers who usually earn around the minimum wage have income dips that suggest taking unpaid time off work. Yet, only half of them were eligible for two weeks of parental leave. This presentation discusses certain aspects of fathers’ life around the birth.

Urban Trees and Data Science: Examining Wellbeing
Speaker: Peter Edwards
Read abstract

Trees and forests have major impacts on planetary and human wellbeing.  In an increasingly urbanised world, urban trees and forests become important for human wellbeing. Many studies show the benefits of urban trees and forests – health, climate change, urban planning and ecological perspectives. Using a wellbeing framework, quantitative data from remote sensing, modelling and social and cultural administrative data, we aim to understand the impact of urban trees across a range of wellbeing domains. Using epidemiological methods, we aim to discover correlations and patterns between urban trees and human wellbeing in Singapore and Wellington. More information? Dr Peter Edwards —

Using administrative data to better model receipt of transfers
Speaker: Cory Davis and Luke Symes
Read abstract

Microsimulation models such as the New Zealand Treasury’s Tax and Welfare Analysis (TAWA) model play an important role in estimating the impacts of potential policy changes. A challenge facing these models is in estimating the take up of programmes like the Accommodation Supplement (AS). This can lead to significant uncertainty in estimated outcomes, such as fiscal costs and reductions in child poverty. In this presentation the Treasury’s TAWA team outlines work undertaken to better model AS by using administrative data on this programme contained in Stats NZ’s Integrated Data Infrastructure (IDI).

Using cell phone data to monitor population mobility and tourism recovery
Speaker: Hattie Plant
Read abstract

Analytics on population movement has become increasingly important during the COVID-19 pandemic. In response, Data Ventures partnered with Vodafone and Spark to produce aggregated device counts at an hourly level for small geographies. We then developed a novel weighting methodology to create population estimates, broken down based on whether they are local to an area, domestic visitors, or international visitors. Our daily data provided insight into high mobility areas during lockdown. We now continue to use the data to inform the post COVID-19 recovery of international and domestic tourism.

For more information, see and

Using risk-adjusted Days Alive and Out of Hospital (DAOH) to compare health outcomes across NZ after surgery
Speaker: Luke Boyle
Read abstract

DAOH scores can be an effective way to measure health outcomes by collapsing many negative outcomes after surgery, such as death or readmission, into one number. This study used routine data from the Ministry of Health and applied novel risk adjustment methods to illustrate how DAOH scores can detect differences between patients, different types of operations and DHBs in NZ.

Using this data, we can identify important areas of difference, for example between hospitals, that can be further audited to improve outcomes after surgery or to investigate optimal patient pathways for recovery.

Who is a visitor and who you think is a visitor, are they the same thing?
Speaker: Taylor Winter
Read abstract

Population definitions are well established into our mahi at Stats NZ. However, when Data Ventures reached out to our tourism customers, we identified a key disconnect between expected definitions of visitors and actual definitions. Specifically, our customers thought of visitors as tourists and needed these numbers to inform their COVID-19 response. We developed a new measure of visitors, named ‘short-term visitors’, that predominantly represents tourists and was validated against visa and migration data. We present a comparison between visitor definitions and discuss how these definitions may differ in value based on the context and customer expectation.

Workload of university students in New Zealand
Speaker: Daniel Wrench
Read abstract

This descriptive study uses education and taxation data to analyse patterns in employment for tertiary students in Aotearoa New Zealand. Using the Integrated Data Infrastructure to link tax data and tertiary enrolment data, and the Household Labour Force Survey to estimate hourly rates, we observe a strong seasonal pattern in employment rate and hours worked for full-time students, and to a lesser extent, part-time students. We also find key differences between international and domestic students. This work can give universities better insight into a key part of students’ lives, which has until now only been understood from surveys and anecdotes. To find out more, email Daniel Wrench at