Using 'big data' to identify new risk factors for sudden unexpected death in infancy

Professor Ed Mitchell
University of Auckland

What is the problem and who does it affect? 

Rates of sudden unexpected death in infancy (SUDI) – previously known as SIDS or cot death – have fallen significantly since the late 1980s. A large share of this reduction is due to the “Back to Sleep” campaign, and the research informing it, which advised mothers to place their baby to sleep on their back.

While the drop in SUDI rates was significant, the rate started to plateau in the 2000s. Maori families significantly overrepresented –  Maori rates are higher than any other ethnic group in the world.


What does this research hope to achieve?

Using the Integrated Data Infrastructure (IDI), Professor Ed Mitchell and his team from the University of Auckland, hope to uncover unknown factors which may be contributing to the risk of a baby dying suddenly an unexpectedly. The IDI is a large database containing linked data about people and households. The data are from a range of government agencies, Statistics NZ surveys – including the census – and NGOs. The size of the IDI allows for patterns and trends to be identified with greater confidence, which cannot be done with smaller datasets.

The IDI represents an incredibly rich source of information. The data are detailed, and can be linked across contact with different services, surveys, and NGOs. To protect identity, individuals cannot be identified, with every link that is made being done using an encrypted NHI number.

The main aspect of the project will be looking at SUDI cases over the past ten years to see if there is an association with the families use of Service Contact, including: Child Youth and Family (CYF), Corrections and the 2013 census. Usually these questions aren’t asked of parents who’ve lost a baby as they are grieving, nor of control families, who would be reluctant to join the study if they are asked to disclose such information. The anonymity of the IDI means these issues aren’t present.

Mitchell and team will use data over a period of ten years, 2006 – 2015. The duration of the study data – with rates falling in the period 2010-2015 – will allow the researchers to identify potential causes or risks in this period of falling rates.

Adjusting for known risk factors –smoking, deprivation etc.– the team will use complex statistical modelling to identify trends and associations in the data. The insights gleaned from this work could quickly provide the medical community with the tools they need to tailor safe sleep messaging, as well providing families with safe sleep devices where deemed necessary.