Programme And Abstracts For Tuesday 28th Of November
Keynote: Tuesday 28th 9:00 Mantra
Cluster Capture-Recapture: A New Framework For Estimating Population Size
Rachel Fewster
University of Auckland
Ask any wildlife manager: their first burning question is “How many are there?”, and their second is “Are they trending upwards or downwards?” Capture-recapture is one of the most popular methods for estimating population size and trends. As the name suggests, it relies on being able to identify the same animal upon multiple capture occasions. The pattern of captures and recaptures among identified animals is used to estimate the number of animals never captured.
Physically capturing and tagging animals can be a dangerous and stressful experience for both the animals and their human investigators - or if it transpires that the animals actually enjoy it, biased inference may result. Consequently, researchers increasingly favour non-invasive sampling using natural tags that allow animals to be identified by features such as coat markings, dropped DNA samples, acoustic profiles, or spatial locations. These innovations greatly broaden the scope of capture-recapture estimation and the number of capture samples achievable. However, they are imperfect measures of identity, effectively sacrificing sample quality for quantity and accessibility. As a result, capture-recapture samples no longer generate capture histories in which the matching of repeated samples to a single identity is certain. Instead, they generate data that are informative—but not definitive—about animal identity.
I will describe a new framework for drawing inference from capture-recapture studies when there is uncertainty in animal identity. In the cluster capture-recapture framework, we assume that repeated samples from the same animal will be similar, but not necessarily identical, to each other. Overlap is also possible between clusters of samples generated by different animals. We treat the sample data as a clustered point process, and derive the necessary probabilistic properties of the process to estimate abundance and other parameters using a Palm likelihood approach.
Because it avoids any attempts at explicit sample-matching, the cluster capture-recapture method can be very fast, taking much the same time to analyse millions of sample-comparisons as it does to analyse hundreds. I will describe a preliminary framework for abundance estimation from acoustic monitoring. Cluster capture-recapture can also be used for behavioural studies, and I will show an example using camera-trap data from a partially-marked population of forest ship rats.
Tuesday 28th 10:30 Narrabeen
Propensity Score Approaches In The Presence Of Missing Data: Comparison Of Balance And Treatment Effect Estimates
Jannah Baker1, Tim Watkins1,2, and Laurent Billot1,3
1The George Insitute for Global Health
2University of Sydney
3University of New South Wales
Tuesday 28th 10:30 Gunnamatta
Visualising Model Selection Stability In High-Dimensional Regression Models
Garth Tarr
University of Sydney
Tuesday 28th 10:50 Narrabeen
The Missing Link: An Equivalence Result For Likelihood Based Methods In Missing Data Problems
Firouzeh Noghrehchi, Jakub Stoklosa, Spiridon Penev, and David I. Warton
University of New South Wales
Tuesday 28th 10:50 Gunnamatta
Dimensionality Reduction Of LIBS Data For Bayesian Analysis
Anjali Gupta1, James Curran1, Sally Coulson2, and Christopher Triggs1
1University of Auckland
2ESR
Tuesday 28th 11:10 Narrabeen
Analysis Of Melanoma Data With A Mixture Of Survival Models Utilising Multi-Class DLDA To Inform Mixture Class
Sarah Romanes, John Ormerod, and Jean Yang
University of Sydney
Tuesday 28th 11:10 Gunnamatta
Forecasting Hotspots Of Potentially Preventable Hospitalisations With Spatially Aggregated Longitudinal Health Data: All Subset Model Selection With A Novel Implementation Of Repeated K-Fold Cross-Validation
Matthew Tuson, Berwin Turlach, Kevin Murray, Mei Ruu Kok, Alistair Vickery, and David Whyatt
University of Western Australia
It is sometimes difficult to target individuals for health intervention due to limited information on their behaviour and risk factors. In such cases place-based interventions targeting geographical ‘hotspots’ with higher than average rates of health service utilisation may be effective. Many studies exist examining predictors of hotspots, but often do not consider that place-based interventions are typically costly and take time to develop and implement, and hotspots often regress to the mean in the short-term. Long-term geographical forecasting of hotspots using validated statistical models is essential in effectively prioritising place-based health interventions.
Existing methods forecasting hotspots tend to prioritise positive predicted value (i.e. correct predictions) at the expense of sensitivity. This work introduces methods to develop models optimising both positive predicted value and sensitivity concurrently. These methods utilise spatially aggregated administrative health data, WA census population data, and ABS geographic boundaries, combining all subset model selection with a novel implementation of repeated cross-validation for longitudinal data. Results from models forecasting 3-year hotspots for four potentially preventable hospitalisations are presented, namely: type II diabetes mellitus, heart failure, high risk foot, and chronic obstructive pulmonary disease (COPD).
Tuesday 28th 11:30 Narrabeen
Identifying Clusters Of Patients With Diabetes Using A Markov Birth-Death Process
Mugdha Manda, Thomas Lumley, and Susan Wells
University of Auckland
Tuesday 28th 11:30 Gunnamatta
Challenges Analysing Combined Agricultural Field Trials With Partially Overlapping Treatments
Kerry Bell and Michael Mumford
Queensland Department of Agriculture and Fisheries
To make recommendations on which management practices have the potential to increase crop yield there needs to be a consistent pattern demonstrated across trials from many environments. This presentation considers a case study looking at 31 mungbean trials in northern Australian from 2014 to 2016. The trials did not always have consistent factors (e.g. variety, row spacing or target plant density) or even consistent factor levels. To overcome the issue of inconsistent factors, environments were defined as the combination of site, year and any management factors not common across trials (e.g. time of sowing, irrigation, fertiliser).
There were numerous full factorial combinations within subsets of the data that could be considered for investigation so the first challenge was to determine which factorial combinations to focus on to best address the research questions and reporting requirements. Once this was determined, all the data from the trials that contributed to the factorial were included in a combined analysis using linear mixed models. In this model, the factorial of interest was partitioned in the test of fixed effects while each trials’ design parameters and residual variances were estimated using all the data from each trial. An example of the above mentioned factorial combinations is environment by row spacing for one particular variety. The next challenge was that with so many environments there was usually an environment by row spacing interaction which was not useful for making recommendations about row spacing.
Clustering of environments allowed forming groups that did not have a significant interaction between row spacing and environment. These groups were then generalised to types of environments with certain responses to row spacing.
Tuesday 28th 11:50 Narrabeen
Sparse Phenotyping Designs For Early Stage Selection Experiments In Plant Breeding Programs
Nicole Cocks, Alison Smith, David Butler, and Brian Cullis
University of Wollongong
The early stages of cereal and pulse breeding programs typically involve in excess of 500 test lines. The test lines are promoted through a series of trials based on their performance (yield) and other desirable traits such as heat/drought tolerance, disease resistance, etc. It is therefore important to ensure the design (and analysis) of these trials are efficient in order to appropriately and accurately guide the breeders through their selection decisions, until only a small number of elite lines remain.
The design of early stage variety trials in Australia provided the motivation for developing a new design strategy. The preliminary stages of these programs have limited seed supply, which limits the number of trials and replicates of test lines that can be sown. Traditionally, completely balanced block designs or grid plot designs were sown at a small number of environments in order to select the highest performing lines for promotion to the later stages of the program. Given our understanding of variety (i.e. line) by environment interaction, this approach is not a sensible or optimal use of the limited resources available.
A new method to allow for a larger number of environments to be sampled for situations where seed supply is limited and number of test lines is large will be discussed. This strategy will be referred to as sparse phenotyping, which is developed within the linear mixed model framework as a model-based design approach to generating optimal trial designs for early stage selection experiments.
Tuesday 28th 12:10 Narrabeen
Comparisons Of Two Large Long-Term Studies In Alzheimer’s Disease
Charley Budgeon
University of Western Australia
The incidence of Alzheimer’s disease (AD), the leading cause of dementia, is predicted to increase at least three fold by 2050. Curing this disease is a global priority. Currently, two major studies are attempting to gain further understanding of this disease; the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Australian Imaging, Biomarker and Lifestyle Study (AIBL). We describe these two cohorts to assess the impact of combining them to provide a larger cohort for analyses.
An initial comparison of the protocols was carried out and recruitment strategies were shown to be marginally different between the studies. Inclusion criteria specified ages between 55 and 90 years in ADNI and > 65 years in AIBL. Marginally different specifications for disease stage classifications of healthy controls (HC), mild cognitively impaired (MCI) and AD individuals were observed, for example, different Mini-Mental State Exam (MMSE) cut-offs. However, both studies had AD diagnosis supported by the NINDS/ARDA criteria. Baseline characteristics were compared between ADNI and AIBL cohorts. Overall, AIBL had more HCs compared to ADNI (69% vs 30%), but fewer MCI individuals (12% vs 50%). The ADNI cohort had a higher level of education and generally, within a disease classification, there were minimal differences in baseline age, sex, MMSE, and Preclinical Alzheimer Cognitive Composite (PACC) scores.
Longitudinal analyses compared the change over time for the two cohorts and disease classifications for PACC and MMSE. There were no significant differences in cohorts within the HC and MCI groups, but within the AD group, subjects in the ADNI cohort had generally higher predicted PACC and MMSE scores over time than those in AIBL.
Our results suggest there is the potential to combine the ADNI and AIBL cohorts for analysis purposes to provide one more powerful data set; however, consideration should be taken for some measures.
Tuesday 28th 12:10 Gunnamatta
A One-Stage Mixed Model Analysis Of Canola Chemistry Trials
Daniel Tolhurst, Ky Mathews, Alison Smith, and Brian Cullis
University of Wollongong
The National Variety Trials (NVT) program is used by plant breeding companies to evaluate the yield potential of new crop varieties independently across a large range of Australian growing conditions. By comparison with the remaining NVT crops, grower decisions are further complicated in canola because of its vulnerability to the infestation of weeds. A measure historically used by farmers for the management of weeds is the application of a herbicide (chemistry) treatment. The choice of chemistry is important as it restricts variety selection to those bred with the specific tolerance. The set of varieties currently evaluated in NVT are tolerant to one of three chemistries, namely imidazolinone (I; but marketed as Clearfield), glyphosate (Roundup Ready; R) or triazine (T), or have no specific tolerance (i.e. conventional canola; C). Consequently, canola has a more complex testing regime than the remaining NVT crops as each trial has a nested treatment structure involving both chemistries and varieties.
Canola trials are conducted in locations across the Australian grain belt and reflect best farmer practice for each district. Every site is partitioned into several field blocks and plots are allocated to the treatments according to orthogonal block designs. A spray boom is used to administer each chemistry but is pragmatic in the sense that large areas are treated simultaneously. This precludes the application of different sprays to plots in the same block. Randomisation is therefore restricted so varieties in a single block are tolerant to the same chemistry. However, as the number of chemistries and blocks are exactly equal, there is no information to estimate the experimental error variation and both are statistically confounded. Consequently, growers are limited to evaluating varieties with the same tolerance as comparisons across chemistries are invalid. This also has important implications on the statistical analysis, which is discussed in this talk.
Tuesday 28th 12:30 Narrabeen
A Semi-Parametric Linear Mixed Models For Longitudinally Measured Fasting Blood Sugar Level Of Adult Diabetic Patients
Tafere Tilahun1, Belay Birlie1, and Legesse Kassa Debusho2
1Jimma University
2University of South Africa
Tuesday 28th 12:30 Gunnamatta
Individual And Joint Analyses Of Sugarcane Experiments To Select Test Lines
Alessandra Dos Santos1, Chris Brien2,4, Clarice G. B. Demétrio1, Renata Alcarde Sermarini1,5, Guilherme A. P. Silva3, and Sandro R. Fuzatto3
1University of São Paulo
2University of South Australia
3CTC - Piracicaba
4Universtiy of Adelaide
5University of Adelaide