User:Shawndouglas/sandbox/sublevel6

From LIMSWiki
Jump to navigationJump to search

Sandbox begins below

Full article title A new numerical method for processing longitudinal data: Clinical applications
Journal Epidemiology Biostatistics and Public Health
Author(s) Stura, Ilaria; Perracchione, Emma; Migliaretti, Giuseppe; Cavallo, Franco
Author affiliation(s) Università di Torino, Università di Padova
Primary contact Email: Ilaria dot stura at unito dot it
Year published 2018
Volume and issue 15(2)
Page(s) e12881
DOI 10.2427/12881
ISSN 2282-0930
Distribution license Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Website https://ebph.it/index.php/ebph/article/view/12881
Download https://ebph.it/article/view/12881/11630 (PDF)

Abstract

Background: Processing longitudinal data is a computational issue that arises in many applications, such as in aircraft design, medicine, optimal control, and weather forecasting. Given some longitudinal data, i.e., scattered measurements, the aim consists in approximating the parameters involved in the dynamics of the considered process. For this problem, a large variety of well-known methods have already been developed.

Results: Here, we propose an alternative approach to be used as an effective and accurate tool for the parameters fitting and prediction of individual trajectories from sparse longitudinal data. In particular, our mixed model, that uses radial basis functions (RBFs) combined with stochastic optimization algorithms (SOMs), is here presented and tested on clinical data. Further, we also carry out comparisons with other methods that are widely used in this framework.

Conclusion: The main advantages of the proposed method are the flexibility with respect to the datasets, meaning that it is effective also for truly irregularly distributed data, and its ability to extract reliable information on the evolution of the dynamics.

Keywords: statistical method, radial basis function; stochastic optimization algorithm, longitudinal data

Introduction

Longitudinal data are often the object of study in many fields, e.g., sociology, meteorology, and medicine. In medicine, repeated measurements are used to monitor patients’ behaviors and also to adjust therapies accordingly. However, many problems occur when these data are analyzed. Indeed, each time series could have a different number of observations and not be equally spaced. In addition, the sampling period could vary from patient to patient, and measurement errors and also missing data often occur. Thus, since in these cases common methods such as linear regression usually fail, the recent research is directed towards more robust statistical methods. For instance, longitudinal data are commonly analyzed using parametric models such as Bayesian ones[1], as well as functional data analysis (FDA).[2][3] In both cases, many data are required in order to model the behavior of the studied variable(s). These methods, in fact, try to find an "average curve" using all the data, including truncated series and observations with missing information.

However, in clinical applications the estimate on the future dynamics of a single series, given few previous values, could be needed; think for instance to tumor volumes during a treatment, height/weight of children during growth, and concentration of some substance in the body. Each patient is different and could have different growth behavior and different growth parameters, so an "average curve" could not be sufficient. An important piece of information could be, for example, the possible future development of the subject, given his/her previous growth and the clinical background (e.g., treatments). These data could be compared with the real dynamics, in order to see if the response of the patients to the treatment is stable (the parameters do not vary in the future) or not (change in the parameters).

The aim of our work is to propose our numerical tool that can provide information on the future dynamics given few follow-up data. Thus, we first model longitudinal data via widely used mathematical models in population dynamics. As such, on one hand we aim at validating such a model by approximating the parameters involved in the dynamics. On the other one, we are also interested in giving reliable information on the future dynamics of the curves.

In order to achieve our goal, we propose our numerical tool based on optimization methods coupled with interpolation techniques. Specifically, we approximate the parameters involved in the dynamics by means of stochastic optimization algorithms (SOMs).[4][5][6][7] Moreover, for each data series, we improve the performances of the optimization tools by means of radial basis function (RBF) interpolation; see Fasshauer and McCourt[8] for a general overview and Cavoretto et al.[9][10] for particular instances on the topic and applications. In the interpolation process, we also take into account the critical computational issue of carrying out stable computations. For this reason, and since data are subject to noise, we adopt a kind of Tikhonov regularization.[11]

The method, namely RBF-SOM, is here tested on two different datasets:

  • height measurements of children with a diagnosis of growth hormone deficiency (GHD) during treatment, and
  • prostate-specific antigen (PSA) values of prostatectomized patients with a recurrence of prostate cancer.

In the next section of this paper, the RBF-SOM technique is described. Afterwards, the two datasets used for the validation are presented. The "Results" section is devoted to the numerical results and it is divided into two subsections: in the first one, all the data of each series are considered in order to reconstruct the curves and approximate the parameters, while, in the second one, only a few initial data of each series are used to predict the curve behavior. The last two sections offer a discussion and conclusions.

Methods

This section is devoted to describe the method used to fit a given data series and to approximate the parameters involved in the dynamics.

Given several scattered measurements sampled at different times , the basic idea of the RBF-SOM here proposed consists in considering the theoretical function f, depending on the time t and on several parameters λ = (λ1,..., λp), and to approximate such parameters in order to obtain reliable information on the biological or physical phenomenon.


In the proposed examples, we use, as theoretical growth curve f, the so-called Gompertzian function:

,

where f0 is the measurement at time t0 (i.e., the first measurement), λ1 is the growth rate, and λ2 is the carrying capacity, i.e., the maximum value that can be asymptotically achieved by f.

The Gompertzian function is characterized by a fast-growing initial period and by a progressive slowdown, reaching a carrying capacity after a certain time. This curve, depending on the values of the parameters, is able to model a variety of types of growth, from human to cancer cells ones, see [12-16] for details. For this reason, we will use in Section 5 the same function for both datasets. Moreover, its form is particularly suitable in this study because the parameter estimation is not possible via simple methods like Least Square Approximation.

Trivially, the parameters are approximated by finding

.

Note that we need optimization methods that can be used in case of non-linearity of f, as in the considered cases. In particular, we direct our research throughout stochastic methods. They have been designed by considering analogies with natural phenomena. The most popular are evolution strategy and genetic algorithms, both based on competition among individuals. On the opposite, other methods proposed in the last decades mainly focus on cooperation. Among them, particle swarm optimization (PSO), cuckoo search (CS), and ant colony are widely used techniques, based on the mutual interaction and exchange of information between individuals. In particular, here we will consider PSO and CS, briefly described in what follows.

References

  1. Rao, C.R. (1987). "Prediction of Future Observations in Growth Curve Models". Statistical Science 2 (4): 434–47. doi:10.1214/ss/1177013119. 
  2. Ji, H; Müller, H.-G. (2017). "Optimal designs for longitudinal and functional data". Statistical Methodology Series B 79 (3): 859-876. doi:10.1111/rssb.12192. 
  3. Ramsay, J.; Silverman, B.W. (2005). Functional Data Analysis. Springer-Verlag. pp. 428. ISBN 9780387400808. 
  4. Kennedy, J.; Eberhart, R. (1995). "Particle swarm optimization". Proceedings of ICNN'95 - International Conference on Neural Networks 4: 1942–8. doi:10.1109/ICNN.1995.488968. 
  5. Parsopoulos, K.; Vrahatis, M. (2002). "Particle swarm optimization method for constrained optimization problems". In Sincák, P.; Kvasnicka, V.; Vascák, J.; Pospíchal, J.. Intelligent Technologies: from Theory to Applications. Frontiers in Artificial Intelligence and Applications. 76. IOS Press. pp. 214–20. ISBN 9781586032562. 
  6. Pedersen, M.E.H.; Chipperfield, A.J. (2010). "Simplifying Particle Swarm Optimization". Applied Soft Computing 10 (2): 618–28. doi:10.1016/j.asoc.2009.08.029. 
  7. Shi, Y.; Eberhart, R. (1998). "A modified particle swarm optimizer". 1998 IEEE International Conference on Evolutionary Computation Proceedings: 69–73. doi:10.1109/ICEC.1998.699146. 
  8. Fasshauer, G.; McCourt, M. (2015). Kernel-based Approximation Methods using MATLAB. Interdisciplinary Mathematical Sciences. 19. World Scientific. pp. 536. doi:10.1142/9335. ISBN 9789814630139. 
  9. Cavoretto, R.; De Rossi, A.; Perracchione, E. (2018). "Optimal Selection of Local Approximants in RBF-PU Interpolation". Journal of Scientific Computing 74 (1): 1–22. doi:10.1007/s10915-017-0418-7. 
  10. Cavoretto, R.; De Rossi, A.; Qiao, H. (2018). "Topology analysis of global and local RBF transformations for image registration". Mathematics and Computers in Simulation 147 (5): 52–72. doi:10.1016/j.matcom.2017.10.010. 
  11. Cancelliere, R.; Gai, M.; Gallinari, P.; Rubini, L. (2015). "OCReP: An Optimally Conditioned Regularization for pseudoinversion based neural training". Neural Networks 71 (11): 76–87. doi:10.1016/j.neunet.2015.07.015. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation, spelling, and grammar. We also added PMCID and DOI when they were missing from the original reference. No other modifications were made in accordance with the "no derivatives" portion of the distribution license.