Difference between revisions of "User:Shawndouglas/sandbox/sublevel6"

From LIMSWiki
Jump to navigationJump to search
 
(190 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div class="nonumtoc">__TOC__</div>
==This is demo code demoing math==
{{ombox
| type      = notice
| style    = width: 960px;
| text      = This is sublevel2 of my sandbox, where I play with features and test MediaWiki code. If you wish to leave a comment for me, please see [[User_talk:Shawndouglas|my discussion page]] instead.<p></p>
}}


==Sandbox begins below==
As a typical example, from a [[calibration plot]] following a [[linear equation]] taken here as the simplest possible model:
{{Infobox journal article
|name        =
|image        =
|alt          = <!-- Alternative text for images -->
|caption      =
|title_full  = A new numerical method for processing longitudinal data: Clinical applications
|journal      = ''Epidemiology Biostatistics and Public Health''
|authors      = Stura, Ilaria; Perracchione, Emma; Migliaretti, Giuseppe; Cavallo, Franco
|affiliations = Università di Torino, Università di Padova
|contact      = Email: Ilaria dot stura at unito dot it
|editors      =
|pub_year    = 2018
|vol_iss      = '''15'''(2)
|pages        = e12881
|doi          = [http://10.2427/12881 10.2427/12881]
|issn        = 2282-0930
|license      = [https://creativecommons.org/licenses/by-nc-nd/4.0/ Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International]
|website      = [https://ebph.it/index.php/ebph/article/view/12881 https://ebph.it/index.php/ebph/article/view/12881]
|download    = [https://ebph.it/article/view/12881/11630 https://ebph.it/article/view/12881/11630] (PDF)
}}
{{ombox
| type      = content
| style    = width: 500px;
| text      = This article should not be considered complete until this message box has been removed. This is a work in progress.
}}
==Abstract==
'''Background''': Processing longitudinal data is a computational issue that arises in many applications, such as in aircraft design, medicine, optimal control, and weather forecasting. Given some longitudinal data, i.e., scattered measurements, the aim consists in approximating the parameters involved in the dynamics of the considered process. For this problem, a large variety of well-known methods have already been developed.


'''Results''': Here, we propose an alternative approach to be used as an effective and accurate tool for the parameters fitting and prediction of individual trajectories from sparse longitudinal data. In particular, our mixed model, that uses radial basis functions (RBFs) combined with stochastic optimization algorithms (SOMs), is here presented and tested on clinical data. Further, we also carry out comparisons with other methods that are widely used in this framework.
: <math>f(x) = ax + b </math>


'''Conclusion''': The main advantages of the proposed method are the flexibility with respect to the datasets, meaning that it is effective also for truly irregularly distributed data, and its ability to extract reliable [[information]] on the evolution of the dynamics.
where, <math>f(x)</math> corresponds to the signal measured (e.g. voltage, luminescence, energy, etc.)
 
'''Keywords''': statistical method, radial basis function; stochastic optimization algorithm, longitudinal data
 
==Introduction==
Longitudinal data are often the object of study in many fields, e.g., sociology, meteorology, and medicine. In medicine, repeated measurements are used to monitor patients’ behaviors and also to adjust therapies accordingly. However, many problems occur when these data are analyzed. Indeed, each time series could have a different number of observations and not be equally spaced. In addition, the sampling period could vary from patient to patient, and measurement errors and also missing data often occur. Thus, since in these cases common methods such as linear regression usually fail, the recent research is directed towards more robust statistical methods. For instance, longitudinal data are commonly analyzed using parametric models such as Bayesian ones<ref name="RaoPrediction87">{{cite journal |title=Prediction of Future Observations in Growth Curve Models |journal=Statistical Science |author=Rao, C.R. |volume=2 |issue=4 |pages=434–47 |year=1987 |doi=10.1214/ss/1177013119}}</ref>, as well as functional data analysis (FDA).<ref name="JiOptimal17">{{cite journal |title=Optimal designs for longitudinal and functional data |journal=Statistical Methodology Series B |author=Ji, H; Müller, H.-G. |volume=79 |issue=3 |pages=859-876 |year=2017 |doi=10.1111/rssb.12192}}</ref><ref name="RamsayFunctional05">{{cite book |title=Functional Data Analysis |author=Ramsay, J.; Silverman, B.W. |publisher=Springer-Verlag |pages=428 |year=2005 |isbn=9780387400808}}</ref> In both cases, many data are required in order to model the behavior of the studied variable(s). These methods, in fact, try to find an "average curve" using all the data, including truncated series and observations with missing information.
 
However, in clinical applications the estimate on the future dynamics of a single series, given few previous values, could be needed; think for instance to tumor volumes during a treatment, height/weight of children during growth, and concentration of some substance in the body. Each patient is different and could have different growth behavior and different growth parameters, so an "average curve" could not be sufficient. An important piece of information could be, for example, the possible future development of the subject, given his/her previous growth and the clinical background (e.g., treatments). These data could be compared with the real dynamics, in order to see if the response of the patients to the treatment is stable (the parameters do not vary in the future) or not (change in the parameters).
 
The aim of our work is to propose our numerical tool that can provide information on the future dynamics given few follow-up data. Thus, we first model longitudinal data via widely used mathematical models in population dynamics. As such, on one hand we aim at validating such a model by approximating the parameters involved in the dynamics. On the other one, we are also interested in giving reliable information on the future dynamics of the curves.
 
In order to achieve our goal, we propose our numerical tool based on optimization methods coupled with interpolation techniques. Specifically, we approximate the parameters involved in the dynamics by means of stochastic optimization algorithms (SOMs).<ref name="KennedyParticle95">{{cite journal |title=Particle swarm optimization |journal=Proceedings of ICNN'95 - International Conference on Neural Networks |author=Kennedy, J.; Eberhart, R. |volume=4 |pages=1942–8 |year=1995 |doi=10.1109/ICNN.1995.488968}}</ref><ref name="ParsopoulosParticle02">{{cite book |chapter=Particle swarm optimization method for constrained optimization problems |title=Intelligent Technologies: from Theory to Applications  |author=Parsopoulos, K.; Vrahatis, M. |editor=Sincák, P.; Kvasnicka, V.; Vascák, J.; Pospíchal, J. |publisher=IOS Press |volume=76 |series=Frontiers in Artificial Intelligence and Applications |pages=214–20 |year=2002 |isbn=9781586032562}}</ref><ref name="PedersenSimp10">{{cite journal |title=Simplifying Particle Swarm Optimization |journal=Applied Soft Computing |author=Pedersen, M.E.H.; Chipperfield, A.J. |volume=10 |issue=2 |pages=618–28 |year=2010 |doi=10.1016/j.asoc.2009.08.029}}</ref><ref name="ShiAMod98">{{cite journal |title=A modified particle swarm optimizer |journal=1998 IEEE International Conference on Evolutionary Computation Proceedings |author=Shi, Y.; Eberhart, R. |pages=69–73 |year=1998 |doi=10.1109/ICEC.1998.699146}}</ref> Moreover, for each data series, we improve the performances of the optimization tools by means of radial basis function (RBF) interpolation; see Fasshauer and McCourt<ref name="FasshauerKernel15">{{cite book |title=Kernel-based Approximation Methods using MATLAB |author=Fasshauer, G.; McCourt, M. |publisher=World Scientific |series=Interdisciplinary Mathematical Sciences |volume=19 |pages=536 |year=2015 |isbn=9789814630139 |doi=10.1142/9335}}</ref> for a general overview and Cavoretto ''et al.''<ref name="CavorettoOptimal18">{{cite journal |title=Optimal Selection of Local Approximants in RBF-PU Interpolation |journal=Journal of Scientific Computing |author=Cavoretto, R.; De Rossi, A.; Perracchione, E. |volume=74 |issue=1 |pages=1–22 |year=2018 |doi=10.1007/s10915-017-0418-7}}</ref><ref name="CavorettoTopology18">{{cite journal |title=Topology analysis of global and local RBF transformations for image registration |journal=Mathematics and Computers in Simulation |author=Cavoretto, R.; De Rossi, A.; Qiao, H. |volume=147 |issue=5 |pages=52–72 |year=2018 |doi=10.1016/j.matcom.2017.10.010}}</ref> for particular instances on the topic and applications. In the interpolation process, we also take into account the critical computational issue of carrying out stable computations. For this reason, and since data are subject to noise, we adopt a kind of Tikhonov regularization.<ref name="CancelliereOCReP15">{{cite journal |title=OCReP: An Optimally Conditioned Regularization for pseudoinversion based neural training |journal=Neural Networks |author=Cancelliere, R.; Gai, M.; Gallinari, P.; Rubini, L. |volume=71 |issue=11 |pages=76–87 |year=2015 |doi=10.1016/j.neunet.2015.07.015}}</ref>
 
The method, namely RBF-SOM, is here tested on two different datasets:
 
* height measurements of children with a diagnosis of growth hormone deficiency (GHD) during treatment, and
* prostate-specific antigen (PSA) values of prostatectomized patients with a recurrence of prostate cancer.
 
In the next section of this paper, the RBF-SOM technique is described. Afterwards, the two datasets used for the validation are presented. The "Results" section is devoted to the numerical results and it is divided into two subsections: in the first one, all the data of each series are considered in order to reconstruct the curves and approximate the parameters, while, in the second one, only a few initial data of each series are used to predict the curve behavior. The last two sections offer a discussion and conclusions.
 
==Methods==
This section is devoted to describe the method used to fit a given data series and to approximate the parameters involved in the dynamics.
 
Given several scattered measurements <math> \{ y_{i} \}_{i=1}^N</math> sampled at different times <math> \{ t_{i} \}_{i=1}^N</math> , the basic idea of the RBF-SOM here proposed consists in considering the theoretical function ''f'', depending on the time ''t'' and on several parameters λ = (λ<sub>1</sub>,..., λ<sub>p</sub>), and to approximate such parameters in order to obtain reliable information on the biological or physical phenomenon.
 
 
In the proposed examples, we use, as theoretical growth curve ''f'', the so-called Gompertzian function:
 
:<math> f(t,\lambda_{1}, \lambda_{2}) = \lambda_{2} \exp\left( -\log\left( \frac{\lambda_{2}}{f_{0}} \right) \exp\left(\lambda_{1} (t-t_{0}) \right) \right)</math>,
 
where ''f<sub>0</sub>'' is the measurement at time ''t<sub>0</sub>'' (i.e., the first measurement), λ<sub>1</sub> is the growth rate, and λ<sub>2</sub> is the
carrying capacity, i.e., the maximum value that can be asymptotically achieved by ''f''.
 
The Gompertzian function is characterized by a fast-growing initial period and by a progressive slowdown, reaching a carrying capacity after a certain time. This curve, depending on the values of the parameters, is able to model a variety of types of growth, from human to cancer cells ones, see [12-16] for details. For this reason, we will use in Section 5 the same function for both datasets. Moreover, its form is particularly suitable in this study because the parameter estimation is not possible via simple methods like Least Square Approximation.
 
Trivially, the parameters are approximated by finding
 
:<math> \tilde{\lambda} = min_\lambda \left( \sum_{k=1, ..., N} \left(y_i - f(t,\lambda_{1}, \lambda_{2}) \right)^2 \right)</math>.
 
Note that we need optimization methods that can be used in case of non-linearity of ''f'', as in the considered cases. In particular, we direct our research throughout stochastic methods. They have been designed by considering analogies with natural phenomena. The most popular are evolution strategy and genetic algorithms, both based on competition among individuals. On the opposite, other methods proposed in the last decades mainly focus on cooperation. Among them, particle swarm optimization (PSO), cuckoo search (CS), and ant colony are widely used techniques, based on the mutual interaction and exchange of information between individuals. In particular, here we will consider PSO and CS, briefly described in what follows.
 
==References==
{{Reflist|colwidth=30em}}
 
==Notes==
This presentation is faithful to the original, with only a few minor changes to presentation, spelling, and grammar. We also added PMCID and DOI when they were missing from the original reference. No other modifications were made in accordance with the "no derivatives" portion of the distribution license.  
 
<!--Place all category tags here-->
[[Category:LIMSwiki journal articles (added in 2018)‎]]
[[Category:LIMSwiki journal articles (all)‎]]
[[Category:LIMSwiki journal articles on public health informatics]]

Latest revision as of 20:31, 18 September 2022

This is demo code demoing math

As a typical example, from a calibration plot following a linear equation taken here as the simplest possible model:

where, corresponds to the signal measured (e.g. voltage, luminescence, energy, etc.)