Support Vector Machine: Research Papers and Abstracts
                                                                                    compiled by Subasish Das

acc_research
[1] Ya Li, Xinmei Tian, Mingli Song, and Dacheng Tao. Multi-task proximal support vector machine. Pattern Recognition, 48(10):3249 - 3257, 2015. Discriminative Feature Learning from Big Data for Visual Recognition. [ bib | DOI | http ]
Abstract With the explosive growth of the use of imagery, visual recognition plays an important role in many applications and attracts increasing research attention. Given several related tasks, single-task learning learns each task separately and ignores the relationships among these tasks. Different from single-task learning, multi-task learning can explore more information to learn all tasks jointly by using relationships among these tasks. In this paper, we propose a novel multi-task learning model based on the proximal support vector machine. The proximal support vector machine uses the large-margin idea as does the standard support vector machines but with looser constraints and much lower computational cost. Our multi-task proximal support vector machine inherits the merits of the proximal support vector machine and achieves better performance compared with other popular multi-task learning models. Experiments are conducted on several multi-task learning datasets, including two classification datasets and one regression dataset. All results demonstrate the effectiveness and efficiency of our proposed multi-task proximal support vector machine.

Keywords: Multi-task learning
[2] W. Zhao, J.K. Liu, and Y.Y. Chen. Material behavior modeling with multi-output support vector regression. Applied Mathematical Modelling, 39(17):5216 - 5229, 2015. [ bib | DOI | http ]
Abstract Based on neural network material-modeling technologies, a new paradigm, called multi-output support vector regression, is developed to model complex stress/strain behavior of materials. The constitutive information generally implicitly contained in the results of experiments, i.e., the relationships between stresses and strains, can be captured by training a support vector regression model within a unified architecture from experimental data. This model, inheriting the merits of the neural network based models, can be employed to model the behavior of modern, complex materials such as composites. Moreover, the architectures of the support vector regression built in this research can be more easily determined than that of the neural network. Therefore, the proposed constitutive models can be more conveniently applied to finite element analysis and other application fields. As an illustration, the behaviors of concrete in the state of plane stress under monotonic biaxial loading and compressive uniaxial cycle loading are modeled with the multi-output and single-output support regression respectively. The excellent results show that the support vector regression provides another effective approach for material modeling.

Keywords: Multi-support vector regression
[3] Xiaobing Kong, Xiangjie Liu, Ruifeng Shi, and Kwang Y. Lee. Wind speed prediction using reduced support vector machines with feature selection. Neurocomputing, 169:449 - 456, 2015. Learning for Visual Semantic Understanding in Big DataESANN 2014Industrial Data Processing and AnalysisSelected papers from the 22nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2014)Selected papers from the 11th World Congress on Intelligent Control and Automation (WCICA2014). [ bib | DOI | http ]
Abstract Accurate prediction of wind speed is one of the most effective ways to solve the problems of relaibility, security, stability and quality, which are caused by wind energy production in power systems. This paper presents a wind speed prediction concept with high efficiency convex optimization support vector machine for data regression (SVR). Based on the SVR, a reduced support vector machine (RSVM) is proposed, which preselects a subset of data as support vectors and solves a smaller optimization problem. The principal component analysis is utilized to determine the outcome of the major factors affecting the wind speed. With increasing number of the input variables in {RSVM} for regression structure, particle swarm optimization (PSO) is incorporated to optimize the parameters. Detailed analysis and simulations using the real time wind power plant data demonstrate the effectiveness of the RSVM-based forecasting approach.

Keywords: Reduced support vector machine for regression
[4] Qifa Xu, Jinxiu Zhang, Cuixia Jiang, Xue Huang, and Yaoyao He. Weighted quantile regression via support vector machine. Expert Systems with Applications, 42(13):5441 - 5451, 2015. [ bib | DOI | http ]
Abstract We propose a new support vector weighted quantile regression approach that is closely built upon the idea of support vector machine. We extend the methodology of several popular quantile regressions to a more general approach. It can be estimated by solving a Lagrangian dual problem of quadratic programming and is able to implement the nonlinear quantile regression by introducing a kernel function. The Monte Carlo simulation studies show that the proposed approach outperforms some widely used quantile regression methods in terms of prediction accuracy. Finally, we demonstrate the efficacy of our proposed method on three benchmark data sets. It reveals that our method performs better in terms of prediction accuracy, which illustrates the importance of taking into account of the heterogeneous nonlinear structure among predictors across quantiles.

Keywords: Quantile regression
[5] G.E. Lee and A. Zaknich. A mixed-integer programming approach to {GRNN} parameter estimation. Information Sciences, 320:1 - 11, 2015. [ bib | DOI | http ]
Abstract A mixed-integer programming formulation for sparse general regression neural networks (GRNNs) is presented, along with a method for estimating {GRNN} parameters based on techniques drawn from support vector machines (SVMs) and evolutionary computation. {GRNNs} have been widely used for regression estimation, learning a function from a set of input/output examples, but they utilise the full set of training examples to evaluate the interpolation function. Sparse {GRNNs} choose a subset of the training examples, analogous to the support vectors chosen by SVMs. Experimental comparisons are made with non-sparse {GRNNs} and with sparse {GRNNs} whose centres are randomly chosen or are chosen using vector quantisation of the input domain. It is shown that the mixed-integer programming approach leads to lower prediction errors compared with previous approaches, especially when using a small fraction of the training examples.

Keywords: General regression neural network
[6] Maher Maalouf and Dirar Homouz. Kernel ridge regression using truncated newton method. Knowledge-Based Systems, 71:339 - 344, 2014. [ bib | DOI | http ]
Abstract Kernel Ridge Regression (KRR) is a powerful nonlinear regression method. The combination of {KRR} and the truncated-regularized Newton method, which is based on the conjugate gradient (CG) method, leads to a powerful regression method. The proposed method (algorithm), is called Truncated-Regularized Kernel Ridge Regression (TR-KRR). Compared to the closed-form solution of KRR, Support Vector Machines (SVM) and Least-Squares Support Vector Machines (LS-SVM) algorithms on six data sets, the proposed TR-KRR algorithm is as accurate as, and much faster than all of the other algorithms.

Keywords: Regression
[7] Hongzhe Dai, Boyi Zhang, and Wei Wang. A multiwavelet support vector regression method for efficient reliability assessment. Reliability Engineering & System Safety, 136:132 - 139, 2015. [ bib | DOI | http ]
Abstract As a new sparse kernel modeling technique, support vector regression has become a promising method in structural reliability analysis. However, in the standard quadratic programming support vector regression, its implementation is computationally expensive and sufficient model sparsity cannot be guaranteed. In order to mitigate these difficulties, this paper presents a new multiwavelet linear programming support vector regression method for reliability analysis. The method develops a novel multiwavelet kernel by constructing the autocorrelation function of multiwavelets and employs this kernel in context of linear programming support vector regression for approximating the limit states of structures. Three examples involving one finite element-based problem illustrate the effectiveness of the proposed method, which indicate that the new method is efficient than the classical support vector regression method for response surface function approximation.

Keywords: Structural reliability
[8] Yong-Ping Zhao, Bing Li, Ye-Bo Li, and Kang-Kang Wang. Householder transformation based sparse least squares support vector regression. Neurocomputing, 161:243 - 253, 2015. [ bib | DOI | http ]
Abstract Sparseness is a key problem in modeling problems. To sparsify the solution of normal least squares support vector regression (LSSVR), a novel sparse method is proposed in this paper, which recruits support vectors sequentially by virtue of Householder transformation, here {HSLSSVR} for short. In HSLSSVR, there are two benefits. On one hand, a recursive strategy is adopted to solve the linear equation set instead of solving it from scratch. During each iteration, the training sample incurring the maximum reduction on the residuals is recruited as support vector. On the other hand, in the process of solving the linear equation set, its condition number does not deteriorate, so the numerical stability is guaranteed. The reports from experiments on benchmark data sets and a real-world mechanical system to calculate the inverse dynamics of a robot arm demonstrate the effectiveness and feasibility of the proposed HSLSSVR.

Keywords: Least squares support vector machine
[9] Wentao Zhu, Jun Miao, and Laiyun Qing. Robust regression with extreme support vectors. Pattern Recognition Letters, 45:205 - 210, 2014. [ bib | DOI | http ]
Abstract Extreme Support Vector Machine (ESVM) is a nonlinear robust {SVM} algorithm based on regularized least squares optimization for binary-class classification. In this paper, a novel algorithm for regression tasks, Extreme Support Vector Regression (ESVR), is proposed based on ESVM. Moreover, kernel {ESVR} is suggested as well. Experiments show that, {ESVR} has a better generalization than some other traditional single hidden layer feedforward neural networks, such as Extreme Learning Machine (ELM), Support Vector Regression (SVR) and Least Squares-Support Vector Regression (LS-SVR). Furthermore, {ESVR} has much faster learning speed than {SVR} and LS-SVR. Stabilities and robustnesses of these algorithms are also studied in the paper, which shows that the {ESVR} is more robust and stable.

Keywords: Extreme Support Vector Regression
[10] Nikola Marković, Sanjin Milinković, Konstantin S. Tikhonov, and Paul Schonfeld. Analyzing passenger train arrival delays with support vector regression. Transportation Research Part C: Emerging Technologies, 56:251 - 262, 2015. [ bib | DOI | http ]
Abstract We propose machine learning models that capture the relation between passenger train arrival delays and various characteristics of a railway system. Such models can be used at the tactical level to evaluate effects of various changes in a railway system on train delays. We present the first application of support vector regression in the analysis of train delays and compare its performance with the artificial neural networks which have been commonly used for such problems. Statistical comparison of the two models indicates that the support vector regression outperforms the artificial neural networks. Data for this analysis are collected from Serbian Railways and include expert opinions about the influence of infrastructure along different routes on train arrival delays.

Keywords: Train arrival delays
[11] Yong-Ping Zhao, Kang-Kang Wang, and Fu Li. A pruning method of refining recursive reduced least squares support vector regression. Information Sciences, 296:160 - 174, 2015. [ bib | DOI | http ]
Abstract In this paper, a pruning method is proposed to refine the recursive reduced least squares support vector regression (RRLSSVR) and its improved version (IRRLSSVR), and thus two novel algorithms PruRRLSSVR and PruIRRLSSVR are yielded. This pruning method ranks support vectors by defining a contribution function to the objective function, and then the support vector with the least contribution is pruned unless it is the most recently selected support vector. Consequently, PruRRLSSVR and PruIRRLSSVR outperform {RRLSSVR} and {IRRLSSVR} respectively in terms of the number of support vectors while not impairing the generalization performance. In addition, a speedup scheme is presented that reduces the computational burden of computing the contribution function. To show the effectiveness and feasibility of the proposed PruRRLSSVR and PruIRRLSSVR, experiments are performed on ten benchmark data sets and a gas furnace instance.

Keywords: Support vector machine
[12] Jie Hu and Kai Zheng. A novel support vector regression for data set with outliers. Applied Soft Computing, 31:405 - 411, 2015. [ bib | DOI | http ]
Abstract Support vector machine (SVM) is sensitive to the outliers, which reduces its generalization ability. This paper presents a novel support vector regression (SVR) together with fuzzification theory, inconsistency matrix and neighbors match operator to address this critical issue. Fuzzification method is exploited to assign similarities on the input space and on the output response to each pair of training samples respectively. The inconsistency matrix is used to calculate the weights of input variables, followed by searching outliers through a novel neighborhood matching algorithm and then eliminating them. Finally, the processed data is sent to the original SVR, and the prediction results are acquired. A simulation example and three real-world applications demonstrate the proposed method for data set with outliers.

Keywords: Support vector regression
[13] Yongqiao Wang, He Ni, and Shouyang Wang. Multiple- support vector regression based on spectral risk measure minimization. Neurocomputing, 101:217 - 228, 2013. [ bib | DOI | http ]
Statistical learning theory provides the justification of the ϵ - insensitive loss in support vector regression, but suggests little guidance on the determination of the critical hyper-parameter ϵ . Instead of predefining ϵ , ν - support vector regression automatically selects ϵ by making the percent of deviations larger than ϵ be asymptotically equal to ν . In stochastic programming terminology, the goal of ν - support vector regression is to minimize the conditional Value-at-Risk measure of deviations, i.e. the expectation of the larger ν - percent deviations. This paper tackles the determination of the critical hyper-parameter ν in ν - support vector regression when the error term follows a complex distribution. Instead of one singleton ν , the paper assumes ν to be a combination of multiple, finite or infinite, candidate choices. Thus, the cost function becomes a weighted sum of component conditional value-at-risk measures associated with these base ν s . This paper shows that this cost function can be represented with a spectral risk measure and its minimization can be reformulated to a linear programming problem. Experiments on three artificial data sets show that this multiple- ν support vector regression has great advantage over the classical ν - support vector regression when the error terms follow mixed polynomial distributions. Experiments on 10 real-world data sets also clearly demonstrate that this new method can achieve better performance than ϵ - support vector regression and ν - support vector regression.

Keywords: Conditional value-at-risk
[14] Jooyong Shim and Changha Hwang. Varying coefficient modeling via least squares support vector regression. Neurocomputing, 161:254 - 259, 2015. [ bib | DOI | http ]
Abstract The varying coefficient regression model has received a great deal of attention as an important tool for modeling the dynamic changes of regression coefficients in the social and natural sciences. Lots of efforts have been devoted to develop effective estimation methods for such regression model. In this paper we propose a method for fitting the varying coefficient regression model using the least squares support vector regression technique, which analyzes the dynamic relation between a response and a group of covariates. We also consider a generalized cross validation method for choosing the hyperparameters which affect the performance of the proposed method. We provide a method for estimating the confidence intervals of coefficient functions. The proposed method is evaluated through simulation and real example studies.

Keywords: Confidence interval
[15] Jiqiang Chen, Witold Pedrycz, Minghu Ha, and Litao Ma. Set-valued samples based support vector regression and its applications. Expert Systems with Applications, 42(5):2502 - 2509, 2015. [ bib | DOI | http ]
Abstract In this study, we address the regression problem on set-valued samples that appear in applications. To solve this problem, we propose a support vector regression approach for set-valued samples that generalizes the classical ε-support vector regression. First, an initial representative point (or an element) for every set-valued sample is selected, and a weighted distance between the initial representative point and other points is determined. Second, based on the classification consistency principle, a search algorithm to determine the best representative point for every set-valued datum is designed. Thus, the set-valued samples are converted into numeric samples. Finally, a support vector regression that is based on set-valued data is constructed, and the regression results of the set-valued samples can be approximated using the method used for the numeric samples. Furthermore, the feasibility and efficiency of the proposed method is demonstrated using experiments with real-world examples concerning wind speed prediction and the prediction of peak particle velocity.

Keywords: Support vector machine
[16] Xiao Yao, Jonathan Crook, and Galina Andreeva. Support vector regression for loss given default modelling. European Journal of Operational Research, 240(2):528 - 538, 2015. [ bib | DOI | http ]
Abstract Loss given default modelling has become crucially important for banks due to the requirement that they comply with the Basel Accords and to their internal computations of economic capital. In this paper, support vector regression (SVR) techniques are applied to predict loss given default of corporate bonds, where improvements are proposed to increase prediction accuracy by modifying the {SVR} algorithm to account for heterogeneity of bond seniorities. We compare the predictions from {SVR} techniques with thirteen other algorithms. Our paper has three important results. First, at an aggregated level, the proposed improved versions of support vector regression techniques outperform other methods significantly. Second, at a segmented level, by bond seniority, least square support vector regression demonstrates significantly better predictive abilities compared with the other statistical models. Third, standard transformations of loss given default do not improve prediction accuracy. Overall our empirical results show that support vector regression techniques are a promising technique for banks to use to predict loss given default.

Keywords: Support vector regression
[17] Jiawei Xiang, Ming Liang, and Yumin He. Experimental investigation of frequency-based multi-damage detection for beams using support vector regression. Engineering Fracture Mechanics, 131:257 - 268, 2014. [ bib | DOI | http ]
Abstract A frequency-based damage detection method in conjunction with the support vector regression is presented. The wavelet finite element method is used for numerical simulation to determinate the relationship database between multi-damage locations/depths and natural frequencies of a beam. Then, support vector regression is applied to extract the damage locations and depths from the database due to its ability in handling nonlinearity, finding global solutions, and processing high dimensional input vector. Finally, a large number of experiments have been carried out to further examine the performance of the proposed method.

Keywords: Multi-damage detection
[18] G. Santamaría-Bonfil, A. Reyes-Ballesteros, and C. Gershenson. Wind speed forecasting for wind farms: A method based on support vector regression. Renewable Energy, 85:790 - 809, 2016. [ bib | DOI | http ]
Abstract In this paper, a hybrid methodology based on Support Vector Regression for wind speed forecasting is proposed. Using the autoregressive model called Time Delay Coordinates, feature selection is performed by the Phase Space Reconstruction procedure. Then, a Support Vector Regression model is trained using univariate wind speed time series. Parameters of Support Vector Regression are tuned by a genetic algorithm. The proposed method is compared against the persistence model, and autoregressive models (AR, ARMA, and ARIMA) tuned by Akaike's Information Criterion and Ordinary Least Squares method. The stationary transformation of time series is also evaluated for the proposed method. Using historical wind speed data from the Mexican Wind Energy Technology Center (CERTE) located at La Ventosa, Oaxaca, México, the accuracy of the proposed forecasting method is evaluated for a whole range of short termforecasting horizons (from 1 to 24 h ahead). Results show that, forecasts made with our method are more accurate for medium (5–23 h ahead) short term {WSF} and {WPF} than those made with persistence and autoregressive models.

Keywords: Wind speed forecasting
[19] Chen yongqi. Least squares support vector fuzzy regression. Energy Procedia, 17, Part A:711 - 716, 2012. 2012 International Conference on Future Electrical Power and Energy System. [ bib | DOI | http ]
A least squares support vector fuzzy regression model (LS_SVFR) is proposed to estimate uncertain and imprecise data by applying the fuzzy sets principle in weight vector. Determining the weight vector and the bias term of this model requires only a set of linear equations, as against the solution of a complicated quadratic programming problem in existing support vector fuzzy regression model. Numerical example is given to demonstrate the effectiveness and applicability of the proposed model.

Keywords: Interval analysis
[20] J. Phillips, E. Cripps, John W. Lau, and M.R. Hodkiewicz. Classifying machinery condition using oil samples and binary logistic regression. Mechanical Systems and Signal Processing, 60–61:316 - 325, 2015. [ bib | DOI | http ]
Abstract The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically “black box” approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, {ANN} and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the {ANN} and {SVM} approaches in terms of prediction for healthy/not healthy engines.

Keywords: Logistic regression
[21] Rui Ji, Yupu Yang, and Weidong Zhang. Incremental smooth support vector regression for takagi–sugeno fuzzy modeling. Neurocomputing, 123:281 - 291, 2014. Contains Special issue articles: Advances in Pattern Recognition Applications and Methods. [ bib | DOI | http ]
Abstract We propose an architecture for Takagi–Sugeno (TS) fuzzy system and develop an incremental smooth support vector regression (ISSVR) algorithm to build the {TS} fuzzy system. {ISSVR} is based on the ε -insensitive smooth support vector regression ( ε -SSVR), a smoothing strategy for solving ε -SVR, and incremental reduced support vector machine (RSVM). The {ISSVR} incrementally selects representative samples from the given dataset as support vectors. We show that {TS} fuzzy modeling is equivalent to the {ISSVR} problem under certain assumptions. A {TS} fuzzy system can be generated from the given training data based on the {ISSVR} learning with each fuzzy rule given by a support vector. Compared with other fuzzy modeling methods, more forms of membership functions can be used in our model, and the number of fuzzy rules of our model is much smaller. The performance of our model is illustrated by extensive experiments and comparisons.

Keywords: Takagi–Sugeno fuzzy systems
[22] Paulo R. Filgueiras, Luciana A. Terra, Eustáquio V.R. Castro, Lize M.S.L. Oliveira, Júlio C.M. Dias, and Ronei J. Poppi. Prediction of the distillation temperatures of crude oils using 1h {NMR} and support vector regression with estimated confidence intervals. Talanta, 142:197 - 205, 2015. [ bib | DOI | http ]
Abstract This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using 1H {NMR} and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the {PLS} method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6 °C was obtained in comparison with 15.6 °C for PLS, 15.1 °C for ePLS and 28.4 °C for SVR. The {RMSEPs} for T50% were 24.2 °C, 23.4 °C, 22.8 °C and 14.4 °C for PLS, ePLS, {SVR} and eSVR, respectively. For T90%, the values of {RMSEP} were 39.0 °C, 39.9 °C and 39.9 °C for PLS, ePLS, {SVR} and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS.

Keywords: Boosting
[23] A. Troncoso, S. Salcedo-Sanz, C. Casanova-Mateo, J.C. Riquelme, and L. Prieto. Local models-based regression trees for very short-term wind speed prediction. Renewable Energy, 81:589 - 598, 2015. [ bib | DOI | http ]
Abstract This paper evaluates the performance of different types of Regression Trees (RTs) in a real problem of very short-term wind speed prediction from measuring data in wind farms. {RT} is a solidly established methodology that, contrary to other soft-computing approaches, has been under-explored in problems of wind speed prediction in wind farms. In this paper we comparatively evaluate eight different types of {RTs} algorithms, and we show that they are able obtain excellent results in real problems of very short-term wind speed prediction, improving existing classical and soft-computing approaches such as multi-linear regression approaches, different types of neural networks and support vector regression algorithms in this problem. We also show that {RTs} have a very small computation time, that allows the retraining of the algorithms whenever new wind speed data are collected from the measuring towers.

Keywords: Wind speed prediction
[24] J. García-Gutiérrez, F. Martínez-Álvarez, A. Troncoso, and J.C. Riquelme. A comparison of machine learning regression techniques for lidar-derived estimation of forest variables. Neurocomputing, 167:24 - 31, 2015. [ bib | DOI | http ]
Abstract Light Detection and Ranging (LiDAR) is a remote sensor able to extract three-dimensional information. Environmental models in forest areas have been benefited by the use of LiDAR-derived information in the last years. A multiple linear regression (MLR) with previous stepwise feature selection is the most common method in the literature to develop those models. {MLR} defines the relation between the set of field measurements and the statistics extracted from a LiDAR flight. Machine learning has emerged as a suitable tool to improve classic stepwise {MLR} results on LiDAR. Unfortunately, few studies have been proposed to compare the quality of the multiple machine learning approaches. This paper presents a comparison between the classic MLR-based methodology and regression techniques in machine learning (neural networks, support vector machines, nearest neighbour, ensembles such as random forests) with special emphasis on regression trees. The selected techniques are applied to real LiDAR data from two areas in the province of Lugo (Galizia, Spain). The results confirm that classic {MLR} is outperformed by machine learning techniques and concretely, our experiments suggest that Support Vector Regression with Gaussian kernels statistically outperforms the rest of the techniques.

Keywords: LiDAR
[25] Guangyu Zhu, Da Huang, Peng Zhang, and Weijie Ban. ε-proximal support vector machine for binary classification and its application in vehicle recognition. Neurocomputing, 161:260 - 266, 2015. [ bib | DOI | http ]
Abstract In this paper, we propose a novel proximal support vector machine (PSVM), named ε-proximal support vector machine (ε-PSVM), for binary classification. By introducing the ε-insensitive loss function instead of the quadratic loss function into PSVM, the proposed ε-PSVM has several improved advantages compared with the traditional PSVM: (1) It is sparse controlled by the parameter ε. (2) It is actually a kind of ε-support vector regression (ε-SVR), the only difference here is that it takes the binary classification problem as a special kind of regression problem. (3) By weighting different sparseness parameter ε for each class, unbalanced problem can be solved successfully, furthermore, a useful choice of the parameter ε is proposed. (4) It can be solved efficiently for large scale problems by the Successive Over relaxation (SOR) technique. Experimental results on several benchmark datasets show the effectiveness of our method in sparseness, balance performance and classification accuracy, and therefore confirm the above conclusion further. At last, we also apply this new method to the vehicle recognition and the results show its efficiency.

Keywords: Proximal support vector machines
[26] Bryan R. Herman, Benoit Forget, and Kord Smith. Progress toward monte carlo–thermal hydraulic coupling using low-order nonlinear diffusion acceleration methods. Annals of Nuclear Energy, 84:63 - 72, 2015. Multi-Physics Modelling of {LWR} Static and Transient Behaviour. [ bib | DOI | http ]
Abstract A new approach for coupled Monte Carlo (MC) and thermal hydraulics (TH) simulations is proposed using low-order nonlinear diffusion acceleration methods. This approach uses new features such as coarse mesh finite difference diffusion (CMFD), multipole representation for fuel temperature feedback on microscopic cross sections, and support vector machine learning algorithms (SVM) for iterations between {CMFD} and {TH} equations. The multipole representation method showed small differences of about 0.3% root mean square (RMS) error in converged assembly source distribution compared to a conventional {MC} simulation with {ACE} data at the same temperature. This is within two standard deviations of the real uncertainty. Eigenvalue differences were on the order of 10 pcm. Support vector machine regression was performed on-the-fly during {MC} simulations. Regression results of macroscopic cross sections parametrized by coolant density and fuel temperature were successful and eliminated the need of partial derivative tables generated from lattice codes. All of these new tools were integrated together to perform MC–CMFD–TH–SVM iterations. Results showed that inner iterations between CMFD–TH–SVM are needed to obtain a stable solution.

Keywords: Monte Carlo
[27] Jingwen Zhang, Pan Liu, Hao Wang, Xiaohui Lei, and Yanlai Zhou. A bayesian model averaging method for the derivation of reservoir operating rules. Journal of Hydrology, 528:276 - 285, 2015. [ bib | DOI | http ]
Summary Because the intrinsic dynamics among optimal decision making, inflow processes and reservoir characteristics are complex, functional forms of reservoir operating rules are always determined subjectively. As a result, the uncertainty of selecting form and/or model involved in reservoir operating rules must be analyzed and evaluated. In this study, we analyze the uncertainty of reservoir operating rules using the Bayesian model averaging (BMA) model. Three popular operating rules, namely piecewise linear regression, surface fitting and a least-squares support vector machine, are established based on the optimal deterministic reservoir operation. These individual models provide three-member decisions for the {BMA} combination, enabling the 90% release interval to be estimated by the Markov Chain Monte Carlo simulation. A case study of China’s the Baise reservoir shows that: (1) the optimal deterministic reservoir operation, superior to any reservoir operating rules, is used as the samples to derive the rules; (2) the least-squares support vector machine model is more effective than both piecewise linear regression and surface fitting; (3) {BMA} outperforms any individual model of operating rules based on the optimal trajectories. It is revealed that the proposed model can reduce the uncertainty of operating rules, which is of great potential benefit in evaluating the confidence interval of decisions.

Keywords: Reservoir operation
[28] Hamid Taghavifar, Aref Mardani, and Haleh Karim Maslak. A comparative study between artificial neural networks and support vector regression for modeling of the dissipated energy through tire-obstacle collision dynamics. Energy, pages -, 2015. [ bib | DOI | http ]
Abstract Energy dissipation control has long been synthesized addressing the trafficking of wheeled vehicles. Wheel-obstacle collision has attracted the studies more on ride comfort, stability, maneuvering, and suspension purposes. This paper communicates, for the first time, the energy dissipation analysis through tire-obstacle collision that frequently occurs for the wheeled vehicles particularly those of off-road vehicles. To this aim, a soil bin facility equipped with a single wheel-tester is employed considering input parameters of wheel load, speed, slippage, and obstacle height each at three different levels. In the next step, the potential of classic artificial neural networks was appraised against support vector regression with the two kernels of radial basis function and polynomial function. On account of performance metrics, it was revealed that radial basis function based support vector regression is outperforming the other tested methods for the prediction of dissipated energy through tire-obstacle collision dynamics. The details are documented in the paper.

Keywords: Energy dissipation
[29] Kuo-Chen Hung and Kuo-Ping Lin. Long-term business cycle forecasting through a potential intuitionistic fuzzy least-squares support vector regression approach. Information Sciences, 224:37 - 48, 2013. [ bib | DOI | http ]
This paper developed a novel intuitionistic fuzzy least-squares support vector regression with genetic algorithms (IFLS-SVRGAs) to accurately forecast the long-term indexes of business cycles. Long-term business cycle forecasting is an important issue in economic evaluation, as business cycle indexes may contain uncertain factors or phenomena such as government policies and financial meltdowns. In order to effectively handle such factors and accidental forecasting indexes of business cycles, the proposed method combined intuitionistic fuzzy technology with least-squares support vector regression (LS-SVR). The LS-SVR method has been successfully applied to forecasting problems, especially time series problems. The prediction model in this paper adopted two LS-SVRs with intuitionistic fuzzy sets, in order to approach the intuitionistic fuzzy upper and lower bounds and to provide numeric prediction values. Furthermore, genetic algorithms (GAs) were simultaneously employed in order to select the parameters of the IFLS-SVR models. In this study, IFLS-SVRGA, intuitionistic fuzzy support vector regression (IFSVR), fuzzy support vector regression (FSVR), least-squares support vector regression (LS-SVR), support vector regression (SVR) and the autoregressive integrated moving average (ARIMA) were employed for the long-term index forecasting of Taiwanese businesses. The empirical results indicated that the proposed IFLS-SVRGA model has better performance in terms of forecasting accuracy than the other methods. Therefore, the IFLS-SVRGA model can efficiently provide credible long-term predictions for business index forecasting in Taiwan.

Keywords: Long-term business cycle forecasting
[30] João Dallyson Sousa de Almeida, Aristófanes Corrêa Silva, Jorge Antonio Meireles Teixeira, Anselmo Cardoso Paiva, and Marcelo Gattass. Surgical planning for horizontal strabismus using support vector regression. Computers in Biology and Medicine, 63:178 - 186, 2015. [ bib | DOI | http ]
Abstract Strabismus is a pathology which affects about 4% of the population, causing esthetic problems (reversible at any age) and irreversible sensory disorders, altering the vision mechanism. Many techniques can be applied to settle the muscular balance, thus eliminating strabismus. However, when the conservative treatment is not enough, the surgical treatment is adopted, applying recoils or resections to the ocular muscles affected. The factors involved in the surgical strategy in cases of strabismus are complex, demanding both theoretical knowledge and experience from the surgeon. So, the present work proposes a methodology based on Support Vector Regression to help the physician with decision related to horizontal strabismus surgeries. The efficiency of the method at the indication of the surgical plan was evaluated through the average difference between the values that it provided and the values indicated by the specialists. In the planning of medial rectus muscles surgeries, the average error was 0.5 mm for recoil and 0.7 for resection. For lateral rectus muscles, the mean error was 0.6 for recoil and 0.8 for resection. The results are promising and prove the feasibility of the use of Support Vector Regression in the indication of strabismus surgeries.

Keywords: Surgical planning
[31] Hamid Reza Ansari and Amin Gholami. An improved support vector regression model for estimation of saturation pressure of crude oils. Fluid Phase Equilibria, 402:124 - 132, 2015. [ bib | DOI | http ]
Abstract Use of intelligence based approach for modeling of crude oil saturation pressure is viable alternative since this parameter plays influential role in the reservoir calculation. The objective of current study is to develop a smart model based on fusing of support vector regression model and optimization technique for learn the relation between the saturation pressure and compositional data viz. temperature, hydrocarbon and non-hydrocarbon compositions of crudes, and heptane-plus specifications. The optimization methods improve performance of the support vector regression (SVR) model through finding the proper value of their free parameters. The optimization methods which embedded in the {SVR} formulation in this study are genetic algorithm (GA), imperialist competitive algorithm (ICA), particle swarm optimization algorithm (PSO), cuckoo search algorithm (CS), and bat-inspired algorithm (BA). The optimized models were applied to experimental data given in open source literatures and the performance of optimization algorithm was assessed by virtue of statistical criteria. This evaluation resulted clearly show the superiority of {BA} when integrated with support vector regression for determining the optimal value of its parameters. In addition, the results of aforementioned optimized models were compared with currently available predictive approaches. The comparative results revealed that hybrid of {BA} and {SVR} yield robust model which outperform other models in term of higher correlation coefficient and lower mean square error.

Keywords: Support vector regression (SVR)
[32] S. Balasundaram and Deepak Gupta. Training lagrangian twin support vector regression via unconstrained convex minimization. Knowledge-Based Systems, 59:85 - 96, 2014. [ bib | DOI | http ]
Abstract In this paper, a new unconstrained convex minimization problem formulation is proposed as the Lagrangian dual of the 2-norm twin support vector regression (TSVR). The proposed formulation leads to two smaller sized unconstrained minimization problems having their objective functions piece-wise quadratic and differentiable. It is further proposed to apply gradient based iterative method for solving them. However, since their objective functions contain the non-smooth ‘plus’ function, two approaches are taken: (i) either considering their generalized Hessian or introducing a smooth function in place of the ‘plus’ function, and applying Newton–Armijo algorithm; (ii) obtaining their critical points by functional iterative algorithm. Computational results obtained on a number of synthetic and real-world benchmark datasets clearly illustrate the superiority of the proposed unconstrained Lagrangian twin support vector regression formulation as comparable generalization performance is achieved with much faster learning speed in accordance with the classical support vector regression and TSVR.

Keywords: Generalized Hessian
[33] Qi Wu, Rob Law, Edmond Wu, and Jinxing Lin. A hybrid-forecasting model reducing gaussian noise based on the gaussian support vector regression machine and chaotic particle swarm optimization. Information Sciences, 238:96 - 110, 2013. [ bib | DOI | http ]
In this paper, the relationship between Gaussian noise and the loss function of the support vector regression machine (SVRM) is analyzed, and then a Gaussian loss function proposed to reduce the effect of such noise on the regression estimates. Since the ε-insensitive loss function cannot reduce noise, a novel support vector regression machine, g-SVRM, is proposed, then a chaotic particle swarm optimization (CPSO) algorithm developed to estimate its unknown parameters. Finally, a hybrid-forecasting model combining g-SVRM with the {CPSO} is proposed to forecast a multi-dimensional time series. The results of two experiments demonstrate the feasibility of this approach.

Keywords: Support vector regression machine
[34] Qinghua Hu, Shiguang Zhang, Zongxia Xie, Jusheng Mi, and Jie Wan. Noise model based -support vector regression with its application to short-term wind speed forecasting. Neural Networks, 57:1 - 11, 2014. [ bib | DOI | http ]
Abstract Support vector regression (SVR) techniques are aimed at discovering a linear or nonlinear structure hidden in sample data. Most existing regression techniques take the assumption that the error distribution is Gaussian. However, it was observed that the noise in some real-world applications, such as wind power forecasting and direction of the arrival estimation problem, does not satisfy Gaussian distribution, but a beta distribution, Laplacian distribution, or other models. In these cases the current regression techniques are not optimal. According to the Bayesian approach, we derive a general loss function and develop a technique of the uniform model of ν -support vector regression for the general noise model (N-SVR). The Augmented Lagrange Multiplier method is introduced to solve N-SVR. Numerical experiments on artificial data sets, {UCI} data and short-term wind speed prediction are conducted. The results show the effectiveness of the proposed technique.

Keywords: Support vector regression
[35] Maojin Tan, Xiaodong Song, Xuan Yang, and Qingzhao Wu. Support-vector-regression machine technology for total organic carbon content prediction from wireline logs in organic shale: A comparative study. Journal of Natural Gas Science and Engineering, 26:792 - 802, 2015. [ bib | DOI | http ]
Abstract Organic shale is one of the most important unconventional oil and gas resources. Hydrocarbon potential prediction of organic shale such as total organic carbon (TOC) is an important evaluation tool, which primarily uses empirical equations. A support-vector machine is a set of supervised tools used for classification and regression problems. In this study, a support-vector machine for regression (SVR) is investigated to estimate the {TOC} content in gas-bearing shale. First, {SVR} technology is introduced including its basic concepts, associated regression algorithms and kernel functions, and a {TOC} prediction sketch that uses wireline logs. Then, one example is considered to compare three different regression algorithms and four different kernel functions in a packet dataset validation process and a leave-one-out cross-validation process. Error analysis indicates that the {SVR} method with the Epsilon-SVR regression algorithm and the Gaussian kernel produces the best results. The method of choosing the optimum Gamma value in the Gaussian kernel function is also introduced. Next, for comparison, the SVR-derived {TOC} with the optimal model and parameters is compared with the empirical formula and the ΔlogR methods. Finally, in a real continuous {TOC} prediction using wireline logs, {TOC} prediction tests are performed using {SVR} to choose the optimal logs as inputs, and the optimal input is finally chosen. Additionally, the radial basis network (RBF) is also applied to perform tests with different inputs; the results of these tests are compared with those of the {SVR} method. This study shows that {SVR} technology is a powerful tool for {TOC} prediction and is more effective and applicable than a single empirical model, ΔlogR and some network methods.

Keywords: Organic shale
[36] Stefan Tötterman and Hannu T. Toivonen. Support vector method for identification of wiener models. Journal of Process Control, 19(7):1174 - 1181, 2009. [ bib | DOI | http ]
Support vector regression is applied to identify nonlinear systems represented by Wiener models, consisting of a linear dynamic system in series with a static nonlinear block. The linear block is expanded in terms of basis functions, such as Laguerre or Kautz filters, and the static nonlinear block is determined using support vector machine regression.

Keywords: Support vector machines
[37] Shreenivas Londhe and Seema S. Gavraskar. Forecasting one day ahead stream flow using support vector regression. Aquatic Procedia, 4:900 - 907, 2015. {INTERNATIONAL} {CONFERENCE} {ON} {WATER} RESOURCES, {COASTAL} {AND} {OCEAN} {ENGINEERING} (ICWRCOE'15). [ bib | DOI | http ]
Abstract Effective stream flow forecast for different lead-times is useful in water resource management in arid regions, in designing of hydraulic structures and almost all water resources related issues. The Support Vector Machines are learning systems that use a hypothetical space of linear functions in a kernel induced higher dimensional feature space, and are trained with a learning algorithm from optimization theory. Support vector machines are the methods of supervised learning, which are commonly used for classification and regression purpose. A {SVM} constructs a separating hyper plane between the classes in the n-dimensional space of the inputs. The Support Vector Regression attempts to fit a curve with respect to the kernel used in {SVM} on data points such that the points lie between two marginal hyper planes which helps in minimizing the regression error. For non-linear regression problems Kernel functions are used to map the data into higher dimensional space where linear regression is performed. The current paper presents use of a data driven technique of Support Vector Regression (SVR) to forecast stream flow one day ahead at two stations in India, namely Nighoje in Krishna river basin and another station is Mandaleshwar in Narmada river basin. For forecasting stream flow one day in advance previous values of measured stream flow and rainfall were used for building the models. The relevant inputs were fixed on the basis of autocorrelation, Cross-correlation and trial and error. The model results were reasonable as evident from low value of Root Mean Square Error (RMSE) accompanied by scatter plots and hydrographs.

Keywords: Stream flow forecast
[38] Xinjun Peng, Dong Xu, and Jindong Shen. A twin projection support vector machine for data regression. Neurocomputing, 138:131 - 141, 2014. [ bib | DOI | http ]
Abstract In this paper, an efficient twin projection support vector regression (TPSVR) algorithm for data regression is proposed. This {TPSVR} determines indirectly the regression function through a pair of nonparallel up- and down-bound functions solved by two smaller sized support vector machine (SVM)-type problems. In each optimization problem of TPSVR, it seeks a projection axis such that the variance of the projected points is minimized by introducing a new term, which makes it not only minimize the empirical variance of the projected inputs, but also maximize the empirical correlation coefficient between the up- or down-bound targets and the projected inputs. In terms of generalization performance, the experimental results indicate that {TPSVR} not only obtains the better and stabler prediction performance than the classical {SVR} and some other algorithms, but also needs less number of support vectors (SVs) than the classical SVR.

Keywords: Support vector regression
[39] Bin Gu, Victor S. Sheng, Zhijie Wang, Derek Ho, Said Osman, and Shuo Li. Incremental learning for -support vector regression. Neural Networks, 67:140 - 150, 2015. [ bib | DOI | http ]
Abstract The ν -Support Vector Regression ( ν -SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν -Support Vector Classification ( ν -SVC) (Schölkopf et al., 2000), ν -SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν -SVC algorithm (AONSVM) to ν -SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν -SVR learning algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν -SVC based on the Karush–Kuhn–Tucker (KKT) conditions to prepare an initial solution for the incremental learning. Combining the initial adjustments with the two steps of {AONSVM} produces an exact and effective incremental ν -SVR learning algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of {INSVR} (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that {INSVR} can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that {INSVR} is faster than batch ν -SVR algorithms with both cold and warm starts.

Keywords: Incremental learning
[40] Zhi-Min Yang, Xiang-Yu Hua, Yuan-Hai Shao, and Ya-Fen Ye. A novel parametric-insensitive nonparallel support vector machine for regression. Neurocomputing, pages -, 2015. [ bib | DOI | http ]
Abstract In this paper, a novel parametric-insensitive nonparallel support vector regression (PINSVR) algorithm for data regression is proposed. {PINSVR} indirectly finds a pair of nonparallel proximal functions with a pair of different parametric-insensitive nonparallel proximal functions by solving two smaller sized quadratic programming problems (QPPs). By using new parametric-insensitive loss functions, the proposed {PINSVR} automatically adjusts a flexible parametric-insensitive zone of arbitrary shape and minimal size to include the given data to capture data structure and boundary information more accurately. The experiment results compared with the ε-SVR, ε-TSVR, and {TPISVR} indicate that our {PINSVR} not only obtains comparable regression performance, but also obtains better bound estimations.

Keywords: Support vector machine
[41] Jooyong Shim and Changha Hwang. Estimating small area mean with mixed and fixed effects support vector median regressions. Neurocomputing, 145:174 - 181, 2014. [ bib | DOI | http ]
Abstract Small area estimation has been extensively studied under linear mixed effects models. However, when the functional form of the relationship between the response and the covariates is not linear, it may lead to biased estimators of the small area parameters. In this paper, we relax the assumption of linear regression for the fixed part of the model and replace it by using the underlying concept of support vector quantile regression. This makes it possible to express the nonparametric small area estimation problem as mixed or fixed effects model regression. Through numerical studies we compare the efficiency of different models in estimating small area mean.

Keywords: Fixed effect
[42] M. Braun, T. Bernard, O. Piller, and F. Sedehizade. 24-hours demand forecasting based on {SARIMA} and support vector machines. Procedia Engineering, 89:926 - 933, 2014. 16th Water Distribution System Analysis Conference, {WDSA2014Urban} Water Hydroinformatics and Strategic Planning. [ bib | DOI | http ]
Abstract In time series analysis the autoregressive integrate moving average (ARIMA) models have been used for decades and in a wide variety of scientific applications. In recent years a growing popularity of machine learning algorithms like the artificial neural network (ANN) and support vector machine (SVM) have led to new approaches in time series analysis. The forecasting model presented in this paper combines an autoregressive approach with a regression model respecting additional parameters. Two modelling approaches are presented which are based on seasonal autoregressive integrated moving average (SARIMA) models and support vector regression (SVR). These models are evaluated on data from a residential district in Berlin.

Keywords: SARIMA
[43] Mathieu Wauters and Mario Vanhoucke. Support vector machine regression for project control forecasting. Automation in Construction, 47:92 - 106, 2014. [ bib | DOI | http ]
Abstract Support Vector Machines are methods that stem from Artificial Intelligence and attempt to learn the relation between data inputs and one or multiple output values. However, the application of these methods has barely been explored in a project control context. In this paper, a forecasting analysis is presented that compares the proposed Support Vector Regression model with the best performing Earned Value and Earned Schedule methods. The parameters of the {SVM} are tuned using a cross-validation and grid search procedure, after which a large computational experiment is conducted. The results show that the Support Vector Machine Regression outperforms the currently available forecasting methods. Additionally, a robustness experiment has been set up to investigate the performance of the proposed method when the discrepancy between training and test set becomes larger.

Keywords: Project management
[44] Yongqiao Wang and He Ni. Multivariate convex support vector regression with semidefinite programming. Knowledge-Based Systems, 30:87 - 94, 2012. [ bib | DOI | http ]
As one of important nonparametric regression method, support vector regression can achieve nonlinear capability by kernel trick. This paper discusses multivariate support vector regression when its regression function is restricted to be convex. This paper approximates this convex shape restriction with a series of linear matrix inequality constraints and transforms its training to a semidefinite programming problem, which is computationally tractable. Extensions to multivariate concave case, ℓ2-norm Regularization, ℓ1 and ℓ2-norm loss functions, are also studied in this paper. Experimental results on both toy data sets and a real data set clearly show that, by exploiting this prior shape knowledge, this method can achieve better performance than the classical support vector regression.

Keywords: Support vector regression
[45] A. Reşit Kavsaoğlu, Kemal Polat, and M. Hariharan. Non-invasive prediction of hemoglobin level using machine learning techniques with the {PPG} signal's characteristics features. Applied Soft Computing, pages -, 2015. [ bib | DOI | http ]
Abstract Hemoglobin can be measured normally after the analysis of the blood sample taken from the body and this measurement is named as invasive. Hemoglobin must continuously be measured to control the disease and its progression in people who go through hemodialysis and have diseases such as oligocythemia and anemia. This gives a perpetual feeling of pain to the people. This paper proposes a non-invasive method for the prediction of the hemoglobin using the characteristic features of the {PPG} signals and different machine learning algorithms. In this work, {PPG} signals from 33 people were included in 10 periods and 40 characteristic features were extracted from them. In addition to these features, gender information (male or female), height (as cm), weight (as kg) and age of each subjects were also considered as the features. Blood count and hemoglobin level were measured simultaneously by using the “Hemocue Hb-201TM” device. Using the different machine learning regression techniques (classification and regression trees – CART, least squares regression – LSR, generalized linear regression – GLR, multivariate linear regression – MVLR, partial least squares regression – PLSR, generalized regression neural network – GRNN, {MLP} – multilayer perceptron, and support vector regression – SVR). {RELIEFF} feature selection (RFS) and correlation-based feature selection (CFS) were used to select the best features. Original features and selected features using {RFS} (10 features) and {CFS} (11 features) were used to predict the hemoglobin level using the different machine learning techniques. To evaluate the performance of the machine learning techniques, different performance measures such as mean absolute error – MAE, mean square error – MSE, {R2} (coefficient of determination), root mean square error – RMSE, Mean Absolute Percentage Error (MAPE) and Index of Agreement – {IA} were used. The promising results were obtained (MSE-0.0027) using the selected features by {RFS} and SVR. Hence, the proposed method may clinically be used to predict the hemoglobin level of human being clinically without taking and analyzing blood samples.

Keywords: Photoplethysmography (PPG)
[46] Francisco M. Ortuño, Olga Valenzuela, Beatriz Prieto, Maria Jose Saez-Lara, Carolina Torres, Hector Pomares, and Ignacio Rojas. Comparing different machine learning and mathematical regression models to evaluate multiple sequence alignments. Neurocomputing, 164:123 - 136, 2015. [ bib | DOI | http ]
Abstract The evaluation of multiple sequence alignments (MSAs) is still an open task in bioinformatics. Current {MSA} scores do not agree about how alignments must be accurately evaluated. Consequently, it is not trivial to know the quality of {MSAs} when reference alignments are not provided. Recent scores tend to use more complex evaluations adding supplementary biological features. In this work, a set of novel regression approaches are proposed for the {MSA} evaluation, comparing several supervised learning and mathematical methodologies. Therefore, the following models specifically designed for regression are applied: regression trees, a bootstrap aggregation of regression trees (bagging trees), least-squares support vector machines (LS-SVMs) and Gaussian processes. These algorithms consider a heterogeneous set of biological features together with other standard {MSA} scores in order to predict the quality of alignments. The most relevant features are then applied to build novel score schemes for the evaluation of alignments. The proposed algorithms are validated by using the {BAliBASE} benchmark. Additionally, an statistical {ANOVA} test is performed to study the relevance of these scores considering three alignment factors. According to the obtained results, the four regression models provide accurate evaluations, even outperforming other standard scores such as BLOSUM, {PAM} or STRIKE.

Keywords: Multiple sequence alignments (MSAs)
[47] Neophytos Stylianou, Artur Akbarov, Evangelos Kontopantelis, Iain Buchan, and Ken W. Dunn. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches. Burns, 41(5):925 - 934, 2015. [ bib | DOI | http ]
AbstractIntroduction Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. Methods An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. Results All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. Discussion The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts.

Keywords: Machine learning
[48] Yong-Ping Zhao, Jian-Guo Sun, and Xian-Quan Zou. Reducing samples for accelerating multikernel semiparametric support vector regression. Expert Systems with Applications, 37(6):4519 - 4525, 2010. [ bib | DOI | http ]
In this paper, the reducing samples strategy instead of classical ν -support vector regression ( ν -SVR), viz. single kernel ν -SVR, is utilized to select training samples for admissible functions so as to curtail the computational complexity. The proposed multikernel learning algorithm, namely reducing samples based multikernel semiparametric support vector regression (RS-MSSVR), has an advantage over the single kernel support vector regression (classical ε -SVR) in regression accuracy. Meantime, in comparison with multikernel semiparametric support vector regression (MSSVR), the algorithm is also favorable for computational complexity with the comparable generalization performance. Finally, the efficacy and feasibility of RS-MSSVR are corroborated by experiments on the synthetic and real-world benchmark data sets.

Keywords: Support vector regression
[49] Marcin Orchel. Support vector regression based on data shifting. Neurocomputing, 96:2 - 11, 2012. Adaptive and Natural Computing Algorithms. [ bib | DOI | http ]
In this article, we provide some preliminary theoretical analysis and extended practical experiments of a novel regression method proposed recently which is based on representing regression problems as classification ones with duplicated and shifted data. The main results regard partial equivalency of Bayes solutions for regression problems and the transformed classification ones, and improved Vapnik–Chervonenkis bounds for the proposed method compared to Support Vector Machines. We conducted experiments comparing the proposed method with ε - insensitive Support Vector Regression ( ε - {SVR} ) on various synthetic and real world data sets. The results indicate that the new method can achieve comparable generalization performance as ε - {SVR} with significantly improved the number of support vectors.

Keywords: Support vector machines
[50] Michel Ballings, Dirk Van den Poel, Nathalie Hespeels, and Ruben Gryp. Evaluating multiple classifiers for stock price direction prediction. Expert Systems with Applications, 42(20):7046 - 7056, 2015. [ bib | DOI | http ]
Abstract Stock price direction prediction is an important issue in the financial world. Even small improvements in predictive performance can be very profitable. The purpose of this paper is to benchmark ensemble methods (Random Forest, AdaBoost and Kernel Factory) against single classifier models (Neural Networks, Logistic Regression, Support Vector Machines and K-Nearest Neighbor). We gathered data from 5767 publicly listed European companies and used the area under the receiver operating characteristic curve (AUC) as a performance measure. Our predictions are one year ahead. The results indicate that Random Forest is the top algorithm followed by Support Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-Nearest Neighbors and Logistic Regression. This study contributes to literature in that it is, to the best of our knowledge, the first to make such an extensive benchmark. The results clearly suggest that novel studies in the domain of stock price direction prediction should include ensembles in their sets of algorithms. Our extensive literature review evidently indicates that this is currently not the case.

Keywords: Ensemble methods
[51] L.G. Sun, C.C. de Visser, Q.P. Chu, and J.A. Mulder. A novel online adaptive kernel method with kernel centers determined by a support vector regression approach. Neurocomputing, 124:111 - 119, 2014. [ bib | DOI | http ]
Abstract The optimality of the kernel number and kernel centers plays a significant role in determining the approximation power of nearly all kernel methods. However, the process of choosing optimal kernels is always formulated as a global optimization task, which is hard to accomplish. Recently, an improved algorithm called recursive reduced least squares support vector regression (IRR-LSSVR) was proposed for establishing a global nonparametric offline model. IRR-LSSVR demonstrates a significant advantage in choosing representing support vectors compared with others. Inspired by the IRR-LSSVR, a new online adaptive parametric kernel method called Weights Varying Least Squares Support Vector Regression (WV-LSSVR) is proposed in this paper using the same type of kernels and the same centers as those used in the IRR-LSSVR. Furthermore, inspired by the multikernel semiparametric support vector regression, the effect of the kernel extension is investigated in a recursive regression framework, and a recursive kernel method called Gaussian Process Kernel Least Squares Support Vector Regression (GPK-LSSVR) is proposed using a compound kernel type which is recommended for Gaussian process regression. Numerical experiments on benchmark data sets confirm the validity and effectiveness of the presented algorithms. The WV-LSSVR algorithm shows higher approximation accuracy than the recursive parametric kernel method using the centers calculated by the k-means clustering approach. The extended recursive kernel method (i.e. GPK-LSSVR) has not shown any advantage in terms of global approximation accuracy when validating the test data set without real-time updates, but it can increase modeling accuracy if real-time identification is involved.

Keywords: Support vector machine
[52] Jinjiang Wang, Peng Wang, and Robert X. Gao. Enhanced particle filter for tool wear prediction. Journal of Manufacturing Systems, 36:35 - 45, 2015. [ bib | DOI | http ]
Abstract Timely assessment and prediction of tool wear is essential to ensuring part quality, minimizing material waste, and contributing to sustainable manufacturing. This paper presents a probabilistic method based on particle filtering to account for uncertainties in the tool wear process. Tool wear state is predicted by recursively updating a physics-based tool wear rate model with online measurement, following a Bayesian inference scheme. For long term prediction where online measurement is not available, regression analysis methods such as autoregressive model and support vector regression are investigated by incorporating predicted measurement into particle filter. The effectiveness of the developed method is demonstrated using experiments performed on a {CNC} milling machine.

[53] Yan Zhao and Qingshan Liu. Generalized recurrent neural network for ϵ-insensitive support vector regression. Mathematics and Computers in Simulation, 86:2 - 9, 2012. The Seventh International Symposium on Neural Networks + The Conference on Modelling and Optimization of Structures, Processes and Systems. [ bib | DOI | http ]
In this paper, a generalized recurrent neural network is proposed for solving ϵ-insensitive support vector regression (ϵ-ISVR). The ϵ-ISVR is first formulated as a convex non-smooth programming problem, and then a generalize recurrent neural network with lower model complexity is designed for training the support vector machine. Furthermore, simulation results are given to demonstrate the effectiveness and performance of the proposed neural network.

Keywords: Non-smooth optimization
[54] Irwanda Laory, Thanh N. Trinh, Ian F.C. Smith, and James M.W. Brownjohn. Methodologies for predicting natural frequency variation of a suspension bridge. Engineering Structures, 80:211 - 221, 2014. [ bib | DOI | http ]
Abstract In vibration-based structural health monitoring, changes in the natural frequency of a structure are used to identify changes in the structural conditions due to damage and deterioration. However, natural frequency values also vary with changes in environmental factors such as temperature and wind. Therefore, it is important to differentiate between the effects due to environmental variations and those resulting from structural damage. In this paper, this task is accomplished by predicting the natural frequency of a structure using measurements of environmental conditions. Five methodologies – multiple linear regression, artificial neural networks, support vector regression, regression tree and random forest – are implemented to predict the natural frequencies of the Tamar Suspension Bridge (UK) using measurements taken from 3 years of continuous monitoring. The effects of environmental factors and traffic loading on natural frequencies are also evaluated by measuring the relative importance of input variables in regression analysis. Results show that support vector regression and random forest are the most suitable methods for predicting variations in natural frequencies. In addition, traffic loading and temperature are found to be two important parameters that need to be measured. Results show potential for application to continuously monitored structures that have complex relationships between natural frequencies and parameters such as loading and environmental factors.

Keywords: Environmental effect
[55] L. Zhu, M.S. Li, Q.H. Wu, and L. Jiang. Short-term natural gas demand prediction based on support vector regression with false neighbours filtered. Energy, 80:428 - 436, 2015. [ bib | DOI | http ]
Abstract This paper presents a novel approach, named the {SVR} (support vector regression) based {SVRLP} (support vector regression local predictor) with FNF-SVRLP (false neighbours filtered-support vector regression local predictor), to predict short-term natural gas demand. This method integrates the {SVR} algorithm with the reconstruction properties of a time series, and optimises the original local predictor by removing false neighbours. A unified model, named the {SM} (“Standard Model”), is presented to process the entire dataset. To further improve the predicted accuracy, an {AM} (“Advanced Model”) is proposed, and is based on specific customer behaviours during different days of the week. The {AM} contains seven individual models for the seven days of the week. The FNF-SVRLP based {AM} has been used to predict natural gas demand for the National Grid of the United Kingdom (UK). This model outperforms the SVRLP, the {ARMA} (autoregressive moving average) and the {ANN} (artificial neural network) methods when applied to real-world data obtained from National Grid and has been successfully applied to daily gas operations for National Grid.

Keywords: Short-term prediction
[56] Ibrahim Berkan Aydilek and Ahmet Arslan. A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Information Sciences, 233:25 - 35, 2013. [ bib | DOI | http ]
Missing values in datasets should be extracted from the datasets or should be estimated before they are used for classification, association rules or clustering in the preprocessing stage of data mining. In this study, we utilize a fuzzy c-means clustering hybrid approach that combines support vector regression and a genetic algorithm. In this method, the fuzzy clustering parameters, cluster size and weighting factor are optimized and missing values are estimated. The proposed novel hybrid method yields sufficient and sensible imputation performance results. The results are compared with those of fuzzy c-means genetic algorithm imputation, support vector regression genetic algorithm imputation and zero imputation.

Keywords: Missing data
[57] Mahesh Pal, N.K. Singh, and N.K. Tiwari. Support vector regression based modeling of pier scour using field data. Engineering Applications of Artificial Intelligence, 24(5):911 - 916, 2011. [ bib | DOI | http ]
This paper investigates the potential of support vector machines based regression approach to model the local scour around bridge piers using field data. A dataset of consisting of 232 pier scour measurements taken from {BSDMS} were used for this analysis. Results obtained by using radial basis function and polynomial kernel based Support vector regression were compared with four empirical relation as well as with a backpropagation neural network and generalized regression neural network. A total of 154 data were used for training different algorithms whereas remaining 78 data were used to test the created model. A coefficient of determination value of 0.897 (root mean square error=0.356) was achieved by radial basis kernel based support vector regression in comparison to 0.880 and 0.835 (root mean square error=0.388 and 0.438) by backpropagation neural and generalized regression neural network. Comparisons of results with four predictive equations suggest an improved performance by support vector regression. Results with dimensionless data using all three algorithms suggest a better performance by dimensional data with this dataset. Sensitivity analysis suggests the importance of depth of flow and pier width in predicting the scour depth when using support vector regression based modeling approach.

Keywords: Pier scour
[58] Yingjie Tian, Xuchan Ju, Zhiquan Qi, and Yong Shi. Efficient sparse least squares support vector machines for pattern classification. Computers & Mathematics with Applications, 66(10):1935 - 1947, 2013. ICNC-FSKD 2012. [ bib | DOI | http ]
Abstract We propose a novel least squares support vector machine, named ε -least squares support vector machine ( ε -LSSVM), for binary classification. By introducing the ε -insensitive loss function instead of the quadratic loss function into LSSVM, ε -LSSVM has several improved advantages compared with the plain LSSVM. (1) It has the sparseness which is controlled by the parameter ε . (2) By weighting different sparseness parameters ε for each class, the unbalanced problem can be solved successfully, furthermore, an useful choice of the parameter ε is proposed. (3) It is actually a kind of ε -support vector regression ( ε -SVR), the only difference here is that it takes the binary classification problem as a special kind of regression problem. (4) Therefore it can be implemented efficiently by the sequential minimization optimization (SMO) method for large scale problems. Experimental results on several benchmark datasets show the effectiveness of our method in sparseness, balance performance and classification accuracy, and therefore confirm the above conclusion further.

Keywords: Least squares support vector machine
[59] Mohamad Hasan Bahari, Mitchell McLaren, Hugo Van hamme, and David A. van Leeuwen. Speaker age estimation using i-vectors. Engineering Applications of Artificial Intelligence, 34:99 - 108, 2014. [ bib | DOI | http ]
Abstract In this paper, a new approach for age estimation from speech signals based on i-vectors is proposed. In this method, each utterance is modeled by its corresponding i-vector. Then, a Within-Class Covariance Normalization technique is used for session variability compensation. Finally, a least squares support vector regression (LSSVR) is applied to estimate the age of speakers. The proposed method is trained and tested on telephone conversations of the National Institute for Standard and Technology (NIST) 2010 and 2008 speaker recognition evaluation databases. Evaluation results show that the proposed method yields significantly lower mean absolute error and higher Pearson correlation coefficient between chronological speaker age and estimated speaker age compared to different conventional schemes. The obtained relative improvements of mean absolute error and correlation coefficient compared to our best baseline system are around 5% and 2% respectively. Finally, the effect of some major factors influencing the proposed age estimation system, namely utterance length and spoken language are analyzed.

Keywords: Speaker age estimation
[60] Zhao Yongping and Sun Jianguo. Fast online approximation for hard support vector regression and its application to analytical redundancy for aeroengines. Chinese Journal of Aeronautics, 23(2):145 - 152, 2010. [ bib | DOI | http ]
The hard support vector regression attracts little attention owing to the overfitting phenomenon. Recently, a fast offline method has been proposed to approximately train the hard support vector regression with the generation performance comparable to the soft support vector regression. Based on this achievement, this article advances a fast online approximation called the hard support vector regression (FOAHSVR for short). By adopting the greedy stagewise and iterative strategies, it is capable of online estimating parameters of complicated systems. In order to verify the effectiveness of the FOAHSVR, an FOAHSVR-based analytical redundancy for aeroengines is developed. Experiments on the sensor failure and drift evidence the viability and feasibility of the analytical redundancy for aeroengines together with its base—FOAHSVR. In addition, the {FOAHSVR} is anticipated to find applications in other scientific-technical fields.

Keywords: support vector machines
[61] Xinjun Peng and Yifei Wang. The robust and efficient adaptive normal direction support vector regression. Expert Systems with Applications, 38(4):2998 - 3008, 2011. [ bib | DOI | http ]
The recently proposed reduced convex hull support vector regression (RH-SVR) treats support vector regression (SVR) as a classification problem in the dual feature space by introducing an epsilon-tube. In this paper, an efficient and robust adaptive normal direction support vector regression (AND-SVR) is developed by combining the geometric algorithm for support vector machine (SVM) classification. AND-SVR finds a better shift direction for training samples based on the normal direction of output function in the feature space compared with RH-SVR. Numerical examples on several artificial and {UCI} benchmark datasets with comparisons show that the proposed AND-SVR derives good generalization performance

Keywords: Support vector regression
[62] Chao Gao and Xiao jun Wu. Kernel support tensor regression. Procedia Engineering, 29:3986 - 3990, 2012. 2012 International Workshop on Information and Electronics Engineering. [ bib | DOI | http ]
Support vector machine (SVM) not only can be used for classification, can also be applied to regression problems by the introduction of an alternative loss function. Now most of the regress algorithms are based on vector as input, but in many real cases input samples are tensors, support tensor machine (STM) by Cai and He is a typical learning machine for second order tensors. In this paper, we propose an algorithm named kernel support tensor regression (KSTR) using tensors as input for function regression. In this algorithm, after mapping the each row of every original tensor or of every tensor converted from original vector into a high dimensional space, we can get associated points in a new high dimensional feature space, and then compute the regression function. We compare the results of {KSTR} with the traditional {SVR} algorithm, and find that {KSTR} is more effective according to the analysis of the experimental results.

Keywords: Support Vector Machine(SVM)
[63] Xiaodan Yu, Zhiquan Qi, and Yuanmeng Zhao. Support vector regression for newspaper/magazine sales forecasting. Procedia Computer Science, 17:1055 - 1062, 2013. First International Conference on Information Technology and Quantitative Management. [ bib | DOI | http ]
Abstract Advances in information technologies have changed our lives in many ways. There is a trend that people look for news and stories on the internet. Under this circumstance, it is more urgent for traditional media companies to predict print's (i.e. newspapers/magazines) sales than ever. Previous approaches in newspapers/magazines’ sales forecasting are mainly focused on building regression models based on sample data sets. But such regression models can suffer from the over-fitting problem. Recent theoretical studies in statistics proposed a novel method, namely support vector regression (SVR), to overcome the over-fitting problem. In contrast to traditional regression model, the objective of {SVR} is to achieve the minimum structural risk rather than the minimum empirical risk. This study, therefore, applied support vector regression to the newspaper/magazines’ sales forecasting problem. The experiment showed that {SVR} is a superior method in this kind of task.

Keywords: sales forecasting
[64] Zhenpeng He, Yigang Sun, Guichang Zhang, Zhenyu Hong, Weisong Xie, Xin Lu, and Junhong Zhang. Tribilogical performances of connecting rod and by using orthogonal experiment, regression method and response surface methodology. Applied Soft Computing, 29:436 - 449, 2015. [ bib | DOI | http ]
Abstract Dynamic lubrication analysis of connecting rod is a very complex problem. Some factors have great effect on lubrication, such as clearance, oil viscosity, oil supplying hole, bearing elastic modulus, surface roughness, oil supplying pressure and engine speed and bearing width. In this paper, ten indexes are used as the input parameters to evaluate the bearing performances: minimum oil film thickness (MOFT), friction loss, the maximum oil film pressure (MOFP) and average of the oil leakages (OLK). Two orthogonal experiments are combined to identify the factors dominating the bearing behavior. The stepwise regression is used to establish the regression model without insignificant variables, and two most important variables are used as the input to carry out the surface response analysis for each model. At last, the support vector machine (SVM) is used to identify the asperity contact. Compared with {SVM} model, the particle swarm optimization-support vector machines (PSO–SVM) can predict the asperity contact more precise, especially to the samples near dividing line. In future work, more soft computing methods with statistical characteristic are used to the tribology analyses.

Keywords: Connecting rod
[65] Eric Bastos Görgens, Alessandro Montaghi, and Luiz Carlos Estraviz Rodriguez. A performance comparison of machine learning methods to estimate the fast-growing forest plantation yield based on laser scanning metrics. Computers and Electronics in Agriculture, 116:221 - 227, 2015. [ bib | DOI | http ]
Abstract Machine learning models appear to be an attractive route towards tackling high-dimensional problems, particularly in areas where a lack of knowledge exists regarding the development of effective algorithms, and where programs must dynamically adapt to changing conditions. The objective of this study was to evaluate the performance of three machine learning tools for predicting stand volume of fast-growing forest plantations, based on statistical vegetation metrics extracted from an Airborne Laser Scanning (ALS) survey. The forests used in this study were composed of 1138 ha of commercial plantations that consisted of hybrids of Eucalyptus grandis and Eucalyptus urophylla, managed for pulp production. Three machine learning tools were implemented: neural network (NN), random forest (RF) and support vector regression (SV); and their performance was compared to a regression model (RM). The {RF} and the {RM} presented an {RMSE} in the leave-one-out cross-validation of 31.80 and 30.56 m3 ha−1 respectively. The {NN} and {SV} presented a higher {RMSE} than the others, equal to 64.44 and 65.30 m3 ha−1. The coefficient of determination and bias were similar to all modeling techniques. The ranking of {ALS} metrics based on their relative importance for the estimation of stand volume showed some differences. Rather than being limited to a subset of predictor variables, machine learning techniques explored the complete metrics set, looking for patterns between them and the dependent variable.

Keywords: Forest quantification
[66] Ping Liu, Jianmin Sun, Liying Han, and Bo Wang. Research on the construction of macro assets price index based on support vector machine. Procedia Computer Science, 29:1801 - 1815, 2014. 2014 International Conference on Computational Science. [ bib | DOI | http ]
Abstract In this paper, a new macro assets price index (MAPI) is constructed based on support vector machine. In fact, 12 indicators, which can represent the macro economy well in both economically and statistically, are chosen to build our new index. Here, different from traditional econometric method, a novel machine learning method support vector regression machine (SVR) is employed to product the predictor of consumer price index (CPI) in China. In addition, in the experiment part, we also compare the result of {SVR} with that of least square regression (LSR) and vector autoregressive (VAR) impulse response analysis. The comparison shows that the latter two methods are hard to satisfy the requirement in both economically and statistically. On the contrary, {SVR} gives a good predictor of {CPI} and exhibits a manifest leading of CPI. In other words, our index can forecast the trends by 4 to 6 months, which is useful for investment and policy making.

Keywords: Macro assets price index
[67] Zibo Dong, Dazhi Yang, Thomas Reindl, and Wilfred M. Walsh. A novel hybrid approach based on self-organizing maps, support vector regression and particle swarm optimization to forecast solar irradiance. Energy, 82:570 - 577, 2015. [ bib | DOI | http ]
Abstract We forecast hourly solar irradiance time series using a novel hybrid model based on {SOM} (self-organizing maps), {SVR} (support vector regression) and {PSO} (particle swarm optimization). In order to solve the noise and stationarity problems in the statistical time series forecasting modelling process, {SOM} is applied to partition the whole input space into several disjointed regions with different characteristic information on the correlation between the input and the output. Then {SVR} is used to model each disjointed regions to identify the characteristic correlation. In order to reduce the performance volatility of {SVM} (support vector machine) with different parameters, {PSO} is implemented to automatically perform the parameter selection in {SVR} modelling. This hybrid model has been used to forecast hourly solar irradiance in Colorado, {USA} and Singapore. The technique is found to outperform traditional forecasting models.

Keywords: Hourly solar irradiance forecasting
[68] Pengfei Zhu and Qinghua Hu. Rule extraction from support vector machines based on consistent region covering reduction. Knowledge-Based Systems, 42:1 - 8, 2013. [ bib | DOI | http ]
Due to good performance in classification and regression, support vector machines have attracted much attention and become one of the most popular learning machines in last decade. As a black box, the support vector machine is difficult for users’ understanding and explanation. In many application domains including medical diagnosis or credit scoring, understandability and interpretability are very important for the practicability of the learned models. To improve the comprehensibility of SVMs, we propose a rule extraction technique from support vector machines via analyzing the distribution of samples. We define the consistent region of samples in terms of classification boundary, and form a consistent region covering of the sample space. Then a covering reduction algorithm is developed for extracting compact representation of classes, thus a minimal set of decision rules is derived. Experiment analysis shows that the extracted models perform well in comparison with decision tree algorithms and other support vector machine rule extraction methods.

Keywords: Classification learning
[69] P. Yuvaraj, A. Ramachandra Murthy, Nagesh R. Iyer, S.K. Sekar, and Pijush Samui. Support vector regression based models to predict fracture characteristics of high strength and ultra high strength concrete beams. Engineering Fracture Mechanics, 98:29 - 43, 2013. [ bib | DOI | http ]
This paper examines the applicability of support vector machine (SVM) based regression to predict fracture characteristics and failure load (Pmax) of high strength and ultra high strength concrete beams. Characterization of mix and testing of beams of high strength and ultra strength concrete have been described briefly. Methodologies for evaluation of fracture energy, critical stress intensity factor and critical crack tip opening displacement have been outlined. Support Vector Regression (SVR) is the extension of {SVMs} to solve regression and prediction problems. The main characteristics of {SVR} includes minimizing the observed training error, attempts to minimize the generalized error bound so as to achieve generalized performance. Four Support Vector Regression (SVR) models have been developed using {MATLAB} software for training and prediction of fracture characteristics. It is observed that the predicted values from the {SVR} models are in good agreement with those of the experimental values.

Keywords: Support vector machine
[70] Allaeddine Djouama and Myoung-Seob Lim. Reduction of the feedback delay effect on a proportional fair scheduler in {LTE} downlink using nonlinear support vector machine prediction. {AEU} - International Journal of Electronics and Communications, pages -, 2015. [ bib | DOI | http ]
Abstract The scheduling of mobile users often relies on accurate feedback from the channel quality indicator (CQI). In this paper, we determine the strength of the effect of feedback delay on the scheduler in a Long-Term Evolution (LTE) system. We study this degradation under fairness constraints using a proportional fair scheduler. We propose a nonlinear support vector machine regression with a modified cost function in order to reduce the effect of feedback delay on the scheduler, which operates by predicting the {CQI} from previous feedback and using that for scheduling instead of the delayed feedback. The simulation results show important improvements in terms of throughput.

Keywords: Support vector machines
[71] Kadir Kavaklioglu. Support vector regression model based predictive control of water level of u-tube steam generators. Nuclear Engineering and Design, 278:651 - 660, 2014. [ bib | DOI | http ]
Abstract A predictive control algorithm using support vector regression based models was proposed for controlling the water level of U-tube steam generators of pressurized water reactors. Steam generator data were obtained using a transfer function model of U-tube steam generators. Support vector regression based models were built using a time series type model structure for five different operating powers. Feedwater flow controls were calculated by minimizing a cost function that includes the level error, the feedwater change and the mismatch between feedwater and steam flow rates. Proposed algorithm was applied for a scenario consisting of a level setpoint change and a steam flow disturbance. The results showed that steam generator level can be controlled at all powers effectively by the proposed method.

[72] Raghuram Karthik Desu, Sharath Chandra Guntuku, Aditya B, and Amit Kumar Gupta. Support vector regression based flow stress prediction in austenitic stainless steel 304. Procedia Materials Science, 6:368 - 375, 2014. 3rd International Conference on Materials Processing and Characterisation (ICMPC 2014). [ bib | DOI | http ]
Abstract This paper focuses on modelling the relationship between flow stress and strain, strain rate and temperature using Support Vector Regression technique. Data obtained for both the regions (non-Dynamic Strain Aging and Dynamic Strain Aging) is analysed using Support Vector Machine, where a nonlinear model is learned by linear learning machine by mapping it into high dimensional kernel included feature space. A number of semi empirical models based on mathematical relationships and Artificial Intelligence techniques were reported by researchers to predict the flow stress during deformation. This work attempts to show the prowess of Support Vector Regression based modelling applied to flow stress prediction, delineating the flexibility that the user is presented with, while modelling the problem. The model is successfully trained based on the training data and employed to predict the flow stress values for the testing data, which were compared with the experimental values. It was found that the correlation coefficient between the predicted and experimental data is 0.9978 for the non- Dynamic Strain Aging regime and 0.9989 for the Dynamic Strain Aging regime showcasing the excellent predictability of this model when compared with other models that are prominently used for flow stress prediction. Data is trained at different values of insensitivity loss function of the Support Vector Regression for showcasing the unique features of this technique. The results produced are encouraging to the researchers for exploring this Artificial Intelligence technique for data modelling.

Keywords: Austenitic Stainless Steel
[73] Samik Dutta, Surjya K. Pal, and Ranjan Sen. On-machine tool prediction of flank wear from machined surface images using texture analyses and support vector regression. Precision Engineering, pages -, 2015. [ bib | DOI | http ]
Abstract In this paper, a method for on-machine tool condition monitoring by processing the turned surface images has been proposed. Progressive monitoring of cutting tool condition is inevitable to maintain product quality. Thus, image texture analyses using gray level co-occurrence matrix, Voronoi tessellation and discrete wavelet transform based methods have been applied on turned surface images for extracting eight useful features to describe progressive tool flank wear. Prediction of cutting tool flank wear has also been performed using these eight features as predictors by utilizing linear support vector machine based regression technique with a maximum 4.9% prediction error.

Keywords: Tool flank wear prediction
[74] Hong wei ZHANG, Zhi qiang GE, Xiao feng YUAN, Zhi huan SONG, and Ling jian YE. Rapid vision-based system for secondary copper content estimation. Transactions of Nonferrous Metals Society of China, 24(8):2665 - 2676, 2014. [ bib | DOI | http ]
Abstract A vision-based color analysis system was developed for rapid estimation of copper content in the secondary copper smelting process. Firstly, cross section images of secondary copper samples were captured by the designed vision system. After the preprocessing and segmenting procedures, the images were selected according to their grayscale standard deviations of pixels and percentages of edge pixels in the luminance component. The selected images were then used to extract the information of the improved color vector angles, from which the copper content estimation model was developed based on the least squares support vector regression (LSSVR) method. For comparison, three additional {LSSVR} models, namely, only with sample selection, only with improved color vector angle, without sample selection or improved color vector angle, were developed. In addition, two exponential models, namely, with sample selection, without sample selection, were developed. Experimental results indicate that the proposed method is more effective for improving the copper content estimation accuracy, particularly when the sample size is small.

Keywords: secondary copper
[75] Fazil Kaytez, M. Cengiz Taplamacioglu, Ertugrul Cam, and Firat Hardalac. Forecasting electricity consumption: A comparison of regression analysis, neural networks and least squares support vector machines. International Journal of Electrical Power & Energy Systems, 67:431 - 438, 2015. [ bib | DOI | http ]
Abstract Accurate electricity consumption forecast has primary importance in the energy planning of the developing countries. During the last decade several new techniques are being used for electricity consumption planning to accurately predict the future electricity consumption needs. Support vector machines (SVMs) and least squares support vector machines (LS-SVMs) are new techniques being adopted for energy consumption forecasting. In this study, the LS-SVM is implemented for the prediction of electricity energy consumption of Turkey. In addition to the traditional regression analysis and artificial neural networks (ANNs) are considered. In the models, gross electricity generation, installed capacity, total subscribership and population are used as independent variables using historical data from 1970 to 2009. Forecasting results are compared using diverse performance criteria in this study with each other. Receiver operating characteristic (ROC) analysis is realized for determining the specificity and sensitivity of the empirical results. The results indicate that the proposed LS-SVM model is an accurate and a quick prediction method.

Keywords: Electricity consumption forecasting
[76] A. Srinivasan, P. Venkatesh, B. Dineshkumar, and N. Ramkumar. Dynamic available transfer capability determination in power system restructuring environment using support vector regression. International Journal of Electrical Power & Energy Systems, 69:123 - 130, 2015. [ bib | DOI | http ]
Abstract This paper presents dynamic available transfer capability (DATC) determination in power system restructuring environment using support vector regression (SVR). Dynamic available transfer capability is first determined based on the conventional method of potential energy boundary surface transient energy function. Simulations were carried out on a {WSCC} 3-machine 9-bus system and a Practical South Indian Grid test system by considering load increases as the contingency. The data collected from the conventional method is then used as an input training sample to the {SVR} in determining DATC. To reduce training time and improve accuracy of the SVR, the kernel function type and kernel parameter are considered. The proposed {SVR} based method, its performance is validated by comparing with the multilayer perceptron neural network (MLPNN). Studies show that the {SVR} gives faster and more accurate results for {DATC} determination compared with MLPNN.

Keywords: Dynamic available transfer capability
[77] Pierre M.L. Drezet and Robert F. Harrison. A new method for sparsity control in support vector classification and regression. Pattern Recognition, 34(1):111 - 125, 2001. [ bib | DOI | http ]
A new method of implementing Support Vector learning algorithms for classification and regression is presented which deals with problems of over-defined solutions and excessive complexity. Classification problems are solved with a minimum number of support vectors, irrespective of the degree of overlap in the training data. Support vector regression can deliver a sparse solution, without requiring Vapnik's ε-insensitive zone. This paper generalises sparsity control for both support vector classification and regression. The novelty in this work is in the method of achieving a sparse support vector set which forms a minimal basis for the prediction function.

Keywords: Support vector machines
[78] Kennedy Were, Dieu Tien Bui, Øystein B. Dick, and Bal Ram Singh. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an afromontane landscape. Ecological Indicators, 52:394 - 403, 2015. [ bib | DOI | http ]
Abstract Soil organic carbon (SOC) is a key indicator of ecosystem health, with a great potential to affect climate change. This study aimed to develop, evaluate, and compare the performance of support vector regression (SVR), artificial neural network (ANN), and random forest (RF) models in predicting and mapping {SOC} stocks in the Eastern Mau Forest Reserve, Kenya. Auxiliary data, including soil sampling, climatic, topographic, and remotely-sensed data were used for model calibration. The calibrated models were applied to create prediction maps of {SOC} stocks that were validated using independent testing data. The results showed that the models overestimated {SOC} stocks. Random forest model with a mean error (ME) of −6.5 Mg C ha−1 had the highest tendency for overestimation, while {SVR} model with an {ME} of −4.4 Mg C ha−1 had the lowest tendency. Support vector regression model also had the lowest root mean squared error (RMSE) and the highest {R2} values (14.9 Mg C ha−1 and 0.6, respectively); hence, it was the best method to predict {SOC} stocks. Artificial neural network predictions followed closely with RMSE, ME, and {R2} values of 15.5, −4.7, and 0.6, respectively. The three prediction maps broadly depicted similar spatial patterns of {SOC} stocks, with an increasing gradient of {SOC} stocks from east to west. The highest stocks were on the forest-dominated western and north-western parts, while the lowest stocks were on the cropland-dominated eastern part. The most important variable for explaining the observed spatial patterns of {SOC} stocks was total nitrogen concentration. Based on the close performance of {SVR} and {ANN} models, we proposed that both models should be calibrated, and then the best result applied for spatial prediction of target soil properties in other contexts.

Keywords: Random forests
[79] Abdulrahman Alenezi, Scott A. Moses, and Theodore B. Trafalis. Real-time prediction of order flowtimes using support vector regression. Computers & Operations Research, 35(11):3489 - 3503, 2008. Part Special Issue: Topics in Real-time Supply Chain Management. [ bib | DOI | http ]
In a make-to-order production system, a due date must be assigned to new orders that arrive dynamically, which requires predicting the order flowtime in real-time. This study develops a support vector regression model for real-time flowtime prediction in multi-resource, multi-product systems. Several combinations of kernel and loss functions are examined, and results indicate that the linear kernel and the ε -insensitive loss function yield the best generalization performance. The prediction error of the support vector regression model for three different multi-resource systems of varying complexity is compared to that of classic time series models (exponential smoothing and moving average) and to a feedforward artificial neural network. Results show that the support vector regression model has lower flowtime prediction error and is more robust. More accurately predicting flowtime using support vector regression will improve due-date performance and reduce expenses in make-to-order production environments.

Keywords: Due-date assignment
[80] Fu-Kwun Wang and Timon Du. Implementing support vector regression with differential evolution to forecast motherboard shipments. Expert Systems with Applications, 41(8):3850 - 3855, 2014. [ bib | DOI | http ]
Abstract In this study, we investigate the forecasting accuracy of motherboard shipments from Taiwan manufacturers. A generalized Bass diffusion model with external variables can provide better forecasting performance. We present a hybrid particle swarm optimization (HPSO) algorithm to improve the parameter estimates of the generalized Bass diffusion model. A support vector regression (SVR) model was recently used successfully to solve forecasting problems. We propose an {SVR} model with a differential evolution (DE) algorithm to improve forecasting accuracy. We compare our proposed model with the Bass diffusion and generalized Bass diffusion models. The {SVR} model with a {DE} algorithm outperforms the other models on both model fit and forecasting accuracy.

Keywords: Generalized Bass diffusion model
[81] V. Ceperic, G. Gielen, and A. Baric. Recurrent sparse support vector regression machines trained by active learning in the time-domain. Expert Systems with Applications, 39(12):10933 - 10942, 2012. [ bib | DOI | http ]
A method for the sparse solution of recurrent support vector regression machines is presented. The proposed method achieves a high accuracy versus complexity and allows the user to adjust the complexity of the resulting model. The sparse representation is guaranteed by limiting the number of training data points for the support vector regression method. Each training data point is selected based on the accuracy of the fully recurrent model using the active learning principle applied to the successive time-domain data. The user can adjust the training time by selecting how often the hyper-parameters of the algorithm should be optimised. The advantages of the proposed method are illustrated on several examples, and the experiments clearly show that it is possible to reduce the number of support vectors and to significantly improve the accuracy versus complexity of recurrent support vector regression machines.

Keywords: Support vector machines
[82] Lu Han, Liyan Han, and Hongwei Zhao. Orthogonal support vector machine for credit scoring. Engineering Applications of Artificial Intelligence, 26(2):848 - 862, 2013. [ bib | DOI | http ]
The most commonly used techniques for credit scoring is logistic regression, and more recent research has proposed that the support vector machine is a more effective method. However, both logistic regression and support vector machine suffers from curse of dimension. In this paper, we introduce a new way to address this problem which is defined as orthogonal dimension reduction. We discuss the related properties of this method in detail and test it against other common statistical approaches—principal component analysis and hybridizing logistic regression to better solve and evaluate the data. With experiments on German data set, there is also an interesting phenomenon with respect to the use of support vector machine, which we define as ‘Dimensional interference’, and discuss in general. Based on the results of cross-validation, it can be found that through the use of logistic regression filtering the dummy variables and orthogonal extracting feature, the support vector machine not only reduces complexity and accelerates convergence, but also achieves better performance.

Keywords: Dimension curse
[83] Qi Wu. The complex fuzzy system forecasting model based on triangular fuzzy robust wavelet ν-support vector machine. Expert Systems with Applications, 38(12):14478 - 14489, 2011. [ bib | DOI | http ]
This paper presents a new version of fuzzy wavelet support vector regression machine to forecast the nonlinear fuzzy system with multi-dimensional input variables. The input and output variables of the proposed model are described as triangular fuzzy numbers. Then by integrating the triangular fuzzy theory, wavelet analysis theory and ν-support vector regression machine, a polynomial slack variable is also designed, the triangular fuzzy robust wavelet ν-support vector regression machine (TFRWν-SVM) is proposed. To seek the optimal parameters of TFRWν-SVM, particle swarm optimization is also applied to optimize parameters of TFRWν-SVM. A forecasting method based on TFRWν-SVRM and {PSO} are put forward. The results of the application in sale system forecasts confirm the feasibility and the validity of the forecasting method. Compared with the traditional model, TFRWν-SVM method requires fewer samples and has better forecasting precision.

Keywords: Fuzzy ν-support vector machine
[84] Shanshan Qiu, Liping Gao, and Jun Wang. Classification and regression of elm, {LVQ} and {SVM} for e-nose data of strawberry juice. Journal of Food Engineering, 144:77 - 85, 2015. [ bib | DOI | http ]
Abstract An electronic nose (E-nose) has been used to characterize five types of strawberry juices based on different processing approaches (i.e., Microwave Pasteurization, Steam Blanching, High Temperature Short Time Pasteurization, Frozen–Thawed, and Freshly Squeezed). Juice quality parameters (vitamin C and total acid) were detected by traditional measuring methods. Multivariate statistical methods (Principle Component Analysis, Linear Discriminant Analysis, Multiple Linear Regression, and Partial Least Squares Regression) and neural networks (Extreme Learning Machine (ELM), Learning Vector Quantization and Library Support Vector Machines) were employed for qualitative classification and quantitative regression. {ELM} showed best performances on classification and regression, indicating that {ELM} would be a good choice for E-nose data treatment. Results provide promising principles for the elaboration of E-nose which could be used to discriminate processed juices and to predict juice quality parameters based on appropriate algorithms for the beverage industry.

Keywords: Electronic nose
[85] Shanshan Chen, Fangfang Zhang, Jifeng Ning, Xu Liu, Zhenwen Zhang, and Shuqin Yang. Predicting the anthocyanin content of wine grapes by {NIR} hyperspectral imaging. Food Chemistry, 172:788 - 793, 2015. [ bib | DOI | http ]
Abstract The aim of this study was to demonstrate the capability of hyperspectral imaging in predicting anthocyanin content changes in wine grapes during ripening. One hundred twenty groups of Cabernet Sauvignon grapes were collected periodically after veraison. The hyperspectral images were recorded by a hyperspectral imaging system with a spectral range from 900 to 1700 nm. The anthocyanin content was measured by the pH differential method. A quantitative model was developed using partial least squares regression (PLSR) or support vector regression (SVR) for calculating the anthocyanin content. The best model was obtained using SVR, yielding a coefficient of validation (P-R2) of 0.9414 and a root mean square error of prediction (RMSEP) of 0.0046, higher than the {PLSR} model, which had a P-R2 of 0.8407 and a {RMSEP} of 0.0129. Therefore, hyperspectral imaging can be a fast and non-destructive method for predicting the anthocyanin content of wine grapes during ripening.

Keywords: Wine grapes
[86] Shien-Tsung Chen, Pao-Shan Yu, and Bin-Wu Liu. Comparison of neural network architectures and inputs for radar rainfall adjustment for typhoon events. Journal of Hydrology, 405(1–2):150 - 160, 2011. [ bib | DOI | http ]
Summary This work presents a radar rainfall adjustment approach that uses two neural network architectures, support vector regression and the radial basis function neural network. The proposed approach can increase the accuracy of radar rainfall estimates that are underestimated, especially in mountainous regions. Hourly rainfall data observed at 126 raingauges in typhoon events provide the ground-truth information for adjusting radar rainfall estimates. Various inputs to the adjustment model are variable combinations of the radar rainfall, the coordinates, the elevation and the distance to the radar station. Simulation results and their intercomparison indicate that including additional topographic variables in the input vector can enhance the model performance. Validation results pertaining to three typhoon events further demonstrate that the adjustment models can reduce radar rainfall errors. Moreover, the support vector regression outperforms the radial basis function neural network in terms of radar rainfall adjustment. The spatial rainfall distribution of adjusted radar rainfall is also presented, as well as the model calibration and validation by two sets of gauges to show the generality of the method.

Keywords: Radar rainfall adjustment
[87] Jiayi Li, Hongyan Zhang, and Liangpei Zhang. Column-generation kernel nonlocal joint collaborative representation for hyperspectral image classification. {ISPRS} Journal of Photogrammetry and Remote Sensing, 94:25 - 36, 2014. [ bib | DOI | http ]
Abstract We propose a kernel nonlocal joint collaborative representation classification method based on column generation for hyperspectral imagery. The proposed approach first maps the original spectral space to a higher implicit kernel space by directly taking the similarity measures between spectral pixels as a feature, and then utilizes a nonlocal joint collaborative regression model for kernel signal reconstruction and the subsequent pixel classification. We also develop two kinds of specific radial basis function kernels for measuring the similarities. The experimental results indicate that the proposed algorithms obtain a competitive performance and outperform other state-of-the-art regression-based classifiers and the classical support vector machines classifier.

Keywords: Kernel method
[88] Ting Hu, Dao-Hong Xiang, and Ding-Xuan Zhou. Online learning for quantile regression and support vector regression. Journal of Statistical Planning and Inference, 142(12):3107 - 3122, 2012. [ bib | DOI | http ]
We consider for quantile regression and support vector regression a kernel-based online learning algorithm associated with a sequence of insensitive pinball loss functions. Our error analysis and derived learning rates show quantitatively that the statistical performance of the learning algorithm may vary with the quantile parameter τ . In our analysis we overcome the technical difficulty caused by the varying insensitive parameter introduced with a motivation of sparsity.

Keywords: Quantile regression
[89] Felipe Avila, Marco Mora, Miguel Oyarce, Alex Zuñiga, and Claudio Fredes. A method to construct fruit maturity color scales based on support machines for regression: Application to olives and grape seeds. Journal of Food Engineering, 162:9 - 17, 2015. [ bib | DOI | http ]
Abstract Color scales are a powerful tool used in agriculture for estimate maturity of fruits. Fruit maturity is an important parameter to determine the harvest time. Typically, to obtain the maturity grade, a human expert visually associates the fruit color with a color present in the scale. In this paper, a computer-based method to create color scales is proposed. The proposed method performs a multidimensional regression based on Support Vector Regression (SVR) to generate color scales. The experimentation considers two color scales examples, the first one for grape seeds, the second one for olives. Grape seed data set contains 250 samples and olives data set has 200 samples. Color scales developed by {SVR} were validated through K-fold Cross Validation method, using mean squared error as performance function. The proposed method generates scales that adequately follow the evolution of color in the fruit maturity process, provides a tool to define different phenolic pre-harvest stages, which may be of interest to the human expert.

Keywords: Color scales
[90] Wenle Zhang, Na Li, Yuyan Feng, Shujun Su, Tao Li, and Bing Liang. A unique quantitative method of acid value of edible oils and studying the impact of heating on edible oils by uv–vis spectrometry. Food Chemistry, 185:326 - 332, 2015. [ bib | DOI | http ]
Abstract UV–Vis spectroscopy coupled with chemometrics was used effectively to study the impact of heating on edible oils (corn oil, sunflower oil, rapeseed oil, peanut oil, soybean oil and sesame oil) and determine their acid value. Analysis of their first derivative spectra showed that the peak at 370 nm was a common indicator of the heated oils. Partial least squares regression (PLS) and principle component regression (PCR) were applied to building individual quantitative models of acid value for each kind of oil, respectively. The {PLS} models had a better performance than {PCR} models, with determination coefficients (R2) of 0.9904–0.9977 and root mean square errors (RMSE) of 0.0230–0.0794 for the prediction sets of each kind of oil, respectively. An integrate quantitative model built by support vector regression for all the six kinds of oils was also developed and gave a satisfactory prediction with a {R2} of 0.9932 and a {RMSE} of 0.0656.

Keywords: Edible oil
[91] Shunli Zhang, Yao Sui, Xin Yu, Sicong Zhao, and Li Zhang. Hybrid support vector machines for robust object tracking. Pattern Recognition, 48(8):2474 - 2488, 2015. [ bib | DOI | http ]
Abstract Tracking-by-detection techniques always formulate tracking as a binary classification problem. However, in this formulation, there exists a potential issue that the boundary of the positive targets and the negative background samples is fuzzy, which may be an important factor causing drift. To address this problem, we propose a novel hybrid formulation for tracking based on binary classification, regression and one-class classification, which comprehensively represents the appearance from different perspectives. In particular, the proposed regression model is a novel formulation for tracking and plays an important role in solving the fuzzy boundary problem. Moreover, we present a new tracking approach with different support vector machines (SVMs) and a novel distribution-based collaboration strategy as a specific implementation. Experimental results demonstrate that our method is robust and can achieve the state-of-the-art performance.

Keywords: Object tracking
[92] F. Salazar, M.A. Toledo, E. Oñate, and R. Morán. An empirical comparison of machine learning techniques for dam behaviour modelling. Structural Safety, 56:9 - 17, 2015. [ bib | DOI | http ]
Abstract Predictive models are essential in dam safety assessment. Both deterministic and statistical models applied in the day-to-day practice have demonstrated to be useful, although they show relevant limitations at the same time. On another note, powerful learning algorithms have been developed in the field of machine learning (ML), which have been applied to solve practical problems. The work aims at testing the prediction capability of some state-of-the-art algorithms to model dam behaviour, in terms of displacements and leakage. Models based on random forests (RF), boosted regression trees (BRT), neural networks (NN), support vector machines (SVM) and multivariate adaptive regression splines (MARS) are fitted to predict 14 target variables. Prediction accuracy is compared with the conventional statistical model, which shows poorer performance on average. {BRT} models stand out as the most accurate overall, followed by {NN} and RF. It was also verified that the model fit can be improved by removing the records of the first years of dam functioning from the training set.

Keywords: Dam monitoring
[93] S. Moncayo, S. Manzoor, F. Navarro-Villoslada, and J.O. Caceres. Evaluation of supervised chemometric methods for sample classification by laser induced breakdown spectroscopy. Chemometrics and Intelligent Laboratory Systems, 146:354 - 364, 2015. [ bib | DOI | http ]
Abstract In this work seven supervised chemometric methods have been evaluated in a real world application for the classification of human bone remains with similar elemental composition based on Laser Induced Breakdown Spectroscopy (LIBS) measurements. Bone samples belonging to five individuals were obtained from a local cemetery, exposed to uncontrolled conditions. {LIBS} data were processed with different linear and non-linear supervised chemometric approaches. The performance of each chemometric model was assessed by three validation procedures taking into account their sensitivity (internal validation), generalization ability and robustness (independent external validation). The accuracy of each method increased in the following order: 42% for Linear Discriminant Analysis (LDA), 48% for Classification and Regression Tree (CART), 56% for Support Vector Machines (SVM), 58% for Soft Independent Modeling of Class Analogy (SIMCA), 58% for Partial least Squares–Discriminant Analysis (PLS-DA), 66% for Binary Logistic Regression (BLR) and 100% for Artificial Neural Networks (NN). The results showed that {NN} outperforms in terms of sensitivity, generalization ability and robustness; whereas SIMCA, PLS-DA, LDA, CART, Logistic Regression and {SVM} did not show significant accuracy to discriminate the bone samples with a high degree of similarity.

Keywords: Laser Induced Breakdown Spectroscopy
[94] Chen-Chia Chuang and Zne-Jung Lee. Hybrid robust support vector machines for regression with outliers. Applied Soft Computing, 11(1):64 - 72, 2011. [ bib | DOI | http ]
In this study, a hybrid robust support vector machine for regression is proposed to deal with training data sets with outliers. The proposed approach consists of two stages of strategies. The first stage is for data preprocessing and a support vector machine for regression is used to filter out outliers in the training data set. Since the outliers in the training data set are removed, the concept of robust statistic is not needed for reducing the outliers’ effects in the later stage. Then, the training data set except for outliers, called as the reduced training data set, is directly used in training the non-robust least squares support vector machines for regression (LS-SVMR) or the non-robust support vector regression networks (SVRNs) in the second stage. Consequently, the learning mechanism of the proposed approach is much easier than that of the robust support vector regression networks (RSVRNs) approach and of the weighted LS-SVMR approach. Based on the simulation results, the performance of the proposed approach with non-robust LS-SVMR is superior to the weighted LS-SVMR approach when the outliers exist. Moreover, the performance of the proposed approach with non-robust {SVRNs} is also superior to the {RSVRNs} approach.

Keywords: Outliers
[95] V. Ceperic, G. Gielen, and A. Baric. Sparse multikernel support vector regression machines trained by active learning. Expert Systems with Applications, 39(12):11029 - 11035, 2012. [ bib | DOI | http ]
A method for the sparse multikernel support vector regression machines is presented. The proposed method achieves a high accuracy versus complexity ratio and allows the user to adjust the complexity of the resulting models. The sparse representation is guaranteed by limiting the number of training data points for the support vector regression method. Each training data point is selected based on its influence on the accuracy of the model using the active learning principle. A different kernel function is attributed to each training data point, yielding multikernel regressor. The advantages of the proposed method are illustrated on several examples and the experiments show the advantages of the proposed method.

Keywords: Support vector machines
[96] Hossein Bonakdari, Amir Hossein Zaji, Shahaboddin Shamshirband, Roslan Hashim, and Dalibor Petkovic. Sensitivity analysis of the discharge coefficient of a modified triangular side weir by adaptive neuro-fuzzy methodology. Measurement, 73:74 - 81, 2015. [ bib | DOI | http ]
Abstract The discharge coefficient of a modified triangular side weir is analyzed regarding various non-dimensional input sets. It is desirable to select and analyze factors or parameters that are truly relevant or the most influential to triangular side weir discharge coefficient estimation and prediction. The Adaptive Neuro-Fuzzy Inference System (ANFIS) is applied for the selection of the most prominent triangular side weir discharge coefficient parameters based on ten input parameters. The input variables were searched using the {ANFIS} network to specify the input parameters’ effects on the discharge coefficients. According to the obtained results, the side weir included angle has the most effect on modeling the discharge coefficient. Then by using the selected input variables, the discharge coefficient was modeled with ANFIS, artificial neural network, support vector machine and multi non linear regression methods. The results show that {ANFIS} could predict the discharge coefficient significantly better than the other investigated models.

Keywords: ANFIS
[97] Ping Zhu, Yu Zhang, and Guanlong Chen. Metamodeling development for reliability-based design optimization of automotive body structure. Computers in Industry, 62(7):729 - 741, 2011. [ bib | DOI | http ]
Metamodels are commonly used in reliability-based design optimization (RBDO) due to the enormously expensive computation cost of numerical simulations. However, for large-scale design optimization of automotive body structure, with the increasing number of design variable and enhanced nonlinearity degree of structural performance, polynomial response surface which is commonly used for vehicle design optimization often suffers exponentially increased computation burden and serious loss of approximation accuracy. In this paper, support vector regression, along with other four complex metamodeling techniques including moving least square, artificial neural network, radial basis function and Kriging, is investigated for approximating frontal crashworthiness performance which is one of the most highly nonlinear performances. It aims at testing support vector regression and providing advanced metamodeling technique for {RBDO} of automotive body structure. Approximation results are compared in both accuracy and computational efficiency. Based on the frontal crashworthiness example, it is found that support vector regression and moving least square are preferable techniques to approximate structural performances with good accuracy. But support vector regression is recommended for its computational efficiency and better approximation potential. Moreover, the ensemble of support vector regression, moving least square, Kriging and artificial neural network is an effective alternative and is proved, in the {RBDO} example for the lightweight design of front body structure, to outperform any other single metamodel. The remarkable predominance indicates that the ensemble of support vector regression, moving least square, Kriging and artificial neural network holds great potential in approximating highly nonlinear performances for {RBDO} of automotive body structure.

Keywords: RBDO
[98] Qisheng Yan, Mingjing Guo, and Junpo Jiang. Study on the support vector regression model for order's prediction. Procedia Engineering, 15:1471 - 1475, 2011. {CEIS} 2011. [ bib | DOI | http ]
The prediction for the order of enterprise is very important. Support vector machine is a kind of learning technique based on the structural risk minimization principle, and it is also a class of regression method with good generalization ability. In this paper, support vector machine is used to model of the prediction for the order. A simulation example is taken to demonstrate correctness and effectiveness of the proposed approach. The selection method of the model parameters is presented.

Keywords: Order ;Support Vector Regression ;Neural Network ;Prediction
[99] Hsu-Yung Cheng, Chih-Chang Yu, and Sian-Jing Lin. Bi-model short-term solar irradiance prediction using support vector regressors. Energy, 70:121 - 127, 2014. [ bib | DOI | http ]
Abstract This paper proposes an accurate short-term solar irradiance prediction scheme via support vector regression. Utilizing clearness index conversion and appropriate features, the support vector regression models are able to output satisfying prediction results. The prediction results are further improved by the proposed ramp-down event forecasting and solar irradiance refinement procedures. With the help of all-sky image analysis, two separated regression models are constructed based on the cloud obstruction conditions near the solar disk. With bi-model prediction, the behavior of the changing irradiance can be captured more accurately. Moreover, if a ramp-down event is forecasted, the predicted irradiance is corrected based on the cloud cover ratio in the area near the sun. The experiments have shown that the proposed method can effectively improve the prediction accuracy on a highly challenging dataset.

Keywords: Solar irradiance prediction
[100] Ji Huang, Yucheng Bo, and Huiyuan Wang. Electromechanical equipment state forecasting based on genetic algorithm – support vector regression. Expert Systems with Applications, 38(7):8399 - 8402, 2011. [ bib | DOI | http ]
Prediction of electromechanical equipments state nonlinear and non-stationary condition effectively is significant to forecast the lifetime of electromechanical equipments. In order to forecast electromechanical equipments state exactly, support vector regression optimized by genetic algorithm is proposed to forecast electromechanical equipments state. In the model, genetic algorithm is employed to choose the training parameters of support vector machine, and the {SVR} forecasting model of electromechanical equipments state with good forecasting ability is obtained. The proposed forecasting model is applied to the state forecasting for industrial smokes and gas turbine. The experimental results demonstrate that the proposed GA-SVR model provides better prediction capability. Therefore, the method is considered as a promising alternative method for forecasting electromechanical equipments state.

Keywords: Support vector machine
[101] Melda Akın. A novel approach to model selection in tourism demand modeling. Tourism Management, 48:64 - 72, 2015. [ bib | DOI | http ]
Abstract In many studies on tourism demand modeling, the main conclusion is that none of the considered modeling approaches consistently outperforms the others. We consider Seasonal AutoRegressive Integrated Moving Average, ν-Support Vector Regression, and multi-layer perceptron type Neural Network models and optimize their parameters using different techniques for each and compare their performances on monthly tourist arrival data to Turkey from different countries. Based on these results, this study proposes a novel approach to model selection for a given tourism time series. Our approach is based on identifying the components of the given time series using structural time series modeling. Using the identified components we construct a decision tree and obtain a rule set for model selection.

Keywords: Time series
[102] Nasser H. Sweilam, A.A. Tharwat, and N.K. Abdel Moniem. Support vector machine for diagnosis cancer disease: A comparative study. Egyptian Informatics Journal, 11(2):81 - 92, 2010. [ bib | DOI | http ]
Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, Quantum-behave Particle Swarm for training {SVM} is introduced. Another approach named least square support vector machine (LSSVM) and active set strategy are introduced. The obtained results by these methods are tested on a breast cancer dataset and compared with the exact solution model problem.

Keywords: Breast cancer diagnosis mathematical model
[103] J. Antonanzas, R. Urraca, F.J. Martinez de Pison, and F. Antonanzas-Torres. Solar irradiation mapping with exogenous data from support vector regression machines estimations. Energy Conversion and Management, 100:380 - 390, 2015. [ bib | DOI | http ]
Abstract Exactly how to estimate solar resources in areas without pyranometers is of great concern for solar energy planners and developers. This study addresses the mapping of daily global irradiation by combining geostatistical interpolation techniques with support vector regression machines. The support vector regression machines training process incorporated commonly measured meteorological variables (temperatures, rainfall, humidity and wind speed) to estimate solar irradiation and was performed with data of 35 pyranometers over continental Spain. Genetic algorithms were used to simultaneously perform feature selection and model parameter optimization in the calibration process. The model was then used to estimate solar irradiation in a massive set of exogenous stations, 365 sites without irradiation sensors, so as to overcome the lack of pyranometers. Then, different spatial techniques for interpolation, fed with both measured and estimated irradiation values, were evaluated and compared, which led to the conclusion that ordinary kriging demonstrated the best performance. Training and interpolation mean absolute errors were as low as 1.81 {MJ} / m 2 day and 1.74 {MJ} / m 2 day , respectively. Errors improved significantly as compared to interpolation without exogenous stations and others referred in the bibliography for the same region. This study presents an innovative methodology for estimating solar irradiation, which is especially promising since it may be implemented broadly across other regions and countries under similar circumstances.

Keywords: Solar resource estimation
[104] A.A. Yusuff, A.A. Jimoh, and J.L. Munda. Fault location in transmission lines based on stationary wavelet transform, determinant function feature and support vector regression. Electric Power Systems Research, 110:73 - 83, 2014. [ bib | DOI | http ]
Abstract This paper proposes a novel transmission line fault location scheme, combining stationary wavelet transform (SWT), determinant function feature (DFF), support vector machine (SVM) and support vector regression (SVR). Various types of faults at different locations, fault impedance and fault inception angles on a 400 kV, 361.297 km transmission line are investigated. The system only utilizes single-end measurements. {DFF} is used to extract distinctive fault features from 1/4 cycle of post fault signals after noise and the decaying {DC} offset have been eliminated by a filtering scheme based on SWT. A classifier (SVM) and regression (SVR) schemes are subsequently trained with features obtained from DFF. The scheme is then used in precise location of fault on the transmission line. The result shows that fault location on transmission lines can be determined rapidly and correctly irrespective of fault impedance.

Keywords: Fault location
[105] S.R. Na’imi, S.R. Shadizadeh, M.A. Riahi, and M. Mirzakhanian. Estimation of reservoir porosity and water saturation based on seismic attributes using support vector regression approach. Journal of Applied Geophysics, 107:93 - 101, 2014. [ bib | DOI | http ]
Abstract Porosity and fluid saturation distributions are crucial properties of hydrocarbon reservoirs and are involved in almost all calculations related to reservoir and production. True measurements of these parameters derived from laboratory measurements, are only available at the isolated localities of a reservoir and also are expensive and time-consuming. Therefore, employing other methodologies which have stiffness, simplicity, and cheapness is needful. Support Vector Regression approach is a moderately novel method for doing functional estimation in regression problems. Contrary to conventional neural networks which minimize the error on the training data by the use of usual Empirical Risk Minimization principle, Support Vector Regression minimizes an upper bound on the anticipated risk by means of the Structural Risk Minimization principle. This difference which is the destination in statistical learning causes greater ability of this approach for generalization tasks. In this study, first, appropriate seismic attributes which have an underlying dependency with reservoir porosity and water saturation are extracted. Subsequently, a non-linear support vector regression algorithm is utilized to obtain quantitative formulation between porosity and water saturation parameters and selected seismic attributes. For an undrilled reservoir, in which there are no sufficient core and log data, it is moderately possible to characterize hydrocarbon bearing formation by means of this method.

Keywords: Porosity
[106] Zhao Lu and Jing Sun. Non-mercer hybrid kernel for linear programming support vector regression in nonlinear systems identification. Applied Soft Computing, 9(1):94 - 99, 2009. [ bib | DOI | http ]
As a new sparse kernel modeling method, support vector regression (SVR) has been regarded as the state-of-the-art technique for regression and approximation. In [V.N. Vapnik, The Nature of Statistical Learning Theory, second ed., Springer-Verlag, 2000], Vapnik developed the ɛ-insensitive loss function for the support vector regression as a trade-off between the robust loss function of Huber and one that enables sparsity within the support vectors. The use of support vector kernel expansion provides us a potential avenue to represent nonlinear dynamical systems and underpin advanced analysis. However, in the standard quadratic programming support vector regression (QP-SVR), its implementation is often computationally expensive and sufficient model sparsity cannot be guaranteed. In an attempt to mitigate these drawbacks, this article focuses on the application of the soft-constrained linear programming support vector regression (LP-SVR) with hybrid kernel in nonlinear black-box systems identification. An innovative non-Mercer hybrid kernel is explored by leveraging the flexibility of LP-SVR in choosing the kernel functions. The simulation results demonstrate the ability to use more general kernel function and the inherent performance advantage of LP-SVR to QP-SVR in terms of model sparsity and computational efficiency.

Keywords: Support vector regression
[107] R. Viswanathan and Pijush Samui. Determination of rock depth using artificial intelligence techniques. Geoscience Frontiers, pages -, 2015. [ bib | DOI | http ]
Abstract This article adopts three artificial intelligence techniques, Gaussian Process Regression (GPR), Least Square Support Vector Machine (LSSVM) and Extreme Learning Machine (ELM), for prediction of rock depth (d) at any point in Chennai. GPR, {ELM} and {LSSVM} have been used as regression techniques. Latitude and longitude are also adopted as inputs of the GPR, {ELM} and {LSSVM} models. The performance of the ELM, {GPR} and {LSSVM} models has been compared. The developed ELM, {GPR} and {LSSVM} models produce spatial variability of rock depth and offer robust models for the prediction of rock depth.

Keywords: Rock depth
[108] M. Hirtl, S. Mantovani, B.C. Krüger, G. Triebnig, C. Flandorfer, M. Bottoni, and M. Cavicchi. Improvement of air quality forecasts with satellite and ground based particulate matter observations. Atmospheric Environment, 84:20 - 27, 2014. [ bib | DOI | http ]
Abstract Daily regional scale forecasts of particulate air pollution are simulated for public information and warning. An increasing amount of air pollution measurements is available in real-time from ground stations as well as from satellite observations. In this paper, the Support Vector Regression technique is applied to derive highly-resolved {PM10} initial fields for air quality modeling from satellite measurements of the Aerosol Optical Thickness. Additionally, PM10-ground measurements are assimilated using optimum interpolation. The performance of both approaches is shown for a selected {PM10} episode.

Keywords: {PM10} forecasts
[109] Mohammad Alizadeh and Turaj Amraee. Adaptive scheme for local prediction of post-contingency power system frequency. Electric Power Systems Research, 107:240 - 249, 2014. [ bib | DOI | http ]
Abstract The power system frequency always should be kept upper than a minimum threshold determined by the limitations of system equipments such as synchronous generators. In this paper a new method is proposed for local prediction of maximum post-contingency deviation of power system frequency using Artificial Neural Network (ANN) and Support Vector Regression (SVR) learning machines. Due to change of network oscillation modes under different contingencies, the proposed predictors adjust the data sampling time for improving the performance. For {ANN} and {SVR} training, a comprehensive list of scenarios is created considering all credible disturbances. The performance of the proposed algorithm is simulated and verified over a dynamic test system.

Keywords: Frequency response
[110] Qing Li, Licheng Jiao, and Yingjuan Hao. Adaptive simplification of solution for support vector machine. Pattern Recognition, 40(3):972 - 980, 2007. [ bib | DOI | http ]
{SVM} has been receiving increasing interest in areas ranging from its original application in pattern recognition to other applications such as regression estimation due to its remarkable generalization performance. Unfortunately, {SVM} is currently considerably slower in test phase caused by number of the support vectors, which has been a serious limitation for some applications. To overcome this problem, we proposed an adaptive algorithm named feature vectors selection (FVS) to select the feature vectors from the support vector solutions, which is based on the vector correlation principle and greedy algorithm. Through the adaptive algorithm, the sparsity of solution is improved and the time cost in testing is reduced. To select the number of the feature vectors adaptively by the requirements, the generalization and complexity trade-off can be directly controlled. The computer simulations on regression estimation and pattern recognition show that {FVS} is a promising algorithm to simplify the solution for support vector machine.

Keywords: Support vector machine
[111] Piotr Bilski. Application of support vector machines to the induction motor parameters identification. Measurement, 51:377 - 386, 2014. [ bib | DOI | http ]
Abstract The paper presents the application of the Support Vector Machines (SVM) to identify the parameters of the induction machine. The problem is identical to the regression task, solved here with the help of multiple {SVM} modules – each identifying the separate system’s parameter. The work regime of the induction motor and the significance of its accurate modelling are introduced. The application of {SVM} for the task is discussed, both as the standalone regression method and combined with the preceding classification approach (such as decision trees). Methods of measuring the regression accuracy in both scenarios are introduced. Experimental results of the model identification are presented in detail and discussed. The {SVM} optimization is performed, including selection of the kernel and its parameters’ values, maximizing the diagnostic accuracy. The paper is concluded with results discussion, conclusions and future prospects.

Keywords: Electrical machines
[112] Ch. Suryanarayana, Ch. Sudheer, Vazeer Mahammood, and B.K. Panigrahi. An integrated wavelet-support vector machine for groundwater level prediction in visakhapatnam, india. Neurocomputing, 145:324 - 335, 2014. [ bib | DOI | http ]
Abstract Accurate and reliable prediction of the groundwater level variation is significant and essential in water resources management of a basin. The situation is complicated by the fact that the variation of groundwater level is highly nonlinear in nature because of interdependencies and uncertainties in the hydro-geological process. Models such as Artificial Neural Networks (ANN) and Support Vector Machine (SVM) have proved to be effective in modeling virtually any nonlinear function with a greater degree of accuracy. In recent times, combining several techniques to form a hybrid tool to improve the accuracy of prediction has become a common practice for various applications. This integrated method increases the efficiency of the model by combining the unique features of the constituent models to capture different patterns in the data. In the present study, an attempt is made to predict monthly groundwater level fluctuations using integrated wavelet and support vector machine modeling. The discrete wavelet transform with two coefficients (db2 wavelet) is adopted for decomposing the input data into wavelet series. These series are further used as input variables in different combinations for Support Vector Regression (SVR) model to forecast groundwater level fluctuations. The monthly data of precipitation, maximum temperature, mean temperature and groundwater depth for the period 2001–2012 are used as the input variables. The proposed Wavelet-Support Vector Regression (WA-SVR) model is applied to predict the groundwater level variations for three observation wells in the city of Visakhapatnam, India. The performance of the WA-SVR model is compared with SVR, {ANN} and also with the traditional Auto Regressive Integrated Moving Average (ARIMA) models. Results indicate that WA-SVR model gives better accuracy in predicting groundwater levels in the study area when compared to other models.

Keywords: Predicting
[113] Xinjun Peng. Efficient twin parametric insensitive support vector regression model. Neurocomputing, 79:26 - 38, 2012. [ bib | DOI | http ]
In this paper, an efficient twin parametric insensitive support vector regression (TPISVR) is proposed. The {TPISVR} determines indirectly the regression function through a pair of nonparallel parametric-insensitive up- and down-bound functions solved by two smaller sized support vector machine (SVM)-type problems, which causes the {TPISVR} not only have the faster learning speed than the classical SVR, but also be suitable for many cases, especially when the noise is heteroscedastic, that is, the noise strongly depends on the input value. The proposed method has the advantage of using the ratio of the parameters ν and c for controlling the bounds of fractions of support vectors and errors. The experimental results on several artificial and benchmark datasets indicate that the {TPISVR} not only has fast learning speed, but also shows good generalization performance.

Keywords: Support vector machine
[114] P. Lingras and C.J. Butz. Rough support vector regression. European Journal of Operational Research, 206(2):445 - 455, 2010. [ bib | DOI | http ]
This paper describes the relationship between support vector regression (SVR) and rough (or interval) patterns. {SVR} is the prediction component of the support vector techniques. Rough patterns are based on the notion of rough values, which consist of upper and lower bounds, and are used to effectively represent a range of variable values. Predictions of rough values in a variety of different forms within the context of interval algebra and fuzzy theory are attracting research interest. An extension of SVR, called rough support vector regression (RSVR), is proposed to improve the modeling of rough patterns. In particular, it is argued that the upper and lower bounds should be modeled separately. The proposal is shown to be a more flexible version of lower possibilistic regression model using ϵ -insensitivity. Experimental results on the Dow Jones Industrial Average demonstrate the suggested {RSVR} modeling technique.

Keywords: Rough set
[115] Xinjun Peng. Primal twin support vector regression and its sparse approximation. Neurocomputing, 73(16–18):2846 - 2858, 2010. 10th Brazilian Symposium on Neural Networks (SBRN2008). [ bib | DOI | http ]
Twin support vector regression (TSVR) obtains faster learning speed by solving a pair of smaller sized support vector machine (SVM)-typed problems than classical support vector regression (SVR). In this paper, a primal version for TSVR, termed primal {TSVR} (PTSVR), is first presented. By introducing a quadratic function to approximate its loss function, {PTSVR} directly optimizes the pair of quadratic programming problems (QPPs) of {TSVR} in the primal space based on a series of sets of linear equations. {PTSVR} can obviously improve the learning speed of {TSVR} without loss of the generalization. To improve the prediction speed, a greedy-based sparse {TSVR} (STSVR) in the primal space is further suggested. {STSVR} uses a simple back-fitting strategy to iteratively select its basis functions and update the augmented vectors. Computational results on several synthetic as well as benchmark datasets confirm the merits of {PTSVR} and STSVR.

Keywords: Twin support vector regression
[116] Yun Hwan Kim, Seong Joon Yoo, Yeong Hyeon Gu, Jin Hee Lim, Dongil Han, and Sung Wook Baik. Crop pests prediction method using regression and machine learning technology: Survey. {IERI} Procedia, 6:52 - 56, 2014. 2013 International Conference on Future Software Engineering and Multimedia Engineering (ICFM 2013). [ bib | DOI | http ]
Abstract This paper describes current trends in the prediction of crop pests using machine learning technology. With the advent of data mining, the field of agriculture is also focused on it. Currently, various studies, domestic and overseas, are under progress using machine learning technology, and cases of its utilization are increasing. This paper classifies and introduces {SVM} (Support Vector Machine), Multiple Linear Regression, Neural Network, and Bayesian Network based techniques, and describes some cases of their utilization.

Keywords: Regression
[117] X. Sun, K.J. Chen, E.P. Berg, D.J. Newman, C.A. Schwartz, W.L. Keller, and K.R. Maddock Carlin. Prediction of troponin-t degradation using color image texture features in 10 d aged beef longissimus steaks. Meat Science, 96(2, Part A):837 - 842, 2014. [ bib | DOI | http ]
Abstract The objective was to use digital color image texture features to predict troponin-T degradation in beef. Image texture features, including 88 gray level co-occurrence texture features, 81 two-dimension fast Fourier transformation texture features, and 48 Gabor wavelet filter texture features, were extracted from color images of beef strip steaks (longissimus dorsi, n = 102) aged for 10 d obtained using a digital camera and additional lighting. Steaks were designated degraded or not-degraded based on troponin-T degradation determined on d 3 and d 10 postmortem by immunoblotting. Statistical analysis (STEPWISE regression model) and artificial neural network (support vector machine model, SVM) methods were designed to classify protein degradation. The d 3 and d 10 {STEPWISE} models were 94% and 86% accurate, respectively, while the d 3 and d 10 {SVM} models were 63% and 71%, respectively, in predicting protein degradation in aged meat. {STEPWISE} and {SVM} models based on image texture features show potential to predict troponin-T degradation in meat.

Keywords: Beef
[118] Weilin Luo, Lúcia Moreira, and C. Guedes Soares. Manoeuvring simulation of catamaran by using implicit models based on support vector machines. Ocean Engineering, 82:150 - 159, 2014. [ bib | DOI | http ]
Abstract Manoeuvring models based on support vector machines (SVMs) are proposed for the manoeuvring simulation of a catamaran. Implicit models of manoeuvring motion are derived from the {SVM} regression instead of using the traditional methods for identification of the hydrodynamic coefficients. Data obtained from full-scale trials are used for regression analysis. Disturbances induced by current and wind are estimated. At the training stage, the inputs to the {SVMs} are the surge speed, sway speed, yaw rate and rudder angle, while the outputs are the derivatives of the surge speed, sway speed and yaw rate, respectively. At the simulation stage a predictive model is constructed with the obtained support vectors, Lagrangian factors and a constant. The Gauss function kernel is employed in the {SVMs} to guarantee the performance of the approximation and the robustness of the {SVM} regressor. The turning circle manoeuvre is simulated based on the regression manoeuvring models. Comparisons between the trials and the simulated results are conducted to demonstrate the validity of the proposed modelling method.

Keywords: Catamaran
[119] Divya Tomar and Sonali Agarwal. Twin support vector machine: A review from 2007 to 2014. Egyptian Informatics Journal, 16(1):55 - 69, 2015. [ bib | DOI | http ]
Abstract Twin Support Vector Machine (TWSVM) is an emerging machine learning method suitable for both classification and regression problems. It utilizes the concept of Generalized Eigen-values Proximal Support Vector Machine (GEPSVM) and finds two non-parallel planes for each class by solving a pair of Quadratic Programming Problems. It enhances the computational speed as compared to the traditional Support Vector Machine (SVM). {TWSVM} was initially constructed to solve binary classification problems; later researchers successfully extended it for multi-class problem domain. {TWSVM} always gives promising empirical results, due to which it has many attractive features which enhance its applicability. This paper presents the research development of {TWSVM} in recent years. This study is divided into two main broad categories - variant based and multi-class based {TWSVM} methods. The paper primarily discusses the basic concept of {TWSVM} and highlights its applications in recent years. A comparative analysis of various research contributions based on {TWSVM} is also presented. This is helpful for researchers to effectively utilize the {TWSVM} as an emergent research methodology and encourage them to work further in the performance enhancement of TWSVM.

Keywords: Twin Support Vector Machine
[120] Feng-Ping An, Da-Chao Lin, Ying-Ang Li, and Xian-Wei Zhou. Edge effects of {BEMD} improved by expansion of support-vector-regression extrapolation and mirror-image signals. Optik - International Journal for Light and Electron Optics, pages -, 2015. [ bib | DOI | http ]
Abstract In the operation of bidimensional empirical mode decomposition, expansion with mirror-image signals is an effective approach to weaken the edge effect. To meet the basic requirement that mirrors should be placed at the extrema, however, there is a problem to make full use of the information involved in the original signal. To address this problem, we propose an approach with the expansion of both support-vector-regression (SVR) extrapolation and mirror-image signals, in which the extrema are captured from the data of {SVR} extrapolation. The {SVR} model is constructed with the support vector method (SVM) based on the original signal data. Its extrapolation results in the estimation of the signal data beyond the edge for capturing the extrema so that the information of the original signal can be fully used in locating the mirror. Once all of these extrema points are determined, the traditional mirror expansion method is used and finally edge effects of the {BEMD} are eliminated. Results from numerical experiments show that the proposed approach has a good capability of improving edge effects of the {BEMD} operation process, and the reconstruction image from the decomposed components of the intrinsic mode function (IMF) confirms its high coherency with the original one.

Keywords: BEMD
[121] Yong-Ping Zhao, Jing Zhao, and Min Zhao. Twin least squares support vector regression. Neurocomputing, 118:225 - 236, 2013. [ bib | DOI | http ]
Abstract In this paper, combining the spirit of twin hyperplanes with the fast speed of least squares support vector regression (LSSVR) yields a new regressor, termed as twin least squares support vector regression (TLSSVR). As a result, {TLSSVR} outperforms normal {LSSVR} in the generalization performance, and as opposed to other algorithms of twin hyperplanes, {TLSSVR} owns faster computational speed. When coping with large scale problems, this advantage is obvious. To accelerate the testing speed of TLSSVR, {TLSSVR} is sparsified using a simple mechanism, thus obtaining STLSSVR. In addition to introducing these algorithms above, a lot of experiments including a toy problem, several small and large scale data sets, and a gas furnace example are done. These applications demonstrate the effectiveness and efficiency of the proposed algorithms.

Keywords: Support vector machine
[122] Jaehun Lee, Wooyong Chung, and Euntai Kim. A new kernelized approach to wireless sensor network localization. Information Sciences, 243:20 - 38, 2013. [ bib | DOI | http ]
Abstract In this paper, a new approach to range-free localization in Wireless Sensor Networks (WSNs) is proposed using nonlinear mapping, and the kernel function is introduced. The localization problem in the {WSN} is formulated as a kernelized regression problem, which is solved by support vector regression (SVR) and multi-dimensional support vector regression (MSVR). The proposed methods are simple and efficient in that no additional hardware is required for the measurements, and only proximity information and position information of the anchor nodes are used for the localization. The proposed methods are composed of three steps: the measurement step, kernelized regression step, and localization step. In the measurement step, the proximity information of the given network is measured. In the regression step, the relationships among the geographical distances and the proximity among sensor nodes is built using kernelized regression. In the localization step, each sensor node finds its own position in a distributed manner using a kernelized regressor. The simulation result demonstrates that the proposed methods exhibit excellent and robust location estimation performance.

Keywords: Wireless sensor network
[123] Parisa Bagheripour, Amin Gholami, Mojtaba Asoodeh, and Mohsen Vaezzadeh-Asadi. Support vector regression based determination of shear wave velocity. Journal of Petroleum Science and Engineering, 125:95 - 99, 2015. [ bib | DOI | http ]
Abstract Shear wave velocity in the company of compressional wave velocity add up to an invaluable source of information for geomechanical and geophysical studies. Although compressional wave velocity measurements exist in almost all wells, shear wave velocity is not recorded for most of elderly wells due to lack of technologic tools in those days and incapability of recent tools in cased holes. Furthermore, measurement of shear wave velocity is to some extent costly. This study proposes a novel methodology to remove aforementioned problems by use of support vector regression tool originally invented by Vapnik (1995, The Nature of Statistical Learning Theory. Springer, New York, NY). Support vector regression (SVR) is a supervised learning algorithm plant based on statistical learning (SLT) theory. It is used in this study to formulate conventional well log data into shear wave velocity in a quick, cheap, and accurate manner. {SVR} is preferred for model construction because it utilizes structural risk minimization (SRM) principle which is superior to empirical risk minimization (ERM) theory, used in traditional learning algorithms such as neural networks. A group of 2879 data points was used for model construction and 1176 data points were employed for assessment of {SVR} model. A comparison between measured and {SVR} predicted data showed {SVR} was capable of accurately extract shear wave velocity, hidden in conventional well log data. Finally, a comparison among SVR, neural network, and four well-known empirical correlations demonstrated {SVR} model outperformed other methods. This strategy was successfully applied in one of carbonate reservoir rocks of Iran Gas-Fields.

Keywords: Shear wave velocity
[124] M. Herrera, J. Izquierdo, R. Pérez-Garćıa, and D. Ayala-Cabrera. On-line learning of predictive kernel models for urban water demand in a smart city. Procedia Engineering, 70:791 - 799, 2014. 12th International Conference on Computing and Control for the Water Industry, {CCWI2013}. [ bib | DOI | http ]
Abstract This paper proposes a multiple kernel regression (MKr) to predict water demand in the presence of a continuous source of infor- mation. {MKr} extends the simple support vector regression (SVR) to a combination of kernels from as many distinct types as kinds of input data are available. In addition, two on-line learning methods to obtain real time predictions as new data arrives to the system are tested by a real-world case study. The accuracy and computational efficiency of the results indicate that our proposal is a suitable tool for making adequate management decisions in the smart cities environment.

Keywords: Smart cities
[125] A. Candelieri and F. Archetti. Identifying typical urban water demand patterns for a reliable short-term forecasting – the icewater project approach. Procedia Engineering, 89:1004 - 1012, 2014. 16th Water Distribution System Analysis Conference, {WDSA2014Urban} Water Hydroinformatics and Strategic Planning. [ bib | DOI | http ]
Abstract This paper presents a computational framework performing, in two stages: urban water demand pattern characterization through time series clustering and reliable hourly water demand forecasting for the entire day based on Support Vector Machine (SVM) regression. An {SVM} regression model is trained for each cluster identified and for each hour of the day, taking the hourly water demand data acquired at the very first m hours of the day. The approach has been validated on a real case study that is the urban water demand of the Water Distribution Network (WDN) in Milan, managed by Metropolitana Milanese, one of the partner of the EU-FP7-ICT {ICeWater} project.

Keywords: urban water demand
[126] Daniel J. Griffin, Martha A. Grover, Yoshiaki Kawajiri, and Ronald W. Rousseau. Robust multicomponent ir-to-concentration model regression. Chemical Engineering Science, 116:77 - 90, 2014. [ bib | DOI | http ]
Abstract Infrared absorbance measurements can be made in situ and rapidly. Calibrating these measurements to give solution compositions can therefore yield a powerful tool for process monitoring and control. In many applications it is desirable to monitor the concentrations of multiple components in a complex solution under varying process conditions (which may introduce error in the absorbance measurements). Establishing a model that is capable of accurately predicting the concentrations of multiple components from infrared absorbance measurements that may be corrupted by error requires a carefully designed calibration procedure—a key part of which is model regression. In this article, a number of commonly used multivariate regression techniques are examined in the context of developing a model for simultaneously predicting the concentrations of four solutes from noisy infrared absorbance measurements. In addition, a tailored support vector regression algorithm—designed to produce a robust (measurement error-insensitive) calibration model—is developed, tested, and compared against these established regression algorithms.

Keywords: Multi-component calibration
[127] Elena Montañés, Ana Suárez-Vázquez, and José Ramón Quevedo. Ordinal classification/regression for analyzing the influence of superstars on spectators in cinema marketing. Expert Systems with Applications, 41(18):8101 - 8111, 2014. [ bib | DOI | http ]
Abstract This paper studies the influence of superstars on spectators in cinema marketing. Casting superstars is a common risk-mitigation strategy in the cinema industry. Anecdotal evidence suggests that the presence of superstars is not always a guarantee of success and hence, a deeper study is required to analyze the potential audience of a movie. In this sense, knowledge, attitudes and emotions of spectators towards stars are analyzed as potential factors of influencing the intention of seeing a movie with stars in its cast. This analysis is performed through machine learning techniques. In particular, the problem is stated as an ordinal classification/regression task rather than a traditional classification or regression task, since the intention of watching a movie is measured in a graded scale, hence, its values exhibit an order. Several methods are discussed for this purpose, but Support Vector Ordinal Regression shows its superiority over other ordinal classification/regression techniques. Moreover, exhaustive experiments carried out confirm that the formulation of the problem as an ordinal classification/regression is a success, since powerful traditional classifiers and regressors show worse performance. The study also confirms that talent and popularity expressed by means of knowledge, attitude and emotions satisfactorily explain superstar persuasion. Finally, the impact of these three components is also checked.

Keywords: Ordinal classification
[128] Ruijin Liao, Hanbo Zheng, Stanislaw Grzybowski, and Lijun Yang. Particle swarm optimization-least squares support vector regression based forecasting model on dissolved gases in oil-filled power transformers. Electric Power Systems Research, 81(12):2074 - 2080, 2011. [ bib | DOI | http ]
This paper presents a forecasting model based upon least squares support vector machine (LS-SVM) regression and particle swarm optimization (PSO) algorithm on dissolved gases in oil-filled power transformers. First, the LS-SVM regression model, with radial basis function (RBF) kernel, is established to facilitate the forecasting model. Then a global optimizer, {PSO} is employed to optimize the hyper-parameters needed in LS-SVM regression. Afterward, a procedure is put forward to serve as an effective tool for forecasting of gas contents in transformer oil. The application of the proposed model on actual transformer gas data has given promising results. Moreover, four other forecasting models, derived from back propagation neural network (BPNN), radial basis function neural network (RBFNN), generalized regression neural network (GRNN) and support vector regression (SVR), are selected for comparisons. The experimental results further demonstrate that the proposed model achieves better forecasting performance than its counterparts under the circumstances of limited samples.

Keywords: Least squares support vector machine (LS-SVM)
[129] Ozgur Kisi and Mesut Cimen. A wavelet-support vector machine conjunction model for monthly streamflow forecasting. Journal of Hydrology, 399(1–2):132 - 140, 2011. [ bib | DOI | http ]
Summary The study investigates the accuracy of wavelet and support vector machine conjunction model in monthly streamflow forecasting. The conjunction method is obtained by combining two methods, discrete wavelet transform and support vector machine, and compared with the single support vector machine. Monthly flow data from two stations, Gerdelli Station on Canakdere River and Isakoy Station on Goksudere River, in Eastern Black Sea region of Turkey are used in the study. The root mean square error (RMSE), mean absolute error (MAE) and correlation coefficient (R) statistics are used for the comparing criteria. The comparison of results reveals that the conjunction model could increase the forecast accuracy of the support vector machine model in monthly streamflow forecasting. For the Gerdelli and Isakoy stations, it is found that the conjunction models with {RMSE} = 13.9 m3/s, {MAE} = 8.14 m3/s, R = 0.700 and {RMSE} = 8.43 m3/s, {MAE} = 5.62 m3/s, R = 0.768 in test period is superior in forecasting monthly streamflows than the most accurate support vector regression models with {RMSE} = 15.7 m3/s, {MAE} = 10 m3/s, R = 0.590 and {RMSE} = 11.6 m3/s, {MAE} = 7.74 m3/s, R = 0.525, respectively.

Keywords: Monthly streamflows
[130] Yang Zhao, Shengwei Wang, and Fu Xiao. A statistical fault detection and diagnosis method for centrifugal chillers based on exponentially-weighted moving average control charts and support vector regression. Applied Thermal Engineering, 51(1–2):560 - 572, 2013. [ bib | DOI | http ]
This paper presents a new fault detection and diagnosis (FDD) method for centrifugal chillers of building air-conditioning systems. Firstly, the Support Vector Regression (SVR) is adopted to develop the reference {PI} models. A new PI, namely the heat transfer efficiency of the sub-cooling section (ɛsc), is proposed to improve the {FDD} performance. Secondly, the Exponentially-Weighted Moving Average (EWMA) control charts are introduced to detect faults in a statistical way to improve the ratios of correctly detected points. Thirdly, when faults are detected, diagnosis follows which is based on a proposed {FDD} rule table. Six typical chiller component faults are concerned in this paper. This method is validated using the real-time experimental data from the {ASHRAE} RP-1043. Test results show that the combined use of {SVR} and {EWMA} can achieve the best performance. Results also show that significant improvements are achieved compared with a commonly used method using Multiple Linear Regression (MLR) and t-statistic.

Keywords: Fault detection
[131] Samuele Salti and Luigi Di Stefano. On-line support vector regression of the transition model for the kalman filter. Image and Vision Computing, 31(6–7):487 - 501, 2013. Machine learning in motion analysis: New advances. [ bib | DOI | http ]
Recursive Bayesian Estimation (RBE) is a widespread solution for visual tracking as well as for applications in other domains where a hidden state is estimated recursively from noisy measurements. From a practical point of view, deployment of {RBE} filters is limited by the assumption of complete knowledge on the process and measurement statistics. These missing tokens of information lead to an approximate or even uninformed assignment of filter parameters. Unfortunately, the use of the wrong transition or measurement model may lead to large estimation errors or to divergence, even when the otherwise optimal filter is deployed. In this paper on-line learning of the transition model via Support Vector Regression is proposed. The specialization of this general framework for linear/Gaussian filters, which we dub Support Vector Kalman (SVK), is then introduced and shown to outperform a standard, non adaptive Kalman filter as well as a widespread solution to cope with unknown transition models such as the Interacting Multiple Models (IMM) filter.

Keywords: Adaptive transition model
[132] Jianjun Wang, Li Li, Dongxiao Niu, and Zhongfu Tan. An annual load forecasting model based on support vector regression with differential evolution algorithm. Applied Energy, 94:65 - 70, 2012. [ bib | DOI | http ]
Annual load forecasting is very important for the electric power industry. As influenced by various factors, an annual load curve shows a non-linear characteristic, which demonstrates that the annual load forecasting is a non-linear problem. Support vector regression (SVR) is proven to be useful in dealing with non-linear forecasting problems in recent years. The key point in using {SVR} for forecasting is how to determine the appropriate parameters. This paper proposes a hybrid load forecasting model combining differential evolution (DE) algorithm and support vector regression to deal with this problem, where the {DE} algorithm is used to choose the appropriate parameters for the {SVR} load forecasting model. The effectiveness of this model has been proved by the final simulation which shows that the proposed model outperforms the {SVR} model with default parameters, back propagation artificial neural network (BPNN) and regression forecasting models in the annual load forecasting.

Keywords: Support vector regression (SVR)
[133] Zaobao Liu, Jianfu Shao, Weiya Xu, Yu Zhang, and Hongjie Chen. Prediction of elastic compressibility of rock material with soft computing techniques. Applied Soft Computing, 22:118 - 125, 2014. [ bib | DOI | http ]
Abstract Mechanical and physical properties of sandstone are interesting scientifically and have great practical significance as well as their relations to the mineralogy and pore features. These relations are however highly nonlinear and cannot be easily formulated by conventional methods. This paper investigates the potential of the technique named as the relevance vector machine (RVM) for prediction of the elastic compressibility of sandstone based on its characteristics of physical properties. Based on the fact that the hyper-parameters may have effects on the {RVM} performance, an iteration method is proposed in this paper to search for optimal hyper-parameter value so that it can produce best predictions. Also, the qualitative sensitivity of the physical properties is investigated by the backward regression analysis. Meanwhile, the hyper-parameter effect of the {RVM} approach is discussed in the prediction of the elastic compressibility of sandstone. The predicted results of the {RVM} demonstrate that hyper-parameter values have evident effects on the {RVM} performance. Comparisons on the results of the RVM, the artificial neural network and the support vector machine prove that the proposed strategy is feasible and reliable for prediction of the elastic compressibility of sandstone based on its physical properties.

Keywords: Soft computing
[134] Kuo-Ping Lin, Ping-Feng Pai, Yu-Ming Lu, and Ping-Teng Chang. Revenue forecasting using a least-squares support vector regression model in a fuzzy environment. Information Sciences, 220:196 - 209, 2013. Online Fuzzy Machine Learning and Data Mining. [ bib | DOI | http ]
Revenue forecasting is difficult but essential for companies that want to create high-quality revenue budgets, especially in an uncertain economic environment with changing government policies. Under these conditions, the subjective judgment of decision makers is a crucial factor in making accurate forecasts. This investigation develops a fuzzy least-squares support vector regression model with genetic algorithms (FLSSVRGA) to forecast seasonal revenues. The {FLSSVRGA} uses the H-level to control the possibility distribution range yielded by the fuzzy model and to provide the fuzzy prediction interval. Depending on various factors, such as the global economy and government policies, a decision maker can elect a different level for H using the FLSSVRGA. The proposed {FLSSVRGA} model is a rolling forecasting model with time series data updated monthly that predicts revenue for the coming month. Four other forecasting models: the seasonal autoregressive integrated moving average (SARIMA), generalized regression neural networks (GRNN), support vector regression with genetic algorithms (SVRGA) and least-squares support vector regression with genetic algorithms (LSSVRGA), are employed to forecast the same data sets. The experimental results indicate that the {FLSSVRGA} model outperforms all four models in terms of forecasting accuracy. Thus, the {FLSSVRGA} model is a useful alternative for forecasting seasonal time series data in an uncertain environment; it can provide a user-defined fuzzy prediction interval for decision makers.

Keywords: Least-squares support vector regression
[135] Ming-Wei Li, Duan-Feng Han, and Wen long Wang. Vessel traffic flow forecasting by {RSVR} with chaotic cloud simulated annealing genetic algorithm and {KPCA}. Neurocomputing, 157:243 - 255, 2015. [ bib | DOI | http ]
Abstract The prediction of vessel traffic flow is complicated, its accuracy is influenced by uncertain socio-economic factors, especially by the singular points existed in the statistical data. Recently, the robust v-support vector regression model (RSVR) has been successfully employed to solve non-linear regression and time-series problems with the singular points. This paper will firstly propose a novel hybrid algorithm, namely chaotic cloud simulated annealing genetic algorithm (CcatCSAGA) for optimizing the parameters of RSVR, to improve the performance of vessel traffic flow prediction. In which, the proposed CcatCSAGA employs cat mapping to carefully expand variable searching space, to overcome premature local optimum, and uses cloud model efficiently to search a better solution in a small neighborhood of the current optimal solution, to improve the search efficiency. Secondly, the kernel principal component analysis (KPCA) algorithm is adopted to determine the final input vectors from the candidate input variables. Finally, a numerical example of vessel traffic flow and its influence factors data from Tianjin are employed to test the forecasting performance of the proposed KRSVR-CcatCSAGA model.

Keywords: Vessel traffic flow forecasting
[136] Wei Zhang, Leiqing Pan, Sicong Tu, Ge Zhan, and Kang Tu. Non-destructive internal quality assessment of eggs using a synthesis of hyperspectral imaging and multivariate analysis. Journal of Food Engineering, 157:41 - 48, 2015. [ bib | DOI | http ]
Abstract The study develops a nondestructive test based on hyperspectral imaging using a combination of existing analytical techniques to determine the internal quality of eggs, including freshness, bubble formation or scattered yolk. Successive projections algorithm (SPA) combined with support vector regression established a freshness detection model, which achieved a determination coefficient of 0.87, a root mean squared error of 4.01%, and the ratio of prediction to deviation of 2.80 in the validation set. In addition, eggs with internal bubbles and scattered yolk could be discriminated by support vector classification (SVC) model with identification accuracy of 90.0% and 96.3% respectively. Our findings suggest that hyperspectral imaging can be useful to non-destructively and rapidly assess egg internal quality.

Keywords: Egg internal quality
[137] Xixiang Yang and Weihua Zhang. A faster optimization method based on support vector regression for aerodynamic problems. Advances in Space Research, 52(6):1008 - 1017, 2013. [ bib | DOI | http ]
Abstract In this paper, a new strategy for optimal design of complex aerodynamic configuration with a reasonable low computational effort is proposed. In order to solve the formulated aerodynamic optimization problem with heavy computation complexity, two steps are taken: (1) a sequential approximation method based on support vector regression (SVR) and hybrid cross validation strategy, is proposed to predict aerodynamic coefficients, and thus approximates the objective function and constraint conditions of the originally formulated optimization problem with given limited sample points; (2) a sequential optimization algorithm is proposed to ensure the obtained optimal solution by solving the approximation optimization problem in step (1) is very close to the optimal solution of the originally formulated optimization problem. In the end, we adopt a complex aerodynamic design problem, that is optimal aerodynamic design of a flight vehicle with grid fins, to demonstrate our proposed optimization methods, and numerical results show that better results can be obtained with a significantly lower computational effort than using classical optimization techniques.

Keywords: Aerodynamic configuration
[138] Yongping Zhao and Jianguo Sun. Recursive reduced least squares support vector regression. Pattern Recognition, 42(5):837 - 842, 2009. [ bib | DOI | http ]
Combining reduced technique with iterative strategy, we propose a recursive reduced least squares support vector regression. The proposed algorithm chooses the data which make more contribution to target function as support vectors, and it considers all the constraints generated by the whole training set. Thus it acquires less support vectors, the number of which can be arbitrarily predefined, to construct the model with the similar generalization performance. In comparison with other methods, our algorithm also gains excellent parsimoniousness. Numerical experiments on benchmark data sets confirm the validity and feasibility of the presented algorithm. In addition, this algorithm can be extended to classification.

Keywords: Least squares support vector regression
[139] Feilong Cao and Yubo Yuan. Learning errors of linear programming support vector regression. Applied Mathematical Modelling, 35(4):1820 - 1828, 2011. [ bib | DOI | http ]
In this paper, we give several results of learning errors for linear programming support vector regression. The corresponding theorems are proved in the reproducing kernel Hilbert space. With the covering number, the approximation property and the capacity of the reproducing kernel Hilbert space are measured. The obtained result (Theorem 2.1) shows that the learning error can be controlled by the sample error and regularization error. The mentioned sample error is summarized by the errors of learning regression function and regularizing function in the reproducing kernel Hilbert space. After estimating the generalization error of learning regression function (Theorem 2.2), the upper bound (Theorem 2.3) of the regularized learning algorithm associated with linear programming support vector regression is estimated.

Keywords: Regression
[140] Jie Liu, Redouane Seraoui, Valeria Vitelli, and Enrico Zio. Nuclear power plant components condition monitoring by probabilistic support vector machine. Annals of Nuclear Energy, 56:23 - 33, 2013. [ bib | DOI | http ]
In this paper, an approach for the prediction of the condition of Nuclear Power Plant (NPP) components is proposed, for the purposes of condition monitoring. It builds on a modified version of the Probabilistic Support Vector Regression (PSVR) method, which is based on the Bayesian probabilistic paradigm with a Gaussian prior. Specific techniques are introduced for the tuning of the {PSVR} hyerparameters, the model identification and the uncertainty analysis. A real case study is considered, regarding the prediction of a drifting process parameter of a {NPP} component.

Keywords: Probabilistic support vector machine
[141] Zhenbo Wei, Jun Wang, and Yongwei Wang. Classification of monofloral honeys from different floral origins and geographical origins based on rheometer. Journal of Food Engineering, 96(3):469 - 479, 2010. [ bib | DOI | http ]
A rheometer was used to classify commercial honeys. Five kinds of Yichun honeys from different floral origins and five kinds of Acacia honeys from different geographical origins were classified based on a rheometer by four pattern recognition techniques: Principal Component Analysis (PCA), Cluster Analysis (CA), Partial Least Squares (PLS), and Support Vector Machines (SVM). All the samples for different floral origins or different geographical origins were demarcated clearly by PCA, PLS. The samples from different floral origins could be classified by SVM, and the samples from different geographical origins also have a high correct classification rate (97.5%). The classification rates for different floral origins and geographical origins were 95% and 97.50% by CA, respectively. Three regression models: Principal Component Regression Analysis (PCR), Partial Least Squares Regression (PLSR), Support Vector Regression (SVR) were used for category forecast. The regression analysis showed that {SVR} with radial basis function kernel worked most effective.

Keywords: Rheometer
[142] Geraldo da Silva e Souza and Eliane Gonçalves Gomes. A performance measure to support decision-making in agricultural research centers in brazil. Procedia Computer Science, 55:405 - 414, 2015. 3rd International Conference on Information Technology and Quantitative Management, {ITQM} 2015. [ bib | DOI | http ]
Abstract The assessment of productive efficiency of a public research institution is of fundamental importance for its administration. A better management of available resources may be accomplished if managers have at their disposal meaningful quantitative measurements of the production process. In this paper we use Multivariate Analysis and Data Envelopment Analysis to define a performance measure for the research centers of the Brazilian Agricultural Research Corporation. Multiple production indicators are reduced to three output variables by means of maximum likelihood factor analysis. Performance is determined on the basis of this output vector and a three dimensional input vector defined by cost components. We impose restrictions on the optimization algorithm to guarantee usage of all outputs and inputs in the optimal solutions. Types of research centers are compared by using fractional regression models, quasi-maximum likelihood estimation and bootstrap. The analysis also provides a weighting system to compute a goal achievement index and therefore support managerial decision-making.

Keywords: Factor Analysis
[143] Siqi Yi, Yong Shi, and Yibing Chen. Establishment of china information technology outsourcing early warning index based on {SVR}. Procedia Computer Science, 55:802 - 808, 2015. 3rd International Conference on Information Technology and Quantitative Management, {ITQM} 2015. [ bib | DOI | http ]
Abstract Information technology outsourcing in China has developed fast, it plays a more and more important role in economic development of China. Economic analysis and early warning system of information technology outsourcing, which reflect the status of ITO, can promote the healthy development of the industry. This paper constructed the indicator system by the method of time difference relevance and peak-valley. The weight vector of each indicator is attained by using support vector regression. It also calculated the comprehensive early warning index and established the early warning index system. At last, we used a group of signal lamps to reflect the status at every time. Based on the reality of {ITO} in China, this paper found that the development speed of {ITO} is slowing in recent months, the government should take out some positive measures.

Keywords: information technology outsourcing
[144] Jing Geng, Min-Liang Huang, Ming-Wei Li, and Wei-Chiang Hong. Hybridization of seasonal chaotic cloud simulated annealing algorithm in a svr-based load forecasting model. Neurocomputing, 151, Part 3:1362 - 1373, 2015. [ bib | DOI | http ]
Abstract Support vector regression with chaotic sequence and simulated annealing algorithm in previous forecasting research paper has shown its superiority to effectively avoid trapping into a local optimum. However, the proposed chaotic simulated annealing (CSA) algorithm in previous published literature as well as the original {SA} algorithm could not realize the mechanism of temperature decreasing continuously. In addition, lots of chaotic sequences adopt Logistic mapping function which is distributed at both ends in the interval [0,1], thus, it could not excellently strengthen the chaotic distribution characteristics. To continue exploring any possible improvements of the proposed {CSA} and chaotic sequence, this paper employs the innovative cloud theory to be hybridized with {CSA} to overcome the discrete temperature annealing process, and applies the Cat mapping function to ensure the chaotic distribution characteristics. Furthermore, seasonal mechanism is also proposed to well arrange with the cyclic tendency of electric load, caused by economic activities or climate cyclic nature. This investigation eventually presents a load forecasting model which hybridizes the seasonal support vector regression model and chaotic cloud simulated annealing algorithm (namely SSVRCCSA) to receive more accurate forecasting performance. Experimental results indicate that the proposed {SSVRCCSA} model yields more accurate forecasting results than other alternatives.

Keywords: Support vector regression (SVR)
[145] Ömer Eskidere, Figen Ertaş, and Cemal Hanilçi. A comparison of regression methods for remote tracking of parkinson’s disease progression. Expert Systems with Applications, 39(5):5523 - 5528, 2012. [ bib | DOI | http ]
Remote patient tracking has recently gained increased attention, due to its lower cost and non-invasive nature. In this paper, the performance of Support Vector Machines (SVM), Least Square Support Vector Machines (LS-SVM), Multilayer Perceptron Neural Network (MLPNN), and General Regression Neural Network (GRNN) regression methods is studied in application to remote tracking of Parkinson’s disease progression. Results indicate that the LS-SVM provides the best performance among the other three, and its performance is superior to that of the latest proposed regression method published in the literature.

Keywords: Parkinson’s disease
[146] Hongdong Li, Yizeng Liang, and Qingsong Xu. Support vector machines and its applications in chemistry. Chemometrics and Intelligent Laboratory Systems, 95(2):188 - 198, 2009. [ bib | DOI | http ]
Support vector machines (SVMs) are a promising machine learning method originally developed for pattern recognition problem based on structural risk minimization. Functionally, {SVMs} can be divided into two categories: support vector classification (SVC) machines and support vector regression (SVR) machines. According to this classification, their basic elements and algorithms are discussed in some detail and selected applications on two real world datasets and two simulated datasets are conducted to elucidate the good generalization performance of SVMs, specially good for treating the data of some nonlineartiy.

Keywords: Support vector machines
[147] Shanshan Qiu, Jun Wang, Chen Tang, and Dongdong Du. Comparison of elm, rf, and {SVM} on e-nose and e-tongue to trace the quality status of mandarin (citrus unshiu marc.). Journal of Food Engineering, 166:193 - 203, 2015. [ bib | DOI | http ]
Abstract This paper demonstrates a joint way employing both of an electronic nose (E-nose) and an electronic tongue (E-tongue) to discriminate two types of satsuma mandarins from different development stages and to trace the internal quality changes (i.e. ascorbic acid, soluble solids content, total acid, and sugar/acid ratio). Extreme Learning Machine (ELM), Random Forest (RF) and Support Vector Machine (SVM) were applied for qualitative classification and quantitative prediction. The models were compared according to accuracy rate and regression parameters. For classification, the three systems (E-nose, E-tongue, and the fusion system) achieved perfect results respectively. For internal quality prediction, the {RF} and {ELM} models obtained better performance than the {SVM} models. The fusion systems had an advantage when compared with the signal system. This study shows that the E-nose and E-tongue systems combined with {RF} or {ELM} could be a fast and objective detection system to trace fruit internal quality changes.

Keywords: E-nose
[148] Fei Feng, Qiongshui Wu, and Libo Zeng. Rapid analysis of diesel fuel properties by near infrared reflectance spectra. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 149:271 - 278, 2015. [ bib | DOI | http ]
Abstract In this study, based on near infrared reflectance spectra (NIRS) of 441 samples from four diesel groups (−10# diesel, −20# diesel, −35# diesel, and inferior diesel), three spectral analysis models were established by using partial least square (PLS) regression for the six diesel properties (i.e., boiling point, cetane number, density, freezing temperature, total aromatics, and viscosity) respectively. In model 1, all the samples were processed as a whole; in model 2 and model 3, samples were firstly classified into four groups by least square support vector machine (LS-SVM), and then partial least square regression models were applied to each group and each property. The main difference between model 2 and model 3 was that the latter used the direct orthogonal signal correction (DOSC), which helped to get rid of the non-relevant variation in the spectra. Comparing these three models, two results could be concluded: (1) models for grouped samples had higher precision and smaller prediction error; (2) models with {DOSC} after LS-SVM classification yielded a considerable error reduction compared to models without DOSC.

Keywords: Near infrared reflectance spectra
[149] Rolands Kromanis and Prakash Kripakaran. Predicting thermal response of bridges using regression models derived from measurement histories. Computers & Structures, 136:64 - 77, 2014. [ bib | DOI | http ]
Abstract This study investigates the application of novel computational techniques for structural performance monitoring of bridges that enable quantification of temperature-induced response during the measurement interpretation process. The goal is to support evaluation of bridge response to diurnal and seasonal changes in environmental conditions, which have widely been cited to produce significantly large deformations that exceed even the effects of live loads and damage. This paper proposes a regression-based methodology to generate numerical models, which capture the relationships between temperature distributions and structural response, from distributed measurements collected during a reference period. It compares the performance of various regression algorithms such as multiple linear regression (MLR), robust regression (RR) and support vector regression (SVR) for application within the proposed methodology. The methodology is successfully validated on measurements collected from two structures – a laboratory truss and a concrete footbridge. Results show that the methodology is capable of accurately predicting thermal response and can therefore help with interpreting measurements from continuous bridge monitoring.

Keywords: Structural health monitoring
[150] JinXing Che. Support vector regression based on optimal training subset and adaptive particle swarm optimization algorithm. Applied Soft Computing, 13(8):3473 - 3481, 2013. [ bib | DOI | http ]
Abstract Support vector regression (SVR) has become very promising and popular in the field of machine learning due to its attractive features and profound empirical performance for small sample, nonlinearity and high dimensional data application. However, most existing support vector regression learning algorithms are limited to the parameters selection and slow learning for large sample. This paper considers an adaptive particle swarm optimization (APSO) algorithm for the parameters selection of support vector regression model. In order to accelerate its training process while keeping high accurate forecasting in each parameters selection step of {APSO} iteration, an optimal training subset (OTS) method is carried out to choose the representation data points of the full training data set. Furthermore, the optimal parameters setting of {SVR} and the optimal size of {OTS} are studied preliminary. Experimental results of an {UCI} data set and electric load forecasting in New South Wales show that the proposed model is effective and produces better generalization performance.

Keywords: Support vector regression
[151] Nadia Abd-Alsabour. Investigating the effect of fixing the subset length on the performance of ant colony optimization for feature selection for supervised learning. Computers & Electrical Engineering, 45:1 - 9, 2015. [ bib | DOI | http ]
Abstract This paper studies the effect of fixing the length of the selected feature subsets on the performance of ant colony optimization (ACO) for feature selection (FS) for supervised learning. It addresses this concern by investigating: (1) determining the optimal feature subset from datamining perspective, (2) demonstrating the solution convergence in case of fixing the length of the selected feature subsets, (3) determining the subset length in {ACO} for subset selection problems, and (4) different stopping criteria when solving {FS} by ACO. Besides, two types of experiments on {ACO} algorithms for {FS} for classification and regression problems using artificial and real world datasets in two cases fixing and not fixing the length of the selected feature subsets with the use of a support vector machine. The obtained results showed that not fixing the length of the selected feature subsets is better than fixing the length of the selected feature subsets.

Keywords: Ant colony optimization
[152] Yunling Liu, Lan Tao, Jianjun Lu, Shuo Xu, Qin Ma, and Qingling Duan. A novel force field parameter optimization method based on {LSSVR} for {ECEPP}. {FEBS} Letters, 585(6):888 - 892, 2011. [ bib | DOI | http ]
In this paper, we propose a novel force field parameter optimization method based on {LSSVR} and optimize the torsion energy parameters of {ECEPP} force field. In this method force field parameter optimization problem is turned into a support vector regression problem. Protein samples for regression model training are chosen from Protein Data Bank. The experiments show that the optimized force-field parameters make both α-helix and β-hairpin structures more consistent with the experimental implications than the original parameters.

Keywords: Force field
[153] Zeynab Ramedani, Mahmoud Omid, Alireza Keyhani, Benyamin Khoshnevisan, and Hadi Saboohi. A comparative study between fuzzy linear regression and support vector regression for global solar radiation prediction in iran. Solar Energy, 109:135 - 143, 2014. [ bib | DOI | http ]
Abstract Energy is fundamental to, and plays a prominent role in the quality of life. Sustainable energy is important for the benefits it yields. Sustainable energy technologies are clean sources of energy that have a much lower environmental impact than conventional energy technologies. Among the different forms of clean energy, solar energy has attracted a lot of attention as it is not only sustainable, but is also renewable. Because the number of meteorological stations where global solar radiation (GSR) is recorded is limited in Iran, the aim was to develop three distinctive models in order to prognosticate {GSR} in Tehran Province, Iran. Accordingly, the fuzzy linear regression (FLR), polynomial and radial basis function (RBF) were applied as the kernel function of support vector regression (SVR). Input energies from different meteorological data obtained from the only station in the study region were selected as the model inputs while {GSR} was chosen as the model output. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempted to minimize the generalization error bounds so as to achieve generalized performance. The experimental results show that it is possible to achieve enhanced predictive accuracy and capability of generalization via the proposed approach. The calculated root mean square error and correlation coefficient disclosed that SVR_rbf performed well in predicting {GSR} compared with FLR.

Keywords: Renewable energy
[154] Jianhong Yang, Cancan Yi, Jinwu Xu, and Xianghong Ma. Laser-induced breakdown spectroscopy quantitative analysis method via adaptive analytical line selection and relevance vector machine regression model. Spectrochimica Acta Part B: Atomic Spectroscopy, 107:45 - 55, 2015. [ bib | DOI | http ]
Abstract A new {LIBS} quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an {LIBS} quantitative analysis method based on {RVM} is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the {RVM} regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed {LIBS} quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.

Keywords: Laser-induced breakdown spectroscopy
[155] Daniel Mirman, Yongsheng Zhang, Ze Wang, H. Branch Coslett, and Myrna F. Schwartz. The ins and outs of meaning: Behavioral and neuroanatomical dissociation of semantically-driven word retrieval and multimodal semantic recognition in aphasia. Neuropsychologia, pages -, 2015. [ bib | DOI | http ]
Abstract Theories about the architecture of language processing differ with regard to whether verbal and nonverbal comprehension share a functional and neural substrate and how meaning extraction in comprehension relates to the ability to use meaning to drive verbal production. We (re-)evaluate data from 17 cognitive-linguistic performance measures of 99 participants with chronic aphasia using factor analysis to establish functional components and support vector regression-based lesion-symptom mapping to determine the neural correlates of deficits on these functional components. The results are highly consistent with our previous findings: production of semantic errors is behaviorally and neuroanatomically distinct from verbal and nonverbal comprehension. Semantic errors were most strongly associated with left {ATL} damage whereas deficits on tests of verbal and non-verbal semantic recognition were most strongly associated with damage to deep white matter underlying the frontal lobe at the confluence of multiple tracts, including the inferior fronto-occipital fasciculus, the uncinate fasciculus, and the anterior thalamic radiations. These results suggest that traditional views based on grey matter hub(s) for semantic processing are incomplete and that the role of white matter in semantic cognition has been underappreciated.

Keywords: Semantic memory
[156] Stefan J. Teipel, Jens Kurth, Bernd Krause, and Michel J. Grothe. The relative importance of imaging markers for the prediction of alzheimer's disease dementia in mild cognitive impairment — beyond classical regression. NeuroImage: Clinical, 8:583 - 593, 2015. [ bib | DOI | http ]
Abstract Selecting a set of relevant markers to predict conversion from mild cognitive impairment (MCI) to Alzheimer's disease (AD) has become a challenging task given the wealth of regional pathologic information that can be extracted from multimodal imaging data. Here, we used regularized regression approaches with an elastic net penalty for best subset selection of multiregional information from AV45-PET, FDG-PET and volumetric {MRI} data to predict conversion from {MCI} to AD. The study sample consisted of 127 {MCI} subjects from ADNI-2 who had a clinical follow-up between 6 and 31 months. Additional analyses assessed the effect of partial volume correction on predictive performance of AV45- and FDG-PET data. Predictor variables were highly collinear within and across imaging modalities. Penalized Cox regression yielded more parsimonious prediction models compared to unpenalized Cox regression. Within single modalities, time to conversion was best predicted by increased AV45-PET signal in posterior medial and lateral cortical regions, decreased FDG-PET signal in medial temporal and temporobasal regions, and reduced gray matter volume in medial, basal, and lateral temporal regions. Logistic regression models reached up to 72% cross-validated accuracy for prediction of conversion status, which was comparable to cross-validated accuracy of non-linear support vector machine classification. Regularized regression outperformed unpenalized stepwise regression when number of parameters approached or exceeded the number of training cases. Partial volume correction had a negative effect on the predictive performance of AV45-PET, but slightly improved the predictive value of FDG-PET data. Penalized regression yielded more parsimonious models than unpenalized stepwise regression for the integration of multiregional and multimodal imaging information. The advantage of penalized regression was particularly strong with a high number of collinear predictors.

[157] Jennifer N. Cooper, Lai Wei, Soledad A. Fernandez, Peter C. Minneci, and Katherine J. Deans. Pre-operative prediction of surgical morbidity in children: Comparison of five statistical models. Computers in Biology and Medicine, 57:54 - 65, 2015. [ bib | DOI | http ]
AbstractBackground The accurate prediction of surgical risk is important to patients and physicians. Logistic regression (LR) models are typically used to estimate these risks. However, in the fields of data mining and machine-learning, many alternative classification and prediction algorithms have been developed. This study aimed to compare the performance of {LR} to several data mining algorithms for predicting 30-day surgical morbidity in children. Methods We used the 2012 National Surgical Quality Improvement Program-Pediatric dataset to compare the performance of (1) a {LR} model that assumed linearity and additivity (simple {LR} model) (2) a {LR} model incorporating restricted cubic splines and interactions (flexible {LR} model) (3) a support vector machine, (4) a random forest and (5) boosted classification trees for predicting surgical morbidity. Results The ensemble-based methods showed significantly higher accuracy, sensitivity, specificity, PPV, and {NPV} than the simple {LR} model. However, none of the models performed better than the flexible {LR} model in terms of the aforementioned measures or in model calibration or discrimination. Conclusion Support vector machines, random forests, and boosted classification trees do not show better performance than {LR} for predicting pediatric surgical morbidity. After further validation, the flexible {LR} model derived in this study could be used to assist with clinical decision-making based on patient-specific surgical risks.

Keywords: Data mining
[158] Hu Yuxia and Zhang Hongtao. Prediction of the chaotic time series based on chaotic simulated annealing and support vector machine. Physics Procedia, 25:506 - 512, 2012. International Conference on Solid State Devices and Materials Science, April 1-2, 2012, Macao. [ bib | DOI | http ]
The regression accuracy and generalization performance of the support vector regression (SVR) model depend on a proper setting of its parameters. An optimal selection approach of {SVR} parameters was put forward based on chaotic simulated annealing algorithm (CSAA), the key parameters C and ɛ of {SVM} and the radial basis kernel parameter g were optimized within the global scope. The support vector regression model was established for chaotic time series prediction by using the optimum parameters. The time series of Lorenz system was used to testify the effectiveness of the model. The root mean square error of prediction reached8.756 × 10-4. Simulation results show that the optimal selection approach based on {CSAA} is available and the CSAA-SVR model can predict the chaotic time series accurately.

Keywords: support vector machine
[159] Hu Yuxia and Zhang Hongtao. Chaos optimization method of {SVM} parameters selection for chaotic time series forecasting. Physics Procedia, 25:588 - 594, 2012. International Conference on Solid State Devices and Materials Science, April 1-2, 2012, Macao. [ bib | DOI | http ]
For support vector regression (SVR), the setting of key parameters is very important, which determines the regression accuracy and generalization performance of {SVR} model. In this paper, an optimal selection approach for {SVR} parameters was put forward based on mutative scale optimization algorithm(MSCOA), the key parameters C and ɛ of {SVM} and the radial basis kernel parameter g were optimized within the global scopes. The support vector regression model was established for chaotic time series prediction by using the optimum parameters. The time series of Lorenz system was used to testify the effectiveness of the model. The root mean square error of prediction reachedRMSE = 3.0335 × 10−3. Simulation results show that the optimal selection approach based on {MSCOA} is an effective approach and the MSCOA-SVR model has a good performance for chaotic time series forecasting.

Keywords: support vector machine
[160] Hui YI, Xiao-Feng SONG, Bin JIANG, Yu-Fang LIU, and Zhi-Hua ZHOU. Flexible support vector regression and its application to fault detection. Acta Automatica Sinica, 39(3):272 - 284, 2013. [ bib | DOI | http ]
Abstract Hyper-parameters, which determine the ability of learning and generalization for support vector regression (SVR), are usually fixed during training. Thus when {SVR} is applied to complex system modeling, this parameters-fixed strategy leaves the {SVR} in a dilemma of selecting rigorous or slack parameters due to complicated distributions of sample dataset. Therefore in this paper we proposed a flexible support vector regression (F-SVR) in which parameters are adaptive to sample dataset distributions during training. The method F-SVR divides the training sample dataset into several domains according to the distribution complexity, and generates a different parameter set for each domain. The efficacy of the proposed method is validated on an artificial dataset, where F-SVR yields better generalization ability than conventional {SVR} methods while maintaining good learning ability. Finally, we also apply F-SVR successfully to practical fault detection of a high frequency power supply.

Keywords: Support vector regression (SVR)
[161] G. Farias, S. Dormido-Canto, J. Vega, and N. Díaz. Initial results with time series forecasting of tj-ii heliac waveforms. Fusion Engineering and Design, pages -, 2015. [ bib | DOI | http ]
Abstract This article discusses about how to apply forecasting techniques to predict future samples of plasma signals during a discharge. One application of the forecasting could be to detect in real time anomalous behaviors in fusion waveforms. The work describes the implementation of three prediction techniques; two of them based on machine learning methods such as artificial neural networks and support vector machines for regression. The results have shown that depending on the temporal horizon, the predictions match the real samples in most cases with an error less than 5%, even more the forecasting of five samples ahead can reach accuracy over 90% in most signals analyzed.

Keywords: Signals
[162] K.C. Assi, H. Labelle, and F. Cheriet. Statistical model based 3d shape prediction of postoperative trunks for non-invasive scoliosis surgery planning. Computers in Biology and Medicine, 48:85 - 93, 2014. [ bib | DOI | http ]
Abstract One of the major concerns of scoliosis patients undergoing surgical treatment is the aesthetic aspect of the surgery outcome. It would be useful to predict the postoperative appearance of the patient trunk in the course of a surgery planning process in order to take into account the expectations of the patient. In this paper, we propose to use least squares support vector regression for the prediction of the postoperative trunk 3D shape after spine surgery for adolescent idiopathic scoliosis. Five dimensionality reduction techniques used in conjunction with the support vector machine are compared. The methods are evaluated in terms of their accuracy, based on the leave-one-out cross-validation performed on a database of 141 cases. The results indicate that the 3D shape predictions using a dimensionality reduction obtained by simultaneous decomposition of the predictors and response variables have the best accuracy.

Keywords: Scoliosis
[163] M.M. Krell, D. Feess, and S. Straube. Balanced relative margin machine — the missing piece between {FDA} and {SVM} classification. Pattern Recognition Letters, 41:43 - 52, 2014. Supervised and Unsupervised Classification Techniques and their Applications. [ bib | DOI | http ]
Abstract In this theoretical work we approach the class of relative margin classification algorithms from the mathematical programming perspective. In particular, we propose a Balanced Relative Margin Machine (BRMM) and then extend it by a 1-norm regularization. We show that this new classifier concept connects Support Vector Machines (SVM) with Fisher’s Discriminant Analysis (FDA) by the insertion of a range parameter. It is also strongly connected to the Support Vector Regression. Using this {BRMM} it is now possible to optimize the classifier type instead of choosing it beforehand. We verify our findings empirically by means of simulated and benchmark data.

Keywords: Support vector machines
[164] Chih-Fong Tsai and Che-Wei Chang. Svois: Support vector oriented instance selection for text classification. Information Systems, 38(8):1070 - 1083, 2013. [ bib | DOI | http ]
Abstract Automatic text classification is usually based on models constructed through learning from training examples. However, as the size of text document repositories grows rapidly, the storage requirements and computational cost of model learning is becoming ever higher. Instance selection is one solution to overcoming this limitation. The aim is to reduce the amount of data by filtering out noisy data from a given training dataset. A number of instance selection algorithms have been proposed in the literature, such as ENN, IB3, ICF, and DROP3. However, all of these methods have been developed for the k-nearest neighbor (k-NN) classifier. In addition, their performance has not been examined over the text classification domain where the dimensionality of the dataset is usually very high. The support vector machines (SVM) are core text classification techniques. In this study, a novel instance selection method, called Support Vector Oriented Instance Selection (SVOIS), is proposed. First of all, a regression plane in the original feature space is identified by utilizing a threshold distance between the given training instances and their class centers. Then, another threshold distance, between the identified data (forming the regression plane) and the regression plane, is used to decide on the support vectors for the selected instances. The experimental results based on the TechTC-100 dataset show the superior performance of {SVOIS} over other state-of-the-art algorithms. In particular, using {SVOIS} to select text documents allows the k-NN and {SVM} classifiers perform better than without instance selection.

Keywords: Instance selection
[165] Kohji Omata. Screening of new additives to heteropoly acid catalyst for friedel–crafts reaction by microwave heated {HTS} and by gaussian process regression. Applied Catalysis A: General, 407(1–2):112 - 117, 2011. [ bib | DOI | http ]
Activity of heteropoly acid (HPA) catalyst for Friedel–Crafts reaction was promoted by Pt addition of which effect was discovered by means of microwave heated high-throughput screening (HTS) and Gaussian process regression (GPR). In the screening, activities of Na, Mg, Mn, Zn, Pd, Cs, Pr and W promoted {HPA} were measured, and every activity test using microwave irradiation required only 150 s. The results and physicochemical properties of these 8 elements were used to construct regression models by a radial basis function network (RBFN), a support vector machine, and GPR. The regression model by {GPR} predicted that Pt is an effective additive, which promotes the activity, and the activity was experimentally verified to be 8 times higher than that of the unpromoted {HPA} catalyst. The performance of the regression model by {GPR} was superior to those by {RBFN} or by {SVM} because an excellent effect of Pt addition was discovered only by GPR. In addition to the extrapolative prediction, advantages of {GPR} model are that the performance and accuracy of the regression model are increased by using expected improvement which can suggest the additional experiments necessary for the improvement of the regression model.

Keywords: Friedel–Crafts reaction
[166] Cong Liu, Simon X. Yang, and Lie Deng. A comparative study for least angle regression on {NIR} spectra analysis to determine internal qualities of navel oranges. Expert Systems with Applications, pages -, 2015. [ bib | DOI | http ]
Abstract Internal qualities of navel oranges are the key factors for their market value and of major concern to customers. Unlike traditional subjective quality assessment, near infrared (NIR) spectroscopy based techniques are quantitative, convenient and non-destructive. Various machine learning methods have been applied to {NIR} spectra analysis to determine the fruit qualities. {NIR} spectra are usually of very high dimension. Explicit or implicit variable selection is essential to ensure prediction performance. Least angle regression (LAR) is a relatively new and efficient machine learning algorithm for regression analysis and is good for variable selection. We investigate the potential of the {LAR} algorithm for {NIR} spectra analysis to determine the internal qualities of navel oranges. A total of 1535 navel orange samples from 15 origins were prepared for {NIR} spectra collection and quality parameters measurement. Spectra are of 1500 dimensions with wavelengths ranging from 1000 nm to 2499 nm. The {LAR} was compared with the most widely used linear and nonlinear methods in three aspects: prediction accuracy, computational efficiency, and model interpretability. The results showed that the prediction performance of {LAR} was better than that of PLS, while slightly inferior to that of least squares support vector machines (LS-SVM). {LAR} was computationally more efficient than both {PLS} and LS-SVM. By concentrating on the most important predictors, {LAR} is much easier to reveal the most relevant predictors than PLS; LS-SVM was hardly interpretable because of its nonlinear kernel.

Keywords: Least angle regression
[167] Georgios Sermpinis, Charalampos Stasinakis, Konstantinos Theofilatos, and Andreas Karathanasopoulos. Modeling, forecasting and trading the {EUR} exchange rates with hybrid rolling genetic algorithms—support vector regression forecast combinations. European Journal of Operational Research, pages -, 2015. [ bib | DOI | http ]
Abstract The motivation of this paper is to introduce a hybrid Rolling Genetic Algorithm-Support Vector Regression (RG-SVR) model for optimal parameter selection and feature subset combination. The algorithm is applied to the task of forecasting and trading the EUR/USD, EUR/GBP and EUR/JPY exchange rates. The proposed methodology genetically searches over a feature space (pool of individual forecasts) and then combines the optimal feature subsets (SVR forecast combinations) for each exchange rate. This is achieved by applying a fitness function specialized for financial purposes and adopting a sliding window approach. The individual forecasts are derived from several linear and non-linear models. RG-SVR is benchmarked against genetically and non-genetically optimized {SVRs} and {SVMs} models that are dominating the relevant literature, along with the robust ARBF-PSO neural network. The statistical and trading performance of all models is investigated during the period of 1999–2012. As it turns out, RG-SVR presents the best performance in terms of statistical accuracy and trading efficiency for all the exchange rates under study. This superiority confirms the success of the implemented fitness function and training procedure, while it validates the benefits of the proposed algorithm.

Keywords: Genetic algorithms
[168] Yong-Ping Zhao, Jian-Guo Sun, Zhong-Hua Du, Zhi-An Zhang, Yu-Chen Zhang, and Hai-Bo Zhang. An improved recursive reduced least squares support vector regression. Neurocomputing, 87:1 - 9, 2012. [ bib | DOI | http ]
Recently, an algorithm, namely recursive reduced least squares support vector regression (RR-LSSVR), was proposed to reduce the number of support vectors, which demonstrates better sparseness compared with other algorithms. However, it does not consider the effects between the previously selected support vectors and the will-selected ones during the selection process. Actually, they are not independent. Hence, in this paper, an improved scheme, named as IRR-LSSVR, is proposed to update the support weights immediately when a new sample is selected as support vector. As a result, the training sample leading to the largest reduction in the target function is chosen to construct the approximation subset. To show the efficacy and feasibility of our proposed IRR-LSSVR, a lot of experiments are done, which are all favorable for our viewpoints. That is, the IRR-LSSVR needs less number of support vectors to reach the almost same generalization performance as RR-LSSVR, which is beneficial to reducing the testing time and favorable for the realtime.

Keywords: Support vector machine
[169] Seokho Kang and Sungzoon Cho. Approximating support vector machine with artificial neural network for fast prediction. Expert Systems with Applications, 41(10):4989 - 4995, 2014. [ bib | DOI | http ]
Abstract Support vector machine (SVM) is a powerful algorithm for classification and regression problems and is widely applied to real-world applications. However, its high computational load in the test phase makes it difficult to use in practice. In this paper, we propose hybrid neural network (HNN), a method to accelerate an {SVM} in the test phase by approximating the SVM. The proposed method approximates the {SVM} using an artificial neural network (ANN). The resulting regression function of the {ANN} replaces the decision function or the regression function of the SVM. Since the prediction of the {ANN} requires significantly less computation than that of the SVM, the proposed method yields faster test speed. The proposed method is evaluated by experiments on real-world benchmark datasets. Experimental results show that the proposed method successfully accelerates {SVM} in the test phase with little or no prediction loss.

Keywords: Support vector machine
[170] Jan Luts, Fabian Ojeda, Raf Van de Plas, Bart De Moor, Sabine Van Huffel, and Johan A.K. Suykens. A tutorial on support vector machine-based methods for classification problems in chemometrics. Analytica Chimica Acta, 665(2):129 - 145, 2010. [ bib | DOI | http ]
This tutorial provides a concise overview of support vector machines and different closely related techniques for pattern classification. The tutorial starts with the formulation of support vector machines for classification. The method of least squares support vector machines is explained. Approaches to retrieve a probabilistic interpretation are covered and it is explained how the binary classification techniques can be extended to multi-class methods. Kernel logistic regression, which is closely related to iteratively weighted least squares support vector machines, is discussed. Different practical aspects of these methods are addressed: the issue of feature selection, parameter tuning, unbalanced data sets, model evaluation and statistical comparison. The different concepts are illustrated on three real-life applications in the field of metabolomics, genetics and proteomics.

Keywords: Support vector machine
[171] Bo Yang, Hung-Yu Chou, and Tsung-Hsun Yang. Color reproduction method by support vector regression for color computer vision. Optik - International Journal for Light and Electron Optics, 124(22):5649 - 5656, 2013. [ bib | DOI | http ]
Abstract In the color computer vision system, the nonlinearity of the camera and computer screen may result in different colors between the screen and the actual color of objects, which requires for color calibration. In this paper, support vector regression (SVR) method was introduced to reproduce the colors of the nonlinear imaging system. Firstly, successive 3σ method was used to eliminate the large errors found in the color measurement. Then, based on the training set measured in advance, {SVR} model of {RBF} kernel was applied to map the nonlinear imaging system. In this step, two important parameters (C, γ) were optimized by the Least Mean Squared Validating Errors algorithm to get the best {SVR} model. Finally, this optimized model could predict the real values displayed on the screen. Compared with quadratic polynomial regression, {BP} neural network and relevance vector machine, the optimized {SVR} model has better ability in color reproduction performance and generalization.

Keywords: Color reproduction
[172] Gao Guo and Jiang-She Zhang. Reducing examples to accelerate support vector regression. Pattern Recognition Letters, 28(16):2173 - 2183, 2007. [ bib | DOI | http ]
With increasing of the number of training examples, training time for support vector regression machine augments greatly. In this paper we develop a method to cut the training time by reducing the number of training examples based on the observation that support vector’s target value is usually a local extremum or near extremum. The proposed method first extracts extremal examples from the full training set, and then the extracted examples are used to train a support vector regression machine. Numerical results show that the proposed method can reduce training time of support regression machine considerably and the obtained model has comparable generalization capability with that trained on the full training set.

Keywords: Support vector machine
[173] Wei Zhou, Shubo Wu, Zhijun Dai, Yuan Chen, Yan Xiang, Jianrong Chen, Chunyu Sun, Qingming Zhou, and Zheming Yuan. Nonlinear {QSAR} models with high-dimensional descriptor selection and {SVR} improve toxicity prediction and evaluation of phenols on photobacterium phosphoreum. Chemometrics and Intelligent Laboratory Systems, 145:30 - 38, 2015. [ bib | DOI | http ]
Abstract Assessment of the risk of chemicals is an important task in the environmental protection. In this paper, we developed quantitative structure–activity relationship (QSAR) methods to evaluate the toxicity of phenol to Photobacterium phosphoreum, which is an important indicator for water quality. We first built support vector regression (SVR) model using three descriptors, and the {SVR} model (t = 2) had the highest external prediction ability (MSEext = 0.068, Qext2 = 0.682), about 40% higher than literature model's. Second, to identify more effective descriptors, we applied in-house methods to select descriptors with clear meanings from 2835 descriptors calculated by the {PCLIENT} and used them to construct the {SVR} models. Our results showed that our twenty new {QSAR} models significantly increased the standard regression coefficient on test set (MSEext values ranged from 0.003 to 0.063 and Qext2 values ranged from 0.708 to 0.985). The Y random response permutation test and different splits of training/test datasets also supported the excellent predictive power of the best {SVR} model. We further evaluated the regression significance of our {SVR} model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided useful theoretical understanding of the toxicity of phenol analogues.

Keywords: Phenol
[174] Jaime Alonso, Alfonso Villa, and Antonio Bahamonde. Improved estimation of bovine weight trajectories using support vector machine classification. Computers and Electronics in Agriculture, 110:36 - 41, 2015. [ bib | DOI | http ]
Abstract The benefits of livestock breeders are usually closely related to the weight of their animals. In this paper we present a method to anticipate the weight of each animal provided we know the past evolution of the herd. Our approach exploits the geometrical relationships of the trajectories of weights along the time. Starting from a collection of data from a set of animals, we learn a family of parallel functions that fits the whole data set, instead of having one regression function for each individual. In this way, our method enables animals with only one or a few weights to have an accurate estimation of their future evolution. Thus, we learn a function F defined on the space of weights and time that separates the trajectories in such a way that F has constant values on each trajectory. The key point is that the specification of F can be done in terms of ordering constraints, in the same way as preference functions or ordinal regressors. Therefore, F can be obtained from a classification {SVM} (Support Vector Machines). To evaluate the method, we have used a collection of real world data sets of bovines of different breeds and ages. We will show that our method outperforms the separate regression of each animal when there are only a few weights available and we need medium or long term predictions.

Keywords: Support Vector Machines (SVM)
[175] Bingtao Zhao, Yaxin Su, and Wenwen Tao. Mass transfer performance of {CO2} capture in rotating packed bed: Dimensionless modeling and intelligent prediction. Applied Energy, 136:132 - 142, 2014. [ bib | DOI | http ]
Abstract Rotating packed beds have been demonstrated to be able to intensify the physicochemical process of multiphase transportation and reaction in the fields of energy and environment, and successfully applied in the field of {CO2} emission control. However, modeling and prediction of gas–liquid mass transfer especially for mass transfer with chemical reaction are rare due to the complexity of multiphase fluid flow and transportation. In view of the inaccuracy of semi-empirical models and the complexity of computational fluid dynamics models, an intelligent correlation model was developed in this work to predict the mass transfer coefficient more accurately for {CO2} capture with NaOH solution in different type rotating packed beds. This model used dimensional analysis to determine the independent variables affecting the mass transfer coefficients, and then used least squares support vector regression (LSSVR) for prediction. An optimized radial basis function was obtained as kernel function based on grid search coupled with simulated annealing (SA) and 10-fold cross-validation (CV) algorithms. The proposed model had the mean square error of 0.0016 for training set and 0.0012 for testing set. Compared with the models based on multiple nonlinear regression (MNR) and artificial neural network (ANN), the present model decreased mean squared error by 91.06% and 38.46% for training set and 94.57% and 53.85% for testing set respectively, suggesting it had superior performance on prediction accuracy and generalization ability.

Keywords: {CO2} capture
[176] Özlem Baydaroğlu and Kasım Koçak. Svr-based prediction of evaporation combined with chaotic approach. Journal of Hydrology, 508:356 - 363, 2014. [ bib | DOI | http ]
Summary Evaporation, temperature, wind speed, solar radiation and relative humidity time series are used to predict water losses. Prediction of evaporation amounts is performed using Support Vector Regression (SVR) originated from Support Vector Machine (SVM). To prepare the input data for SVR, phase space reconstructions are realized using both univariate and multivariate time series embedding methods. The idea behind {SVR} is based on the computation of a linear regression in a multidimensional feature space. Observations vector in the input space are transformed to feature space by way of a kernel function. In this study, Radial Basis Function (RBF) is preferred as a kernel function due to its flexibility to observations from many divers fields. It is widely accepted that {SVR} is the most effective method for prediction when compared to other classical and modern methods like Artificial Neural Network (ANN), Autoregressive Integrated Moving Average (ARIMA), Group Method of Data Handling (GMDH) (Samsudin et al., 2011). Thus {SVR} has been chosen to predict evaporation amounts because of its good generalization capability. The results show that SVR-based predictions are very successful with high determination coefficients as 83% and 97% for univariate and multivariate time series embeddings, respectively.

Keywords: Prediction
[177] V. Rodriguez-Galiano, M. Sanchez-Castillo, M. Chica-Olmo, and M. Chica-Rivas. Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews, pages -, 2015. [ bib | DOI | http ]
Abstract Machine learning algorithms (MLAs) such us artificial neural networks (ANNs), regression trees (RTs), random forest (RF) and support vector machines (SVMs) are powerful data driven methods that are relatively less widely used in the mapping of mineral prospectivity, and thus have not been comparatively evaluated together thoroughly in this field. The performances of a series of MLAs, namely, artificial neural networks (ANNs), regression trees (RTs), random forest (RF) and support vector machines (SVMs) in mineral prospectivity modelling are compared based on the following criteria: i) the accuracy in the delineation of prospective areas; ii) the sensitivity to the estimation of hyper-parameters; iii) the sensitivity to the size of training data; and iv) the interpretability of model parameters. The results of applying the above algorithms to epithermal Au prospectivity mapping of the Rodalquilar district, Spain, indicate that the {RF} outperformed the other {MLA} algorithms (ANNs, {RTs} and SVMs). The {RF} algorithm showed higher stability and robustness with varying training parameters and better success rates and {ROC} analysis results. On the other hand, all {MLA} algorithms can be used when ore deposit evidences are scarce. Moreover the model parameters of {RF} and {RT} can be interpreted to gain insights into the geological controls of mineralization.

Keywords: Mineral prospectivity mapping
[178] Yong-Ping Zhao and Jian-Guo Sun. Robust truncated support vector regression. Expert Systems with Applications, 37(7):5126 - 5133, 2010. [ bib | DOI | http ]
In this paper, we utilize two ε-insensitive loss functions to construct a non-convex loss function. Based on this non-convex loss function, a robust truncated support vector regression (TSVR) is proposed. In order to solve the TSVR, the concave–convex procedure is used to circumvent this problem though transforming the non-convex problem to a sequence of convex ones. The {TSVR} owns better robustness to outliers than the classical support vector regression, which makes the {TSVR} gain advantages in the generalization ability and the number of support vector. Finally, the experiments on the synthetic and real-world benchmark data sets further confirm the effectiveness of our proposed TSVR.

Keywords: Non-convex loss function
[179] Zhe-Ming YUAN and Xian-Sheng TAN. Nonlinear screening indicators of drought resistance at seedling stage of rice based on support vector machine. Acta Agronomica Sinica, 36(7):1176 - 1182, 2010. [ bib | DOI | http ]
Screening indexes for drought resistance in crops is a puzzler characterized with a few samples, multiple indexes, and nonlinear. Rationality of linear regression model and indexes obtained by linear screening based on empirical risk minimization are controversal. On the contrary, support vector machine based on structural risk minimization has the advantages of nonlinear characteristics, fitting for a few samples, avoiding the over-fit, strong generalization ability, and high prediction precision. In this paper, setting the survival percentage under repeated drought condition as the target and support vector regression as the nonlinear screen tool, 6 integrated indexes including plant height, proline content, malondialdehyde content, leaf age, area of the first leaf under the central leaf and ascorbic acid were highlighted from 24 morphological and physiological indexes in 15 paddy rice cultivars. The results showed that support vector regression model with the 6 integrated indexes had a more distinct improvement in fitting and prediction precision than the linear reference models. Considering the simplicity of indexes measurement, the support vector regression model with only 6 morphological indexes including shoot dry weight, area of the second leaf under the central leaf, root shoot ratio, leaf age, leaf fresh weight, and area of the first leaf under the central leaf was also feasible. Furthermore, an explanatory system including the significance of regression model and the importance of single index was established based on support vector regression and F-test.

Keywords: rice
[180] Jamshid Piri, Shahaboddin Shamshirband, Dalibor Petković, Chong Wen Tong, and Muhammad Habib ur Rehman. Prediction of the solar radiation on the earth using support vector regression technique. Infrared Physics & Technology, 68:179 - 185, 2015. [ bib | DOI | http ]
Abstract The solar rays on the surface of Earth is one of the major factor in water resources, environmental and agricultural modeling. The main environmental factors influencing plants growth are temperature, moisture, and solar radiation. Solar radiation is rarely obtained in weather stations; as a result, many empirical approaches have been applied to estimate it by using other parameters. In this study, a soft computing technique, named support vector regression (SVR) has been used to estimate the solar radiation. The data was collected from two synoptic stations with different climate conditions (Zahedan and Bojnurd) during the period of 5 and 7 years, respectively. These data contain sunshine hours, maximum temperature, minimum temperature, average relative humidity and daily solar radiation. In this study, the polynomial and radial basis functions (RBF) are applied as the {SVR} kernel function to estimate solar radiation. The performance of the proposed estimators is confirmed with the simulation results.

Keywords: SVR
[181] Rong Chen, Chang-Yong Liang, Wei-Chiang Hong, and Dong-Xiao Gu. Forecasting holiday daily tourist flow based on seasonal support vector regression with adaptive genetic algorithm. Applied Soft Computing, 26:435 - 443, 2015. [ bib | DOI | http ]
Abstract Accurate holiday daily tourist flow forecasting is always the most important issue in tourism industry. However, it is found that holiday daily tourist flow demonstrates a complex nonlinear characteristic and obvious seasonal tendency from different periods of holidays as well as the seasonal nature of climates. Support vector regression (SVR) has been widely applied to deal with nonlinear time series forecasting problems, but it suffers from the critical parameters selection and the influence of seasonal tendency. This article proposes an approach which hybridizes {SVR} model with adaptive genetic algorithm (AGA) and the seasonal index adjustment, namely AGA-SSVR, to forecast holiday daily tourist flow. In addition, holiday daily tourist flow data from 2008 to 2012 for Mountain Huangshan in China are employed as numerical examples to validate the performance of the proposed model. The experimental results indicate that the AGA-SSVR model is an effective approach with more accuracy than the other alternative models including AGA-SVR and back-propagation neural network (BPNN).

Keywords: Holiday daily tourist flow forecasting
[182] Mitsuo Hirata, Yohei Hashimoto, Sakae Noguchi, and Shuichi Adachi. A hybrid modeling method for mechanical systems. Mechatronics, 20(1):59 - 66, 2010. Special Issue on “Servo Control for Data Storage and Precision Systems”, from 17th {IFAC} World Congress 2008. [ bib | DOI | http ]
In this paper, a system identification method for hybrid systems switched by the magnitude of velocity and displacement is proposed. First, it is shown that the regression vector space of a mechanical system switched by the magnitude of velocity cannot be separated by a hyperplane. Then a method based on support vector machines with a polynomial kernel is proposed. The effectiveness of the proposed method is shown by simulations and experiments.

Keywords: System identification
[183] Zhengzong Wu, Enbo Xu, Jie Long, Yujing Zhang, Fang Wang, Xueming Xu, Zhengyu Jin, and Aiquan Jiao. Monitoring of fermentation process parameters of chinese rice wine using attenuated total reflectance mid-infrared spectroscopy. Food Control, 50:405 - 412, 2015. [ bib | DOI | http ]
Abstract There is a growing need for the effective fermentation monitoring during the manufacture of wine due to the rapid pace of change in the industry. In this study, the potential of attenuated total reflectance mid-infrared (ATR-MIR) spectroscopy to monitor time-related changes during Chinese rice wine (CRW) fermentation was investigated. Interval partial least-squares (i-PLS) and support vector machine (SVM) were used to improve the performances of partial least-squares (PLS) models. In total, four different calibration models, namely PLS, i-PLS, {SVM} and interval support vector machine (i-SVM), were established. It was observed that the performances of models based on the efficient spectra intervals selected by i-PLS were much better than those based on the full spectrum. In addition, nonlinear models outperformed linear models in predicting fermentation parameters. After systemically comparison and discussion, it was found that i-SVM model gave the best result with excellent prediction accuracy. The correlation coefficients (R2 (pre)), root mean square error (RMSEP (%)) and the residual predictive deviation (RPD) for the prediction set were 0.96, 6.92 and 14.34 for total sugar, 0.97, 3.32 and 12.64 for ethanol, 0.93, 3.24 and 9.3 for total acid and 0.95, 6.33 and 8.46 for amino nitrogen, respectively. The results demonstrated that ATR-MIR combined with efficient variable selection algorithm and nonlinear regression tool as a rapid method to monitor and control {CRW} fermentation process was feasible.

Keywords: Chinese rice wine
[184] Weiya Guo, Xuezhi Xia, and Xiaofei Wang. A remote sensing ship recognition method of entropy-based hierarchical discriminant regression. Optik - International Journal for Light and Electron Optics, pages -, 2015. [ bib | DOI | http ]
Abstract Aiming at recognizing the battlefield's ship targets on the sea reliably and timely, a discriminative method for ship recognition using optical remote sensing data entropy-based hierarchical discriminant regression (E-HDR) is presented. First, target features including size, texture, shape, and moment invariants features, as well as area ratio codes are extracted as candidate features, and then information entropy is used to choose the attributes in target recognition, which can reduce the interference of redundant attributes to target recognition, and the valid recognition features are selected automatically. Next, entropy is also used to realize the sub nodes splitting adaptively and automatically, which avoids manual intervention well. Ultimately, according to entropy, a decision tree based on hierarchical discriminant regression (HDR) theory is built to recognize ships in data from optical remote sensing systems. Experimental results on real data show that the proposed approach can get better classification rates at a higher speed than k-nearest neighbor (KNN), support vector machines (SVM), affinity propagation (AP) and traditional hierarchical discriminant regression (HDR) methods.

Keywords: Ship recognition
[185] Saeid Shokri, Mahdi Ahmadi Marvast, Mohammad Taghi Sadeghi, and Shankar Narasimhan. Combination of data rectification techniques and soft sensor model for robust prediction of sulfur content in {HDS} process. Journal of the Taiwan Institute of Chemical Engineers, pages -, 2015. [ bib | DOI | http ]
Abstract A novel approach based on integration of data rectification techniques and support vector regression (SVR) is proposed to predict the sulfur content of treated product in gas oil hydrodesulfurization (HDS) process. Simultaneous approaches consisting of robust estimation method (REM) and wavelet transform (WT) were proposed to reduce outliers and noises of the input data for the {SVR} model. Results indicated that implementation of outlier detection and noise reduction techniques give a considerable improvement in the prediction error. Proposed approach delivered satisfactory predicting performance in computation time (CT) and prediction accuracy (AARE = 0.079 and CT = 74 s). The proposed method can provide a robust soft sensor for prediction of industrial treated gas oil's sulfur content.

Keywords: Gas oil hydrodesulfurization
[186] Pei-Yi Hao. New support vector algorithms with parametric insensitive/margin model. Neural Networks, 23(1):60 - 73, 2010. [ bib | DOI | http ]
In this paper, a modification of v -support vector machines ( v -SVM) for regression and classification is described, and the use of a parametric insensitive/margin model with an arbitrary shape is demonstrated. This can be useful in many cases, especially when the noise is heteroscedastic, that is, the noise strongly depends on the input value x . Like the previous v -SVM, the proposed support vector algorithms have the advantage of using the parameter 0 ≤ v ≤ 1 for controlling the number of support vectors. To be more precise, v is an upper bound on the fraction of training errors and a lower bound on the fraction of support vectors. The algorithms are analyzed theoretically and experimentally.

Keywords: Support vector machines (SVMs)
[187] P.J. García Nieto, E. García-Gonzalo, F. Sánchez Lasheras, and F.J. de Cos Juez. Hybrid pso–svm-based method for forecasting of the remaining useful life for aircraft engines and evaluation of its reliability. Reliability Engineering & System Safety, 138:219 - 231, 2015. [ bib | DOI | http ]
Abstract The present paper describes a hybrid PSO–SVM-based model for the prediction of the remaining useful life of aircraft engines. The proposed hybrid model combines support vector machines (SVMs), which have been successfully adopted for regression problems, with the particle swarm optimization (PSO) technique. This optimization technique involves kernel parameter setting in the {SVM} training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not been yet widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid PSO–SVM-based model from the remaining measured parameters (input variables) for aircraft engines with success. A coefficient of determination equal to 0.9034 was obtained when this hybrid PSO–RBF–SVM-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. One of the main advantages of this predictive model is that it does not require information about the previous operation states of the engine. Finally, the main conclusions of this study are exposed.

Keywords: Support vector machines (SVMs)
[188] Cheng-Wei Fei and Guang-Chen Bai. Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression. Mechanical Systems and Signal Processing, 49(1–2):196 - 208, 2014. [ bib | DOI | http ]
Abstract To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of {DCSRM} is established and the probabilistic design idea of {DCSRM} is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) {BTRRC} is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of {HPT} is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the {DCSRM} has high computational accuracy and high computational efficiency in {BTRRC} probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.

Keywords: Mechanical dynamic assembly
[189] Hancheng Dong, Xiaoning Jin, Yangbing Lou, and Changhong Wang. Lithium-ion battery state of health monitoring and remaining useful life prediction based on support vector regression-particle filter. Journal of Power Sources, 271:114 - 123, 2014. [ bib | DOI | http ]
Abstract Lithium-ion batteries are used as the main power source in many electronic and electrical devices. In particular, with the growth in battery-powered electric vehicle development, the lithium-ion battery plays a critical role in the reliability of vehicle systems. In order to provide timely maintenance and replacement of battery systems, it is necessary to develop a reliable and accurate battery health diagnostic that takes a prognostic approach. Therefore, this paper focuses on two main methods to determine a battery's health: (1) Battery State-of-Health (SOH) monitoring and (2) Remaining Useful Life (RUL) prediction. Both of these are calculated by using a filter algorithm known as the Support Vector Regression-Particle Filter (SVR-PF). Models for battery {SOH} monitoring based on SVR-PF are developed with novel capacity degradation parameters introduced to determine battery health in real time. Moreover, the {RUL} prediction model is proposed, which is able to provide the {RUL} value and update the {RUL} probability distribution to the End-of-Life cycle. Results for both methods are presented, showing that the proposed {SOH} monitoring and {RUL} prediction methods have good performance and that the SVR-PF has better monitoring and prediction capability than the standard particle filter (PF).

Keywords: Lithium-ion battery
[190] Yongming Wang, Jian Li, Junzhong Gu, Zili Zhou, and Zhijin Wang. Artificial neural networks for infectious diarrhea prediction using meteorological factors in shanghai (china). Applied Soft Computing, 35:280 - 290, 2015. [ bib | DOI | http ]
Abstract Infectious diarrhea is an important public health problem around the world. Meteorological factors have been strongly linked to the incidence of infectious diarrhea. Therefore, accurately forecast the number of infectious diarrhea under the effect of meteorological factors is critical to control efforts. In recent decades, development of artificial neural network (ANN) models, as predictors for infectious diseases, have created a great change in infectious disease predictions. In this paper, a three layered feed-forward back-propagation {ANN} (BPNN) model trained by Levenberg–Marquardt algorithm was developed to predict the weekly number of infectious diarrhea by using meteorological factors as input variable. The meteorological factors were chosen based on the strongly relativity with infectious diarrhea. Also, as a comparison study, the support vector regression (SVR), random forests regression (RFR) and multivariate linear regression (MLR) also were applied as prediction models using the same dataset in addition to {BPNN} model. The 5-fold cross validation technique was used to avoid the problem of overfitting in models training period. Further, since one of the drawbacks of {ANN} models is the interpretation of the final model in terms of the relative importance of input variables, a sensitivity analysis is performed to determine the parametric influence on the model outputs. The simulation results obtained from the {BPNN} confirms the feasibility of this model in terms of applicability and shows better agreement with the actual data, compared to those from the SVR, {RFR} and {MLR} models. The {BPNN} model, described in this paper, is an efficient quantitative tool to evaluate and predict the infectious diarrhea using meteorological factors.

Keywords: Artificial neural networks
[191] Xuezhen Hong, Jun Wang, and Guande Qi. E-nose combined with chemometrics to trace tomato-juice quality. Journal of Food Engineering, 149:38 - 43, 2015. [ bib | DOI | http ]
Abstract An e-nose was presented to trace freshness of cherry tomatoes that were squeezed for juice consumption. Four supervised approaches (linear discriminant analysis, quadratic discriminant analysis, support vector machines and back propagation neural network) and one semi-supervised approach (Cluster-then-Label) were applied to classify the juices, and the semi-supervised classifier outperformed the supervised approaches. Meanwhile, quality indices of the tomatoes (storage time, pH, soluble solids content (SSC), Vitamin C (VC) and firmness) were predicted by partial least squares regression (PLSR). Two sizes of training sets (20% and 70% of the whole dataset, respectively) were considered, and {R2} > 0.737 for all quality indices in both cases, suggesting it is possible to trace fruit quality through detecting the squeezed juices. However, {PLSR} models trained by the small dataset were not very good. Thus, our next plan is to explore semi-supervised regression methods for regression cases where only a few experimental data are available.

Keywords: Electronic nose
[192] Xinjun Peng. Tsvr: An efficient twin support vector machine for regression. Neural Networks, 23(3):365 - 372, 2010. [ bib | DOI | http ]
The learning speed of classical Support Vector Regression (SVR) is low, since it is constructed based on the minimization of a convex quadratic function subject to the pair groups of linear inequality constraints for all training samples. In this paper we propose Twin Support Vector Regression (TSVR), a novel regressor that determines a pair of ϵ -insensitive up- and down-bound functions by solving two related SVM-type problems, each of which is smaller than that in a classical SVR. The {TSVR} formulation is in the spirit of Twin Support Vector Machine (TSVM) via two nonparallel planes. The experimental results on several artificial and benchmark datasets indicate that the proposed {TSVR} is not only fast, but also shows good generalization performance.

Keywords: Machine learning
[193] Hung Chak Ho, Anders Knudby, Paul Sirovyak, Yongming Xu, Matus Hodul, and Sarah B. Henderson. Mapping maximum urban air temperature on hot summer days. Remote Sensing of Environment, 154:38 - 45, 2014. [ bib | DOI | http ]
Abstract Air temperature is an essential component in microclimate and environmental health research, but difficult to map in urban environments because of strong temperature gradients. We introduce a spatial regression approach to map the peak daytime air temperature relative to a reference station on typical hot summer days using Vancouver, Canada as a case study. Three regression models, ordinary least squares regression, support vector machine, and random forest, were all calibrated using Landsat TM/ETM + data and field observations from two sources: Environment Canada and the Weather Underground. Results based on cross-validation indicate that the random forest model produced the lowest prediction errors (RMSE = 2.31 °C). Some weather stations were consistently cooler/hotter than the reference station and were predicted well, while other stations, particularly those close to the ocean, showed greater temperature variability and were predicted with greater errors. A few stations, most of which were from the Weather Underground data set, were very poorly predicted and possibly unrepresentative of air temperature in the area. The random forest model generally produced a sensible map of temperature distribution in the area. The spatial regression approach appears useful for mapping intra-urban air temperature variability and can easily be applied to other cities.

Keywords: Landsat
[194] Gert Loterman, Iain Brown, David Martens, Christophe Mues, and Bart Baesens. Benchmarking regression algorithms for loss given default modeling. International Journal of Forecasting, 28(1):161 - 170, 2012. Special Section 1: The Predictability of Financial MarketsSpecial Section 2: Credit Risk Modelling and Forecasting. [ bib | DOI | http ]
The introduction of the Basel {II} Accord has had a huge impact on financial institutions, allowing them to build credit risk models for three key risk parameters: {PD} (probability of default), {LGD} (loss given default) and {EAD} (exposure at default). Until recently, credit risk research has focused largely on the estimation and validation of the {PD} parameter, and much less on {LGD} modeling. In this first large-scale {LGD} benchmarking study, various regression techniques for modeling and predicting {LGD} are investigated. These include one-stage models, such as those built by ordinary least squares regression, beta regression, robust regression, ridge regression, regression splines, neural networks, support vector machines and regression trees, as well as two-stage models which combine multiple techniques. A total of 24 techniques are compared using six real-life loss datasets from major international banks. It is found that much of the variance in {LGD} remains unexplained, as the average prediction performance of the models in terms of R 2 ranges from 4% to 43%. Nonetheless, there is a clear trend that non-linear techniques, and in particular support vector machines and neural networks, perform significantly better than more traditional linear techniques. Also, two-stage models built by a combination of linear and non-linear techniques are shown to have a similarly good predictive power, with the added advantage of having a comprehensible linear model component.

Keywords: Basel II
[195] Jui-Sheng Chou and Anh-Duc Pham. Hybrid computational model for predicting bridge scour depth near piers and abutments. Automation in Construction, 48:88 - 96, 2014. [ bib | DOI | http ]
Abstract Efficient bridge design and maintenance requires a clear understanding of channel bottom scouring near piers and abutment foundations. Bridge scour, a dynamic phenomenon that varies according to numerous factors (e.g., water depth, flow angle and strength, pier and abutment shape and width, material properties of the sediment), is a major cause of bridge failure and is critical to the total construction and maintenance costs of bridge building. Accurately estimating the equilibrium depths of local scouring near piers and abutments is vital for bridge design and management. Therefore, an efficient technique that can be used to enhance the estimation capability, safety, and cost reduction when designing and managing bridge projects is required. This study investigated the potential use of genetic algorithm (GA)-based support vector regression (SVR) model to predict bridge scour depth near piers and abutments. An {SVR} model developed by using MATLAB® was optimized using a GA, maximizing generalization performance. Data collected from the literature were used to evaluate the bridge scour depth prediction accuracy of the hybrid model. To demonstrate the capability of the computational model, the GA–SVR modeling results were compared with those obtained using numeric predictive models (i.e., classification and regression tree, chi-squared automatic interaction detector, multiple regression, artificial neural network, and ensemble models) and empirical methods. The proposed hybrid model achieved error rates that were 81.3% to 96.4% more accurate than those obtained using other methods. The GA–SVR model effectively outperformed existing methods and can be used by civil engineers to efficiently design safer and more cost-effective bridge substructures.

Keywords: Bridge foundation
[196] Stefan Platikanov, Jordi Martín, and Romà Tauler. Linear and non-linear chemometric modeling of {THM} formation in barcelona's water treatment plant. Science of The Total Environment, 432:365 - 374, 2012. [ bib | DOI | http ]
The complex behavior observed for the dependence of trihalomethane formation on forty one water treatment plant (WTP) operational variables is investigated by means of linear and non-linear regression methods, including kernel-partial least squares (K-PLS), and support vector machine regression (SVR). Lower prediction errors of total trihalomethane concentrations (lower than 14% for external validation samples) were obtained when these two methods were applied in comparison to when linear regression methods were applied. A new visualization technique revealed the complex nonlinear relationships among the operational variables and displayed the existing correlations between input variables and the kernel matrix on one side and the support vectors on the other side. Whereas some water treatment plant variables like river water {TOC} and chloride concentrations, and breakpoint chlorination were not considered to be significant due to the multi-collinear effect in straight linear regression modeling methods, they were now confirmed to be significant using K-PLS and {SVR} non-linear modeling regression methods, proving the better performance of these methods for the prediction of complex formation of trihalomethanes in water disinfection plants.

Keywords: Drinking water
[197] Oscar González-Recio, Guilherme J.M. Rosa, and Daniel Gianola. Machine learning methods and predictive ability metrics for genome-wide prediction of complex traits. Livestock Science, 166:217 - 231, 2014. Genomics Applied to Livestock Production. [ bib | DOI | http ]
Abstract Genome-wide prediction of complex traits has become increasingly important in animal and plant breeding, and is receiving increasing attention in human genetics. Most common approaches are whole-genome regression models where phenotypes are regressed on thousands of markers concurrently, applying different prior distributions to marker effects. While use of shrinkage or regularization in {SNP} regression models has delivered improvements in predictive ability in genome-based evaluations, serious over-fitting problems may be encountered as the ratio between markers and available phenotypes continues increasing. Machine learning is an alternative approach for prediction and classification, capable of dealing with the dimensionality problem in a computationally flexible manner. In this article we provide an overview of non-parametric and machine learning methods used in genome wide prediction, discuss their similarities as well as their relationship to some well-known parametric approaches. Although the most suitable method is usually case dependent, we suggest the use of support vector machines and random forests for classification problems, whereas Reproducing Kernel Hilbert Spaces regression and boosting may suit better regression problems, with the former having the more consistently higher predictive ability. Neural Networks may suffer from over-fitting and may be too computationally demanded when the number of neurons is large. We further discuss on the metrics used to evaluate predictive ability in model comparison under cross-validation from a genomic selection point of view. We suggest use of predictive mean squared error as a main but not only metric for model comparison. Visual tools may greatly assist on the choice of the most accurate model.

Keywords: Animal breeding
[198] Jatin Alreja, Shantaram Parab, Shivam Mathur, and Pijush Samui. Estimating hysteretic energy demand in steel moment resisting frames using multivariate adaptive regression spline and least square support vector machine. Ain Shams Engineering Journal, 6(2):449 - 455, 2015. [ bib | DOI | http ]
Abstract This paper uses Multivariate Adaptive Regression Spline (MARS) and Least Squares Support Vector Machines (LSSVMs) to predict hysteretic energy demand in steel moment resisting frames. These models are used to establish a relation between the hysteretic energy demand and several effective parameters such as earthquake intensity, number of stories, soil type, period, strength index, and the energy imparted to the structure. A total of 27 datasets (input–output pairs) are used, 23 of which are used to train the model and 4 are used to test the models. The data-sets used in this study are derived from experimental results. The performance and validity of the model are further tested on different steel moment resisting structures. The developed models have been compared with Genetic-based simulated annealing method (GSA) and accurate results portray the strong potential of {MARS} and {LSSVM} as reliable tools to predict the hysteretic energy demand.

Keywords: Multivariate Adaptive Regression Spline
[199] S.K. Lahiri and K.C. Ghanta. Prediction of pressure drop of slurry flow in pipeline by hybrid support vector regression and genetic algorithm model. Chinese Journal of Chemical Engineering, 16(6):841 - 848, 2008. [ bib | DOI | http ]
This paper describes a robust support vector regression (SVR) methodology, which can offer superior performance for important process engineering problems. The method incorporates hybrid support vector regression and genetic algorithm technique (SVR-GA) for efficient tuning of {SVR} meta-parameters. The algorithm has been applied for prediction of pressure drop of solid liquid slurry flow. A comparison with selected correlations in the literature showed that the developed {SVR} correlation noticeably improved the prediction of pressure drop over a wide range of operating conditions, physical properties, and pipe diameters.

Keywords: support vector regression
[200] Yue Huang, Guorong Du, Yanjun Ma, and Jun Zhou. Near-infrared determination of polyphenols using linear and nonlinear regression algorithms. Optik - International Journal for Light and Electron Optics, pages -, 2015. [ bib | DOI | http ]
Abstract In the present study, the possibility of using Fourier transform near-infrared spectroscopy (FT-NIR) to measure the concentration of polyphenols in Yunnan tobacco was investigated. Selected samples representing a wide range of varieties and regions were analyzed by high performance liquid chromatography (HPLC) for the concentrations of polyphenols in tobacco. Results showed that positive correlations existed between {NIR} spectra and concentration of objective compound upon the established linear and nonlinear regression models. The optimal model was obtained by comparing different modeling processes. It was demonstrated that the {PLS} regression covering the range of 5450–4250 cm−1 could lead to a good linear relationship between spectra and polyphenols with the {R2} of 0.9170. Optimal model generated the {RMSEP} of 0.254, {RSEP} of 0.0554, and {RPD} of 3.47, revealing that the linear model was able to predict the content of polyphenols in tobacco. Support vector regression (SVR) preprocessed by {SNV} obtained the predictable results with the {R2} of 0.8461, {RMSEP} of 0.374, and {RPD} of 2.36, which was inferior to {PLS} modeling.

Keywords: Polyphenols
[201] Kuaini Wang and Ping Zhong. Robust non-convex least squares loss function for regression with outliers. Knowledge-Based Systems, 71:290 - 302, 2014. [ bib | DOI | http ]
Abstract In this paper, we propose a robust scheme for least squares support vector regression (LS-SVR), termed as RLS-SVR, which employs non-convex least squares loss function to overcome the limitation of LS-SVR that it is sensitive to outliers. Non-convex loss gives a constant penalty for any large outliers. The proposed loss function can be expressed by a difference of convex functions (DC). The resultant optimization is a {DC} program. It can be solved by utilizing the Concave–Convex Procedure (CCCP). RLS-SVR iteratively builds the regression function by solving a set of linear equations at one time. The proposed RLS-SVR includes the classical LS-SVR as its special case. Numerical experiments on both artificial datasets and benchmark datasets confirm the promising results of the proposed algorithm.

Keywords: Least squares support vector regression
[202] Petr Hájek and Vladimír Olej. Ozone prediction on the basis of neural networks, support vector regression and methods with uncertainty. Ecological Informatics, 12:31 - 42, 2012. [ bib | DOI | http ]
The article presents modeling of daily average ozone level prediction by means of neural networks, support vector regression and methods based on uncertainty. Based on data measured by a monitoring station of the Pardubice micro-region, the Czech Republic, and optimization of the number of parameters by a defined objective function and genetic algorithm a model of daily average ozone level prediction in a certain time has been designed. The designed model has been optimized in light of its input parameters. The goal of prediction by various methods was to compare the results of prediction with the aim of various recommendations to micro-regional public administration management. It is modeling by means of feed-forward perceptron type neural networks, time delay neural networks, radial basis function neural networks, ε-support vector regression, fuzzy inference systems and Takagi–Sugeno intuitionistic fuzzy inference systems. Special attention is paid to the adaptation of the Takagi–Sugeno intuitionistic fuzzy inference system and adaptation of fuzzy logic-based systems using evolutionary algorithms. Based on data obtained, the daily average ozone level prediction in a certain time is characterized by a root mean squared error. The best possible results were obtained by means of an ε-support vector regression with polynomial kernel functions and Takagi–Sugeno intuitionistic fuzzy inference systems with adaptation by means of a Kalman filter.

Keywords: Ozone prediction
[203] Jui-Sheng Chou and Chih-Fong Tsai. Preliminary cost estimates for thin-film transistor liquid–crystal display inspection and repair equipment: A hybrid hierarchical approach. Computers & Industrial Engineering, 62(2):661 - 669, 2012. [ bib | DOI | http ]
The thin-film transistor liquid–crystal display (TFT-LCD) industry has developed rapidly in recent years. Because TFT-LCD manufacturing is highly complex and requires different tools for different products, accurately estimating the cost of manufacturing TFT-LCD equipment is essential. Conventional cost estimation models include linear regression (LR), artificial neural networks (ANNs), and support vector regression (SVR). Nevertheless, in accordance with recent evidence that a hierarchical structure outperforms a flat structure, this study proposes a hierarchical classification and regression (HCR) approach for improving the accuracy of cost predictions for TFT-LCD inspection and repair equipment. Specifically, first-level analyses by {HCR} classify new unknown cases into specific classes. The cases are then inputted into the corresponding prediction models for the final output. In this study, experimental results based on a real world dataset containing data for TFT-LCD equipment development projects performed by a leading Taiwan provider show that three prediction models based on {HCR} approach are generally comparable or better than three conventional flat models (LR, ANN, and SVR) in terms of prediction accuracy. In particular, the 4-class and 5-class support vector machines in the first-level {HCR} combined with individual {SVR} obtain the lowest root mean square error (RMSE) and mean average percentage error (MAPE) rates, respectively.

Keywords: TFT-LCD
[204] Ozgur Kisi and Mesut Cimen. Precipitation forecasting by using wavelet-support vector machine conjunction model. Engineering Applications of Artificial Intelligence, 25(4):783 - 792, 2012. Special Section: Dependable System Modelling and Analysis. [ bib | DOI | http ]
A new wavelet-support vector machine conjunction model for daily precipitation forecast is proposed in this study. The conjunction method combining two methods, discrete wavelet transform and support vector machine, is compared with the single support vector machine for one-day-ahead precipitation forecasting. Daily precipitation data from Izmir and Afyon stations in Turkey are used in the study. The root mean square errors (RMSE), mean absolute errors (MAE), and correlation coefficient (R) statistics are used for the comparing criteria. The comparison results indicate that the conjunction method could increase the forecast accuracy and perform better than the single support vector machine. For the Izmir and Afyon stations, it is found that the conjunction models with RMSE=46.5 mm, MAE=13.6 mm, R=0.782 and RMSE=21.4 mm, MAE=9.0 mm, R=0.815 in test period is superior in forecasting daily precipitations than the best accurate support vector regression models with RMSE=71.6 mm, MAE=19.6 mm, R=0.276 and RMSE=38.7 mm, MAE=14.2 mm, R=0.103, respectively. The {ANN} method was also employed for the same data set and found that there is a slight difference between {ANN} and {SVR} methods.

Keywords: Precipitation
[205] Tony Bellotti, Roman Matousek, and Chris Stewart. A note comparing support vector machines and ordered choice models’ predictions of international banks’ ratings. Decision Support Systems, 51(3):682 - 687, 2011. [ bib | DOI | http ]
We find that support vector machines can produce notably better predictions of international bank ratings than the standard method currently used for this purpose, ordered choice models. This appears due to the support vector machine's ability to estimate a large number of country dummies unrestrictedly, which was not possible with the ordered choice models due to the low sample size.

Keywords: International bank ratings
[206] Ting-Yu Hsu, Shieh-Kung Huang, Yu-Weng Chang, Chun-Hsiang Kuo, Che-Min Lin, Tao-Ming Chang, Kuo-Liang Wen, and Chin-Hsiung Loh. Rapid on-site peak ground acceleration estimation based on support vector regression and p-wave features in taiwan. Soil Dynamics and Earthquake Engineering, 49:210 - 217, 2013. [ bib | DOI | http ]
This study extracted some P-wave features from the first few seconds of vertical ground acceleration of a single station. These features include the predominant period, peak acceleration amplitude, peak velocity amplitude, peak displacement amplitude, cumulative absolute velocity and integral of the squared velocity. The support vector regression was employed to establish a regression model which can predict the peak ground acceleration according to these features. Some representative earthquake records of the Taiwan Strong Motion Instrumentation Program from 1992 to 2006 were used to train and validate the support vector regression model. Then the constructed model was tested using the whole earthquake records of the same period as well as the 2010 Kaohsiung earthquake with 6.4 ML. The effects on the performance of the regression models using different P-wave features and different length of time window to extract these features are studied. The results illustrated that, if the first 3 s of the vertical ground acceleration was used, the standard deviation of the predicted peak ground acceleration error of the whole tested 15-years earthquake records is 20.89 gal.The length of time window could be shortened, e.g. 1 s, and the prediction error is slightly sacrificed, in order to prolong the lead-time before destructive S-waves reaches.

[207] Di-Rong Chen and Han Li. Convergence rates of learning algorithms by random projection. Applied and Computational Harmonic Analysis, 37(1):36 - 51, 2014. [ bib | DOI | http ]
Abstract Random projection allows one to substantially reduce dimensionality of data while still retaining a significant degree of problem structure. In the past few years it has received considerable interest in compressed sensing and learning theory. By using the random projection of the data to low-dimensional space instead of the data themselves, a learning algorithm is implemented with low computational complexity. This paper investigates the accuracy of the algorithm of regularized empirical risk minimization in Hilbert spaces. By letting the dimensionality of the projected data increase suitably as the number of samples increases, we obtain an estimation of the error for least squares regression and support vector machines.

Keywords: Random projection
[208] Wendy Flores-Fuentes, Moises Rivas-Lopez, Oleg Sergiyenko, Felix F. Gonzalez-Navarro, Javier Rivera-Castillo, Daniel Hernandez-Balbuena, and Julio C. Rodríguez-Quiñonez. Combined application of power spectrum centroid and support vector machines for measurement improvement in optical scanning systems. Signal Processing, 98:37 - 51, 2014. [ bib | DOI | http ]
Abstract In this paper Support Vector Machine (SVM) Regression was applied to predict measurements errors for Accuracy Enhancement in Optical Scanning Systems, for position detection in real life application for Structural Health Monitoring (SHM) by a novel method, based on the Power Spectrum Centroid Calculation in determining the energy center of an optoelectronic signal in order to obtain accuracy enhancement in optical scanning system measurements. In the development of an Optical Scanning System based on a 45° – sloping surface cylindrical mirror and an incoherent light emitting source, surged a novel method in optoelectronic scanning, it has been found that in order to find the position of a light source and to reduce errors in position measurements, the best solution is taking the measurement in the energy centre of the signal generated by the Optical Scanning System. The Energy Signal Centre is found in the Power Spectrum Centroid and the {SVM} Regression Method is used as a digital rectified to increase measurement accuracy for Optical Scanning System.

Keywords: Support Vector Machines
[209] Hongzhi Tong, Di-Rong Chen, and Fenghong Yang. Support vector machines regression with -regularizer. Journal of Approximation Theory, 164(10):1331 - 1344, 2012. [ bib | DOI | http ]
The classical support vector machines regression (SVMR) is known as a regularized learning algorithm in reproducing kernel Hilbert spaces (RKHS) with a ε -insensitive loss function and an {RKHS} norm regularizer. In this paper, we study a new {SVMR} algorithm where the regularization term is proportional to l 1 -norm of the coefficients in the kernel ensembles. We provide an error analysis of this algorithm, an explicit learning rate is then derived under some assumptions.

Keywords: Support vector machines regression
[210] Andreas Rienow and Roland Goetzke. Supporting {SLEUTH} – enhancing a cellular automaton with support vector machines for urban growth modeling. Computers, Environment and Urban Systems, 49:66 - 81, 2015. [ bib | DOI | http ]
Abstract In recent years, urbanization has been one of the most striking change processes in the socioecological system of Central Europe. Cellular automata (CA) are a popular and robust approach for the spatially explicit simulation of land-use and land-cover changes. The {CA} {SLEUTH} simulates urban growth using four simple but effective growth rules. Although the performance of {SLEUTH} is very high, the modeling process still is strongly influenced by stochastic decisions resulting in a variable pattern. Besides, it gives no information about the human and ecological forces driving the local suitability of urban growth. Hence, the objective of this research is to combine the simulation skills of {CA} with the machine learning approach called support vector machines (SVM). {SVM} has the basic idea to project input vectors on a higher-dimensional feature space, in which an optimal hyperplane can be constructed for separating the data into two or more classes. By using a forward feature selection, important features can be identified and separated from unimportant ones. The anchor point of coupling both methods is the exclusion layer of SLEUTH. It will be replaced by a SVM-based probability map of urban growth. As a kind of litmus test, we compare the approach with the combination of {CA} and binomial logistic regression (BLR), a frequently used technique in urban growth studies. The integrated models are applied to an area in the federal state of North Rhine-Westphalia involving a highly urbanized region along the Rhine valley (Cologne, Düsseldorf) and a rural, hilly region (Bergisches Land) with a dispersed settlement pattern. Various geophysical and socio-economic driving forces are included, and comparatively evaluated. The validation shows that the quantity and the allocation performance of {SLEUTH} are augmented clearly when coupling {SLEUTH} with a BLR- or SVM-based probability map. The combination enables the dynamical simulation of different growth types on the one hand as well as the analyses of various geophysical and socio-economic driving forces on the other hand. The {SVM} approach needs less variables than the {BLR} model and SVM-based probabilities exhibit a higher certainty compared to those derived by BLR.

Keywords: Urban growth model
[211] S. Salcedo-Sanz, J.C. Nieto Borge, L. Carro-Calvo, L. Cuadra, K. Hessner, and E. Alexandre. Significant wave height estimation using {SVR} algorithms and shadowing information from simulated and real measured x-band radar images of the sea surface. Ocean Engineering, 101:244 - 253, 2015. [ bib | DOI | http ]
Abstract In this paper we propose to apply the Support Vector Regression (SVR) methodology to significant wave height estimation using the shadowing effect, that is visible on the X-band marine radar images of the sea surface due to the presence of high waves. One of the main problems of using sea clutter images is that, for a given sea state conditions, the shadowing effect depends on the radar antenna installation features, such as the angle of incidence. On the other hand, for a given radar antenna location, the shadowing properties depend on the different sea state parameters, like wave periods, and wave lengths. Thus, in this paper we show that {SVR} can be successfully trained from simulation-based data. We propose a simulation process for X-band marine radar images derived from simulated wave elevation fields using the stochastic wave theory. We show the performance of the {SVR} in simulation data and how {SVR} outperforms alternative algorithms such as neural networks. Finally, we show that the simulation process is reliable by applying the {SVR} methodology trained in the simulation-based data to real measured data, obtaining good prediction results in wave height, which indicates the goodness of our proposal.

Keywords: Significant wave height prediction
[212] Wei-Chiang Hong, Yucheng Dong, Feifeng Zheng, and Chien-Yuan Lai. Forecasting urban traffic flow by {SVR} with continuous {ACO}. Applied Mathematical Modelling, 35(3):1282 - 1291, 2011. [ bib | DOI | http ]
Accurate forecasting of inter-urban traffic flow has been one of the most important issues globally in the research on road traffic congestion. Because the information of inter-urban traffic presents a challenging situation, the traffic flow forecasting involves a rather complex nonlinear data pattern. In the recent years, the support vector regression model (SVR) has been widely used to solve nonlinear regression and time series problems. This investigation presents a short-term traffic forecasting model which combines the support vector regression model with continuous ant colony optimization algorithms (SVRCACO) to forecast inter-urban traffic flow. Additionally, a numerical example of traffic flow values from northern Taiwan is employed to elucidate the forecasting performance of the proposed {SVRCACO} model. The forecasting results indicate that the proposed model yields more accurate forecasting results than the seasonal autoregressive integrated moving average (SARIMA) time series model. Therefore, the {SVRCACO} model is a promising alternative for forecasting traffic flow.

Keywords: Traffic flow forecasting
[213] Yong-Ping Zhao and Jian-Guo Sun. Multikernel semiparametric linear programming support vector regression. Expert Systems with Applications, 38(3):1611 - 1618, 2011. [ bib | DOI | http ]
In many real life realms, many unknown systems own different data trends in different regions, i.e., some parts are steep variations while other parts are smooth variations. If we utilize the conventional kernel learning algorithm, viz. the single kernel linear programming support vector regression, to identify these systems, the identification results are usually not very good. Hence, we exploit the nonlinear mappings induced from the kernel functions as the admissible functions to construct a novel multikernel semiparametric predictor, called as MSLP-SVR, to improve the regression effectiveness. The experimental results on the synthetic and the real-world data sets corroborate the efficacy and validity of our proposed MSLP-SVR. Meantime, compared with other multikernel linear programming support vector algorithm, ours also takes advantages. In addition, although the MSLP-SVR is proposed in the regression domain, it can also be extended to classification problems.

Keywords: Linear programming support vector regression
[214] Hung-Hsu Tsai, Bae-Muu Chang, and Xuan-Ping Lin. Using decision tree, particle swarm optimization, and support vector regression to design a median-type filter with a 2-level impulse detector for image enhancement. Information Sciences, 195:103 - 123, 2012. [ bib | DOI | http ]
The paper presents a system using Decision tree, Particle swarm optimization, and Support vector regression to design a Median-type filter with a 2-level impulse detector for image enhancement, called {DPSM} filter. First, it employs a varying 2-level hybrid impulse noise detector (IND) to determine whether a pixel is contaminated by impulse noises or not. The 2-level {IND} is constructed by a decision tree (DT) which is built via combining 10 impulse noise detectors. Also, the particle swarm optimization (PSO) algorithm is exploited to optimize the DT. Subsequently, the {DPSM} filter utilizes the median-type filter with the support vector regression (MTSVR) to restore the corrupted pixels. Experimental results demonstrate that the {DPSM} filter achieves high performance for detecting and restoring impulse noises, and also outperforms the existing well-known methods under consideration in the paper.

Keywords: Impulse noise detector
[215] Pilar Campoy-Muñoz, Pedro Antonio Gutiérrez, and César Hervás-Martínez. Addressing remitting behavior using an ordinal classification approach. Expert Systems with Applications, 41(10):4752 - 4761, 2014. [ bib | DOI | http ]
Abstract The remittance market represents a great business opportunity for financial institutions given the increasing volume of these capital flows throughout the world. However, the corresponding business strategy could be costly and time consuming because immigrants do not respond to general media campaigns. In this paper, the remitting behavior of immigrants have been addressed by a classification approach that predicts the remittance levels sent by immigrants according to their individual characteristics, thereby identifying the most profitable customers within this group. To do so, five nominal and two ordinal classifiers were applied to an immigrant sample and their resulting performances were compared. The ordinal classifiers achieved the best results; the Support Vector Machine with Ordered Partitions (SVMOP) yielded the best model, providing information needed to draw remitting profiles that are useful for financial institutions. The Support Vector Machine with Explicit Constraints (SVOREX), however, achieved the second best results, and these results are presented graphically to study misclassified patterns in a natural and simple way. Thus, financial institutions can use this ordinal SVM-based approach as a tool to generate valuable information to develop their remittance business strategy.

Keywords: Nominal classification
[216] Helena G. Ramos, Tiago Rocha, Jakub Král, Dário Pasadas, and Artur L. Ribeiro. An {SVM} approach with electromagnetic methods to assess metal plate thickness. Measurement, 54:201 - 206, 2014. [ bib | DOI | http ]
Abstract Eddy current testing (ECT) is a non-destructive technique that can be used in the measurement of conductive material thickness. In this work {ECT} and a machine learning algorithm (support vector machine – SVM) are used to determine accurately the thickness of metallic plates. The study has been made with {ECT} measurements on real specimens. At a first stage, a few number of plates is considered and {SVM} is used for a multi-class classification of the conductive plate thicknesses within a finite number of categories. Several figures of merit were tested to investigate the features that lead to “good” separating hyperplanes. Then, based on a {SVM} regressor, a reliable estimation of the thickness of a large quantity of plates is tested. Eddy currents are induced by imposing a voltage step in an excitation coil (transient eddy currents – TEC), while a giant magnetoresistance (GMR) is the magnetic sensor that measures the transient magnetic field intensity in the sample vicinity. An experimental validation procedure, including machine training with linear and exponential kernels and classification errors, is presented with sets of samples with thicknesses up to 7.5 mm.

Keywords: Eddy current testing
[217] Sadegh Baziar, Mehdi Tadayoni, Majid Nabi-Bidhendi, and Mohsen Khalili. Prediction of permeability in a tight gas reservoir by using three soft computing approaches: A comparative study. Journal of Natural Gas Science and Engineering, 21:718 - 724, 2014. [ bib | DOI | http ]
Abstract Permeability is the most important petrophysical property in tight gas reservoirs. Many researchers have worked on permeability measurement methods, but there is no universal method yet which can predict permeability in the whole field and in all intervals of the wells. So artificial intelligence methods have been used to predict permeability by using well log data in all field areas. In this research, Multilayer Perceptron Neural Network, Co-Active Neuro-Fuzzy Inference System and Support Vector Machine techniques have been employed to predict permeability of Mesaverde tight gas sandstones located in Washakie basin in USA. Multilayer Perceptrons are the most used neural networks in regression tasks. Co-Active Neuro-Fuzzy Inference System is a method which combines fuzzy model and neural network in a manner to produce accurate results. Support Vector Machine is a relatively new intelligence method with great capabilities in regression and classification tasks. Each method has advantages and disadvantage and here their capability in predicting permeability has been evaluated. In this study, data from three wells were used and two different dataset patterns were constructed to evaluate performances of the models in predicting permeability by using either previously seen data or unseen data. The most important aspect of this research is investigation of capability of these methods to generalize the training patterns to previously unseen data. Results showed that all methods have acceptable performance in predicting permeability but Co-Active Neuro-Fuzzy Inference System and Support Vector Machine performs so better than Multilayer Perceptron and predict permeability more accurate.

Keywords: Permeability
[218] Armin Walter, Georgios Naros, Martin Spüler, Alireza Gharabaghi, Wolfgang Rosenstiel, and Martin Bogdan. Decoding stimulation intensity from evoked {ECoG} activity. Neurocomputing, 141:46 - 53, 2014. [ bib | DOI | http ]
Abstract Cortical stimulation is used for therapeutic applications and research into neural processes. Cortical evoked responses to stimulation yield important information about neural connectivity and cortical excitability but are sensitive to changes in stimulation parameters. So far, the relationship between the stimulation parameters and the evoked responses has been reported only descriptively. In this paper we propose the use of regression analysis to train models that infer the stimulation intensity from the shape of the evoked activity. Using Support Vector Regression and electrocorticogram (ECoG) responses to electrical stimulation via epidural electrodes collected from two stroke patients, we show that the models can capture this relationship and generalize to intensities not used during the training process.

Keywords: Cortical stimulation
[219] Jiahuan Wu, Jianlin Wang, Tao Yu, and Liqiang Zhao. An approach to continuous approximation of pareto front using geometric support vector regression for multi-objective optimization of fermentation process. Chinese Journal of Chemical Engineering, 22(10):1131 - 1140, 2014. [ bib | DOI | http ]
Abstract The approaches to discrete approximation of Pareto front using multi-objective evolutionary algorithms have the problems of heavy computation burden, long running time and missing Pareto optimal points. In order to overcome these problems, an approach to continuous approximation of Pareto front using geometric support vector regression is presented. The regression model of the small size approximate discrete Pareto front is constructed by geometric support vector regression modeling and is described as the approximate continuous Pareto front. In the process of geometric support vector regression modeling, considering the distribution characteristic of Pareto optimal points, the separable augmented training sample sets are constructed by shifting original training sample points along multiple coordinated axes. Besides, an interactive decision-making (DM) procedure, in which the continuous approximation of Pareto front and decision-making is performed interactively, is designed for improving the accuracy of the preferred Pareto optimal point. The correctness of the continuous approximation of Pareto front is demonstrated with a typical multi-objective optimization problem. In addition, combined with the interactive decision-making procedure, the continuous approximation of Pareto front is applied in the multi-objective optimization for an industrial fed-batch yeast fermentation process. The experimental results show that the generated approximate continuous Pareto front has good accuracy and completeness. Compared with the multi-objective evolutionary algorithm with large size population, a more accurate preferred Pareto optimal point can be obtained from the approximate continuous Pareto front with less computation and shorter running time. The operation strategy corresponding to the final preferred Pareto optimal point generated by the interactive {DM} procedure can improve the production indexes of the fermentation process effectively.

Keywords: Continuous approximation of Pareto front
[220] JinXing Che and JianZhou Wang. Short-term load forecasting using a kernel-based support vector regression combination model. Applied Energy, 132:602 - 609, 2014. [ bib | DOI | http ]
Abstract Kernel-based methods, such as support vector regression (SVR), have demonstrated satisfactory performance in short-term load forecasting (STLF) application. However, the good performance of kernel-based method depends on the selection of an appropriate kernel function that fits the learning target, unsuitable kernel function or hyper-parameters setting may lead to significantly poor performance. To get the optimal kernel function of {STLF} problem, this paper proposes a kernel-based {SVR} combination model by using a novel individual model selection algorithm. Moreover, the proposed combination model provides a new way to kernel function selection of {SVR} model. The performance and electric load forecast accuracy of the proposed model are assessed by means of real data from the Australia and California Power Grid, respectively. The simulation results from numerical tables and figures show that the proposed combination model increases electric load forecasting accuracy compared to the best individual kernel-based {SVR} model.

Keywords: Short-term load forecasting
[221] N. Garijo, J. Martínez, J.M. García-Aznar, and M.A. Pérez. Computational evaluation of different numerical tools for the prediction of proximal femur loads from bone morphology. Computer Methods in Applied Mechanics and Engineering, 268:437 - 450, 2014. [ bib | DOI | http ]
Abstract Patient-specific modeling is becoming increasingly important. One of the most challenging difficulties in creating patient-specific models is the determination of the specific load that the bone is really supporting. Real information relating to specific patients, such as bone geometry and bone density distribution, can be used to determine these loads. The main goal of this study is to theoretically estimate patient-specific loads from bone geometry and density measurements, comparing different mathematical techniques: linear regression, artificial neural networks with individual or multiple outputs and support vector machines. This methodology has been applied to 2D/3D finite element models of a proximal femur with different results. Linear regression and artificial neural networks demonstrated a good load prediction with relative error less than 2%. However, the support vector machine technique predicted higher relative errors. Using artificial neural networks with multiple outputs we obtained a high degree of accuracy in the prediction of the load conditions that produce a known bone density distribution. Therefore, it is shown that the proposed method is capable of predicting the loading that induces a specific bone density distribution.

Keywords: Artificial neuronal network
[222] Yvonne Gala, Ángela Fernández, Julia Díaz, and José R. Dorronsoro. Hybrid machine learning forecasting of solar radiation values. Neurocomputing, pages -, 2015. [ bib | DOI | http ]
Abstract The constant expansion of solar energy has made the accurate forecasting of radiation an important issue. In this work we apply Support Vector Regression (SVR), Gradient Boosted Regression (GBR), Random Forest Regression (RFR) as well as a hybrid method to combine them to downscale and improve 3-h accumulated radiation forecasts provided by Numerical Weather Prediction (NWP) systems for seven locations in Spain. We use either direct 3-h aggregated radiation forecasts or we build first global accumulated daily predictions and disaggregate them into 3-h values, with both approaches outperforming the base {NWP} forecasts. We also show how to disaggregate the 3-h forecasts into hourly values using interpolation based on clear sky (CS) theoretical and experimental radiation models, with the disaggregated forecasts again being better than the base {NWP} ones and where empirical {CS} interpolation yields the best results. Besides providing ample background on a problem that offers many opportunities to the Machine Learning (ML) community, our study shows that {ML} methods or, more generally, hybrid artificial intelligence systems are quite effective and, hence, relevant for solar radiation prediction.

Keywords: Solar radiation
[223] Y.F. Li, S.H. Ng, M. Xie, and T.N. Goh. A systematic comparison of metamodeling techniques for simulation optimization in decision support systems. Applied Soft Computing, 10(4):1257 - 1273, 2010. Optimisation Methods & Applications in Decision-Making Processes. [ bib | DOI | http ]
Simulation is a widely applied tool to study and evaluate complex systems. Due to the stochastic and complex nature of real world systems, simulation models for these systems are often difficult to build and time consuming to run. Metamodels are mathematical approximations of simulation models, and have been frequently used to reduce the computational burden associated with running such simulation models. In this paper, we propose to incorporate metamodels into Decision Support Systems to improve its efficiency and enable larger and more complex models to be effectively analyzed with Decision Support Systems. To evaluate the different metamodel types, a systematic comparison is first conducted to analyze the strengths and weaknesses of five popular metamodeling techniques (Artificial Neural Network, Radial Basis Function, Support Vector Regression, Kriging, and Multivariate Adaptive Regression Splines) for stochastic simulation problems. The results show that Support Vector Regression achieves the best performance in terms of accuracy and robustness. We further propose a general optimization framework GA-META, which integrates metamodels into the Genetic Algorithm, to improve the efficiency and reliability of the decision making process. This approach is illustrated with a job shop design problem. The results indicate that GA-Support Vector Regression achieves the best solution among the metamodels.

Keywords: Decision Support System
[224] Hao Zhou, Kang Zhou, Qi Tang, Shangbin Chen, and Kefa Cen. Using a core-vector machine to correct the steam-separator temperature deviations of a 1000 {MW} boiler. Fuel, 130:142 - 148, 2014. [ bib | DOI | http ]
Abstract Steam-separator temperature is an important parameter of ultra-supercritical boilers, where temperature deviations result in an increase in feed-water and a fast decline in steam temperature. Optimizing temperature deviations through manual operating-variables adjustments is difficult because of the complex relationships among influencing factors, as well as unacceptable increases in combustion air from opening baffles. Therefore, this research has used a core-vector regression (CVR) algorithm to model steam-separator temperature deviations. {CVR} is an extremely fast way to model the process and gives more accurate predictions than a support vector machine (SVM). Seventy-seven operating parameters were used as inputs, the objective was set as the temperature deviation factor of all the steam separators, and in total 17,338 experimental cases from the {DCS} were used in this study. Secondary-air volume adjustments at the C and D levels in #4 corner were carried out at different boiler loads in field tests, and the temperature deviation after each test was compared with the original value. Results showed that steam-temperature deviation was decreased by 29.6% and 36.3% respectively at 700 {MW} and 530 {MW} after secondary-air volume was adjusted to the target value.

Keywords: CVR
[225] Qi Wu and Rob Law. The complex fuzzy system forecasting model based on fuzzy {SVM} with triangular fuzzy number input and output. Expert Systems with Applications, 38(10):12085 - 12093, 2011. [ bib | DOI | http ]
This paper presents a new version of fuzzy support vector machine to forecast the nonlinear fuzzy system with multi-dimensional input variables. The input and output variables of the proposed model are described as triangular fuzzy numbers. Then by integrating the triangular fuzzy theory and v-support vector regression machine, the triangular fuzzy v-support vector machine (TFv-SVM) is proposed. To seek the optimal parameters of TFv-SVM, particle swarm optimization is also applied to optimize parameters of TFv-SVM. A forecasting method based on TFv-SVRM and {PSO} are put forward. The results of the application in sale system forecasts confirm the feasibility and the validity of the forecasting method. Compared with the traditional model, TFv-SVM method requires fewer samples and has better forecasting precision.

Keywords: Fuzzy v-support vector machine
[226] Qi Wu and Rob Law. The forecasting model based on fuzzy novel ν-support vector machine. Expert Systems with Applications, 38(10):12028 - 12034, 2011. [ bib | DOI | http ]
This paper presents a new version of fuzzy support vector machine to forecast multi-dimension fuzzy sample. By combining the triangular fuzzy theory with the modified ν-support vector machine, the fuzzy novel ν-support vector machine (FNν-SVM) is proposed, whose constraint conditions are less than those of the standard Fν-SVM by one, is proved to satisfy the structure risk minimum rule under the condition of probability. Moreover, there is no parameter b in the regression function of the FNν-SVM. To seek the optimal parameters of the FNν-SVM, particle swarm optimization is also proposed to optimize the unknown parameters of the FNν-SVM. The results of the application in sale forecasts confirm the feasibility and the validity of the FNν-SVM model. Compared with the traditional model, the FNν-SVM method requires fewer samples and has better forecasting precision.

Keywords: Fuzzy ν-support vector machine
[227] Sunil K. Jha and Kenshi Hayashi. A novel odor filtering and sensing system combined with regression analysis for chemical vapor quantification. Sensors and Actuators B: Chemical, 200:269 - 287, 2014. [ bib | DOI | http ]
Abstract An advanced odor filtering and sensing system based on polymers, carbon molecular sieves, micro-ceramic heaters and metal oxide semiconductor (MOS) gas sensor array has been designed for quantitative identification of volatile organic chemicals (VOCs). {MOS} sensor resistance due to chemical vapor adsorption in filtering material and after desorption are measured for five target {VOCs} including acetone, benzene, ethanol, pentanal, and propenoic acid at distinct concentrations in between 3 and 500 parts per million (ppm). Two kinds of regression methods specifically linear regression analysis based on least square criterion and kernel function based support vector regression (SVR) have been employed to model sensor resistance with {VOCs} concentration. Scatter plot and Spearman's rank correlation coefficient (ρ) are used to investigate the strength of dependence of sensor resistance on vapor concentration and to search optimal filtering material for {VOCs} quantification prior to the regression analysis. Quantitative recognition efficiency of regression methods have been evaluated on the basis of coefficient of determination {R2} (R-squared) and correlation values. {MOS} sensor resistance after vapor desorption with carbon molecular sieve (carboxen–1012) as filtering material results the maximum values of R-squared (R2 = 0.9957) and correlation (ρ = 1.00) between the actual and estimated concentration for propenoic acid using radial basis kernel based {SVR} method.

Keywords: Odor filter
[228] Hiromasa Kaneko and Kimito Funatsu. Adaptive soft sensor based on online support vector regression and bayesian ensemble learning for various states in chemical plants. Chemometrics and Intelligent Laboratory Systems, 137:57 - 66, 2014. [ bib | DOI | http ]
Abstract A soft sensor predicts the values of some process variable y that is difficult to measure. To maintain the predictive ability of a soft sensor model, adaptation mechanisms are applied to soft sensors. However, even these adaptive soft sensors cannot predict the y-values of various process states in chemical plants, and it is difficult to ensure the predictive ability of such models on a long-term basis. Therefore, we propose a method that combines online support vector regression (OSVR) with an ensemble learning system to adapt to nonlinear and time-varying changes in process characteristics and various process states in a plant. Several {OSVR} models, each of which has an adaptation mechanism and is updated with new data, predict y-values. A final predicted y-value is calculated based on those predicted y-values and Bayes' rule. We analyze a numerical dataset and two real industrial datasets, and demonstrate the superiority of the proposed method.

Keywords: Process control
[229] Chunhua Zhang, Dewei Li, and Junyan Tan. The support vector regression with adaptive norms. Procedia Computer Science, 18:1730 - 1736, 2013. 2013 International Conference on Computational Science. [ bib | DOI | http ]
Abstract This study proposes a new method for regression – lp-norm support vector regression (lp SVR). Some classical {SVRs} minimize the hinge loss function subject to the l2-norm or l1-norm penalty. These methods are non-adaptive since their penalty forms are fixed and pre-determined for any types of data. Our new model is an adaptive learning procedure with lp-norm (0 < p < 1), where the best p is automatically chosen by data. By adjusting the parameter p, lp {SVR} can not only select relevant features but also improve the regression accuracy. An iterative algorithm is suggested to solve the lp {SVR} efficiently. Simulations and real data applications support the effectiveness of the proposed procedure.

Keywords: Regression
[230] JinXing Che. A novel hybrid model for bi-objective short-term electric load forecasting. International Journal of Electrical Power & Energy Systems, 61:259 - 266, 2014. [ bib | DOI | http ]
Abstract Context: Current decision development in electricity market needs a variety of forecasting techniques to analysis the nature of electric load series. And the interpretability and forecasting accuracy of the electric load series are two main objectives when establishing the load forecasting model. Objective: Considering that electric load series exhibit repeating seasonal cycles at different level ( daily, weekly and annual seasonality), this paper concerns the interpretability of these seasonal cycles and the forecasting accuracy. Method: For the above proposes, the author firstly introduces a multiple linear regression model that involves treating all the seasonal cycles as the input attributes. The result helps the managers to interpret the series structure with multiple seasonal cycles. To improve the forecasting accuracy, a support vector regression model based on optimal training subset (OTS) and adaptive particle swarm optimization (APSO) algorithm is established to forecast the residual series. Thus, a novel hybrid model combining the proposed linear regression model and support vector regression model is built to achieve the above bi-objective short-term load forecasting. Results: The effectiveness of the hybrid model is evaluated by an electrical load forecasting in California electricity market. The proposed modeling algorithm generates not only the seasonal cycle's decomposition for the time series, but also better accuracy predictions. Conclusion: It is concluded that the hybrid model provides a very powerful tool of easy implementation for bi-objective short-term electric load forecasting.

Keywords: Bi-objective short-term electric load forecasting
[231] Wangdong Ni, Lars Nørgaard, and Morten Mørup. Non-linear calibration models for near infrared spectroscopy. Analytica Chimica Acta, 813:1 - 14, 2014. [ bib | DOI | http ]
Abstract Different calibration techniques are available for spectroscopic applications that show nonlinear behavior. This comprehensive comparative study presents a comparison of different nonlinear calibration techniques: kernel {PLS} (KPLS), support vector machines (SVM), least-squares {SVM} (LS-SVM), relevance vector machines (RVM), Gaussian process regression (GPR), artificial neural network (ANN), and Bayesian {ANN} (BANN). In this comparison, partial least squares (PLS) regression is used as a linear benchmark, while the relationship of the methods is considered in terms of traditional calibration by ridge regression (RR). The performance of the different methods is demonstrated by their practical applications using three real-life near infrared (NIR) data sets. Different aspects of the various approaches including computational time, model interpretability, potential over-fitting using the non-linear models on linear problems, robustness to small or medium sample sets, and robustness to pre-processing, are discussed. The results suggest that {GPR} and {BANN} are powerful and promising methods for handling linear as well as nonlinear systems, even when the data sets are moderately small. The LS-SVM is also attractive due to its good predictive performance for both linear and nonlinear calibrations.

Keywords: NIR
[232] Xiaolin Huang, Lei Shi, Kristiaan Pelckmans, and Johan A.K. Suykens. Asymmetric -tube support vector regression. Computational Statistics & Data Analysis, 77:371 - 382, 2014. [ bib | DOI | http ]
Abstract Finding a tube of small width that covers a certain percentage of the training data samples is a robust way to estimate a location: the values of the data samples falling outside the tube have no direct influence on the estimate. The well-known ν -tube Support Vector Regression ( ν -SVR) is an effective method for implementing this idea in the context of covariates. However, the ν -SVR considers only one possible location of this tube: it imposes that the amount of data samples above and below the tube are equal. The method is generalized such that those outliers can be divided asymmetrically over both regions. This extension gives an effective way to deal with skewed noise in regression problems. Numerical experiments illustrate the computational efficacy of this extension to the ν -SVR.

Keywords: Robust regression
[233] Akiko Takeda and Takafumi Kanamori. Using financial risk measures for analyzing generalization performance of machine learning models. Neural Networks, 57:29 - 38, 2014. [ bib | DOI | http ]
Abstract We propose a unified machine learning model (UMLM) for two-class classification, regression and outlier (or novelty) detection via a robust optimization approach. The model embraces various machine learning models such as support vector machine-based and minimax probability machine-based classification and regression models. The unified framework makes it possible to compare and contrast existing learning models and to explain their differences and similarities. In this paper, after relating existing learning models to UMLM, we show some theoretical properties for UMLM. Concretely, we show an interpretation of {UMLM} as minimizing a well-known financial risk measure (worst-case value-at risk (VaR) or conditional VaR), derive generalization bounds for {UMLM} using such a risk measure, and prove that solving problems of {UMLM} leads to estimators with the minimized generalization bounds. Those theoretical properties are applicable to related existing learning models.

Keywords: Support vector machine
[234] Mounika Lingala, R. Joe Stanley, Ryan K. Rader, Jason Hagerty, Harold S. Rabinovitz, Margaret Oliviero, Iqra Choudhry, and William V. Stoecker. Fuzzy logic color detection: Blue areas in melanoma dermoscopy images. Computerized Medical Imaging and Graphics, 38(5):403 - 410, 2014. [ bib | DOI | http ]
Abstract Fuzzy logic image analysis techniques were used to analyze three shades of blue (lavender blue, light blue, and dark blue) in dermoscopic images for melanoma detection. A logistic regression model provided up to 82.7% accuracy for melanoma discrimination for 866 images. With a support vector machines (SVM) classifier, lower accuracy was obtained for individual shades (79.9–80.1%) compared with up to 81.4% accuracy with multiple shades. All fuzzy blue logic alpha cuts scored higher than the crisp case. Fuzzy logic techniques applied to multiple shades of blue can assist in melanoma detection. These vector-based fuzzy logic techniques can be extended to other image analysis problems involving multiple colors or color shades.

Keywords: Fuzzy logic
[235] Pan Xiong, Xiaobo Ji, Xin Zhao, Wei Lv, Taiang Liu, and Wencong Lu. Materials design and control synthesis of the layered double hydroxide with the desired basal spacing. Chemometrics and Intelligent Laboratory Systems, 144:11 - 16, 2015. [ bib | DOI | http ]
Abstract Efficient and effective prediction of the basal spacing is of great importance to materials design of layered double hydroxides (LDHs). In this work, the {QSPR} model was constructed to predict the basal spacing of {LDHs} from 7.5 to 8.0 Å by using the support vector regression (SVR) algorithm. The genetic algorithm (GA)–support vector regression (SVR) method was used to filter the main molecular descriptors in modeling. The {QSPR} model available was tested by an external test set consisting of 8 compounds. As a case study of controllable synthesis based on the {QSPR} model, the new {LDH} of Mg–Al–CO3 system with the desired basal spacing 7.6 Å, which was screened out from a list of {LDH} dataset consisting of 30 different kinds of samples, was verified by our experiment with the relative error equal to 0.93%. The method outlined here can be served as a new computational template for the materials design and control synthesis of the {LDH} with the desired basal spacing based on {QSPR} model for the first time.

Keywords: QSPR
[236] Paulius Danenas and Gintautas Garsva. Selection of support vector machines based classifiers for credit risk domain. Expert Systems with Applications, 42(6):3194 - 3204, 2015. [ bib | DOI | http ]
Abstract This paper describes an approach for credit risk evaluation based on linear Support Vector Machines classifiers, combined with external evaluation and sliding window testing, with focus on application on larger datasets. It presents a technique for optimal linear {SVM} classifier selection based on particle swarm optimization technique, providing significant amount of focus on imbalanced learning issue. It is compared to other classifiers in terms of accuracy and identification of each class. Experimental classification performance results, obtained using real world financial dataset from {SEC} {EDGAR} database, lead to conclusion that proposed technique is capable to produce results, comparable to other classifiers, such as logistic regression and {RBF} network, and thus be can be an appealing option for future development of real credit risk evaluation models.

Keywords: Support Vector Machines
[237] Pei-Yi Hao. Interval regression analysis using support vector networks. Fuzzy Sets and Systems, 160(17):2466 - 2485, 2009. Theme: Learning. [ bib | DOI | http ]
Support vector machines (SVMs) have been very successful in pattern classification and function estimation problems for crisp data. In this paper, the v -support vector interval regression network ( v -SVIRN) is proposed to evaluate interval linear and nonlinear regression models for crisp input and output data. As it is difficult to select an appropriate value of the insensitive tube width in ε -support vector regression network, the proposed v -SVIRN alleviates this problem by utilizing a new parametric-insensitive loss function. The proposed v -SVIRN automatically adjusts a flexible parametric-insensitive zone of arbitrary shape and minimal size to include the given data. Besides, the proposed method can achieve automatic accuracy control in the interval regression analysis task. For a priori chosen v , at most a fraction v of the data points lie outside the interval model constructed by the proposed v -SVIRN. To be more precise, v is an upper bound on the fraction of training errors and a lower bound on the fraction of support vectors. Hence, the selection of v is more intuitive. Moreover, the proposed algorithm here is a model-free method in the sense that we do not have to assume the underlying model function. Experimental results are then presented which show the proposed v -SVIRN is useful in practice, especially when the noise is heteroscedastic, that is, the noise strongly depends on the input value x.

Keywords: Support vector machines (SVMs)
[238] Mohamed Cheriet, Reza Farrahi Moghaddam, and Rachid Hedjam. A learning framework for the optimization and automation of document binarization methods. Computer Vision and Image Understanding, 117(3):269 - 280, 2013. [ bib | DOI | http ]
Almost all binarization methods have a few parameters that require setting. However, they do not usually achieve their upper-bound performance unless the parameters are individually set and optimized for each input document image. In this work, a learning framework for the optimization of the binarization methods is introduced, which is designed to determine the optimal parameter values for a document image. The framework, which works with any binarization method, has a standard structure, and performs three main steps: (i) extracts features, (ii) estimates optimal parameters, and (iii) learns the relationship between features and optimal parameters. First, an approach is proposed to generate numerical feature vectors from 2D data. The statistics of various maps are extracted and then combined into a final feature vector, in a nonlinear way. The optimal behavior is learned using support vector regression (SVR). Although the framework works with any binarization method, two methods are considered as typical examples in this work: the grid-based Sauvola method, and Lu’s method, which placed first in the DIBCO’09 contest. The experiments are performed on the DIBCO’09 and H-DIBCO’10 datasets, and combinations of these datasets with promising results.

Keywords: Document image processing
[239] Chunlei Zeng, Changchun Wu, Lili Zuo, Bin Zhang, and Xingqiao Hu. Predicting energy consumption of multiproduct pipeline using artificial neural networks. Energy, 66:791 - 798, 2014. [ bib | DOI | http ]
Abstract In this paper artificial neural network is introduced to forecast the daily electricity consumption of a multiproduct pipeline which is used to drive oil pumps. Forecasting electricity energy consumption is complicated since there are so many parameters affecting the energy consumption. Two different sets of input vectors are selected from these parameters by detailed analysis of energy consumption in this study, and two corresponding multilayer perceptron artificial neural network (MLP ANN) models are developed. To enhance the generalization ability, the numbers of hidden layers and neurons, activation functions and training algorithm of each model are optimized by the trial-and-error process step by step. The performances of the two proposed {MLP} {ANN} models are evaluated on real data of a Chinese multiproduct pipeline, and compared with two linear regression and two support vector machine (SVM) models which are produced using different inputs. Results show that the two {MLP} {ANN} models have very high accuracy for prediction and better forecasting performance than the other models. The proposed input vectors and {MLP} {ANN} models are useful not only in the effective evaluation of batch scheduling and pumping operation, but also in the energy consumption target setting.

Keywords: Multiproduct pipeline
[240] Daehyun Kang, Jungho Im, Myong-In Lee, and Lindi J. Quackenbush. The {MODIS} ice surface temperature product as an indicator of sea ice minimum over the arctic ocean. Remote Sensing of Environment, 152:99 - 108, 2014. [ bib | DOI | http ]
Abstract This study examines the relationship between sea ice extent and ice surface temperature (IST) between 2000 and 2013 using daily {IST} products from the Terra Moderate Resolution Imaging Spectroradiometer (MODIS) sensor. The empirical prediction of September sea ice extent using its trend and two climate variables—IST and wind vorticity—exhibits a statistically significant relationship (R = 0.97) with a time lag, where {IST} maximum in summer (June–July) corresponds to the sea ice extent minimum in September. This suggests that {IST} may serve as an indicator of the basin-wide heat energy accumulated in the Arctic by solar radiation and large-scale atmospheric heat transport from lower latitudes. The process of inducing higher {IST} is related to the change of atmospheric circulation over the Arctic. Averaged {IST} and 850 hPa relative vorticity of the polar region show a significant negative correlation (− 0.57) in boreal summer (June–August), suggesting a weakening of the polar vortex in the case of warmer-than-normal {IST} conditions. Weakening of the polar vortex is accompanied by above-normal surface pressure. Minimum sea ice extent in September was successfully predicted by both multiple linear regression and machine learning support vector regression using preceding summer {IST} and wind vorticity along with the trend of sea ice extent (R2   0.95, cross validation {RMSE} of 3–4 × 105 km2, and relative cross validation {RMSE} of 5–8%).

Keywords: MODIS
[241] Adriano L.I. Oliveira. Estimation of software project effort with support vector regression. Neurocomputing, 69(13–15):1749 - 1753, 2006. Blind Source Separation and Independent Component AnalysisSelected papers from the {ICA} 2004 meeting, Granada, SpainBlind Source Separation and Independent Component Analysis. [ bib | DOI | http ]
This paper provides a comparative study on support vector regression (SVR), radial basis functions neural networks (RBFNs) and linear regression for estimation of software project effort. We have considered {SVR} with linear as well as {RBF} kernels. The experiments were carried out using a dataset of software projects from {NASA} and the results have shown that {SVR} significantly outperforms {RBFNs} and linear regression in this task.

Keywords: Support vector regression
[242] Jie Yu. A bayesian inference based two-stage support vector regression framework for soft sensor development in batch bioprocesses. Computers & Chemical Engineering, 41:134 - 144, 2012. [ bib | DOI | http ]
Inherent process and measurement uncertainty has posed a challenging issue on soft sensor development of batch bioprocesses. In this paper, a new soft sensor modeling framework is proposed by integrating Bayesian inference strategy with two-stage support vector regression (SVR) method. The Bayesian inference procedure is first designed to identify measurement biases and misalignments via posterior probabilities. Then the biased input measurements are calibrated through Bayesian estimation and the first-stage {SVR} model is thus built for output measurement reconciliation. The inferentially calibrated input and output data can be further used to construct the second-stage {SVR} model, which serves as the main model of soft sensor to predict new output measurements. The Bayesian inference based two-stage support vector regression (BI-SVR) approach is applied to a fed-batch penicillin cultivation process and the obtained soft sensor performance is compared to that of the conventional {SVR} method. The results from two test cases with different levels of measurement uncertainty show significant improvement of the BI-SVR approach over the regular {SVR} method in predicting various output measurements.

Keywords: Soft sensor
[243] Ying-Chao Hung, Wen-Chi Tsai, Su-Fen Yang, Shih-Chung Chuang, and Yi-Kuan Tseng. Nonparametric profile monitoring in multi-dimensional data spaces. Journal of Process Control, 22(2):397 - 403, 2012. [ bib | DOI | http ]
Profile monitoring has received increasingly attention in a wide range of applications in statistical process control (SPC). In this work, we propose a framework for monitoring nonparametric profiles in multi-dimensional data spaces. The framework has the following important features: (i) a flexible and computationally efficient smoothing technique, called Support Vector Regression, is employed to describe the relationship between the response variable and the explanatory variables; (ii) the usual structural assumptions on the residuals are not required; and (iii) the dependence structure for the within-profile observations is appropriately accommodated. Finally, real {AIDS} data collected from hospitals in Taiwan are used to illustrate and evaluate our proposed framework.

Keywords: Nonparametric profile monitoring
[244] Chih-Chia Yao and Pao-Ta Yu. Fuzzy regression based on asymmetric support vector machines. Applied Mathematics and Computation, 182(1):175 - 193, 2006. [ bib | DOI | http ]
This paper presents a modified framework of support vector machines which is called asymmetric support vector machines (ASVMs) and is designed to evaluate the functional relationship for fuzzy linear and nonlinear regression models. In earlier works, in order to cope with different types of input–output patterns, strong assumptions were made regarding linear fuzzy regression models with symmetric and asymmetric triangular fuzzy coefficients. Excellent performance is achieved on some linear fuzzy regression models. However, the nonlinear fuzzy regression model has received relatively little attention, because such nonlinear fuzzy regression models having certain limitations. This study modifies the framework of support vector machines in order to overcome these limitations. The principle of {ASVMs} is applying an orthogonal vector into the weight vector in order to rotate the support hyperplanes. The prime merits of the proposed model are in its simplicity, understandability and effectiveness. Consequently, experimental results and comparisons are given to demonstrate that the basic idea underlying {ASVMs} can be effectively used for parameter estimation.

Keywords: SVMs
[245] Torki A. Altameem, Vlastimir Nikolić, Shahaboddin Shamshirband, Dalibor Petković, Hossein Javidnia, Miss Laiha Mat Kiah, and Abdullah Gani. Potential of support vector regression for optimization of lens system. Computer-Aided Design, 62:57 - 63, 2015. [ bib | DOI | http ]
Abstract Lens system design is an important factor in image quality. The main aspect of the lens system design methodology is the optimization procedure. Since optimization is a complex, non-linear task, soft computing optimization algorithms can be used. There are many tools that can be employed to measure optical performance, but the spot diagram is the most useful. The spot diagram gives an indication of the image of a point object. In this paper, the spot size radius is considered an optimization criterion. Intelligent soft computing scheme Support Vector Regression (SVR) is implemented. In this study, the polynomial and radial basis functions (RBF) are applied as the {SVR} kernel function to estimate the optimal lens system parameters. The performance of the proposed estimators is confirmed with the simulation results. The {SVR} results are then compared with other soft computing techniques. According to the results, a greater improvement in estimation accuracy can be achieved through the {SVR} with polynomial basis function compared to other soft computing methodologies. The {SVR} coefficient of determination R 2 with the polynomial function was 0.9975 and with the radial basis function the R 2 was 0.964. The new optimization methods benefit from the soft computing capabilities of global optimization and multi-objective optimization rather than choosing a starting point by trial and error and combining multiple criteria into a single criterion in conventional lens design techniques.

Keywords: Lens system
[246] Rozalina Zakaria, Siti Munirah Che Noh, Dalibor Petković, Shahaboddin Shamshirband, and Richard Penny. Investigation of plasmonic studies on morphology of deposited silver thin films having different thicknesses by soft computing methodologies—a comparative study. Physica E: Low-dimensional Systems and Nanostructures, 63:317 - 323, 2014. [ bib | DOI | http ]
Abstract This work presents an experimental analysis on the tunable localized surface plasmon resonance (LSPR), obtained from deposited silver (Ag) thin films of various thicknesses. Silver thin films are prepared using electron-beam deposition and undergo an annealing process at different temperatures to produce distinctive sizes of Ag metal nanoparticles (MNPs). The variability of structure sizes and shapes provides an effective means of tuning the position of the {LSPR} within a wide wavelength range. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of Support Vector Regression (SVR) to estimate and predict the {LSPR} over a broad wavelength range by a process in which the resonance spectra of silver nanoparticles differing in thickness. Instead of minimizing the observed training error, SVR_poly, SVR_rbf and SVR_lin attempt to minimize the generalization error bound to achieve generalized performance. The experimental results show an improvement in predictive accuracy and capability of generalization which can be achieved by the SVR_poly approach in compare to SVR_rbf and SVR_lin methodology. It was found the best testing errors for The SVR_poly approach.

Keywords: Ag
[247] Indrajit Mandal and N. Sairam. Accurate telemonitoring of parkinson's disease diagnosis using robust inference system. International Journal of Medical Informatics, 82(5):359 - 377, 2013. [ bib | DOI | http ]
This work presents more precise computational methods for improving the diagnosis of Parkinson's disease based on the detection of dysphonia. New methods are presented for enhanced evaluation and recognize Parkinson's disease affected patients at early stage. Analysis is performed with significant level of error tolerance rate and established our results with corrected T-test. Here new ensembles and other machine learning methods consisting of multinomial logistic regression classifier with Haar wavelets transformation as projection filter that outperform logistic regression is used. Finally a novel and reliable inference system is presented for early recognition of people affected by this disease and presents a new measure of the severity of the disease. Feature selection method is based on Support Vector Machines and ranker search method. Performance analysis of each model is compared to the existing methods and examines the main advancements and concludes with propitious results. Reliable methods are proposed for treating Parkinson's disease that includes sparse multinomial logistic regression, Bayesian network, Support Vector Machines, Artificial Neural Networks, Boosting methods and their ensembles. The study aim at improving the quality of Parkinson's disease treatment by tracking them and reinforce the viability of cost effective, regular and precise telemonitoring application.

Keywords: Parkinson's disease corrected T-tests
[248] Faming Tang, Mianyun Chen, and Zhongdong Wang. New approach to training support vector machine1. Journal of Systems Engineering and Electronics, 17(1):200 - 219, 2006. [ bib | DOI | http ]
Support vector machine has become an increasingly popular tool for machine learning tasks involving classification, regression or novelty detection. Training a support vector machine requires the solution of a very large quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, several approaches exist for circumventing the above shortcomings and work well. Another learning algorithm, particle swarm optimization, for training {SVM} is introduted. The method is tested on {UCI} datasets.

Keywords: support vector machine
[249] Ye Wang, Bo Wang, and Xinyang Zhang. A new application of the support vector regression on the construction of financial conditions index to {CPI} prediction. Procedia Computer Science, 9:1263 - 1272, 2012. Proceedings of the International Conference on Computational Science, {ICCS} 2012. [ bib | DOI | http ]
A regression model based on Support Vector Machine is used in constructing Financial Conditions Index (FCI) to explore the link between composite index of financial indicators and future inflation. Compared with the traditional econometric method, our model takes the advantage of the machine learning method to give a more accurate forecast of future {CPI} in small dataset. In addition, we add more financial indicators including {M2} growth rate, growth rate of housing sales and lag {CPI} in our model which is more in line with economy. A monthly data of Chinese {CPI} and other financial indicators are adopted to construct {FCI} (SVRs) with different lag terms. The experiment result shows that {FCI} (SVRs) performs better than {VAR} impulse response analysis. As a result, our model based on support vector regression in construction of {FCI} is appropriate.

Keywords: Financial conditions index
[250] Man Gyun Na, Jin Weon Kim, and In Joon Hwang. Collapse moment estimation by support vector machines for wall-thinned pipe bends and elbows. Nuclear Engineering and Design, 237(5):451 - 459, 2007. [ bib | DOI | http ]
The collapse moment due to wall-thinned defects is estimated through support vector machines with parameters optimized by a genetic algorithm. The support vector regression models are developed and applied to numerical data obtained from the finite element analysis for wall-thinned defects in piping systems. The support vector regression models are optimized by using both the data sets (training data and optimization data) prepared for training and optimization, and its performance verification is performed by using another data set (test data) different from the training data and the optimization data. In this work, three support vector regression models are developed, respectively, for three data sets divided into the three classes of extrados, intrados, and crown defects, which is because they have different characteristics. The relative root mean square (RMS) errors of the estimated collapse moment are 0.2333% for the training data, 0.5229% for the optimization data and 0.5011% for the test data. It is known from this result that the support vector regression models are sufficiently accurate to be used in the integrity evaluation of wall-thinned pipe bends and elbows.

[251] Okba Taouali, Ilyes Elaissi, and Hassani Messaoud. Dimensionality reduction of {RKHS} model parameters. {ISA} Transactions, 57:205 - 210, 2015. [ bib | DOI | http ]
Abstract This paper proposes a new method to reduce the parameter number of models developed in the Reproducing Kernel Hilbert Space (RKHS). In fact, this number is equal to the number of observations used in the learning phase which is assumed to be high. The proposed method entitled Reduced Kernel Partial Least Square (RKPLS) consists on approximating the retained latent components determined using the Kernel Partial Least Square (KPLS) method by their closest observation vectors. The paper proposes the design and the comparative study of the proposed {RKPLS} method and the Support Vector Machines on Regression (SVR) technique. The proposed method is applied to identify a nonlinear Process Trainer {PT326} which is a physical process available in our laboratory. Moreover as a thermal process with large time response may help record easily effective observations which contribute to model identification. Compared to the {SVR} technique, the results from the proposed {RKPLS} method are satisfactory.

Keywords: RKHS
[252] Chunjian Pan, Yaming Dong, Xuefeng Yan, and Weixiang Zhao. Hybrid model for main and side reactions of p-xylene oxidation with factor influence based monotone additive {SVR}. Chemometrics and Intelligent Laboratory Systems, 136:36 - 46, 2014. [ bib | DOI | http ]
Abstract Due to the complex mechanism of main and burning side reactions in the industrial p-xylene oxidation, its first principle based kinetic mechanism model is hard to be established. Meanwhile building a data-driven model may be also a big challenge, because of various industrial sample data issues such as incompleteness and noise. A hybrid model of industrial p-xylene oxidation, which is based on monotone additive support vector regression, is proposed and established by employing industrial sample data and factor influence information. In the hybrid model, the influence of reaction factors on the main and burning side reactions is investigated with two additive support vector regression (AddSVR) models and the factor influence information is integrated into the modeling process by adding extra constraints to the AddSVR models. The hybrid model presents a better prediction accuracy.

Keywords: Hybrid model
[253] Radu Ioan Boţ and Nicole Lorenz. Optimization problems in statistical learning: Duality and optimality conditions. European Journal of Operational Research, 213(2):395 - 404, 2011. [ bib | DOI | http ]
Regularization methods are techniques for learning functions from given data. We consider regularization problems the objective function of which consisting of a cost function and a regularization term with the aim of selecting a prediction function f with a finite representation f ( · ) = ∑ i = 1 n c i k ( · , X i ) which minimizes the error of prediction. Here the role of the regularizer is to avoid overfitting. In general these are convex optimization problems with not necessarily differentiable objective functions. Thus in order to provide optimality conditions for this class of problems one needs to appeal on some specific techniques from the convex analysis. In this paper we provide a general approach for deriving necessary and sufficient optimality conditions for the regularized problem via the so-called conjugate duality theory. Afterwards we employ the obtained results to the Support Vector Machines problem and Support Vector Regression problem formulated for different cost functions.

Keywords: Machine learning
[254] K. Van Hoorde, S. Van Huffel, D. Timmerman, T. Bourne, and B. Van Calster. A spline-based tool to assess and visualize the calibration of multiclass risk predictions. Journal of Biomedical Informatics, 54:283 - 293, 2015. [ bib | DOI | http ]
Abstract When validating risk models (or probabilistic classifiers), calibration is often overlooked. Calibration refers to the reliability of the predicted risks, i.e. whether the predicted risks correspond to observed probabilities. In medical applications this is important because treatment decisions often rely on the estimated risk of disease. The aim of this paper is to present generic tools to assess the calibration of multiclass risk models. We describe a calibration framework based on a vector spline multinomial logistic regression model. This framework can be used to generate calibration plots and calculate the estimated calibration index (ECI) to quantify lack of calibration. We illustrate these tools in relation to risk models used to characterize ovarian tumors. The outcome of the study is the surgical stage of the tumor when relevant and the final histological outcome, which is divided into five classes: benign, borderline malignant, stage I, stage II–IV, and secondary metastatic cancer. The 5909 patients included in the study are randomly split into equally large training and test sets. We developed and tested models using the following algorithms: logistic regression, support vector machines, k nearest neighbors, random forest, naive Bayes and nearest shrunken centroids. Multiclass calibration plots are interesting as an approach to visualizing the reliability of predicted risks. The {ECI} is a convenient tool for comparing models, but is less informative and interpretable than calibration plots. In our case study, logistic regression and random forest showed the highest degree of calibration, and the naive Bayes the lowest.

Keywords: Risk models
[255] Kuilin Chen and Jie Yu. Short-term wind speed prediction using an unscented kalman filter based state-space support vector regression approach. Applied Energy, 113:690 - 705, 2014. [ bib | DOI | http ]
Abstract Accurate wind speed forecasting is becoming increasingly important to improve and optimize renewable wind power generation. Particularly, reliable short-term wind speed prediction can enable model predictive control of wind turbines and real-time optimization of wind farm operation. However, this task remains challenging due to the strong stochastic nature and dynamic uncertainty of wind speed. In this study, unscented Kalman filter (UKF) is integrated with support vector regression (SVR) based state-space model in order to precisely update the short-term estimation of wind speed sequence. In the proposed SVR–UKF approach, support vector regression is first employed to formulate a nonlinear state-space model and then unscented Kalman filter is adopted to perform dynamic state estimation recursively on wind sequence with stochastic uncertainty. The novel SVR–UKF method is compared with artificial neural networks (ANNs), SVR, autoregressive (AR) and autoregressive integrated with Kalman filter (AR-Kalman) approaches for predicting short-term wind speed sequences collected from three sites in Massachusetts, USA. The forecasting results indicate that the proposed method has much better performance in both one-step-ahead and multi-step-ahead wind speed predictions than the other approaches across all the locations.

Keywords: Wind speed prediction
[256] Kristof Coussement and Dirk Van den Poel. Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1):313 - 327, 2008. [ bib | DOI | http ]
{CRM} gains increasing importance due to intensive competition and saturated markets. With the purpose of retaining customers, academics as well as practitioners find it crucial to build a churn prediction model that is as accurate as possible. This study applies support vector machines in a newspaper subscription context in order to construct a churn model with a higher predictive performance. Moreover, a comparison is made between two parameter-selection techniques, needed to implement support vector machines. Both techniques are based on grid search and cross-validation. Afterwards, the predictive performance of both kinds of support vector machine models is benchmarked to logistic regression and random forests. Our study shows that support vector machines show good generalization performance when applied to noisy marketing data. Nevertheless, the parameter optimization procedure plays an important role in the predictive performance. We show that only when the optimal parameter-selection procedure is applied, support vector machines outperform traditional logistic regression, whereas random forests outperform both kinds of support vector machines. As a substantive contribution, an overview of the most important churn drivers is given. Unlike ample research, monetary value and frequency do not play an important role in explaining churn in this subscription-services application. Even though most important churn predictors belong to the category of variables describing the subscription, the influence of several client/company-interaction variables cannot be neglected.

Keywords: Data mining
[257] Mohammad Goodarzi, Matheus P. Freitas, Chih H. Wu, and Pablo R. Duchowicz. pka modeling and prediction of a series of ph indicators through genetic algorithm-least square support vector regression. Chemometrics and Intelligent Laboratory Systems, 101(2):102 - 109, 2010. [ bib | DOI | http ]
The pKa values of a series of 107 indicators have been modeled by means of a quantitative structure–property relationship (QSPR) approach based on physicochemical descriptors and different variable selection and regression methods. A genetic algorithm/least square support vector regression (GA-LSSVR) model gave the most accurate estimations/predictions, with squared correlation coefficients of 0.90 and 0.89 for the training and test set compounds, respectively. The prediction ability of this model was found to be superior to that based on support vector machine regression alone, revealing the important effect of selecting suitable descriptors during a {QSPR} modeling. Moreover, the GA-LSSVR model showed higher predictive capability than linear methods, demonstrating the influence of nonlinearity on the modeling of pKa values, an extremely useful parameter in the analytical sciences.

Keywords: pKa
[258] Hsun-Jung Cho and Ming-Te Tseng. A support vector machine approach to cmos-based radar signal processing for vehicle classification and speed estimation. Mathematical and Computer Modelling, 58(1–2):438 - 448, 2013. Financial {IT} & Security and 2010 International Symposium on Computational Electronics. [ bib | DOI | http ]
In this work, a complementary metal-oxide semiconductor (CMOS) based transceiver with a sensitivity time control antenna is successfully implemented for advanced traffic signal processing. The collected signals from the {CMOS} radar system are processed with optimization algorithms for vehicle-type classification and speed determination. The high recognition rate optimization algorithms are mainly based upon the information of short setup time and different environmental installation of each sensor. In the course of optimization, a video recognition module is further adopted as a supervisor of support vector machine and support vector regression. Compared with conventional circuit-based detector systems, the developed {CMOS} radar integrates submicron semiconductor devices and thus not only possesses low stand-by power but also is ready for production. In the meantime, the developed algorithm of this study simultaneously optimizes the vehicle-type classification and speed determination in a computationally cost-effective manner, which benefits real-time intelligent transportation systems.

Keywords: Vehicle detector
[259] Dalibor Petković, Shahaboddin Shamshirband, Nor Badrul Anuar, Hadi Saboohi, Ainuddin Wahid Abdul Wahab, Milan Protić, Erfan Zalnezhad, and Seyed Mohammad Amin Mirhashemi. An appraisal of wind speed distribution prediction by soft computing methodologies: A comparative study. Energy Conversion and Management, 84:133 - 139, 2014. [ bib | DOI | http ]
Abstract The probabilistic distribution of wind speed is among the more significant wind characteristics in examining wind energy potential and the performance of wind energy conversion systems. When the wind speed probability distribution is known, the wind energy distribution can be easily obtained. Therefore, the probability distribution of wind speed is a very important piece of information required in assessing wind energy potential. For this reason, a large number of studies have been established concerning the use of a variety of probability density functions to describe wind speed frequency distributions. Although the two-parameter Weibull distribution comprises a widely used and accepted method, solving the function is very challenging. In this study, the polynomial and radial basis functions (RBF) are applied as the kernel function of support vector regression (SVR) to estimate two parameters of the Weibull distribution function according to previously established analytical methods. Rather than minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound, so as to achieve generalized performance. According to the experimental results, enhanced predictive accuracy and capability of generalization can be achieved using the {SVR} approach compared to other soft computing methodologies.

Keywords: Wind turbine
[260] M.A.H. Farquad, V. Ravi, and S. Bapi Raju. Support vector regression based hybrid rule extraction methods for forecasting. Expert Systems with Applications, 37(8):5577 - 5589, 2010. [ bib | DOI | http ]
Support Vector Regression (SVR) solves regression problems based on the concept of Support Vector Machine (SVM) introduced by Vapnik (1995). The main drawback of these newer techniques is their lack of interpretability. In other words, it is difficult for the human analyst to understand the knowledge learnt by these models during training. The most popular way to overcome this difficulty is to extract if–then rules from {SVM} and SVR. Rules provide explanation capability to these models and improve the comprehensibility of the system. Over the last decade, different algorithms for extracting rules from {SVM} have been developed. However rule extraction from {SVR} is not widely available yet. In this paper a novel hybrid approach for extracting rules from {SVR} is presented. The proposed hybrid rule extraction procedure has two phases: (1) Obtain the reduced training set in the form of support vectors using {SVR} (2) Train the machine leaning techniques (with explanation capability) using the reduced training set. Machine learning techniques viz., Classification And Regression Tree (CART), Adaptive Network based Fuzzy Inference System (ANFIS) and Dynamic Evolving Fuzzy Inference System (DENFIS) are used in the phase 2. The proposed hybrid rule extraction procedure is compared to stand-alone CART, {ANFIS} and DENFIS. Extensive experiments are conducted on five benchmark data sets viz. Auto MPG, Body Fat, Boston Housing, Forest Fires and Pollution, to demonstrate the effectiveness of the proposed approach in generating accurate regression rules. The efficiency of these techniques is measured using Root Mean Squared Error (RMSE). From the results obtained, it is concluded that when the support vectors with the corresponding predicted target values are used, the {SVR} based hybrids outperform the stand-alone intelligent techniques and also the case when the support vectors with the corresponding actual target values are used.

Keywords: Rule extraction
[261] Wang Guanghui. Demand forecasting of supply chain based on support vector regression method. Procedia Engineering, 29:280 - 284, 2012. 2012 International Workshop on Information and Electronics Engineering. [ bib | DOI | http ]
Introducing the basic theory and computing process of time series forecasting based on Support Vector Regression (SVR) in details, optimizing the parameters of {SVR} by Genetic Algorithm (GA). Applying {SVR} to forecast the demand of supply chain in real data, and compared to the {RBF} neural network method. The result shows that {SVR} is superior to {RBF} in prediction performance. And {SVR} is the suitable and effective method for demand forecasting of supply chain.

Keywords: Support vector regression ;Supply Chain
[262] Zhenhai Guo, Jing Zhao, Wenyu Zhang, and Jianzhou Wang. A corrected hybrid approach for wind speed prediction in hexi corridor of china. Energy, 36(3):1668 - 1679, 2011. [ bib | DOI | http ]
Wind energy has been well recognized as a renewable resource in electricity generation, which is environmentally friendly, socially beneficial and economically competitive. For proper and efficient evaluation of wind energy, a hybrid Seasonal Auto-Regression Integrated Moving Average and Least Square Support Vector Machine (SARIMA–LSSVM) model is significantly developed to predict the mean monthly wind speed in Hexi Corridor. The design concept of combining the Seasonal Auto-Regression Integrated Moving Average (SARIMA) method with the Least Square Support Vector Machine (LSSVM) algorithm shows more powerful forecasting capacity for monthly wind speed prediction at wind parks, when compared with the single Auto-Regression Integrated Moving Average (ARIMA), SARIMA, {LSSVM} models and the hybrid Auto-Regression Integrated Moving Average and Support Vector Machine (ARIMA–SVM) model. To verify the developed approach, the monthly data from January 2001 to December 2006 in Mazong Mountain and Jiuquan are used for model construction and model testing. The simulation and hypothesis test results show that the developed method is simple and quite efficient.

Keywords: Wind speed
[263] Haydn Hoffman, Sunghoon I. Lee, Jordan H. Garst, Derek S. Lu, Charles H. Li, Daniel T. Nagasawa, Nima Ghalehsari, Nima Jahanforouz, Mehrdad Razaghy, Marie Espinal, Amir Ghavamrezaii, Brian H. Paak, Irene Wu, Majid Sarrafzadeh, and Daniel C. Lu. Use of multivariate linear regression and support vector regression to predict functional outcome after surgery for cervical spondylotic myelopathy. Journal of Clinical Neuroscience, pages -, 2015. [ bib | DOI | http ]
Abstract This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for {CSM} remains a challenge. We recruited patients who had a diagnosis of {CSM} and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in {MLR} and {SVR} models to predict postoperative ODI. Predictions were compared to the actual {ODI} scores with the coefficient of determination (R2) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the {MLR} model, a combination of the preoperative {ODI} score, preoperative {MAA} (step function), and symptom duration yielded the best prediction of postoperative {ODI} (R2 = 0.452; {MAD} = 0.0887; p = 1.17 × 10−3). With the {SVR} model, a combination of preoperative {ODI} score, preoperative {MAA} (sinusoidal function), and symptom duration yielded the best prediction of postoperative {ODI} (R2 = 0.932; {MAD} = 0.0283; p = 5.73 × 10−12). The {SVR} model was more accurate than the {MLR} model. The {SVR} can be used preoperatively in risk/benefit analysis and the decision to operate.

Keywords: Cervical spondylotic myelopathy
[264] Hiromasa Kaneko and Kimito Funatsu. Nonlinear regression method with variable region selection and application to soft sensors. Chemometrics and Intelligent Laboratory Systems, 121:26 - 32, 2013. [ bib | DOI | http ]
Abstract Regions of explanatory variables, X, are attempted to be selected in many fields such as spectral analysis and process control. A genetic algorithm-based wavelength selection (GAWLS) method is one of the methods used to select combinations of important variables from X-variables using regions as a unit of measurement. However, a partial least squares method is used as a regression method, and hence, a {GAWLS} method cannot handle nonlinear relationship between X and an objective variable, y. We therefore proposed a region selection method based on {GAWLS} and support vector regression (SVR), one of the nonlinear regression methods. The proposed method is named GAWLS–SVR. We applied GAWLS–SVR to simulation data and industrial polymer process data, and confirmed that predictive, easy-to-interpret, and appropriate models were constructed using the proposed method.

Keywords: Variable selection
[265] Xuchan Ju, Manjin Cheng, Yuhong Xia, Fuqiang Quo, and Yingjie Tian. Support vector regression and time series analysis for the forecasting of bayannur's total water requirement. Procedia Computer Science, 31:523 - 531, 2014. 2nd International Conference on Information Technology and Quantitative Management, {ITQM} 2014. [ bib | DOI | http ]
Abstract Bayannur is one of the districts lying in the western area of Inner Mongolia whose water resources are extremely deficient. Lack of water resources have become the bottleneck of the place economic sustainable development. So Bayannur diverts water from the Yellow River to supply water shortage every year. How to allocate this water reasonably have become the key point to improve the current situation. However, before reasonable allocation, we should forecast the total water requirement accurately. In this paper, we propose two solutions to the forecasting of Bayannur's total water requirement via support vector regression and time series analysis.

Keywords: support vector regression
[266] Peng-Cheng Zou, Jiandong Wang, Songcan Chen, and Haiyan Chen. Bagging-like metric learning for support vector regression. Knowledge-Based Systems, 65:21 - 30, 2014. [ bib | DOI | http ]
Abstract Metric plays an important role in machine learning and pattern recognition. Though many available off-the-shelf metrics can be selected to achieve some learning tasks at hand such as for k-nearest neighbor classification and k-means clustering, such a selection is not necessarily always appropriate due to its independence on data itself. It has been proved that a task-dependent metric learned from the given data can yield more beneficial learning performance. Inspired by such success, we focus on learning an embedded metric specially for support vector regression and present a corresponding learning algorithm termed as SVRML, which both minimizes the error on the validation dataset and simultaneously enforces the sparsity on the learned metric matrix. Further taking the learned metric (positive semi-definite matrix) as a base learner, we develop a bagging-like effective ensemble metric learning framework in which the resampling mechanism of original bagging is specially modified for SVRML. Experiments on various datasets demonstrate that our method outperforms the single and bagging-based ensemble metric learnings for support vector regression.

Keywords: Distance metric learning
[267] Dalibor Petković, Shahaboddin Shamshirband, Hadi Saboohi, Tan Fong Ang, Nor Badrul Anuar, Zulkanain Abdul Rahman, and Nenad T. Pavlović. Evaluation of modulation transfer function of optical lens system by support vector regression methodologies – a comparative study. Infrared Physics & Technology, 65:94 - 102, 2014. [ bib | DOI | http ]
Abstract The quantitative assessment of image quality is an important consideration in any type of imaging system. The modulation transfer function (MTF) is a graphical description of the sharpness and contrast of an imaging system or of its individual components. The {MTF} is also known and spatial frequency response. The {MTF} curve has different meanings according to the corresponding frequency. The {MTF} of an optical system specifies the contrast transmitted by the system as a function of image size, and is determined by the inherent optical properties of the system. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of Support Vector Regression (SVR) to estimate and predict estimate {MTF} value of the actual optical system according to experimental tests. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound so as to achieve generalized performance. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by the SVR_rbf approach in compare to SVR_poly soft computing methodology.

Keywords: Modulation transfer function
[268] Halil Ibrahim Erdal and Onur Karakurt. Advancing monthly streamflow prediction accuracy of {CART} models using ensemble learning paradigms. Journal of Hydrology, 477:119 - 128, 2013. [ bib | DOI | http ]
Summary Streamflow forecasting is one of the most important steps in the water resources planning and management. Ensemble techniques such as bagging, boosting and stacking have gained popularity in hydrological forecasting in the recent years. The study investigates the potential usage of two ensemble learning paradigms (i.e., bagging; stochastic gradient boosting) in building classification and regression trees (CARTs) ensembles to advance the streamflow prediction accuracy. The study, initially, investigates the use of classification and regression trees for monthly streamflow forecasting and employs a support vector regression (SVR) model as the benchmark model. The analytic results indicate that {CART} outperforms {SVR} in both training and testing phases. Although the obtained results of {CART} model in training phase are considerable, it is not in testing phase. Thus, to optimize the prediction accuracy of {CART} for monthly streamflow forecasting, we incorporate bagging and stochastic gradient boosting which are rooted in same philosophy, advancing the prediction accuracy of weak learners. Comparing with the results of bagged regression trees (BRTs) and stochastic gradient boosted regression trees (GBRTs) models possess satisfactory monthly streamflow forecasting performance than {CART} and {SVR} models. Overall, it is found that ensemble learning paradigms can remarkably advance the prediction accuracy of {CART} models in monthly streamflow forecasting.

Keywords: Bagging (bootstrap aggregating)
[269] Ramon Granell, Colin J. Axon, and David C.H. Wallom. Predicting winning and losing businesses when changing electricity tariffs. Applied Energy, 133:298 - 307, 2014. [ bib | DOI | http ]
Abstract By using smart meters, more data about how businesses use energy is becoming available to energy retailers (providers). This is enabling innovation in the structure and type of tariffs on offer in the energy market. We have applied Artificial Neural Networks, Support Vector Machines, and Naive Bayesian Classifiers to a data set of the electrical power use by 12,000 businesses (in 44 sectors) to investigate predicting which businesses will gain or lose by switching between tariffs (a two-classes problem). We have used only three features of each company: their business sector, load profile category, and mean power use. We are particularly interested in the switch between a static tariff (fixed price or time-of-use) and a dynamic tariff (half-hourly pricing). We have extended the two-classes problem to include a price elasticity factor (a three-classes problem). We show how the classification error for the two- and three-classes problems varies with the amount of available data. Furthermore, we used Ordinary Least Squares and Support Vector Regression models to compute the exact values of the amount gained or lost by a business if it switched tariff types. Our analysis suggests that the machine learning classifiers required less data to reach useful performance levels than the regression models.

Keywords: Energy
[270] Danian Zheng, Jiaxin Wang, and Yannan Zhao. Non-flat function estimation with a multi-scale support vector regression. Neurocomputing, 70(1–3):420 - 429, 2006. Neural NetworksSelected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN '04)7th Brazilian Symposium on Neural Networks. [ bib | DOI | http ]
Estimating the non-flat function which comprises both the steep variations and the smooth variations is a hard problem. The results achieved by the common support vector methods like SVR, {LPR} and LS-SVM are often unsatisfactory, because they cannot avoid underfitting and overfitting simultaneously. This paper takes this problem as a linear regression in a combined feature space which is implicitly defined by a set of translation invariant kernels with different scales, and proposes a multi-scale support vector regression (MS-SVR) method. MS-SVR performs better than SVR, {LPR} and LS-SVM in the experiments tried.

Keywords: Non-flat function
[271] Zoran Bosnić and Igor Kononenko. Comparison of approaches for estimating reliability of individual regression predictions. Data & Knowledge Engineering, 67(3):504 - 516, 2008. [ bib | DOI | http ]
The paper compares different approaches to estimate the reliability of individual predictions in regression. We compare the sensitivity-based reliability estimates developed in our previous work with four approaches found in the literature: variance of bagged models, local cross-validation, density estimation, and local modeling. By combining pairs of individual estimates, we compose a combined estimate that performs better than the individual estimates. We tested the estimates by running data from 28 domains through eight regression models: regression trees, linear regression, neural networks, bagging, support vector machines, locally weighted regression, random forests, and generalized additive model. The results demonstrate the potential of a sensitivity-based estimate, as well as the local modeling of prediction error with regression trees. Among the tested approaches, the best average performance was achieved by estimation using the bagging variance approach, which achieved the best performance with neural networks, bagging and locally weighted regression.

Keywords: Reliability estimate
[272] Dragan Stević, Igor Hut, Nikola Dojčinović, and Jugoslav Joković. Automated identification of land cover type using multispectral satellite images. Energy and Buildings, pages -, 2015. [ bib | DOI | http ]
Abstract Detection of specific terrain features and vegetation, referenced as a landscape classification, is an important component in the management and planning of natural resources. The different land types, man-made materials in natural backgrounds and vegetation cultures can be distinguished by their reflectance. Although remote sensing technology has great potential for acquisition of detailed and accurate information of landscape regions, the determination of land-use data with high accuracy is generally limited by the availability of adequate remote sensing data, in terms of spatial and temporal resolution, and digital image analysis techniques. Therefore, remote sensing with multi-spectral or/and hyper-spectral data derived from various satellites in combination with topographic variables is a valuable tool in landscape type classification. The different methods based on reflectance data from multi-spectral Landsat satellite image sets are used for automatic landscape type recognition. In order to characterize reflectance of landscape types represented in an image, construction of a multi-spectral descriptor, as a vector of acquired reflectance values by wavelength bands, is proposed. The applied algorithms for landscape type classification (artificial neural network, support vector machines and logistic regression) have been analysed and results are compared and discussed in terms of accuracy and time of execution.

Keywords: Landscape classification
[273] Chia-Hui Huang. A reduced support vector machine approach for interval regression analysis. Information Sciences, 217:56 - 64, 2012. [ bib | DOI | http ]
The support vector machine (SVM) has been shown to be an efficient approach for a variety of classification problems. It has also been widely used in pattern recognition, regression and distribution estimation for separable data. However, there are two problems with using the {SVM} model: (1) Large-scale: when dealing with large-scale data sets, the solution may be difficult to find when using {SVM} with nonlinear kernels; (2) Unbalance: the number of samples from one class is much larger than the number of samples from the other classes. It causes the excursion of separation margin. Under these circumstances, developing an efficient method is necessary. Recently, the use of the reduced support vector machine (RSVM) was proposed as an alternative to the standard SVM. It has been proven more efficient than the traditional {SVM} in processing large-scaled data. In this paper, we introduce the principle of {RSVM} to evaluate interval regression analysis. The main idea of the proposed method is to reduce the number of support vectors by randomly selecting a subset of samples.

Keywords: Interval regression analysis
[274] Chunxiao Zhang and Nan Wang. Aero-engine condition monitoring based on support vector machine. Physics Procedia, 24, Part B:1546 - 1552, 2012. International Conference on Applied Physics and Industrial Engineering 2012. [ bib | DOI | http ]
The maintenance and management of civil aero-engine require advanced monitor approaches to estimate aero-engine performance and health in order to increase life of aero-engine and reduce maintenance costs. In this paper, we adopted support vector machine (SVM) regression approach to monitor an aero-engine health and condition by building monitoring models of main aero-engine performance parameters(EGT, N1, {N2} and FF). The accuracy of nonlinear baseline models of performance parameters is tested and the maximum relative error does not exceed ±0.3%, which meets the engineering requirements. The results show that {SVM} nonlinear regression is an effective method in aero-engine monitoring.

Keywords: Aero-engine condition monitoring
[275] Lin Hua, Ping Zhou, Hong Liu, Lin Li, Zheng Yang, and Zhi cheng Liu. Mining susceptibility gene modules and disease risk genes from {SNP} data by combining network topological properties with support vector regression. Journal of Theoretical Biology, 289:225 - 236, 2011. [ bib | DOI | http ]
Genome-wide association study is a powerful approach to identify disease risk loci. However, the molecular regulatory mechanisms for most complex diseases are still not well understood. Therefore, further investigating the interplay between genetic factors and biological networks is important for elucidating the molecular mechanisms of complex diseases. Here, we proposed a novel framework to identify susceptibility gene modules and disease risk genes by combining network topological properties with support vector regression from single nucleotide polymorphism (SNP) level. We assigned risk {SNPs} to genes using the University of California at Santa Cruz (UCSC) genome database, and then mapped these genes to protein–protein interaction (PPI) networks. The gene modules implicated by hub genes were extracted using the {PPI} networks and the topological property was analyzed for these gene modules. For each gene module, risk feature genes were determined by topological property analysis and support vector regression. As a result, five shared risk feature genes, CD80, EGFR, FN1, {GSK3B} and {TRAF6} were found and proven to be associated with rheumatoid arthritis by previous reports. Our approach showed a good performance in comparison with other approaches and can be used for prioritizing candidate genes associated with complex diseases.

Keywords: Complex diseases
[276] Thomas F. Boucher, Marie V. Ozanne, Marco L. Carmosino, M. Darby Dyar, Sridhar Mahadevan, Elly A. Breves, Kate H. Lepore, and Samuel M. Clegg. A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy. Spectrochimica Acta Part B: Atomic Spectroscopy, 107:1 - 10, 2015. [ bib | DOI | http ]
Abstract The ChemCam instrument on the Mars Curiosity rover is generating thousands of {LIBS} spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of {LIBS} data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, {LIBS} spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from {LIBS} spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.

Keywords: Laser-induced breakdown spectroscopy (LIBS)
[277] Mohammad H. Fatemi, Afsane Heidari, and Sajjad Gharaghani. {QSAR} prediction of hiv-1 protease inhibitory activities using docking derived molecular descriptors. Journal of Theoretical Biology, 369:13 - 22, 2015. [ bib | DOI | http ]
Abstract In this study, application of a new hybrid docking-quantitative structure activity relationship (QSAR) methodology to model and predict the HIV-1 protease inhibitory activities of a series of newly synthesized chemicals is reported. This hybrid docking-QSAR approach can provide valuable information about the most important chemical and structural features of the ligands that affect their inhibitory activities. Docking studies were used to find the actual conformations of chemicals in active site of HIV-1 protease. Then the molecular descriptors were calculated from these conformations. Multiple linear regression (MLR) and least square support vector machine (LS-SVM) were used as {QSAR} models, respectively. The obtained results reveal that statistical parameters of the LS-SVM model are better than the {MLR} model, which indicate that there are some non-linear relations between selected molecular descriptors and anti-HIV activities of interested chemicals. The correlation coefficient (R), root mean square error (RMSE) and average absolute error (AAE) for LS-SVM are: R=0.988, RMSE=0.207 and AAE=0.145 for the training set, and R=0.965, RMSE=0.403 and AAE=0.338 for the test set. Leave one out cross validation test was used for assessment of the predictive power and validity of models which led to cross-validation correlation coefficient {QUOTE} of 0.864 and 0.850 and standardized predicted relative error sum of squares (SPRESS) of 0.553 and 0.581 for LS-SVM and {MLR} models, respectively.

Keywords: Hybrid docking
[278] Jui-Sheng Chou and Dac-Khuong Bui. Modeling heating and cooling loads by artificial intelligence for energy-efficient building design. Energy and Buildings, 82:437 - 446, 2014. [ bib | DOI | http ]
Abstract The energy performance of buildings was estimated using various data mining techniques, including support vector regression (SVR), artificial neural network (ANN), classification and regression tree, chi-squared automatic interaction detector, general linear regression, and ensemble inference model. The prediction models were constructed using 768 experimental datasets from the literature with 8 input parameters and 2 output parameters (cooling load (CL) and heating load (HL)). Comparison results showed that the ensemble approach (SVR +ANN) and {SVR} were the best models for predicting {CL} and HL, respectively, with mean absolute percentage errors below 4%. Compared to previous works, the ensemble model and {SVR} model further obtained at least 39.0% to 65.9% lower root mean square errors, respectively, for {CL} and {HL} prediction. This study confirms the efficiency, effectiveness, and accuracy of the proposed approach when predicting {CL} and {HL} in building design stage. The analytical results support the feasibility of using the proposed techniques to facilitate early designs of energy conserving buildings.

Keywords: Cooling load
[279] Dug Hun Hong and Changha Hwang. Support vector fuzzy regression machines. Fuzzy Sets and Systems, 138(2):271 - 281, 2003. [ bib | DOI | http ]
Support vector machine (SVM) has been very successful in pattern recognition and function estimation problems. In this paper, we introduce the use of {SVM} for multivariate fuzzy linear and nonlinear regression models. Using the basic idea underlying {SVM} for multivariate fuzzy regressions gives computational efficiency of getting solutions.

Keywords: Fuzzy inference systems
[280] G.J. Postma, P.W.T. Krooshof, and L.M.C. Buydens. Opening the kernel of kernel partial least squares and support vector machines. Analytica Chimica Acta, 705(1–2):123 - 134, 2011. A selection of papers presented at the 12th International Conference on Chemometrics in Analytical Chemistry. [ bib | DOI | http ]
Kernel partial least squares (KPLS) and support vector regression (SVR) have become popular techniques for regression of complex non-linear data sets. The modeling is performed by mapping the data in a higher dimensional feature space through the kernel transformation. The disadvantage of such a transformation is, however, that information about the contribution of the original variables in the regression is lost. In this paper we introduce a method which can retrieve and visualize the contribution of the variables to the regression model and the way the variables contribute to the regression of complex data sets. The method is based on the visualization of trajectories using so-called pseudo samples representing the original variables in the data. We test and illustrate the proposed method to several synthetic and real benchmark data sets. The results show that for linear and non-linear regression models the important variables were identified with corresponding linear or non-linear trajectories. The results were verified by comparing with ordinary {PLS} regression and by selecting those variables which were indicated as important and rebuilding a model with only those variables.

Keywords: Kernel partial least squares
[281] K. De Brabanter, P. Karsmakers, J. De Brabanter, J.A.K. Suykens, and B. De Moor. Confidence bands for least squares support vector machine classifiers: A regression approach. Pattern Recognition, 45(6):2280 - 2287, 2012. Brain Decoding. [ bib | DOI | http ]
This paper presents bias-corrected 100 ( 1 − α ) % simultaneous confidence bands for least squares support vector machine classifiers based on a regression framework. The bias, which is inherently present in every nonparametric method, is estimated using double smoothing. In order to obtain simultaneous confidence bands we make use of the volume-of-tube formula. We also provide extensions of this formula in higher dimensions and show that the width of the bands are expanding with increasing dimensionality. Simulations and data analysis support its usefulness in practical real life classification problems.

Keywords: Kernel based classification
[282] Dao-Hong Xiang, Ting Hu, and Ding-Xuan Zhou. Learning with varying insensitive loss. Applied Mathematics Letters, 24(12):2107 - 2109, 2011. [ bib | DOI | http ]
Support vector machines for regression are implemented based on regularization schemes in reproducing kernel Hilbert spaces associated with an ϵ -insensitive loss. The insensitive parameter ϵ > 0 changes with the sample size and plays a crucial role in the learning algorithm. The purpose of this paper is to present a perturbation theorem to show how the medium function of the probability measure for regression (with ϵ = 0 ) can be approximated by learning the minimizer of the generalization error with sufficiently small parameter ϵ > 0 . A concrete learning rate is provided under a regularity condition of the medium function and a noise condition of the probability measure.

Keywords: Support vector machine
[283] Min Han and ZhanJi Cao. An improved case-based reasoning method and its application in endpoint prediction of basic oxygen furnace. Neurocomputing, 149, Part C:1245 - 1252, 2015. [ bib | DOI | http ]
Abstract Case retrieval and case revise (reuse) are core parts of case-based reasoning (CBR). According to the problems that weights of condition attributes are difficult to evaluate in case retrieval, and there are few effective strategies for case revise, this paper introduces an improved case-based reasoning method based on fuzzy c-means clustering (FCM), mutual information and support vector machine (SVM). Fuzzy c-means clustering is used to divide case base to improve efficiency of the algorithm. In the case retrieval process, mutual information is introduced to calculate weights of each condition attribute and evaluate their contributions to reasoning results accurately. Considering the good ability of the support vector machine for dealing with limited samples, it is adopted to build an optical regression model for case revise. The proposed method is applied in endpoint prediction of Basic Oxygen Furnace (BOF), and simulation experiments based on a set of actual production data from a 180 t steelmaking furnace show that the model based on improved {CBR} achieves high prediction accuracy and good robustness.

Keywords: Case-based reasoning
[284] Changha Hwang, Dug Hun Hong, and Kyung Ha Seok. Support vector interval regression machine for crisp input and output data. Fuzzy Sets and Systems, 157(8):1114 - 1125, 2006. [ bib | DOI | http ]
Support vector regression (SVR) has been very successful in function estimation problems for crisp data. In this paper, we propose a robust method to evaluate interval regression models for crisp input and output data combining the possibility estimation formulation integrating the property of central tendency with the principle of standard SVR. The proposed method is robust in the sense that outliers do not affect the resulting interval regression. Furthermore, the proposed method is model-free method, since we do not have to assume the underlying model function for interval nonlinear regression model with crisp input and output. In particular, this method performs better and is conceptually simpler than support vector interval regression networks (SVIRNs) which utilize two radial basis function networks to identify the upper and lower sides of data interval. Five examples are provided to show the validity and applicability of the proposed method.

Keywords: Interval regression analysis
[285] JinXing Che, JianZhou Wang, and YuJuan Tang. Optimal training subset in a support vector regression electric load forecasting model. Applied Soft Computing, 12(5):1523 - 1531, 2012. [ bib | DOI | http ]
This paper presents an optimal training subset for support vector regression (SVR) under deregulated power, which has a distinct advantage over {SVR} based on the full training set, since it solves the problem of large sample memory complexity O(N2) and prevents over-fitting during unbalanced data regression. To compute the proposed optimal training subset, an approximation convexity optimization framework is constructed through coupling a penalty term for the size of the optimal training subset to the mean absolute percentage error (MAPE) for the full training set prediction. Furthermore, a special method for finding the approximate solution of the optimization goal function is introduced, which enables us to extract maximum information from the full training set and increases the overall prediction accuracy. The applicability and superiority of the presented algorithm are shown by the half-hourly electric load data (48 data points per day) experiments in New South Wales under three different sample sizes. Especially, the benefit of the developed methods for large data sets is demonstrated by the significantly less {CPU} running time.

Keywords: Support vector regression
[286] Shuangyin Liu, Haijiang Tai, Qisheng Ding, Daoliang Li, Longqin Xu, and Yaoguang Wei. A hybrid approach of support vector regression with genetic algorithm optimization for aquaculture water quality prediction. Mathematical and Computer Modelling, 58(3–4):458 - 465, 2013. Computer and Computing Technologies in Agriculture 2011 and Computer and Computing Technologies in Agriculture 2012. [ bib | DOI | http ]
Water quality prediction plays an important role in modern intensive river crab aquaculture management. Due to the nonlinearity and non-stationarity of water quality indicator series, the accuracy of the commonly used conventional methods, including regression analyses and neural networks, has been limited. A prediction model based on support vector regression (SVR) is proposed in this paper to solve the aquaculture water quality prediction problem. To build an effective {SVR} model, the {SVR} parameters must be set carefully. This study presents a hybrid approach, known as real-value genetic algorithm support vector regression (RGA–SVR), which searches for the optimal {SVR} parameters using real-value genetic algorithms, and then adopts the optimal parameters to construct the {SVR} models. The approach is applied to predict the aquaculture water quality data collected from the aquatic factories of YiXing, in China. The experimental results demonstrate that RGA–SVR outperforms the traditional {SVR} and back-propagation (BP) neural network models based on the root mean square error (RMSE) and mean absolute percentage error (MAPE). This RGA–SVR model is proven to be an effective approach to predict aquaculture water quality.

Keywords: Water quality prediction
[287] João Mendes-Moreira, Alípio Mário Jorge, Jorge Freire de Sousa, and Carlos Soares. Improving the accuracy of long-term travel time prediction using heterogeneous ensembles. Neurocomputing, 150, Part B:428 - 439, 2015. Special Issue on Information Processing and Machine Learning for Applications of EngineeringSolving Complex Machine Learning Problems with Ensemble MethodsVisual Analytics using Multidimensional ProjectionsSelected papers from the {IEEE} 17th International Conference on Intelligent Engineering Systems (INES’13)Selected papers from the Workshop on Visual Analytics using Multidimensional Projections, held at EuroVis 2013. [ bib | DOI | http ]
Abstract This paper is about long-term travel time prediction in public transportation. However, it can be useful for a wider area of applications. It follows a heterogeneous ensemble approach with dynamic selection. A vast set of experiments with a pool of 128 tuples of algorithms and parameter sets ( a & ps ) has been conducted for each of the six studied routes. Three different algorithms, namely, random forest, projection pursuit regression and support vector machines, were used. Then, ensembles of different sizes were obtained after a pruning step. The best approach to combine the outputs is also addressed. Finally, the best ensemble approach for each of the six routes is compared with the best individual a & ps . The results confirm that heterogeneous ensembles are adequate for long-term travel time prediction. Namely, they achieve both higher accuracy and robustness along time than state-of-the-art learners.

Keywords: Travel time prediction
[288] M.V. Suganyadevi and C.K. Babulal. Support vector regression model for the prediction of loadability margin of a power system. Applied Soft Computing, 24:304 - 315, 2014. [ bib | DOI | http ]
Abstract Loadability limits are critical points of particular interest in voltage stability assessment, indicating how much a system can be stressed from a given state before reaching instability. Thus estimating the loadability margin of a power system is essential in the real time voltage stability assessment. A new methodology is developed based on Support Vector Regression (SVR) which is the most common application form of Support Vector Machines (SVM). The proposed {SVR} methodology can successfully estimate the loadability margin under normal operating conditions and different loading directions. {SVR} has the feature of minimizing the generalization error in achieving the generalized network over the other mapping methods. In this paper, the {SVR} input vector is in the form of real and reactive power load, while the target vector is lambda (loading margin). To reduce both mean square error and prediction time in SVR, the kernel type and {SVR} parameters are chosen determined by using grid search based on 10-fold cross-validation method for the best {SVR} network. The results of {SVRs} (nu-SVR and epsilon-SVR) are compared with {RBF} neural networks and validated in the {IEEE} 30 bus system and {IEEE} 118 bus system at different operating scenarios. The results demonstrate the effectiveness of the proposed method for on-line prediction of loadability margins of a power system.

Keywords: Loadability margin
[289] Aslı Çelikyılmaz and I. Burhan Türkşen. Fuzzy functions with support vector machines. Information Sciences, 177(23):5163 - 5177, 2007. Including: Mathematics of UncertaintyA selection of the very best extended papers of the IMS-2004 held at Sarkaya University in Turkey. [ bib | DOI | http ]
A new fuzzy system modeling (FSM) approach that identifies the fuzzy functions using support vector machines (SVM) is proposed. This new approach is structurally different from the fuzzy rule base approaches and fuzzy regression methods. It is a new alternate version of the earlier {FSM} with fuzzy functions approaches. {SVM} is applied to determine the support vectors for each fuzzy cluster obtained by fuzzy c-means (FCM) clustering algorithm. Original input variables, the membership values obtained from the {FCM} together with their transformations form a new augmented set of input variables. The performance of the proposed system modeling approach is compared to previous fuzzy functions approaches, standard SVM, {LSE} methods using an artificial sparse dataset and a real-life non-sparse dataset. The results indicate that the proposed fuzzy functions with support vector machines approach is a feasible and stable method for regression problems and results in higher performances than the classical statistical methods.

Keywords: Fuzzy system modeling
[290] Aixia Yan, Yang Chong, Liyu Wang, Xiaoying Hu, and Kai Wang. Prediction of biological activity of aurora-a kinase inhibitors by multilinear regression analysis and support vector machine. Bioorganic & Medicinal Chemistry Letters, 21(8):2238 - 2243, 2011. [ bib | DOI | http ]
Several {QSAR} (quantitative structure–activity relationships) models for predicting the inhibitory activity of 117 Aurora-A kinase inhibitors were developed. The whole dataset was split into a training set and a test set based on two different methods, (1) by a random selection; and (2) on the basis of a Kohonen’s self-organizing map (SOM). Then the inhibitory activity of 117 Aurora-A kinase inhibitors was predicted using multilinear regression (MLR) analysis and support vector machine (SVM) methods, respectively. For the two {MLR} models and the two {SVM} models, for the test sets, the correlation coefficients of over 0.92 were achieved.

Keywords: Aurora-A kinase inhibitors
[291] Hannes Feilhauer, Gregory P. Asner, and Roberta E. Martin. Multi-method ensemble selection of spectral bands related to leaf biochemistry. Remote Sensing of Environment, 164:57 - 65, 2015. [ bib | DOI | http ]
Abstract Multi-method ensembles are generally believed to return more reliable results than the application of one method alone. Here, we test if for the quantification of leaf traits an ensemble of regression models, consisting of Partial Least Squares (PLSR), Random Forest (RFR), and Support Vector Machine regression (SVMR) models, is able to improve the robustness of the spectral band selection process compared to the outcome of a single technique alone. The ensemble approach was tested using one artificial and five measured data sets of leaf level spectra and corresponding information on leaf chlorophyll, dry matter, and water content. {PLSR} models optimized for the goodness of fit, an established approach for band selection, were used to evaluate the performance of the ensemble. Although the fits of the models within the ensemble were poorer than the fits achieved with the reference approach, the ensemble was able to provide a band selection with higher consistency across all data sets. Due to the selection characteristics of the methods within the ensemble, the ensemble selection is moderately narrow and restrictive but in good agreement with known absorption features published in literature. We conclude that analyzing the range of agreement of different model types is an efficient way to select a robust set of spectral bands related to the foliar properties under investigation. This may help to deepen our understanding of the spectral response of biochemical and biophysical traits in foliage and canopies.

Keywords: Hyperspectral
[292] K. De Brabanter, J. De Brabanter, J.A.K. Suykens, and B. De Moor. Optimized fixed-size kernel models for large data sets. Computational Statistics & Data Analysis, 54(6):1484 - 1504, 2010. [ bib | DOI | http ]
A modified active subset selection method based on quadratic Rényi entropy and a fast cross-validation for fixed-size least squares support vector machines is proposed for classification and regression with optimized tuning process. The kernel bandwidth of the entropy based selection criterion is optimally determined according to the solve-the-equation plug-in method. Also a fast cross-validation method based on a simple updating scheme is developed. The combination of these two techniques is suitable for handling large scale data sets on standard personal computers. Finally, the performance on test data and computational time of this fixed-size method are compared to those for standard support vector machines and ν -support vector machines resulting in sparser models with lower computational cost and comparable accuracy.

Keywords: Kernel methods
[293] Shahaboddin Shamshirband, Dalibor Petković, Hadi Saboohi, Nor Badrul Anuar, Irum Inayat, Shatirah Akib, Žarko Ćojbašić, Vlastimir Nikolić, Miss Laiha Mat Kiah, and Abdullah Gani. Wind turbine power coefficient estimation by soft computing methodologies: Comparative study. Energy Conversion and Management, 81:520 - 526, 2014. [ bib | DOI | http ]
Abstract Wind energy has become a large contender of traditional fossil fuel energy, particularly with the successful operation of multi-megawatt sized wind turbines. However, reasonable wind speed is not adequately sustainable everywhere to build an economical wind farm. In wind energy conversion systems, one of the operational problems is the changeability and fluctuation of wind. In most cases, wind speed can vacillate rapidly. Hence, quality of produced energy becomes an important problem in wind energy conversion plants. Several control techniques have been applied to improve the quality of power generated from wind turbines. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of support vector regression (SVR) to estimate optimal power coefficient value of the wind turbines. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound so as to achieve generalized performance. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by the {SVR} approach in compare to other soft computing methodologies.

Keywords: Wind turbine
[294] Wen Zhang, Ye Yang, and Qing Wang. Using bayesian regression and {EM} algorithm with missing handling for software effort prediction. Information and Software Technology, 58:58 - 70, 2015. [ bib | DOI | http ]
AbstractContext Although independent imputation techniques are comprehensively studied in software effort prediction, there are few studies on embedded methods in dealing with missing data in software effort prediction. Objective We propose {BREM} (Bayesian Regression and Expectation Maximization) algorithm for software effort prediction and two embedded strategies to handle missing data. Method The {MDT} (Missing Data Toleration) strategy ignores the missing data when using {BREM} for software effort prediction and the {MDI} (Missing Data Imputation) strategy uses observed data to impute missing data in an iterative manner while elaborating the predictive model. Results Experiments on the {ISBSG} and {CSBSG} datasets demonstrate that when there are no missing values in historical dataset, {BREM} outperforms {LR} (Linear Regression), {BR} (Bayesian Regression), {SVR} (Support Vector Regression) and M5′ regression tree in software effort prediction on the condition that the test set is not greater than 30% of the whole historical dataset for {ISBSG} dataset and 25% of the whole historical dataset for {CSBSG} dataset. When there are missing values in historical datasets, {BREM} with the {MDT} and {MDI} strategies significantly outperforms those independent imputation techniques, including MI, BMI, CMI, {MINI} and M5′. Moreover, the {MDI} strategy provides {BREM} with more accurate imputation for the missing values than those given by the independent missing imputation techniques on the condition that the level of missing data in training set is not larger than 10% for both {ISBSG} and {CSBSG} datasets. Conclusion The experimental results suggest that {BREM} is promising in software effort prediction. When there are missing values, the {MDI} strategy is preferred to be embedded with BREM.

Keywords: Bayesian regression
[295] Christophe Crambes, Ali Gannoun, and Yousri Henchiri. Support vector machine quantile regression approach for functional data: Simulation and application studies. Journal of Multivariate Analysis, 121:50 - 68, 2013. [ bib | DOI | http ]
Abstract The topic of this paper is related to quantile regression when the covariate is a function. The estimator we are interested in, based on the Support Vector Machine method, was introduced in Crambes et al. (2011) [11]. We improve the results obtained in this former paper, giving a rate of convergence in probability of the estimator. In addition, we give a practical method to construct the estimator, solution of a penalized L 1 -type minimization problem, using an Iterative Reweighted Least Squares procedure. We evaluate the performance of the estimator in practice through simulations and a real data set study.

Keywords: Conditional quantile regression
[296] Baixi Xing, Kejun Zhang, Shouqian Sun, Lekai Zhang, Zenggui Gao, Jiaxi Wang, and Shi Chen. Emotion-driven chinese folk music-image retrieval based on de-svm. Neurocomputing, 148:619 - 627, 2015. [ bib | DOI | http ]
Abstract In this study, we attempt to explore cross-media retrieval between music and image data based on the emotional correlation. Emotion feature analytic could be the bridge of cross-media retrieval, since emotion represents the user׳s perspective and effectively meets the user׳s retrieval need. Currently, there is little research about the emotion correlation of different multimedia data (e.g. image or music). We propose a promising model based on Differential Evolutionary-Support Vector Machine (DE-SVM) to build up the emotion-driven cross-media retrieval system between Chinese folk image and Chinese folk music. In this work, we first build up the Chinese Folk Music Library and Chinese Folk Image Library.Second, we compare Back Propagation(BP), Linear Regression(LR) and Differential Evolutionary-Support Vector Machine (DE-SVM), and find that DE-SVM has the best performance. Then we conduct DE-SVM to build the optimal model for music/image emotion recognition. Finally, an Emotion-driven Chinese Folk Music-Image Exploring System based on DE-SVM is developed and experiment results show our method is effective in terms of retrieval performance.

Keywords: Music emotion recognition
[297] Kyuho Hwang and Sooyong Choi. Blind equalizer for constant-modulus signals based on gaussian process regression. Signal Processing, 92(6):1397 - 1403, 2012. [ bib | DOI | http ]
A new blind equalization method for constant modulus (CM) signals based on Gaussian process for regression (GPR) by incorporating a constant modulus algorithm (CMA)-like error function into the conventional {GPR} framework is proposed. The {GPR} framework formulates the posterior density function for weights using Bayes' rule under the assumption of Gaussian prior for weights. The proposed blind {GPR} equalizer is based on linear-in-weights regression model, which has a form of nonlinear minimum mean-square error solution. Simulation results in linear and nonlinear channels are presented in comparison with the state-of-the-art support vector machine (SVM) and relevance vector machine (RVM) based blind equalizers. The simulation results show that the proposed blind {GPR} equalizer without cumbersome cross-validation procedures shows the similar performances to the blind {SVM} and {RVM} equalizers in terms of intersymbol interference and bit error rate.

Keywords: Gaussian process regression
[298] M.H. Fatemi, E. Mousa Shahroudi, and Z. Amini. Development of quantitative interspecies toxicity relationship modeling of chemicals to fish. Journal of Theoretical Biology, 380:16 - 23, 2015. [ bib | DOI | http ]
Abstract In this work, quantitative interspecies-toxicity relationship methodologies were used to improve the prediction power of interspecies toxicity model. The most relevant descriptors selected by stepwise multiple linear regressions and toxicity of chemical to Daphnia magna were used to predict the toxicities of chemicals to fish. Modeling methods that were used for developing linear and nonlinear models were multiple linear regression (MLR), random forest (RF), artificial neural network (ANN) and support vector machine (SVM). The obtained results indicate the superiority of {SVM} model over other models. Robustness and reliability of the constructed {SVM} model were evaluated by using the leave-one-out cross-validation method (Q2=0.69, SPRESS=0.822) and Y-randomization test (R2=0.268 for 30 trail). Furthermore, the chemical applicability domains of these models were determined via leverage approach. The developed {SVM} model was used for the prediction of toxicity of 46 compounds that their experimental toxicities to a fish were not being reported earlier from their toxicities to D. magna and relevant molecular descriptors.

Keywords: Toxicity
[299] Feng Gao, Peng Kou, Lin Gao, and Xiaohong Guan. Boosting regression methods based on a geometric conversion approach: Using {SVMs} base learners. Neurocomputing, 113:67 - 87, 2013. [ bib | DOI | http ]
Boosting is one of the most important developments in ensemble learning during the past decade. Among different types of boosting methods, AdaBoost is the earliest and the most prevailing one that receives lots of attention for its effectiveness and practicality. Hitherto the research on boosting is dominated by classification problems. Conversely, the extension of boosting to regression is not as successful as that on classification. In this paper, we propose a new approach to extending boosting to regression. This approach first converts a regression sample to a binary classification sample from a geometric point of view, and performs AdaBoost with support vector machines base learner on the converted classification sample. Then the separating hypersurface ensemble obtained from AdaBoost is equivalent to a regression function for the original regression sample. Based on this approach, two new boosting regression methods are presented. The first method adopts the explicit geometric conversion while the second method adopts the implicit geometric conversion. Since both these methods essentially run on the binary classification samples, the convergence property of the standard AdaBoost still holds for them. Experimental results validate the effectiveness of the proposed methods.

Keywords: Boosting
[300] Aihua Zhang, Yongchao Wang, and Zhiqiang Zhang. A novel online performance evaluation strategy to analog circuit. Neurocomputing, pages -, 2015. [ bib | DOI | http ]
Abstract An analog circuit performance online evaluation approach is presented subject to the inevitable actualities of the fault value caused during the data collection process. The multi-model with the corresponding features is modeled via fuzzy clustering based data features firstly. And then the developed scheme relies on a weighted combination of normal least square support vector regression (LSSVR) and particle swarm optimization (PSO) to realize the active suppression for the wrong value and disturbance parameters. Furthermore, another problem should be considered; namely, the traditional offline evaluation approach could not realize the model׳s timely adjustment with the sample increasing or decreasing. Focusing on this issue, the increase and decrease interaction update idea is imported to the modified performance evaluation scheme. The developed model can be updated quickly online. Numerical testing data information supported by the college analog circuit experiments adopted eight performance indexes of the traditional {OTL} amplifier to establish training set. This data information had been obtained via precision instrument evaluation in two years. Numerical simulations are preformed to verify the performance of the proposed approach.

Keywords: PSO–LSSVR
[301] Jing Geng, Ming-Wei Li, Zhi-Hui Dong, and Yu-Sheng Liao. Port throughput forecasting by mars-rsvr with chaotic simulated annealing particle swarm optimization algorithm. Neurocomputing, 147:239 - 250, 2015. Advances in Self-Organizing Maps Subtitle of the special issue: Selected Papers from the Workshop on Self-Organizing Maps 2012 (WSOM 2012). [ bib | DOI | http ]
Abstract Port throughput forecasting is a very complex nonlinear dynamic process, prediction accuracy is influenced by uncertainty of socio-economic factors, especially by the mixed noise (singular point) produced in the collection, transfer and calculation of statistical data; consequently, it is difficult to obtain a satisfactory port throughput forecasting result. Thus, establishing an effective port throughput forecasting scheme is still a significant research issue. Since the robust v-support vector regression model (RSVR) has the ability to solve the nonlinear and mixed noise in the port throughput history data and its related socio-economic factors, this paper introduces the {RSVR} model to forecast port throughput. In order to search the more appropriate parameters combination for the {RSVR} model, considering the proposed simulated annealing particle swarm optimization (SAPSO) algorithm and the original {PSO} algorithm still have the drawbacks of immature convergence and is time consuming, this study presents chaotic simulated annealing particle swarm optimization(CSAPSO) algorithm to determine the parameter combination. Aiming to identify the final input vectors for {RSVR} model, the multivariable adaptive regression splines (MARS) is adopted to select the final input vectors from the candidate input variables. This study eventually proposes a port throughput forecasting scheme that hybridizes the RSVR, {CSAPSO} and {MARS} to obtain a more accurate forecasting result. Subsequently, this study compiles the port throughput data and the corresponding socio-economic indicators data of Shanghai as the illustrative example to evaluate the feasibility and performance of the proposed scheme. The experimental results indicate that the proposed port throughput forecasting scheme obtains better forecasting result than the six competing models in terms of forecasting error.

Keywords: Port throughput
[302] Sounak Chakraborty. Bayesian multiple response kernel regression model for high dimensional data and its practical applications in near infrared spectroscopy. Computational Statistics & Data Analysis, 56(9):2742 - 2755, 2012. [ bib | DOI | http ]
Non-linear regression based on reproducing kernel Hilbert space (RKHS) has recently become very popular in fitting high-dimensional data. The {RKHS} formulation provides an automatic dimension reduction of the covariates. This is particularly helpful when the number of covariates ( p ) far exceed the number of data points. In this paper, we introduce a Bayesian nonlinear multivariate regression model for high-dimensional problems. Our model is suitable when we have multiple correlated observed response corresponding to same set of covariates. We introduce a robust Bayesian support vector regression model based on a multivariate version of Vapnik’s ϵ -insensitive loss function. The likelihood corresponding to the multivariate Vapnik’s ϵ -insensitive loss function is constructed as a scale mixture of truncated normal and gamma distribution. The regression function is constructed using the finite representation of a function in the reproducing kernel Hilbert space (RKHS). The kernel parameter is estimated adaptively by assigning a prior on it and using the Markov chain Monte Carlo (MCMC) techniques for computation. Practical applications of our model are demonstrated via applications in near-infrared (NIR) spectroscopy and simulation studies. Our Bayesian kernel models are highly accurate in predicting composition of materials based on its near infrared (NIR) spectroscopy signature. We have compared our method with popularly used methodologies in {NIR} spectroscopy, like partial least square (PLS), principal component regression (PCA), support vector machine (SVM), Gaussian process regression (GPR), and random forest (RF). In all the simulation and real case studies, our multivariate Bayesian {RKHS} regression model outperforms the standard methods by a substantially large margin. The implementation of our models based on {MCMC} is fairly fast and straight forward.

Keywords: Bayesian prediction
[303] Jin-Tsong Jeng, Chen-Chia Chuang, and Chin-Wang Tao. Hybrid svmr-gpr for modeling of chaotic time series systems with noise and outliers. Neurocomputing, 73(10–12):1686 - 1693, 2010. Subspace Learning / Selected papers from the European Symposium on Time Series Prediction. [ bib | DOI | http ]
In this paper, the hybrid support vector machines for regression (SVMR) and Gaussian processes for regression (GPR) are proposed to deal with training data set with noise and outliers for the chaotic time series systems. In the proposed approach, there are two-stage strategies and can be a sparse approximation. In stage I, the {SVMR} approach is used to filter out some large noise and outliers in the training data set. Because the large noises and outliers in the training data set are almost removed, the affection of large noises and outliers is also reduced. That is, the proposed approach can be against the large noise and outliers. Hence, the proposed approach is also a robust approach. After stage I, the rest of the training data set is directly used to train the {GPR} in stage II. From the simulation results, the performance of the proposed approach is superior to least squares support vector machines regression (LS-SVMR), GPR, weighted LS-SVM and robust support vector regression networks when there are noise and outliers on the chaotic time-series systems.

Keywords: Support vector machine regression
[304] Jun-Hu Cheng, Da-Wen Sun, Hongbin Pu, and Zhiwei Zhu. Development of hyperspectral imaging coupled with chemometric analysis to monitor k value for evaluation of chemical spoilage in fish fillets. Food Chemistry, 185:245 - 253, 2015. [ bib | DOI | http ]
Abstract K value is an important freshness index widely used for indication of nucleotide degradation and assessment of chemical spoilage. The feasibility of hyperspectral imaging (400–1000 nm) for determination of K value in grass carp and silver carp fillets was investigated. Partial least square (PLS) regression and least square support vector machines (LS-SVM) models established using full wavelengths showed excellent performances and the {PLS} model was better with higher determination coefficients of prediction (R2P = 0.936) and lower root mean square errors of prediction (RMSEP = 5.21%). The simplified {PLS} and LS-SVM models using the seven optimal wavelengths selected by successive projections algorithm (SPA) also presented good performances. The spatial distribution map of K value was generated by transferring the SPA-PLS model to each pixel of the images. The current study showed the suitability of using hyperspectral imaging to determine K value for evaluation of chemical spoilage and freshness of fish fillets.

Keywords: Hyperspectral imaging
[305] Soheil Sarhadi and Turaj Amraee. Robust dynamic network expansion planning considering load uncertainty. International Journal of Electrical Power & Energy Systems, 71:140 - 150, 2015. [ bib | DOI | http ]
Abstract This paper presents a dynamic transmission expansion planning framework with considering load uncertainty based on Information-Gap Decision Theory. Dynamic transmission planning process is carried out to obtain the minimum total social cost over the planning horizon. Robustness of the decisions against under-estimated load predictions is modeled using a robustness function. Furthermore, an opportunistic model is proposed for risk-seeker decision making. The proposed IGDT-based dynamic network expansion planning is formulated as a stochastic mixed integer non-linear problem and is solved using an improved standard branch and bound technique. The performance of the proposed scheme is verified over two test cases including the 24-bus {IEEE} {RTS} system and Iran national 400-kV transmission network.

Keywords: Information-Gap Decision Theory
[306] Zhengzong Wu, Enbo Xu, Jie Long, Fang Wang, Xueming Xu, Zhengyu Jin, and Aiquan Jiao. Measurement of fermentation parameters of chinese rice wine using raman spectroscopy combined with linear and non-linear regression methods. Food Control, 56:95 - 102, 2015. [ bib | DOI | http ]
Abstract Effective fermentation monitoring is a growing need during the manufacture of wine due to the rapid pace of change in the wine industry. Ethanol and reducing sugar are two most important process variables indicating the status of Chinese rice wine (CRW) fermentation process. In this study, the potentials of Raman spectroscopy (RS) as a rapid process analytical technique to monitor the evolution of these two chemical parameters involved in {CRW} fermentation process and to group samples according to different fermentation stages were investigated. The results demonstrated that compared with the {PLS} model using all wavelengths of Raman spectra, the prediction precision of model based on the spectral variables selected by competitive adaptive reweighted sampling (Cars) was significantly improved. In addition, nonlinear models outperformed linear models in predicting fermentation parameters. After systemically comparison and discussion, it was found that for both ethanol and glucose, Cars-support vector machine (Cars-SVM) models gave the best results with the highest prediction precisions. Moreover, the results obtained from discriminant partial least squares analysis (DPLS) showed that good performances were obtained with an average correct classification rate of 94.9% for different fermentation stages. The overall results indicated that {RS} combined with efficient variable selection algorithm and nonlinear regression tool could be utilized as a rapid method to monitor {CRW} fermentation process.

Keywords: Chinese rice wine
[307] R. Taghizadeh-Mehrjardi, K. Nabiollahi, B. Minasny, and J. Triantafilis. Comparing data mining classifiers to predict spatial distribution of usda-family soil groups in baneh region, iran. Geoderma, 253–254:67 - 77, 2015. [ bib | DOI | http ]
Abstract Digital soil mapping involves the use of auxiliary data to assist in the mapping of soil classes. In this research, we investigate the predictive power of 6 data mining classifiers, namely Logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), K-nearest neighbour (KNN), random forest (RF), and decision tree model (DTM) to create a {DSM} across an area covering of 3000 ha in Kurdistan Province, north-west Iran. In this area, using the conditioned Latin hypercube sampling method, 217 soil profiles were selected, sampled, analysed and allocated to taxonomic classes according to Soil Taxonomy up to family level. To test the user accuracy (UA) we established a calibration and validation set (70:30%). Of the 5 soil family classes we map, the highest overall accuracy (0.71) and kappa index (0.69) are achieved using the {DTA} and {ANN} method. More specifically, the {UA} of prediction was up to 18.33% better in comparison to LR. Moreover, our results showed that no improvement was obtained in prediction accuracy of {DTA} algorithm with minimizing taxonomic distance compared to minimizing misclassification error (0.71). Overall, our results suggest that the developed methodology could be used to predict soil classes in the other regions of Iran.

Keywords: Digital soil mapping
[308] Yunfeng Xu, Chunzi Ma, Qiang Liu, Beidou Xi, Guangren Qian, Dayi Zhang, and Shouliang Huo. Method to predict key factors affecting lake eutrophication – a new approach based on support vector regression model. International Biodeterioration & Biodegradation, 102:308 - 315, 2015. CESE-2014 – Challenges in Environmental Science and Engineering Series Conference. [ bib | DOI | http ]
Abstract Developing quantitative relationship between environmental factors and eutrophic indices: chlorophyll-a (Chl-a), total nitrogen (TN) and total phosphorus (TP), is highly desired for lake management to prevent eutrophication. In this paper, Support Vector Regression model (SVR) was introduced to fulfill this purpose and the obtained result was compared with previous developed model, back propagation artificial neural network (BP-ANN). Results indicate {SVR} is more effective for the predication of Chl-a, {TN} and {TP} concentrations with less mean relative error (MRE) compared with BP-ANN. The optimal kernel function of {SVR} model was identified as {RBF} function. With optimized C and ε obtained in training process, {SVR} could successfully predict Chl-a, {TN} and {TP} concentrations in Chaohu lake based on other environmental factors observation.

Keywords: Support vector regression
[309] Bouhouche Salah, Mentouri Zoheir, Ziani Slimane, and Bast Jurgen. Inferential sensor-based adaptive principal components analysis of mould bath level for breakout defect detection and evaluation in continuous casting. Applied Soft Computing, 34:120 - 128, 2015. [ bib | DOI | http ]
Abstract This paper is concerned with a method for breakout defect detection and evaluation in a continuous casting process. This method uses adaptive principal component analysis (APCA) as a predictor of inputs–outputs model, which are defined by the mould bath level and casting speed. The main difficulties that cause breakout in continuous casting are, generally, phenomenon related to the non-linear and unsteady state of the metal solidification process. {PCA} is a modelling method based on linear projection of the principal components; the adaptive version developed in this work uses the sliding window technique for the estimation of the model parameters. This recursive form updates the new model parameters; it gives a reliable and accurate prediction. Simulation results compare PCA, APCA, non-linear system identification using neural network (NN) and support vector regression (SVR) methods showing that the {APCA} gives the best Mean Squared Error (MSE). Based on the MSE, the proposed approach is analyzed, tested and improved to give an accurate breakout detection and evaluation system.

Keywords: Soft sensor
[310] Shervin Motamedi, Shahaboddin Shamshirband, Roslan Hashim, Dalibor Petković, and Chandrabhushan Roy. Estimating unconfined compressive strength of cockle shell–cement–sand mixtures using soft computing methodologies. Engineering Structures, 98:49 - 58, 2015. [ bib | DOI | http ]
Abstract The accuracy of soft computing techniques was used in this research to estimate the unconfined compressive strength according to series of unconfined compressive tests for multiple mixtures of cockle shell, cement and sand under different curing periods. We developed a process for simulating the unconfined compressive strength through two techniques of soft computing, the support vector regression (SVR) and the adaptive neuro-fuzzy inference (ANFIS). The developed {SVR} and {ANFIS} networks have one neuron (UCS) in the output layer and four neurons in the input layer. The inputs were percentage of cockle shell, cement and sand content in the mixtures, and age (in days). First, the {ANFIS} network was used to select the most effective parameters on the UCS. The linear, polynomial, and radial basis functions were employed as the SVR’s kernel function. The simulation results proved the performance of proposed optimizers. Additionally, the results of {SVR} and {ANFIS} were compared through the Pearson correlation coefficient and the root-mean-square error. The findings show that the predictive accuracy and capability of generalization can be an improved by the {ANFIS} approach in comparison to the {SVR} estimation. The simulation results confirmed the effectiveness of the proposed optimization strategies.

Keywords: Cockle shell
[311] Ion Marques, Manuel Graña, Anna Kamińska-Chuchmała, and Bruno Apolloni. An experiment of subconscious intelligent social computing on household appliances. Neurocomputing, 167:32 - 43, 2015. [ bib | DOI | http ]
Abstract Subconscious Social Intelligence refers to the design of social services oriented towards user problem solving, providing an underlying innovation layer is able to generate new solutions to yet unknown problems. The innovation layer is achieved by Computational Intelligence techniques, encompassing machine learning to build models of user satisfaction over solution quality, and stochastic search as the means for innovation generation. The SandS project provides an instance of such paradigm, where household appliances are the subject of the social service. This paper proposes a specific architecture, reporting results on a synthetic database build according to SandS project current designs. Database synthesis for system tuning and validation is a critical issue, hence the paper details the considerations guiding its design and generation, as well as the validation procedure ensuring the ecological validity of the innovation process simulation. The architecture is composed of a Support Vector Regression (SVR) module for user satisfaction modeling, and an Evolution Strategy (ES) achieving recipe innovation. The paper reports some computational experiments that may guide the real life implementation. The reported results are methodologically sound as far as they are independent of the generation process.

Keywords: Subconscious social intelligence
[312] Kiyoumars Roushangar and Ali Koosheh. Evaluation of ga-svr method for modeling bed load transport in gravel-bed rivers. Journal of Hydrology, 527:1142 - 1152, 2015. [ bib | DOI | http ]
Summary The aim of the present study is to apply Support Vector Regression (SVR) method to predict bed load transport rates for three gravel-bed rivers. Different combinations of hydraulic parameters are used as inputs for modeling bed load transport using four kernel functions of {SVR} models. Genetic Algorithm (GA) method is applicably administered to determine optimal {SVR} parameters. The GA-SVR models are developed and tested using the available data sets, and consecutive predicted results are compared in terms of Efficiency Coefficient and Correlation Coefficient. Obtained results show that the GA-SVR models with Exponential Radial Basis Function (ERBF) kernel present higher accuracy than the other applied GA-SVR models. Furthermore, testing data sets are predicted by Einstein and Meyer-Peter and Müller (MPM) formulas. The GA-SVR models demonstrate a better performance compared to the traditional bed load formulas. Finally, high bed load transport values were eliminated from data sets and the models are re-analyzed. The elimination of high bed load transport rates improves prediction accuracy using GA-SVR method.

Keywords: Bed load transport
[313] Peng Tan, Cheng Zhang, Ji Xia, Qing-Yan Fang, and Gang Chen. Estimation of higher heating value of coal based on proximate analysis using support vector regression. Fuel Processing Technology, 138:298 - 304, 2015. [ bib | DOI | http ]
Abstract To estimate the higher heating value (HHV) of coals based on proximate analysis, a nonlinear model termed support vector regression (SVR) is introduced in this work. A total of 167 Chinese coal samples and 4540 U.S. coal samples were employed to develop and verify the SVR-based correlations. The estimation results indicated that the average absolute errors from estimating the {HHV} of Chinese and U.S. coals were only 2.16% and 2.42%, respectively. Some published correlations were also employed and redeveloped with the Chinese and U.S. coals to obtain a comparison with the SVR-based correlations developed in the present work. The results indicate that the SVR-based correlations can be more accurate than the published correlations. Attempts were also made to develop a universal correlation for coals from different regions. The simulation results indicate that the correlation between the proximate analysis and {HHV} of coals from different geographical regions is varied. For coals from different regions, developing and using different correlations can obtain much higher accuracy in estimating the {HHV} from proximate analysis.

Keywords: Higher heating value
[314] Yuthana Sethapramote. Synchronization of business cycles and economic policy linkages in {ASEAN}. Journal of Asian Economics, 39:126 - 136, 2015. [ bib | DOI | http ]
Abstract We investigate business cycle synchronization and economic policy linkage in the Association of Southeast Asian Nations (ASEAN). Two important findings are addressed. First, we measure static and dynamic correlations in both macroeconomic variables and policy variables. The vector autoregression and the dynamic conditional correlation model are applied to capture the dynamics of the co-movement pattern in particular. The empirical results show evidence of synchronization in key macroeconomic variables such as gross domestic product, inflation, export, and exchange rates within ASEAN. However, supporting evidence of economic policy linkages are found in only a few cases. Second, the panel regressions show that trade integration is the main factor in the synchronization of the business cycles within ASEAN. Moreover, monetary policy linkage contributes to this co-movement pattern. Financial integration is an important factor only in the correlation between {ASEAN} and the United States, while the role of fiscal policy linkage is not significant in every case.

Keywords: Business cycle synchronization
[315] Fei Ma, Hao Qin, Kefu Shi, Cunliu Zhou, Conggui Chen, Xiaohua Hu, and Lei Zheng. Feasibility of combining spectra with texture data of multispectral imaging to predict heme and non-heme iron contents in pork sausages. Food Chemistry, 190:142 - 149, 2016. [ bib | DOI | http ]
Abstract To precisely determine heme and non-heme iron contents in meat product, the feasibility of combining spectral with texture features extracted from multispectral imaging data (405–970 nm) was assessed. In our study, spectra and textures of 120 pork sausages (PSs) treated by different temperatures (30–80 °C) were analyzed using different calibration models including partial least squares regression (PLSR) and {LIB} support vector machine (Lib-SVM) for predicting heme and non-heme iron contents in PSs. Based on a combination of spectral and textural features, optimized {PLSR} models were obtained with determination coefficient (R2) of 0.912 for heme and of 0.901 for non-heme iron prediction, which demonstrated the superiority of combining spectra with texture data. Results of satisfactory determination and visualization of heme and non-heme iron contents indicated that multispectral imaging could serve as a feasible approach for online industrial applications in the future.

Keywords: Multispectral imaging
[316] Adam Vaughan and Stanislav V. Bohac. Real-time, adaptive machine learning for non-stationary, near chaotic gasoline engine combustion time series. Neural Networks, 70:18 - 26, 2015. [ bib | DOI | http ]
Abstract Fuel efficient Homogeneous Charge Compression Ignition (HCCI) engine combustion timing predictions must contend with non-linear chemistry, non-linear physics, period doubling bifurcation(s), turbulent mixing, model parameters that can drift day-to-day, and air–fuel mixture state information that cannot typically be resolved on a cycle-to-cycle basis, especially during transients. In previous work, an abstract cycle-to-cycle mapping function coupled with ϵ -Support Vector Regression was shown to predict experimentally observed cycle-to-cycle combustion timing over a wide range of engine conditions, despite some of the aforementioned difficulties. The main limitation of the previous approach was that a partially acasual randomly sampled training dataset was used to train proof of concept offline predictions. The objective of this paper is to address this limitation by proposing a new online adaptive Extreme Learning Machine (ELM) extension named Weighted Ring-ELM. This extension enables fully causal combustion timing predictions at randomly chosen engine set points, and is shown to achieve results that are as good as or better than the previous offline method. The broader objective of this approach is to enable a new class of real-time model predictive control strategies for high variability {HCCI} and, ultimately, to bring HCCI’s low engine-out {NO} x and reduced {CO2} emissions to production engines.

Keywords: Non-linear
[317] Heikki Huttunen and Jussi Tohka. Model selection for linear classifiers using bayesian error estimation. Pattern Recognition, 48(11):3739 - 3748, 2015. [ bib | DOI | http ]
Abstract Regularized linear models are important classification methods for high dimensional problems, where regularized linear classifiers are often preferred due to their ability to avoid overfitting. The degree of freedom of the model dis determined by a regularization parameter, which is typically selected using counting based approaches, such as K-fold cross-validation. For large data, this can be very time consuming, and, for small sample sizes, the accuracy of the model selection is limited by the large variance of {CV} error estimates. In this paper, we study the applicability of a recently proposed Bayesian error estimator for the selection of the best model along the regularization path. We also propose an extension of the estimator that allows model selection in multiclass cases and study its efficiency with {L1} regularized logistic regression and {L2} regularized linear support vector machine. The model selection by the new Bayesian error estimator is experimentally shown to improve the classification accuracy, especially in small sample-size situations, and is able to avoid the excess variability inherent to traditional cross-validation approaches. Moreover, the method has significantly smaller computational complexity than cross-validation.

Keywords: Logistic regression
[318] Hanchen Xiong, Sandor Szedmak, and Justus Piater. Scalable, accurate image annotation with joint {SVMs} and output kernels. Neurocomputing, 169:205 - 214, 2015. Learning for Visual Semantic Understanding in Big DataESANN 2014Industrial Data Processing and AnalysisSelected papers from the 22nd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2014)Selected papers from the 11th World Congress on Intelligent Control and Automation (WCICA2014). [ bib | DOI | http ]
Abstract This paper studies how joint training of multiple support vector machines (SVMs) can improve the effectiveness and efficiency of automatic image annotation. We cast image annotation as an output-related multi-task learning framework, with the prediction of each tag׳s presence as one individual task. Evidently, these tasks are related via dependencies between tags. The proposed joint learning framework, which we call joint SVM, is superior to other related models in its impressive and flexible mechanisms in exploiting the dependencies between tags: first, a linear output kernel can be implicitly learned when we train a joint SVM; or, a pre-designed kernel can be explicitly applied by users when prior knowledge is available. Also, a practical merit of joint {SVM} is that it shares the same computational complexity as one single conventional SVM, although multiple tasks are solved simultaneously. Although derived from the perspective of multi-task learning, the proposed joint {SVM} is highly related to structured-output learning techniques, e.g. max-margin regression (Szedmak and Shawe-taylor [1]), structural {SVM} (Tsochantaridis [2]). According to our empirical results on several image-annotation benchmark databases, our joint training strategy of {SVMs} can yield substantial improvements, in terms of both accuracy and efficiency, over training them independently. In particular, it compares favorably with many other state-of-the-art algorithms. We also develop a “perceptron-like” online learning scheme for joint {SVM} to enable it to scale up better to huge data in real-world practice.

Keywords: Image annotation
[319] A. Sanz-Garcia, J. Fernandez-Ceniceros, F. Antonanzas-Torres, A.V. Pernia-Espinoza, and F.J. Martinez de Pison. Ga-parsimony: A ga-svr approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace. Applied Soft Computing, 35:13 - 28, 2015. [ bib | DOI | http ]
Abstract This article proposes a new genetic algorithm (GA) methodology to obtain parsimonious support vector regression (SVR) models capable of predicting highly precise setpoints in a continuous annealing furnace (GA-PARSIMONY). The proposal combines feature selection, model tuning, and parsimonious model selection in order to achieve robust {SVR} models. To this end, a novel {GA} selection procedure is introduced based on separate cost and complexity evaluations. The best individuals are initially sorted by an error fitness function, and afterwards, models with similar costs are rearranged according to model complexity measurement so as to foster models of lesser complexity. Therefore, the user-supplied penalty parameter, utilized to balance cost and complexity in other fitness functions, is rendered unnecessary. GA-PARSIMONY performed similarly to classical {GA} on twenty benchmark datasets from public repositories, but used a lower number of features in a striking 65% of models. Moreover, the performance of our proposal also proved useful in a real industrial process for predicting three temperature setpoints for a continuous annealing furnace. The results demonstrated that GA-PARSIMONY was able to generate more robust {SVR} models with less input features, as compared to classical GA.

Keywords: Genetic algorithms
[320] Xiao Han, Miao Ge, Jie Dong, Ranying Xue, Zixuan Wang, and Jinwei He. Geographical distribution of reference value of aging people's left ventricular end systolic diameter based on the support vector regression. Experimental Gerontology, 57:250 - 255, 2014. [ bib | DOI | http ]
AbstractAim The aim of this paper is to analyze the geographical distribution of reference value of aging people's left ventricular end systolic diameter (LVDs), and to provide a scientific basis for clinical examination. Methods The study is focus on the relationship between reference value of left ventricular end systolic diameter of aging people and 14 geographical factors, selecting 2495 samples of left ventricular end systolic diameter (LVDs) of aging people in 71 units of China, in which including 1620 men and 875 women. By using the Moran's I index to make sure the relationship between the reference values and spatial geographical factors, extracting 5 geographical factors which have significant correlation with left ventricular end systolic diameter for building the support vector regression, detecting by the method of paired sample t test to make sure the consistency between predicted and measured values, finally, makes the distribution map through the disjunctive kriging interpolation method and fits the three-dimensional trend of normal reference value. Results It is found that the correlation between the extracted geographical factors and the reference value of left ventricular end systolic diameter is quite significant, the 5 indexes respectively are latitude, annual mean air temperature, annual mean relative humidity, annual precipitation amount, annual range of air temperature, the predicted values and the observed ones are in good conformity, there is no significant difference at 95% degree of confidence. The overall trend of predicted values increases from west to east, increases first and then decreases from north to south. Conclusion If geographical values are obtained in one region, the reference value of left ventricular end systolic diameter of aging people in this region can be obtained by using the support vector regression model. It could be more scientific to formulate the different distributions on the basis of synthesizing the physiological and the geographical factors. Highlights: -Use Moran's index to analyze the spatial correlation. -Choose support vector machine to build model that overcome complexity of variables. -Test normal distribution of predicted data to guarantee the interpolation results. -Through trend analysis to explain the changes of reference value clearly.

Keywords: Left ventricular end systolic diameter
[321] Zhiwei Guo and Guangchen Bai. Application of least squares support vector machine for regression to reliability analysis. Chinese Journal of Aeronautics, 22(2):160 - 166, 2009. [ bib | DOI | http ]
In order to deal with the issue of huge computational cost very well in direct numerical simulation, the traditional response surface method (RSM) as a classical regression algorithm is used to approximate a functional relationship between the state variable and basic variables in reliability design. The algorithm has treated successfully some problems of implicit performance function in reliability analysis. However, its theoretical basis of empirical risk minimization narrows its range of applications for the regression model. In contrast to classical algorithms, the support vector machine for regression (SVR) based on structural risk minimization has the excellent abilities of small sample learning and generalization, and superiority over the traditional regression method. Nevertheless, {SVR} is time consuming and huge space demanding for the reliability analysis of large samples. This article introduces the least squares support vector machine for regression (LSSVR) into reliability analysis to overcome these shortcomings. Numerical results show that the reliability method based on the {LSSVR} has excellent accuracy and smaller computational cost than the reliability method based on support vector machine (SVM). Thus, it is valuable for the engineering application.

Keywords: mechanism design of spacecraft
[322] Shien-Tsung Chen, Pao-Shan Yu, and Yi-Hsuan Tang. Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. Journal of Hydrology, 385(1–4):13 - 22, 2010. [ bib | DOI | http ]
Summary Downscaling local daily precipitation from large-scale weather variables is often necessary when studying how climate change impacts hydrology. This study proposes a two-step statistical downscaling method for projection of daily precipitation. The first step is classification to determine whether the day is dry or wet, and the second is regression to estimate the amount of precipitation conditional on the occurrence of a wet day. Predictors of classification and regression models are selected from large-scale weather variables in {NECP} reanalysis data based on statistical tests. The proposed statistical downscaling method is developed according to two methodologies. One methodology is support vector machine (SVM), including support vector classification (SVC) and support vector regression (SVR), and the other is multivariate analysis, including discriminant analysis (for classification) and multiple regression. The popular statistical downscaling model (SDSM) is analyzed for comparison. A comparison of downscaling results in the Shih-Men Reservoir basin in Taiwan reveals that overall, the {SVM} reproduces most reasonable daily precipitation properties, although the {SDMS} performs better than other models in small daily precipitation (less than about 10 mm). Finally, projection of local daily precipitation is performed, and future work to advance the downscaling method is proposed.

Keywords: Statistical downscaling
[323] Chin-Sheng Yang, Chih-Ping Wei, Chi-Chuan Yuan, and Jen-Yu Schoung. Predicting the length of hospital stay of burn patients: Comparisons of prediction accuracy among different clinical stages. Decision Support Systems, 50(1):325 - 335, 2010. [ bib | DOI | http ]
A burn injury is a disastrous trauma and can have wide-ranging impacts on burn patients, their families, and society. Burn patients generally experience long hospital stays, and the accurate prediction of the length of those stays has strong implications for healthcare resource management and service delivery. In addition to prediction accuracy, the timing of length of hospital stay (LOS) predictions is also relevant, because {LOS} predictions during earlier clinical stages (e.g., admission) can provide an important component for service and resource planning as well as patient and family counseling, whereas {LOS} predictions at later clinical stages (e.g., post-treatment) can support resource utilization reviews and cost controls. This study evaluates the effectiveness of {LOS} predictions for burn patients during three different clinical stages: admission, acute, and post-treatment. In addition, we compare the prediction effectiveness of two artificial intelligence (AI)-based prediction techniques (i.e., model-tree-based regression and support vector machine regression), using linear regression analysis as our performance benchmark. On the basis of 1080 burn cases collected in Taiwan, the empirical evaluation suggests that the accuracy of {LOS} predictions at the acute stage does not improve compared with those during the admission stage, but {LOS} predictions at the post-treatment stage are significantly more accurate. Moreover, the AI-based prediction techniques, especially support vector machine regression, appear more effective than the regression technique for {LOS} predictions for burn patients across stages.

Keywords: Length of hospital stay (LOS)
[324] Ying Wang, Yong Fan, Priyanka Bhatt, and Christos Davatzikos. High-dimensional pattern regression using machine learning: From medical images to continuous clinical variables. NeuroImage, 50(4):1519 - 1535, 2010. [ bib | DOI | http ]
This paper presents a general methodology for high-dimensional pattern regression on medical images via machine learning techniques. Compared with pattern classification studies, pattern regression considers the problem of estimating continuous rather than categorical variables, and can be more challenging. It is also clinically important, since it can be used to estimate disease stage and predict clinical progression from images. In this work, adaptive regional feature extraction approach is used along with other common feature extraction methods, and feature selection technique is adopted to produce a small number of discriminative features for optimal regression performance. Then the Relevance Vector Machine (RVM) is used to build regression models based on selected features. To get stable regression models from limited training samples, a bagging framework is adopted to build ensemble basis regressors derived from multiple bootstrap training samples, and thus to alleviate the effects of outliers as well as facilitate the optimal model parameter selection. Finally, this regression scheme is tested on simulated data and real data via cross-validation. Experimental results demonstrate that this regression scheme achieves higher estimation accuracy and better generalizing ability than Support Vector Regression (SVR).

Keywords: High-dimensionality pattern regression
[325] Fang Wang, Warawut Suphamitmongkol, and Bo Wang. Advertisement click-through rate prediction using multiple criteria linear programming regression model. Procedia Computer Science, 17:803 - 811, 2013. First International Conference on Information Technology and Quantitative Management. [ bib | DOI | http ]
Abstract In advertisement industry, it is important to predict potentially profitable users who will click target ads (i.e., Behavioral Targeting). The task selects the potential users that are likely to click the ads by analyzing user's clicking/web browsing information and displaying the most relevant ads to them. In this paper, we present a Multiple Criteria Linear Programming Regression (MCLPR) prediction model as the solution. The experiment datasets are provided by a leading Internet company in China, and can be downloaded from track2 of the {KDD} Cup 2012 datasets. In this paper, Support Vector Regression (SVR) and Logistic Regression (LR) are used as two benchmark models for comparison. The results indicate that {MCLPR} is a promising model in behavioral targeting tasks.

Keywords: Behavior Targeting
[326] G. Ganesh Sundarkumar and Vadlamani Ravi. A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. Engineering Applications of Artificial Intelligence, 37:368 - 377, 2015. [ bib | DOI | http ]
Abstract In this paper, we propose a novel hybrid approach for rectifying the data imbalance problem by employing k Reverse Nearest Neighborhood and One Class support vector machine (OCSVM) in tandem. We mined an Automobile Insurance Fraud detection dataset and customer Credit Card Churn prediction dataset to demonstrate the effectiveness of the proposed model. Throughout the paper, we followed 10 fold cross validation method of testing using Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), Probabilistic Neural Network (PNN), Group Method of Data Handling (GMDH), Multi-Layer Perceptron (MLP). We observed that {DT} and {SVM} respectively yielded high sensitivity of 90.74% and 91.89% on Insurance dataset and DT, {SVM} and {GMDH} respectively produced high sensitivity of 91.2%, 87.7%, and 83.1% on Credit Card Churn Prediction dataset. In the case of Insurance Fraud detection dataset, we found that statistically there is no significant difference between {DT} (J48) and SVM. As {DT} yields “if then” rules, we prefer {DT} over SVM. Further, in the case of churn prediction dataset, it turned out that GMDH, {SVM} and {LR} are not statistically different and {GMDH} yielded very high Area Under Curve at ROC. Further, {DT} yielded just 4 ‘if–then’ rules on Insurance and 10 rules on churn prediction datasets, which is the significant outcome of the study.

Keywords: Insurance fraud detection
[327] Claudio Ciancio, Teresa Citrea, Giuseppina Ambrogio, Luigi Filice, and Roberto Musmanno. Design of a high performance predictive tool for forging operation. Procedia {CIRP}, 33:173 - 178, 2015. 9th {CIRP} Conference on Intelligent Computation in Manufacturing Engineering - {CIRP} {ICME} '14. [ bib | DOI | http ]
Abstract This paper presents a comparative study of different artificial intelligence techniques to model and optimize a particular manufacturing process known as forging. The present work aims to reduce energy, load and material consumption satisfying at the same time constraints on product quality. A flywheel is considered as specific case study for the investigation. The size of the billet used in the forging process will be optimized so that the molds are correctly filled, and waste, forging load and energy absorbed by the process are minimized. More in particular, the shape of the initial billet is a hollow cylinder and the parameters to be optimized are the billet dimensions (inner diameter, outer diameter and height) and the friction coefficient. The analytical relationship between input and output values will be identified in order to choose the optimal process configuration to obtain the desired output. The input-output relation was mapped with different techniques. First of all a Genetic Algorithm-Neural Network and a Taguchi-Neural Network approach are described where genetic algorithm and Taguchi are used to optimize the neural network architecture. The other techniques are support vector regression, fuzzy logic and response surface. In addition a support vector machine approach was used to check the final product quality.

Keywords: Forging
[328] Kadir Kavaklioglu. Modeling and prediction of turkey’s electricity consumption using support vector regression. Applied Energy, 88(1):368 - 375, 2011. [ bib | DOI | http ]
Support Vector Regression (SVR) methodology is used to model and predict Turkey’s electricity consumption. Among various {SVR} formalisms, ε-SVR method was used since the training pattern set was relatively small. Electricity consumption is modeled as a function of socio-economic indicators such as population, Gross National Product, imports and exports. In order to facilitate future predictions of electricity consumption, a separate {SVR} model was created for each of the input variables using their current and past values; and these models were combined to yield consumption prediction values. A grid search for the model parameters was performed to find the best ε-SVR model for each variable based on Root Mean Square Error. Electricity consumption of Turkey is predicted until 2026 using data from 1975 to 2006. The results show that electricity consumption can be modeled using Support Vector Regression and the models can be used to predict future electricity consumption.

Keywords: Electricity consumption
[329] Jongho Shin, H. Jin Kim, and Youdan Kim. Adaptive support vector regression for {UAV} flight control. Neural Networks, 24(1):109 - 120, 2011. [ bib | DOI | http ]
This paper explores an application of support vector regression for adaptive control of an unmanned aerial vehicle (UAV). Unlike neural networks, support vector regression (SVR) generates global solutions, because {SVR} basically solves quadratic programming (QP) problems. With this advantage, the input–output feedback-linearized inverse dynamic model and the compensation term for the inversion error are identified off-line, which we call I-SVR (inversion SVR) and C-SVR (compensation SVR), respectively. In order to compensate for the inversion error and the unexpected uncertainty, an online adaptation algorithm for the C-SVR is proposed. Then, the stability of the overall error dynamics is analyzed by the uniformly ultimately bounded property in the nonlinear system theory. In order to validate the effectiveness of the proposed adaptive controller, numerical simulations are performed on the {UAV} model.

Keywords: Support vector regression
[330] Min-Yuan Cheng and Minh-Tu Cao. Evolutionary multivariate adaptive regression splines for estimating shear strength in reinforced-concrete deep beams. Engineering Applications of Artificial Intelligence, 28:86 - 96, 2014. [ bib | DOI | http ]
Abstract This study proposes a novel artificial intelligence (AI) model to estimate the shear strength of reinforced-concrete (RC) deep beams. The proposed evolutionary multivariate adaptive regression splines (EMARS) model is a hybrid of multivariate adaptive regression splines (MARS) and artificial bee colony (ABC). In EMARS, {MARS} addresses learning and curve fitting and {ABC} implements optimization to determine the optimal parameter settings with minimal estimation errors. The proposed model was constructed using 106 experimental datasets from the literature. {EMARS} performance was compared with three other data-mining techniques, including back-propagation neural network (BPNN), radial basis function neural network (RBFNN), and support vector machine (SVM). {EMARS} estimation accuracy was benchmarked against four prevalent mathematical methods, including ACI-318 (2011), CSA, CEB-FIP MC90, and Tang’s Method. Benchmark results identified {EMARS} as the best model and, thus, an efficient alternative approach to estimating {RC} deep beam shear strength.

Keywords: Multivariate adaptive regression splines
[331] Mohammad Hossein Zangooei, Jafar Habibi, and Roohallah Alizadehsani. Disease diagnosis with a hybrid method {SVR} using nsga-ii. Neurocomputing, 136:14 - 29, 2014. [ bib | DOI | http ]
Abstract Early diagnosis of any disease at a lower cost is preferable. Automatic medical diagnosis classification tools reduce financial burden on health care systems. In medical diagnosis, patterns consist of observable symptoms and the results of diagnostic tests, which have various associated costs and risks. In this paper, we have experimented and suggested an automated pattern classification method for classifying four diseases into two classes. In the literature on machine learning or data mining, regression and classification problems are typically viewed as two distinct problems differentiated by continuous or categorical dependent variables. There are endeavors to use regression methods to solve classification problems and vice versa. To regard a classification problem as a regression one, we propose a method based on the Support Vector Regression (SVR) classification model as one of the powerful methods in intelligent field management. We apply the Non-dominated Sorting Genetic Algorithm-II (NSGA-II), a kind of multi-objective evolutionary algorithm, to find mapping points (MPs) for rounding a real-value to an integer one. Also, we employ the NSGA-II to find out and tune the {SVR} kernel parameters optimally so as to enhance the performance of our model and achieve better results. The results of the study are compared with the results of some previous studies focusing on the diagnoses of four diseases using the same {UCI} machine learning database. The experimental results show that the proposed method yields a superior and competitive performance in these four real-world datasets.

Keywords: Support Vector Regression
[332] Zhao Lu, Jing Sun, and Kenneth R. Butts. Linear programming support vector regression with wavelet kernel: A new approach to nonlinear dynamical systems identification. Mathematics and Computers in Simulation, 79(7):2051 - 2063, 2009. [ bib | DOI | http ]
Wavelet theory has a profound impact on signal processing as it offers a rigorous mathematical framework to the treatment of multiresolution problems. The combination of soft computing and wavelet theory has led to a number of new techniques. On the other hand, as a new generation of learning algorithms, support vector regression (SVR) was developed by Vapnik et al. recently, in which ɛ-insensitive loss function was defined as a trade-off between the robust loss function of Huber and one that enables sparsity within the SVs. The use of support vector kernel expansion also provides us a potential avenue to represent nonlinear dynamical systems and underpin advanced analysis. However, for the support vector regression with the standard quadratic programming technique, the implementation is computationally expensive and sufficient model sparsity cannot be guaranteed. In this article, from the perspective of model sparsity, the linear programming support vector regression (LP-SVR) with wavelet kernel was proposed, and the connection between LP-SVR with wavelet kernel and wavelet networks was analyzed. In particular, the potential of the LP-SVR for nonlinear dynamical system identification was investigated.

Keywords: Support vector regression
[333] Julio Cesar L. Alves, Claudete B. Henriques, and Ronei J. Poppi. Determination of diesel quality parameters using support vector regression and near infrared spectroscopy for an in-line blending optimizer system. Fuel, 97:710 - 717, 2012. [ bib | DOI | http ]
This work demonstrates the application of support vector regression (SVR) applied to near infrared spectroscopy (NIR) data to solve regression problems associated to determination of quality parameters of diesel oil for an in-line blending optimizer system in a petroleum refinery. The determination of flash point and cetane number was performed using {SVR} and the results were compared with those obtained by using the {PLS} algorithm. A parametric optimization using a genetic algorithm was carried out for choice of the parameters in the {SVR} regression models. The best models using {SVR} presented a {RBF} kernel and spectra preprocessed with baseline correction and mean centered data. The obtained values of {RMSEP} with the {SVR} models are 1.98 °C and 0.453 for flash point and cetane number, respectively. The {SVR} provided significantly better results when compared with {PLS} and in agreement with the specification of the {ASTM} reference method for both quality parameter determinations.

Keywords: Diesel
[334] Peng Peng and Ze-Nian Li. General-purpose image quality assessment based on distortion-aware decision fusion. Neurocomputing, 134:117 - 121, 2014. Special issue on the 2011 Sino-foreign-interchange Workshop on Intelligence Science and Intelligent Data Engineering (IScIDE 2011)Learning Algorithms and ApplicationsSelected papers from the 19th International Conference on Neural Information Processing (ICONIP2012). [ bib | DOI | http ]
Abstract General-purpose image quality metrics aiming for quality prediction across various distortion types exhibit, on the whole, very limited effectiveness. In this paper, we propose a two-stage scheme to alleviate this limitation. At the first stage, probabilistic knowledge about the image distortion types is obtained based on a support-vector classification method. At the second stage, decision fusion of three existing image quality metrics is performed using the k-nearest-neighbor (k-NN) regression where the aforementioned probabilistic knowledge is utilized under an adaptive weighting scheme. We evaluate our method on the {TID2008} database that is the largest publicly available image quality database containing 17 distortion types. The results strongly support the effectiveness and robustness of our method.

Keywords: General-purpose image quality assessment
[335] Hui Jiang and Zhizhong Wang. Gmrvvm–svr model for financial time series forecasting. Expert Systems with Applications, 37(12):7813 - 7818, 2010. [ bib | DOI | http ]
The complex model GMRVVm–SVR has been adopted to predict financial time series with such characteristics as small sample size, poor information, non-stationary, high noise and non-linearity. In order to construct GMRVVm–SVR, the m-root grey model with revised verge value (GMRVVm) has been introduced and modified by support vector regression based on the calculation of the residual error sequence between predicted values and original data. Due to the recent data points providing more information than distant data points, more importance has been attached to the punishment parameter C of recent data points in support vector regression. Simultaneously, the parameter ɛ in ɛ-insensitive loss function has been determined according to smoothing overshooting. Pattern search (PS) algorithm has been carried out to tune free parameters. A real experimental result shows that the complex model can achieve comparative accurate prediction as well as smoothing overshooting in financial time series prediction.

Keywords: m-root grey model
[336] Yongping Zhao and Jianguo Sun. A fast method to approximately train hard support vector regression. Neural Networks, 23(10):1276 - 1285, 2010. [ bib | DOI | http ]
The hard support vector regression (HSVR) usually has a risk of suffering from overfitting due to the presence of noise. The main reason is that it does not utilize the regularization technique to set an upper bound on the Lagrange multipliers so they can be magnified infinitely. Hence, we propose a greedy stagewise based algorithm to approximately train HSVR. At each iteration, the sample which has the maximal predicted discrepancy is selected and its weight is updated only once so as to avoid being excessively magnified. Actually, this early stopping rule can implicitly control the capacity of the regression machine, which is equivalent to a regularization technique. In addition, compared with the well-known software LIBSVM2.82, our algorithm to a certain extent has advantages in both the training time and the number of support vectors. Finally, experimental results on the synthetic and real-world benchmark data sets also corroborate the efficacy of the proposed algorithm.

Keywords: Support vector regression
[337] Yoonkyung Lee and Rui Wang. Does modeling lead to more accurate classification?: A study of relative efficiency in linear classification. Journal of Multivariate Analysis, 133:232 - 250, 2015. [ bib | DOI | http ]
Abstract Classification arises in a wide range of applications. A variety of statistical tools have been developed for learning classification rules from data. Understanding of their relative merits and comparisons help users to choose a proper method in practice. This paper focuses on theoretical comparison of model-based classification methods in statistics with algorithmic methods in machine learning in terms of the error rate. Extending Efron’s comparison of logistic regression with linear discriminant analysis (LDA) under the normal setting, we contrast such algorithmic methods as the support vector machine (SVM) and boosting with the {LDA} and logistic regression and study their relative efficiencies in reducing the error rate based on the limiting behavior of the classification boundary of each method. We show that algorithmic methods are generally less effective than model-based methods in the normal setting. In particular, loss of efficiency in error rate is typically about 33% to 60% for the {SVM} and 50% to 80% for boosting when compared to the LDA. However, a smooth variant of the {SVM} is shown to be even more efficient than logistic regression. In addition to the theoretical study, we present results from numerical experiments under various settings for comparisons of finite-sample performance and robustness to mislabeling and model misspecification.

Keywords: Boosting
[338] P.J. García Nieto, J. Martínez Torres, M. Araújo Fernández, and C. Ordóñez Galán. Support vector machines and neural networks used to evaluate paper manufactured using eucalyptus globulus. Applied Mathematical Modelling, 36(12):6137 - 6145, 2012. [ bib | DOI | http ]
Using advanced machine learning techniques as an alternative to conventional double-entry volume equations, a regression model of the inside-bark volume (dependent variable) for standing Eucalyptus globulus trunks (or main stems) has been built as a function of the following three independent variables: age, height and outside-bark diameter at breast height (DBH). The experimental observed data (age, height, outside-bark {DBH} and inside-bark volume) for 142 trees (E. globulus) were measured and a nonlinear model was built using a data-mining methodology based on support vector machines (SVM) and multilayer perceptron networks (MLP) for regression problems. Coefficients of determination and Furnival’s indices indicate the superiority of the {SVM} with a radial kernel over the allometric regression models and the MLP.

Keywords: Eucalyptus globulus
[339] Johan Colliez, Franck Dufrenois, and Denis Hamad. Optic flow estimation by support vector regression. Engineering Applications of Artificial Intelligence, 19(7):761 - 768, 2006. Special issue on Engineering Applications of Neural Networks - Novel Applications of Neural Networks in EngineeringSpecial issue on Engineering Applications of Neural Networks - Novel Applications of Neural Networks in Engineering. [ bib | DOI | http ]
In this paper, we describe an approach to estimate optic flow from an image sequence based on Support Vector Regression (SVR) machines with an adaptive ɛ -margin. This approach uses affine and constant models for velocity vectors. Synthetic and real image sequences are used in order to compare results of the {SVR} approach against other well-known optic flow estimation methods. Experimental results on real traffic sequences show that {SVR} approach is an appropriate solution for object tracking.

Keywords: Optic flow
[340] Jianyi Liu, Yao Ma, Lixin Duan, Fangfang Wang, and Yuehu Liu. Hybrid constraint {SVR} for facial age estimation. Signal Processing, 94:576 - 582, 2014. [ bib | DOI | http ]
Abstract In this paper, facial age estimation is discussed in a novel viewpoint – how to jointly exploit the supervised training data and human annotations to improve the age estimation precision. This is motivated by the lacking of data problem in age estimation and the current web booming. To do so, fuzzy age label is firstly defined, and it is then merged into the Support Vector Regression (SVR) framework together with the traditional data labels. The new learning problem is finally formulated into a similar dual form with the standard SVR, which can be easily solved using existing solvers. In experiments, we have compared with the state of the art regression based methods, and the results are very competitive.

Keywords: Facial image
[341] Paulo Roberto Filgueiras, Júlio Cesar L. Alves, and Ronei Jesus Poppi. Quantification of animal fat biodiesel in soybean biodiesel and {B20} diesel blends using near infrared spectroscopy and synergy interval support vector regression. Talanta, 119:582 - 589, 2014. [ bib | DOI | http ]
Abstract In this work, multivariate calibration based on partial least squares (PLS) and support vector regression (SVR) using the whole spectrum and variable selection by synergy interval (siPLS and siSVR) were applied to {NIR} spectra for the determination of animal fat biodiesel content in soybean biodiesel and {B20} diesel blends. For all models, prediction errors, bias test for systematic errors and permutation test for trends in the residuals were calculated. The siSVR produced significantly lower prediction errors compared to the full spectrum methods and siPLS, with a root mean squares error (RMSEP) of 0.18%(w/w) (concentration range: 0.00%–69.00%(w/w)) in the soybean biodiesel blend and 0.10%(w/w) in the {B20} diesel (concentration range: 0.00%–13.80%(w/w)). Additionally, in the models for the determination of animal fat biodiesel in blends with soybean diesel, {PLS} and {SVR} showed evidence of systematic errors, and PLS/siPLS presented trends in residuals based on the permutation test. For the {B20} diesel, {PLS} presented evidence of systematic errors, and siPLS presented trends in the residuals.

Keywords: Biodiesel
[342] P.J. García Nieto, J.R. Alonso Fernández, F.J. de Cos Juez, F. Sánchez Lasheras, and C. Díaz Muñiz. Hybrid modelling based on support vector regression with genetic algorithms in forecasting the cyanotoxins presence in the trasona reservoir (northern spain). Environmental Research, 122:1 - 10, 2013. [ bib | DOI | http ]
Cyanotoxins, a kind of poisonous substances produced by cyanobacteria, are responsible for health risks in drinking and recreational waters. As a result, anticipate its presence is a matter of importance to prevent risks. The aim of this study is to use a hybrid approach based on support vector regression (SVR) in combination with genetic algorithms (GAs), known as a genetic algorithm support vector regression (GA–SVR) model, in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain). The GA-SVR approach is aimed at highly nonlinear biological problems with sharp peaks and the tests carried out proved its high performance. Some physical–chemical parameters have been considered along with the biological ones. The results obtained are two-fold. In the first place, the significance of each biological and physical–chemical variable on the cyanotoxins presence in the reservoir is determined with success. Finally, a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained.

Keywords: Statistical machine learning techniques
[343] Guibing Guo, Jie Zhang, and Neil Yorke-Smith. Leveraging multiviews of trust and similarity to enhance clustering-based recommender systems. Knowledge-Based Systems, 74:14 - 27, 2015. [ bib | DOI | http ]
Abstract Although demonstrated to be efficient and scalable to large-scale data sets, clustering-based recommender systems suffer from relatively low accuracy and coverage. To address these issues, we develop a multiview clustering method through which users are iteratively clustered from the views of both rating patterns and social trust relationships. To accommodate users who appear in two different clusters simultaneously, we employ a support vector regression model to determine a prediction for a given item, based on user-, item- and prediction-related features. To accommodate (cold) users who cannot be clustered due to insufficient data, we propose a probabilistic method to derive a prediction from the views of both ratings and trust relationships. Experimental results on three real-world data sets demonstrate that our approach can effectively improve both the accuracy and coverage of recommendations as well as in the cold start situation, moving clustering-based recommender systems closer towards practical use.

Keywords: Recommender systems
[344] Xiang-Yu Hua, Zhi-Min Yang, Ya-Fen Ye, and Yuan-Hai Shao. A novel dynamic financial conditions index approach based on accurate online support vector regression. Procedia Computer Science, 55:944 - 952, 2015. 3rd International Conference on Information Technology and Quantitative Management, {ITQM} 2015. [ bib | DOI | http ]
Abstract In this paper, we construct a novel dynamic financial conditions index (DFCI) for China based on accurate online support vector regression (AOSVR), and the constructed {DFCI} is evaluated on future inflationary pressures. The research results indicate dynamic effect of financial variables on {DFCI} in time-varying economic and financial environment, verifying the dynamic nature of the weights in our DFCI. On the whole, in our {DFCI} exchange rate, stock price, and money supply have the push-down effect on DFCI, taking negative dynamic weights. Housing price has the pull-up effect on DFCI, taking positive dynamic weights. The effect of interest rate on {DFCI} is erratic, taking sign-changed dynamic weights. The Granger causality test results show the superior performance ability of our {DFCI} compared with the {FCI} constructed based on SVR.

Keywords: Macroeconomic
[345] Mingfeng Jiang, Yaming Wang, Ling Xia, Feng Liu, Shanshan Jiang, and Wenqing Huang. The combination of self-organizing feature maps and support vector regression for solving the inverse {ECG} problem. Computers & Mathematics with Applications, 66(10):1981 - 1990, 2013. ICNC-FSKD 2012. [ bib | DOI | http ]
Abstract Noninvasive electrical imaging of the heart aims to quantitatively reconstruct transmembrane potentials (TMPs) from body surface potentials (BSPs), which is a typical inverse problem. Classically, electrocardiography (ECG) inverse problem is solved by regularization techniques. In this study, it is treated as a regression problem with multi-inputs (BSPs) and multi-outputs (TMPs). Then the resultant regression problem is solved by a hybrid method, which combines the support vector regression (SVR) method with self-organizing feature map (SOFM) techniques. The hybrid SOFM–SVR method conducts a two-step process: {SOFM} algorithm is used to cluster the training samples and the individual {SVR} method is employed to construct the regression model. For each testing sample, the cluster operation can effectively improve the efficiency of the regression algorithm, and also helps the setup of the corresponding {SVR} model for the {TMPs} reconstruction. The performance of the developed SOFM–SVR model is tested using our previously developed realistic heart-torso model. The experiment results show that, compared with traditional single {SVR} method in solving the inverse {ECG} problem, the proposed method can reduce the cost of training time and improve the reconstruction accuracy in solving the inverse {ECG} problem.

Keywords: Support vector regression
[346] Xavier Pascual, Han Gu, Alex R. Bartman, Aihua Zhu, Anditya Rahardianto, Jaume Giralt, Robert Rallo, Panagiotis D. Christofides, and Yoram Cohen. Data-driven models of steady state and transient operations of spiral-wound {RO} plant. Desalination, 316:154 - 161, 2013. [ bib | DOI | http ]
Abstract The development of data-driven {RO} plant performance models was demonstrated using the support vector regression model building approach. Models of both steady state and unsteady state plant operation were developed based on a wide range of operational data obtained from a fully automated small spiral-wound {RO} pilot. Single output variable steady state plant models for flow rates and conductivities of the permeate and retentate streams were of high accuracy, with average absolute relative errors (AARE) of 0.70%–2.46%. Performance of a composite support vector regression (SVR) based model (for both streams) for flow rates and conductivities was of comparable accuracy to the single output variable models (AARE of 0.71%–2.54%). The temporal change in conductivity, as a result of transient system operation (induced by perturbation of either system pressure or flow rate), was described by {SVR} model, which utilizes a time forecasting approach, with performance level of less than 1% {AARE} for forecasting periods of 2 s to 3.5 min. The high level of performance obtained with the present modeling approach suggests that short-term performance forecasting models that are based on plant data, could be useful for advanced {RO} plant control algorithms, fault tolerant control and process optimization.

Keywords: Desalination
[347] Xing Yan and Nurul A. Chowdhury. Mid-term electricity market clearing price forecasting utilizing hybrid support vector machine and auto-regressive moving average with external input. International Journal of Electrical Power & Energy Systems, 63:64 - 70, 2014. [ bib | DOI | http ]
Abstract Currently, there are many techniques available for short-term electricity market clearing price (MCP) forecasting, but very little has been done in the area of mid-term electricity {MCP} forecasting. Mid-term electricity {MCP} forecasting has become essential for resources reallocation, maintenance scheduling, bilateral contracting, budgeting and planning purposes. A hybrid mid-term electricity {MCP} forecasting model combining both support vector machine (SVM) and auto-regressive moving average with external input (ARMAX) modules is presented in this paper. The proposed hybrid model showed improved forecasting accuracy compared to forecasting models using a single SVM, a single least squares support vector machine (LSSVM) and hybrid LSSVM-ARMAX. {PJM} interconnection data have been utilized to illustrate the proposed model with numerical examples.

Keywords: Auto-regressive moving average with external input (ARMAX)
[348] David Meyer, Friedrich Leisch, and Kurt Hornik. The support vector machine under test. Neurocomputing, 55(1–2):169 - 186, 2003. Support Vector Machines. [ bib | DOI | http ]
Support vector machines (SVMs) are rarely benchmarked against other classification or regression methods. We compare a popular {SVM} implementation (libsvm) to 16 classification methods and 9 regression methods—all accessible through the software R—by the means of standard performance measures (classification error and mean squared error) which are also analyzed by the means of bias-variance decompositions. {SVMs} showed mostly good performances both on classification and regression tasks, but other methods proved to be very competitive.

Keywords: Benchmark
[349] Pablo Rivas-Perea and Juan Cota-Ruiz. An algorithm for training a large scale support vector machine for regression based on linear programming and decomposition methods. Pattern Recognition Letters, 34(4):439 - 451, 2013. Advances in Pattern Recognition Methodology and Applications. [ bib | DOI | http ]
This paper presents a method to train a Support Vector Regression (SVR) model for the large-scale case where the number of training samples supersedes the computational resources. The proposed scheme consists of posing the {SVR} problem entirely as a Linear Programming (LP) problem and on the development of a sequential optimization method based on variables decomposition, constraints decomposition, and the use of primal–dual interior point methods. Experimental results demonstrate that the proposed approach has comparable performance with other SV-based classifiers. Particularly, experiments demonstrate that as the problem size increases, the sparser the solution becomes, and more computational efficiency can be gained in comparison with other methods. This demonstrates that the proposed learning scheme and the LP-SVR model are robust and efficient when compared with other methodologies for large-scale problems.

Keywords: Support vector machines
[350] Hanmin Sheng and Jian Xiao. Electric vehicle state of charge estimation: Nonlinear correlation and fuzzy support vector machine. Journal of Power Sources, 281:131 - 137, 2015. [ bib | DOI | http ]
Abstract The aim of this study is to estimate the state of charge (SOC) of the lithium iron phosphate (LiFePO4) battery pack by applying machine learning strategy. To reduce the noise sensitive issue of common machine learning strategies, a kind of {SOC} estimation method based on fuzzy least square support vector machine is proposed. By applying fuzzy inference and nonlinear correlation measurement, the effects of the samples with low confidence can be reduced. Further, a new approach for determining the error interval of regression results is proposed to avoid the control system malfunction. Tests are carried out on modified {COMS} electric vehicles, with two battery packs each consists of 24 50 Ah LiFePO4 batteries. The effectiveness of the method is proven by the test and the comparison with other popular methods.

Keywords: Lithium battery
[351] Chung-Ho Hsieh, Ruey-Hwa Lu, Nai-Hsin Lee, Wen-Ta Chiu, Min-Huei Hsu, and Yu-Chuan (Jack) Li. Novel solutions for an old disease: Diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery, 149(1):87 - 93, 2011. [ bib | DOI | http ]
Background Diagnosing acute appendicitis clinically is still difficult. We developed random forests, support vector machines, and artificial neural network models to diagnose acute appendicitis. Methods Between January 2006 and December 2008, patients who had a consultation session with surgeons for suspected acute appendicitis were enrolled. Seventy-five percent of the data set was used to construct models including random forest, support vector machines, artificial neural networks, and logistic regression. Twenty-five percent of the data set was withheld to evaluate model performance. The area under the receiver operating characteristic curve (AUC) was used to evaluate performance, which was compared with that of the Alvarado score. Results Data from a total of 180 patients were collected, 135 used for training and 45 for testing. The mean age of patients was 39.4 years (range, 16–85). Final diagnosis revealed 115 patients with and 65 without appendicitis. The {AUC} of random forest, support vector machines, artificial neural networks, logistic regression, and Alvarado was 0.98, 0.96, 0.91, 0.87, and 0.77, respectively. The sensitivity, specificity, positive, and negative predictive values of random forest were 94%, 100%, 100%, and 87%, respectively. Random forest performed better than artificial neural networks, logistic regression, and Alvarado. Conclusion We demonstrated that random forest can predict acute appendicitis with good accuracy and, deployed appropriately, can be an effective tool in clinical decision making.

[352] Rein Houthooft, Joeri Ruyssinck, Joachim van der Herten, Sean Stijven, Ivo Couckuyt, Bram Gadeyne, Femke Ongenae, Kirsten Colpaert, Johan Decruyenaere, Tom Dhaene, and Filip De Turck. Predictive modelling of survival and length of stay in critically ill patients using sequential organ failure scores. Artificial Intelligence in Medicine, 63(3):191 - 207, 2015. [ bib | DOI | http ]
AbstractIntroduction The length of stay of critically ill patients in the intensive care unit (ICU) is an indication of patient {ICU} resource usage and varies considerably. Planning of postoperative {ICU} admissions is important as {ICUs} often have no nonoccupied beds available. Problem statement Estimation of the {ICU} bed availability for the next coming days is entirely based on clinical judgement by intensivists and therefore too inaccurate. For this reason, predictive models have much potential for improving planning for {ICU} patient admission. Objective Our goal is to develop and optimize models for patient survival and {ICU} length of stay (LOS) based on monitored {ICU} patient data. Furthermore, these models are compared on their use of sequential organ failure (SOFA) scores as well as underlying raw data as input features. Methodology Different machine learning techniques are trained, using a 14,480 patient dataset, both on {SOFA} scores as well as their underlying raw data values from the first five days after admission, in order to predict (i) the patient LOS, and (ii) the patient mortality. Furthermore, to help physicians in assessing the prediction credibility, a probabilistic model is tailored to the output of our best-performing model, assigning a belief to each patient status prediction. A two-by-two grid is built, using the classification outputs of the mortality and prolonged stay predictors to improve the patient {LOS} regression models. Results For predicting patient mortality and a prolonged stay, the best performing model is a support vector machine (SVM) with GA,D = 65.9% (area under the curve (AUC) of 0.77) and GS,L = 73.2% (AUC of 0.82). In terms of {LOS} regression, the best performing model is support vector regression, achieving a mean absolute error of 1.79 days and a median absolute error of 1.22 days for those patients surviving a nonprolonged stay. Conclusion Using a classification grid based on the predicted patient mortality and prolonged stay, allows more accurate modeling of the patient LOS. The detailed models allow to support the decisions made by physicians in an {ICU} setting.

Keywords: Mortality prediction
[353] Bo Li, Xinjun Li, and Zhiyan Zhao. Novel algorithm for constructing support vector machine regression ensemble1. Journal of Systems Engineering and Electronics, 17(3):541 - 545, 2006. [ bib | DOI | http ]
A novel algorithm for constructing support vector machine regression ensemble is proposed. As to regression prediction, support vector machine regression (SVMR) ensemble is proposed by resampling from given training data sets repeatedly and aggregating several independent SVMRs, each of which is trained to use a replicated training set. After training, several independently trained {SVMRs} need to be aggregated in an appropriate combination manner. Generally, the linear weighting is usually used like expert weighting score in Boosting Regression and it is without optimization capacity. Three combination techniques are proposed, including simple arithmetic mean, linear least square error weighting and nonlinear hierarchical combining that uses another upper-layer {SVMR} to combine several lower-layer SVMRs. Finally, simulation experiments demonstrate the accuracy and validity of the presented algorithm.

Keywords: {SVMR} ensemble
[354] Chen-Chung Liu and Kai-Wen Chuang. An outdoor time scenes simulation scheme based on support vector regression with radial basis function on {DCT} domain. Image and Vision Computing, 27(10):1626 - 1636, 2009. Special Section: Computer Vision Methods for Ambient Intelligence. [ bib | DOI | http ]
In this paper, a novel strategy for forecasting outdoor scenes is introduced. This new approach combines the support vector regression in neural network computation and the discrete cosine transform (DCT). In 1995, Vapnik introduced a neural-network algorithm called support vector machine (SVM). During the recent years, due to SVM’s high generalization performance and attractive modeling features, it has received increasing attention in the application of regression estimation – which is called support vector regression (SVR). In SVR, a set of color-block images were transformed by the discrete cosine transformation to be the training data. We also used the radial basis function (RBF) of the training data as SVR’s kernel to establish the {RBF} neural network. Finally, the time scenes simulation algorithm (TSSA) is able to synthesize the corresponding scene of any assigned time of the original outdoor scene image. To explore the utility and demonstrate the efficiency of the proposed algorithm, simulations under various input images were conducted. The experiment results showed that our proposed algorithm can precisely simulate the desired scenes at an assigned time and has two advantages: (a) Using the color-block images instead of using the scene images of a place to create the reference database, the database can be used for any outdoor scene image taken at anywhere at anytime. (b) Taking the support vector regression on the {DCT} coefficients of scene images instead of taking the {SVR} on the spatial pixels of scene images, it simplifies the regression procedure and saves the processing time.

Keywords: Discrete cosine transform
[355] Ping-Feng Pai, Kuo-Ping Lin, Chi-Shen Lin, and Ping-Teng Chang. Time series forecasting by a seasonal support vector regression model. Expert Systems with Applications, 37(6):4261 - 4265, 2010. [ bib | DOI | http ]
The support vector regression (SVR) model is a novel forecasting approach and has been successfully used to solve time series problems. However, the applications of {SVR} models in a seasonal time series forecasting has not been widely investigated. This study aims at developing a seasonal support vector regression (SSVR) model to forecast seasonal time series data. Seasonal factors and trends are utilized in the {SSVR} model to perform forecasts. Furthermore, hybrid genetic algorithms and tabu search (GA/TS) algorithms are applied in order to select three parameters of {SSVR} models. In this study, two other forecasting models, autoregressive integrated moving average (SARIMA) and {SVR} are employed for forecasting the same data sets. Empirical results indicate that the {SSVR} outperforms both {SVR} and {SARIMA} models in terms of forecasting accuracy. Thus, the {SSVR} model is an effective method for seasonal time series forecasting.

Keywords: Seasonal time series
[356] Yan-Ping Zhou, Jian-Hui Jiang, Wei-Qi Lin, Hong-Yan Zou, Hai-Long Wu, Guo-Li Shen, and Ru-Qin Yu. Boosting support vector regression in {QSAR} studies of bioactivities of chemical compounds. European Journal of Pharmaceutical Sciences, 28(4):344 - 353, 2006. [ bib | DOI | http ]
In this paper, boosting has been coupled with {SVR} to develop a new method, boosting support vector regression (BSVR). {BSVR} is implemented by firstly constructing a series of {SVR} models on the various weighted versions of the original training set and then combining the predictions from the constructed {SVR} models to obtain integrative results by weighted median. The proposed {BSVR} algorithm has been used to predict toxicities of nitrobenzenes and inhibitory potency of 1-phenyl[2H]-tetrahydro-triazine-3-one analogues as inhibitors of 5-lipoxygenase. As comparisons to this method, the multiple linear regression (MLR) and conventional support vector regression (SVR) have also been investigated. Experimental results have shown that the introduction of boosting drastically enhances the generalization performance of individual {SVR} model and {BSVR} is a well-performing technique in {QSAR} studies superior to multiple linear regression.

Keywords: Quantitative structure–activity relationship
[357] Bhusana Premanode and Chris Toumazou. Improving prediction of exchange rates using differential {EMD}. Expert Systems with Applications, 40(1):377 - 384, 2013. [ bib | DOI | http ]
Volatility is a key parameter when measuring the size of errors made in modelling returns and other financial variables such as exchanged rates. The autoregressive moving-average (ARMA) model is a linear process in time series; whilst in the nonlinear system, the generalised autoregressive conditional heteroskedasticity (GARCH) and Markov switching {GARCH} (MS-GARCH) have been widely applied. In statistical learning theory, support vector regression (SVR) plays an important role in predicting nonlinear and nonstationary time series variables. In this paper, we propose a new algorithm, differential Empirical Mode Decomposition (EMD) for improving prediction of exchange rates under support vector regression (SVR). The new algorithm of Differential {EMD} has the capability of smoothing and reducing the noise, whereas the {SVR} model with the filtered dataset improves predicting the exchange rates. Simulations results consisting of the Differential {EMD} and {SVR} model show that our model outperforms simulations by a state-of-the-art MS-GARCH and Markov switching regression (MSR) models.

Keywords: Prediction
[358] Muhammad Nizam, Azah Mohamed, and Aini Hussain. Dynamic voltage collapse prediction in power systems using support vector regression. Expert Systems with Applications, 37(5):3730 - 3736, 2010. [ bib | DOI | http ]
This paper presents dynamic voltage collapse prediction on an actual power system using support vector regression. Dynamic voltage collapse prediction is first determined based on the {PTSI} calculated from information in dynamic simulation output. Simulations were carried out on a practical 87 bus test system by considering load increase as the contingency. The data collected from the time domain simulation is then used as input to the {SVR} in which support vector regression is used as a predictor to determine the dynamic voltage collapse indices of the power system. To reduce training time and improve accuracy of the SVR, the Kernel function type and Kernel parameter are considered. To verify the effectiveness of the proposed {SVR} method, its performance is compared with the multi layer perceptron neural network (MLPNN). Studies show that the {SVM} gives faster and more accurate results for dynamic voltage collapse prediction compared with the MLPNN.

Keywords: Dynamic voltage collapse
[359] Insuk Sohn, Sujong Kim, Changha Hwang, and Jae Won Lee. New normalization methods using support vector machine quantile regression approach in microarray analysis. Computational Statistics & Data Analysis, 52(8):4104 - 4115, 2008. [ bib | DOI | http ]
There are many sources of systematic variations in cDNA microarray experiments which affect the measured gene expression levels. Print-tip lowess normalization is widely used in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. However, print-tip lowess normalization performs poorly in situations where error variability for each gene is heterogeneous over intensity ranges. We first develop support vector machine quantile regression (SVMQR) by extending support vector machine regression (SVMR) for the estimation of linear and nonlinear quantile regressions, and then propose some new print-tip normalization methods based on {SVMR} and SVMQR. We apply our proposed normalization methods to previous cDNA microarray data of apolipoprotein AI-knockout (apoAI-KO) mice, diet-induced obese mice, and genistein-fed obese mice. From our comparative analyses, we find that our proposed methods perform better than the existing print-tip lowess normalization method.

[360] You Ouyang, Wenjie Li, Sujian Li, and Qin Lu. Applying regression models to query-focused multi-document summarization. Information Processing & Management, 47(2):227 - 237, 2011. [ bib | DOI | http ]
Most existing research on applying machine learning techniques to document summarization explores either classification models or learning-to-rank models. This paper presents our recent study on how to apply a different kind of learning models, namely regression models, to query-focused multi-document summarization. We choose to use Support Vector Regression (SVR) to estimate the importance of a sentence in a document set to be summarized through a set of pre-defined features. In order to learn the regression models, we propose several methods to construct the “pseudo” training data by assigning each sentence with a “nearly true” importance score calculated with the human summaries that have been provided for the corresponding document set. A series of evaluations on the {DUC} data sets are conducted to examine the efficiency and the robustness of the proposed approaches. When compared with classification models and ranking models, regression models are consistently preferable.

Keywords: Query-focused summarization
[361] I.M. Horta and A.S. Camanho. Company failure prediction in the construction industry. Expert Systems with Applications, 40(16):6253 - 6257, 2013. [ bib | DOI | http ]
Abstract This paper proposes a new model to predict company failure in the construction industry. The model includes three major innovative aspects. The use of strategic variables reflecting the key specificities of construction companies, which are critical to explain company failure. The use of data mining techniques, i.e. support vector machine to predict company failure. The use of two different sampling methods (random undersampling and random oversampling with replacement) to balance class distributions. The model proposed was empirically tested using all Portuguese contractors that operated in 2009. It is concluded that support vector machine, with random oversampling and including strategic variables, is a very robust tool to predict company failure in the context of the construction industry. In particular, this model outperforms the results obtained with logistic regression.

Keywords: Construction industry
[362] Hiromasa Kaneko and Kimito Funatsu. Adaptive soft sensor model using online support vector regression with time variable and discussion of appropriate hyperparameter settings and window size. Computers & Chemical Engineering, 58:288 - 297, 2013. [ bib | DOI | http ]
Abstract Soft sensors have been widely used in chemical plants to estimate process variables that are difficult to measure online. One crucial difficulty of soft sensors is that predictive accuracy drops due to changes in state of chemical plants. The predictive accuracy of traditional soft sensor models decreases when sudden process changes occur. However, an online support vector regression (OSVR) model with the time variable can adapt to rapid changes among process variables. One crucial problem is finding appropriate hyperparameters and window size, which means the numbers of data for the model construction, and thus, we discussed three methods to select hyperparameters based on predictive accuracy and computation time. The window size of the proposed method was discussed through simulation data and real industrial data analyses and the proposed method achieved high predictive accuracy when time-varying changes in process characteristics occurred.

Keywords: Process control
[363] Z. Yang, X.S. Gu, X.Y. Liang, and L.C. Ling. Genetic algorithm-least squares support vector regression based predicting and optimizing model on carbon fiber composite integrated conductivity. Materials & Design, 31(3):1042 - 1049, 2010. [ bib | DOI | http ]
Support vector machine (SVM), which is a new technology solving classification and regression, has been widely used in many fields. In this study, based on the integrated conductivity(including conductivity and tensile strength) data obtained by carbon fiber/ABS resin matrix composites experiment, a predicting and optimizing model using genetic algorithm-least squares support vector regression (GA-LSSVR) was developed. In this model, genetic algorithm (GA) was used to select and optimize parameters. The predicting results agreed with the experimental data well. By comparing with principal component analysis-genetic back propagation neural network (PCA-GABPNN) predicting model, it is found that GA-LSSVR model has demonstrated superior prediction and generalization performance in view of small sample size problem. Finally, an optimized district of performance parameters was obtained and verified by experiments. It concludes that GA-LSSVR modeling method provides a new promising theoretical method for material design.

Keywords: Carbon fiber composite
[364] Hien D. Nguyen and Geoffrey J. McLachlan. Laplace mixture of linear experts. Computational Statistics & Data Analysis, pages -, 2014. [ bib | DOI | http ]
Abstract Mixture of Linear Experts (MoLE) models provide a popular framework for modeling nonlinear regression data. The majority of applications of MoLE models utilizes a Gaussian distribution for regression error. Such assumptions are known to be sensitive to outliers. The use of a Laplace distributed error is investigated. This model is named the Laplace MoLE (LMoLE). Links are drawn between the Laplace error model and the least absolute deviations regression criterion, which is known to be robust among a wide class of criteria. Through application of the minorization–maximization algorithm framework, an algorithm is derived that monotonically increases the likelihood in the estimation of the {LMoLE} model parameters. It is proven that the maximum likelihood estimator (MLE) for the parameter vector of the {LMoLE} is consistent. Through simulation studies, the robustness of the {LMoLE} model over the Gaussian {MOLE} model is demonstrated, and support for the consistency of the {MLE} is provided. An application of the {LMoLE} model to the analysis of a climate science data set is described.

Keywords: Laplace distribution
[365] Rok Martinčič, Igor Kuzmanovski, Alain Wagner, and Marjana Novič. Development of models for prediction of the antioxidant activity of derivatives of natural compounds. Analytica Chimica Acta, 868:23 - 35, 2015. [ bib | DOI | http ]
Abstract Antioxidants are important for maintaining the appropriate balance between oxidizing and reducing species in the body and thus preventing oxidative stress. Many natural compounds are being screened for their possible antioxidant activity. It was found that a mushroom pigment Norbadione A, which is a pulvinic acid derivative, shows an antioxidant activity; the same was found for other pulvinic acid derivatives and structurally related coumarines. Based on the results of in vitro studies performed on these compounds as a part of this study quantitative structure–activity relationship (QSAR) predictive models were constructed using multiple linear regression, counter-propagation artificial neural networks and support vector regression (SVR). The models have been developed in accordance with current {QSAR} guidelines, including the assessment of the models applicability domains. A new approach for the graphical evaluation of the applicability domain for {SVR} models is suggested. The developed models show sufficient predictive abilities for the screening of virtual libraries for new potential antioxidants.

Keywords: Quantitative structure–activity relationship
[366] Seda Cavdaroglu, Curren Katz, and André Knops. Dissociating estimation from comparison and response eliminates parietal involvement in sequential numerosity perception. NeuroImage, 116:135 - 148, 2015. [ bib | DOI | http ]
Abstract It has been widely debated whether the parietal cortex stores an abstract representation of numerosity that is activated for Arabic digits as well as for non-symbolic stimuli in a sensory modality independent fashion. Some studies suggest that numerical information in time-invariant (simultaneous) symbolic and non-symbolic visual stimuli is represented in the parietal cortex. In humans, whether the same representation is activated for time-variant (sequential) stimuli and for stimuli coming from different modalities has not been determined. To investigate this idea, we measured the brain activation of healthy adults performing estimation and/or comparison of sequential visual (series of dots) and auditory (series of beeps) numerosities. Our experimental design allowed us to separate numerosity estimation from comparison and response related factors. The {BOLD} response in the parietal cortex increased only when participants were engaged in the comparison of two consecutive numerosities that required a response. Using multivariate pattern analysis we trained a classifier to decode numerosity in various regions of interest (ROI). We failed to find any parietal {ROI} where the classifier could decode numerosities during the estimation phase. Rather, when participants were not engaged in comparison we were able to decode numerosity in an auditory cortex {ROI} for auditory stimuli and in a visual cortex {ROI} for visual stimuli. On the other hand, during the response period the classifier successfully decoded numerosity information in a parietal {ROI} for both visual and auditory numerosities. These results were further confirmed by support vector regression. In sum, our study does not support the involvement of the parietal cortex during estimation of sequential numerosity in the absence of an active task with a response requirement.

Keywords: Numerical cognition
[367] M. Asadollahi-Baboli and A. Mani-Varnosfaderani. Therapeutic index modeling and predictive {QSAR} of novel thiazolidin-4-one analogs against toxoplasma gondii. European Journal of Pharmaceutical Sciences, 70:117 - 124, 2015. [ bib | DOI | http ]
Abstract The main idea of this study was to find predictive quantitative structure–activity relationships (QSAR) for the therapeutic index of 68 thiazolidin-4-one analogs against Toxoplasma gondii. Multivariate adaptive regression spline (MARS) together with Monte-Carlo (MC) sampling was proposed as a reliable descriptor subset selection strategy. Basis functions and knot points are also determined for each selected descriptor using generalized cross validation after frequency analysis. Least squares-support vector regression (LS-SVR) with optimized hyper-parameters was employed as mapping tool due to its promising empirical performance. The models were validated and tested through the use of the external prediction set of compounds, leave-one-out and leave-many-out cross validation methods, applicability domain analysis and Y-randomization. The robustness and accuracy of the {QSAR} models were confirmed by the satisfactory statistical parameters for the experimentally reported dataset (R2p = 0.853, {Q2LOO} = 0.785, R2L20%O = 0.742 and r2m = 0.715) and low standard error values (RMSEp = 0.208, {RMSELOO} = 0.321 and RMSEL20%O = 0.376). The comprehensive analysis carried out in the present contribution using the proposed strategy can provide a considerable basis for the design and development of novel drug-like molecules against T. gondii.

Keywords: Toxoplasma gondii
[368] Jui-Sheng Chou, Yu-Chien Hsu, and Liang-Tse Lin. Smart meter monitoring and data mining techniques for predicting refrigeration system performance. Expert Systems with Applications, 41(5):2144 - 2156, 2014. [ bib | DOI | http ]
Abstract A major challenge in many countries is providing sufficient energy for human beings and for supporting economic activities while minimizing social and environmental harm. This study predicted coefficient of performance (COP) for refrigeration equipment under varying amounts of refrigerant (R404A) with the aids of data mining (DM) techniques. The performance of artificial neural networks (ANNs), support vector machines (SVMs), classification and regression tree (CART), multiple regression (MR), generalized linear regression (GLR), and chi-squared automatic interaction detector (CHAID) were applied within {DM} process. After obtaining the {COP} value, abnormal equipment conditions can be evaluated for refrigerant leakage. Analytical results from cross-fold validation method are compared to determine the best models. The study shows that {DM} techniques can be used for accurately and efficiently predicting COP. In the liquid leakage phase, {ANNs} provide the best performance. In the vapor leakage phase, the best model is the {GLR} model. Experimental results confirm that systematic analyses of model construction processes are effective for evaluating and optimizing refrigeration equipment performance.

Keywords: Refrigeration management
[369] Tao Xiong, Chongguang Li, Yukun Bao, Zhongyi Hu, and Lu Zhang. A combination method for interval forecasting of agricultural commodity futures prices. Knowledge-Based Systems, 77:92 - 102, 2015. [ bib | DOI | http ]
Abstract Accurate interval forecasting of agricultural commodity futures prices over future horizons is challenging and of great interests to governments and investors, by providing a range of values rather than a point estimate. Following the well-established “linear and nonlinear” modeling framework, this study extends it to forecast interval-valued agricultural commodity futures prices with vector error correction model (VECM) and multi-output support vector regression (MSVR) (abbreviated as VECM–MSVR), which is capable of capturing the linear and nonlinear patterns exhibited in agricultural commodity futures prices. Two agricultural commodity futures prices from Chinese futures market are used to justify the performance of the proposed VECM–MSVR method against selected competitors. The quantitative and comprehensive assessments are performed and the results indicate that the proposed VECM–MSVR method is a promising alternative for forecasting interval-valued agricultural commodity futures prices.

Keywords: Interval-valued data
[370] Lin Lin, Feng Guo, and Xiaolong Xie. Novel informative feature samples extraction model using cell nuclear pore optimization. Engineering Applications of Artificial Intelligence, 39:168 - 180, 2015. [ bib | DOI | http ]
Abstract A novel informative feature samples extraction model is proposed to approximate massive original samples (OSs) by using a small number of informative feature samples (IFSs). In this model, (1) the feature samples (FSs) are identified using Support Vector Regression and Quantum-behaved Particle Swarm Optimization and (2) the {IFSs} space is established based on the Cell Nuclear Pore Optimization (CNPO) algorithm. {CNPO} uses a pore vector containing 0 or 1 to extract the essential {FSs} with high contribution based on the thought of cell nuclear pore selection mechanism. This model can be used to identify the continuous parameter based on the {IFSs} without massive {OSs} and time-consuming work. Two experiments are used to validate the proposed model, and one case is used to illustrate the practical value in the real engineer field. The experiments show that the {IFSs} could approximately represent the massive OSs, and the case shows that the model is helpful to identify the continuous parameters for the hydraulic turbine type design.

Keywords: Informative feature samples extraction
[371] Jian-Hao Hong, Manish Kumar Goyal, Yee-Meng Chiew, and Lloyd H.C. Chua. Predicting time-dependent pier scour depth with support vector regression. Journal of Hydrology, 468–469:241 - 248, 2012. [ bib | DOI | http ]
Summary The temporal variation of local pier scour depth is very complex, especially for cases where the bed comprises a sediment mixture. Many semi-empirical models have been proposed to predict the time-dependent local pier scour depth. In this paper, an alternative approach, the support vector regression method (SVR) is used to estimate the temporal variation of pier-scour depth with non-uniform sediments under clear-water conditions. Based on dimensional analyses, the temporal variation of scour depth was modeled as a function of seven dimensionless input parameters, namely flow shallowness (y/Dp), sediment coarseness (Dp/d50), densimetric Froude number (Fd), the difference between the actual and critical densimetric Froude number (Fd − Fdβ), geometric standard deviation of the sediment particle size distribution (σg), pier Froude number ( U / gD p ) and one of the following three dimensionless time scales (T1 = t/tR1, {T2} = t/tR2 and {T3} = t/tR3). The {SVR} model not only estimates the time-dependent scour depth more accurately than conventional regression models, but also provides results that are consistent with the physics of the scouring process.

Keywords: Bridge piers
[372] Kuo-Ping Lin and Ping-Feng Pai. A fuzzy support vector regression model for business cycle predictions. Expert Systems with Applications, 37(7):5430 - 5435, 2010. [ bib | DOI | http ]
Business cycle predictions face various sources of uncertainty and imprecision. The uncertainty is usually linguistically determined by the beliefs of decision makers. Thus, the fuzzy set theory is ideally suited to depict vague and uncertain features of business cycle predictions. Consequently, the estimation of fuzzy upper and lower bounds become an essential issue in predicting business cycles in an uncertain environment. The support vector regression (SVR) model is a novel forecasting approach that has been successfully used to solve time series problems. However, the {SVR} approach has not been widely applied in fuzzy forecasting problems. This study employs support vector regressions to calculate fuzzy upper and lower bounds; and presents a fuzzy support vector regression (FSVR) model for forecasting indices of business cycles. A numerical example of a business cycle prediction in Taiwan was used to demonstrate the forecasting performance of the {FSVR} model. The empirical results are satisfactory. Therefore, the {FSVR} model is an effective alternative in forecasting business cycles under uncertain circumstances.

Keywords: Business cycle
[373] Haihua Yao and Jizheng Chu. Operational optimization of a simulated atmospheric distillation column using support vector regression models and information analysis. Chemical Engineering Research and Design, 90(12):2247 - 2261, 2012. [ bib | DOI | http ]
Like any other production processes, atmospheric distillation of crude oil is too complex to be accurately described with first principle models, and on-site experiments guided by some statistical optimization method are often necessary to achieve the optimum operating conditions. In this study, the design of experiment (DOE) optimization procedure proposed originally by Chen et al. (1998) and extended later by Chu et al. (2003) has been revised by using support vector regression (SVR) to build models for target processes. The location of future experiments is suggested through information analysis which is based on {SVR} models for the performance index and observed variables and reduces significantly the number of experiments needed. A simulated atmospheric distillation column (ADC) is built with Aspen Plus (version 11.1) for a real operating ADC. Kernel functions and parameters are investigated for {SVR} models to represent suitably the behavior of the simulated ADC. To verify the effectiveness of the revised {DOE} optimization procedure, three case studies are carried out: (1) The modified Himmelblau function is minimized under a circle constraint; (2) the net profit of the simulated {ADC} is maximized with all the 15 controlled variables free for adjusting in their operational ranges; (3) the net profit of the simulated {ADC} is maximized with fixed production rates for the three side-draws.

Keywords: Atmospheric distillation column
[374] Andrew W. Dougherty, Elvin Beach, Patricia A. Morris, and Bruce R. Patton. Efficient orthogonalization in gas sensor arrays using reciprocal kernel support vector regression. Sensors and Actuators B: Chemical, 149(1):264 - 271, 2010. [ bib | DOI | http ]
In this paper support vector regression is presented, and it is used to model the responses of metal oxide gas sensors to combustion byproducts. A new version of the reciprocal kernel is presented for use in the regression, and it is tested in multiple dimensions. The orthogonality of the sensors is also calculated to determine if the sensors are suitable for use in arrays. A fast numerical approximation of the sensor orthogonality, which takes advantage of the reciprocal kernel, is presented as a way of quickly optimizing the effective response of large arrays. Comparison reveals advantages over standard approaches like principal component analysis.

Keywords: Metal oxide sensors
[375] Turker Tekin Erguzel, Cumhur Tas, and Merve Cebi. A wrapper-based approach for feature selection and classification of major depressive disorder–bipolar disorders. Computers in Biology and Medicine, 64:127 - 137, 2015. [ bib | DOI | http ]
Abstract Feature selection (FS) and classification are consecutive artificial intelligence (AI) methods used in data analysis, pattern classification, data mining and medical informatics. Beside promising studies in the application of {AI} methods to health informatics, working with more informative features is crucial in order to contribute to early diagnosis. Being one of the prevalent psychiatric disorders, depressive episodes of bipolar disorder (BD) is often misdiagnosed as major depressive disorder (MDD), leading to suboptimal therapy and poor outcomes. Therefore discriminating {MDD} and {BD} at earlier stages of illness could help to facilitate efficient and specific treatment. In this study, a nature inspired and novel {FS} algorithm based on standard Ant Colony Optimization (ACO), called improved {ACO} (IACO), was used to reduce the number of features by removing irrelevant and redundant data. The selected features were then fed into support vector machine (SVM), a powerful mathematical tool for data classification, regression, function estimation and modeling processes, in order to classify {MDD} and {BD} subjects. Proposed method used coherence, a promising quantitative electroencephalography (EEG) biomarker, values calculated from alpha, theta and delta frequency bands. The noteworthy performance of novel IACO–SVM approach stated that it is possible to discriminate 46 {BD} and 55 {MDD} subjects using 22 of 48 features with 80.19% overall classification accuracy. The performance of {IACO} algorithm was also compared to the performance of standard ACO, genetic algorithm (GA) and particle swarm optimization (PSO) algorithms in terms of their classification accuracy and number of selected features. In order to provide an almost unbiased estimate of classification error, the validation process was performed using nested cross-validation (CV) procedure.

Keywords: Artificial intelligence
[376] Nasser Goudarzi, Mohammad Goodarzi, M. Arab Chamjangali, and M.H. Fatemi. Application of a new spa-svm coupling method for {QSPR} study of electrophoretic mobilities of some organic and inorganic compounds. Chinese Chemical Letters, 24(10):904 - 908, 2013. [ bib | DOI | http ]
Abstract In this work, two chemometrics methods are applied for the modeling and prediction of electrophoretic mobilities of some organic and inorganic compounds. The successive projection algorithm, feature selection (SPA) strategy, is used as the descriptor selection and model development method. Then, the support vector machine (SVM) and multiple linear regression (MLR) model are utilized to construct the non-linear and linear quantitative structure–property relationship models. The results obtained using the {SVM} model are compared with those obtained using {MLR} reveal that the {SVM} model is of much better predictive value than the {MLR} one. The root-mean-square errors for the training set and the test set for the {SVM} model were 0.1911 and 0.2569, respectively, while by the {MLR} model, they were 0.4908 and 0.6494, respectively. The results show that the {SVM} model drastically enhances the ability of prediction in {QSPR} studies and is superior to the {MLR} model.

Keywords: Quantitative structure–mobility relationship
[377] Zhenbo Wei and Jun Wang. Tracing floral and geographical origins of honeys by potentiometric and voltammetric electronic tongue. Computers and Electronics in Agriculture, 108:112 - 122, 2014. [ bib | DOI | http ]
Abstract A potentiometric electronic tongue (PE-tongue) and a voltammetric electronic tongue (VE-tongue) were used as rapid techniques to classify and predict the honey samples from different floral and geographical origins. The PE-tongue, which was named α-ASTREE, was developed by Alpha M.O.S. (Toulouse, France), and it comprises seven potentiometric chemical sensors. The VE-tongue was self-developed at Zhejiang University and comprises six metallic working sensors. Four types of honey of different floral origins (acacia, buckwheat, data, and motherwort) and four types of acacia honey of different geographical origins were classified by both multisensor systems. Multivariate statistical data analysis techniques such as principal component analysis (PCA) and discriminant function analysis (DFA) were used to classify the honey samples. Both types of electronic tongue have good potential to classify the honey samples, and the positions of the data point for the samples in the {PCA} score plots based on the VE-tongue were much more closely grouped. Three regression modes, principal component regression (PCR), partial least squares regression (PLSR), and least squared-support vector machines (LS-SVM), were applied for category forecasting. These regression models exhibited a clear indication of the prediction ability of the two types of electronic tongue, and a positive trend in the prediction of the floral and geographical origin of honey was found. Moreover, the performance of these regression models for predicting the four types of honey of different geographical origins by the VE-tongue is very stable.

Keywords: Potentiometric electronic tongue
[378] Hiromasa Kaneko and Kimito Funatsu. Adaptive soft sensor model using online support vector regression with time variable and discussion of appropriate parameter settings. Procedia Computer Science, 22:580 - 589, 2013. 17th International Conference in Knowledge Based and Intelligent Information and Engineering Systems - {KES2013}. [ bib | DOI | http ]
Abstract Soft sensors are used in chemical plants to estimate process variables that are difficult to measure online. However, the predictive accuracy of adaptive soft sensor models decreases when sudden process changes occur. An online support vector regression (OSVR) model with a time variable can adapt to rapid changes among process variables. One problem faced by the proposed model is finding appropriate hyperparameters for the {OSVR} model; we discussed three methods to select parameters based on predictive accuracy and computation time. The proposed method was applied to simulation data and industrial data, and achieved high predictive accuracy when time-varying changes occurred.

Keywords: Process control
[379] M. Bassbasi, S. Platikanov, R. Tauler, and A. Oussama. Ftir-atr determination of solid non fat (snf) in raw milk using {PLS} and {SVM} chemometric methods. Food Chemistry, 146:250 - 254, 2014. [ bib | DOI | http ]
Abstract Fourier transform infrared spectroscopy (FTIR) attenuated total reflectance (ATR) spectroscopy, coupled with chemometrics methods have been applied to the fast and non-destructive quantitative determination of solid non fat (SNF) content in raw milk. Partial least squares regression (PLS) and support vector machine (SVM) regression methods were used to model and predict {SNF} contents in raw milk based on {FTIR} spectral transmission measurements. Both methods, {PLS} and SVM, showed good performances in {SNF} prediction with relative prediction errors in the external validation of between 0.2% and 0.3% depending on the spectral range and regression method. Coefficient of determination of the global fit was always above 0.99. Since, the relative prediction errors were low, it can be concluded that FTIR-ATR with chemometrics can be used for accurate quantitative determinations of {SNF} contents in raw milk within the investigated calibration range of 79–100 g/L. The proposed procedure is fast, non-destructive, simple and easy to implement.

Keywords: Raw milk
[380] Phuong Minh Nguyen, Jan De Pue, Khoa Van Le, and Wim Cornelis. Impact of regression methods on improved effects of soil structure on soil water retention estimates. Journal of Hydrology, 525:598 - 606, 2015. [ bib | DOI | http ]
Summary Increasing the accuracy of pedotransfer functions (PTFs), an indirect method for predicting non-readily available soil features such as soil water retention characteristics (SWRC), is of crucial importance for large scale agro-hydrological modeling. Adding significant predictors (i.e., soil structure), and implementing more flexible regression algorithms are among the main strategies of {PTFs} improvement. The aim of this study was to investigate whether the improved effect of categorical soil structure information on estimating soil-water content at various matric potentials, which has been reported in literature, could be enduringly captured by regression techniques other than the usually applied linear regression. Two data mining techniques, i.e., Support Vector Machines (SVM), and k-Nearest Neighbors (kNN), which have been recently introduced as promising tools for {PTF} development, were utilized to test if the incorporation of soil structure will improve PTF’s accuracy under a context of rather limited training data. The results show that incorporating descriptive soil structure information, i.e., massive, structured and structureless, as grouping criterion can improve the accuracy of {PTFs} derived by {SVM} approach in the range of matric potential of −6 to −33 kPa (average {RMSE} decreased up to 0.005 m3 m−3 after grouping, depending on matric potentials). The improvement was primarily attributed to the outperformance of SVM-PTFs calibrated on structureless soils. No improvement was obtained with kNN technique, at least not in our study in which the data set became limited in size after grouping. Since there is an impact of regression techniques on the improved effect of incorporating qualitative soil structure information, selecting a proper technique will help to maximize the combined influence of flexible regression algorithms and soil structure information on {PTF} accuracy.

Keywords: Pedotransfer function
[381] Lisa Michielan, Chiara Bolcato, Stephanie Federico, Barbara Cacciari, Magdalena Bacilieri, Karl-Norbert Klotz, Sonja Kachler, Giorgia Pastorin, Riccardo Cardin, Alessandro Sperduti, Giampiero Spalluto, and Stefano Moro. Combining selectivity and affinity predictions using an integrated support vector machine (svm) approach: An alternative tool to discriminate between the human adenosine {A2A} and {A3} receptor pyrazolo-triazolo-pyrimidine antagonists binding sites. Bioorganic & Medicinal Chemistry, 17(14):5259 - 5274, 2009. [ bib | DOI | http ]
G Protein-coupled receptors (GPCRs) selectivity is an important aspect of drug discovery process, and distinguishing between related receptor subtypes is often the key to therapeutic success. Nowadays, very few valuable computational tools are available for the prediction of receptor subtypes selectivity. In the present study, we present an alternative application of the Support Vector Machine (SVM) and Support Vector Regression (SVR) methodologies to simultaneously describe both {A2AR} versus {A3R} subtypes selectivity profile and the corresponding receptor binding affinities. We have implemented an integrated application of SVM–SVR approach, based on the use of our recently reported autocorrelated molecular descriptors encoding for the Molecular Electrostatic Potential (autoMEP), to simultaneously discriminate {A2AR} versus {A3R} antagonists and to predict their binding affinity to the corresponding receptor subtype of a large dataset of known pyrazolo-triazolo-pyrimidine analogs. To validate our approach, we have synthetized 51 new pyrazolo-triazolo-pyrimidine derivatives anticipating both A2AR/A3R subtypes selectivity and receptor binding affinity profiles.

Keywords: Adenosine receptors
[382] Pingyan Cheng, Wenlai Fan, and Yan Xu. Quality grade discrimination of chinese strong aroma type liquors using mass spectrometry and multivariate analysis. Food Research International, 54(2):1753 - 1760, 2013. [ bib | DOI | http ]
Abstract Food quality control and grade identification have an importance for protecting consumer benefits. In this paper, taking Yanghe Daqu for instance, we studied quality grade discrimination of Chinese liquor with strong aroma type. 108 samples were divided into calibration set (81 samples) and validation set (27 samples), whose mass spectra were obtained by head space-solid phase microextraction-mass spectrometry (HS-SPME-MS) technology in the range of m/z 55–191. And then, the partial least squares (PLS) regression and principal component regression (PCR) models were constructed by calibration set and predicted the quality grade of validation set. Discrimination accuracy of the {PLS} model was > 96.3% for both calibration set and validation set, which was obviously superior to {PCR} model. The support vector machine (SVM) models were built by different ion selection methods, {PLS} regression coefficients, {PLS} X-loading, {PCR} regression coefficients, and {PCR} X-loading. Of these, the optimal {SVM} model was achieved with ions (m/z 112, 134, 140, 162, 167, 168, 175, 187, and 191) selected by {PLS} regression coefficients, whose prediction accuracy for the validation set was up to 92.6%. The overall results indicated that the {PLS} regression coefficients was a powerful way for selecting effective ion variables and mass spectrometry combined with {SVM} could well discriminate the quality grade of liquor.

Keywords: Quality grade discrimination
[383] S.-S. Poil, S. Bollmann, C. Ghisleni, R.L. O’Gorman, P. Klaver, J. Ball, D. Eich-Höchli, D. Brandeis, and L. Michels. Age dependent electroencephalographic changes in attention-deficit/hyperactivity disorder (adhd). Clinical Neurophysiology, 125(8):1626 - 1638, 2014. [ bib | DOI | http ]
AbstractObjective Objective biomarkers for attention-deficit/hyperactivity disorder (ADHD) could improve diagnostics or treatment monitoring of this psychiatric disorder. The resting electroencephalogram (EEG) provides non-invasive spectral markers of brain function and development. Their accuracy as {ADHD} markers is increasingly questioned but may improve with pattern classification. Methods This study provides an integrated analysis of {ADHD} and developmental effects in children and adults using regression analysis and support vector machine classification of spectral resting (eyes-closed) {EEG} biomarkers in order to clarify their diagnostic value. Results {ADHD} effects on {EEG} strongly depend on age and frequency. We observed typical non-linear developmental decreases in delta and theta power for both {ADHD} and control groups. However, for {ADHD} adults we found a slowing in alpha frequency combined with a higher power in alpha-1 (8–10 Hz) and beta (13–30 Hz). Support vector machine classification of {ADHD} adults versus controls yielded a notable cross validated sensitivity of 67% and specificity of 83% using power and central frequency from all frequency bands. {ADHD} children were not classified convincingly with these markers. Conclusions Resting state electrophysiology is altered in ADHD, and these electrophysiological impairments persist into adulthood. Significance Spectral biomarkers may have both diagnostic and prognostic value.

Keywords: Attention-deficit/hyperactivity disorder
[384] Rui min Shen, Yong gang Fu, and Hong tao Lu. A novel image watermarking scheme based on support vector regression. Journal of Systems and Software, 78(1):1 - 8, 2005. [ bib | DOI | http ]
In this paper, a novel support vector regression based color image watermarking scheme is proposed. Using the information provided by the reference positions, the support vector regression can be trained at the embedding procedure, and the watermark is adaptively embedded into the blue channel of the host image by considering the human visual system. Thanks to the good learning ability of support vector machine, the watermark can be correctly extracted under several different attacks. Experimental results show that the proposed scheme outperform the Kutter’s method and Yu’s method against different attacks including noise addition, shearing, luminance and contrast enhancement, distortion, etc. Especially when the watermarked image is enhanced in luminance and contrast at rate 70%, our method can extract the watermark with few bit errors.

Keywords: Digital watermarking
[385] Zengchang Qin and Jonathan Lawry. Prediction and query evaluation using linguistic decision trees. Applied Soft Computing, 11(5):3916 - 3928, 2011. [ bib | DOI | http ]
Linguistic decision tree (LDT) is a tree-structured model based on a framework for “Modelling with Words”. In previous research [15,17], an algorithm for learning {LDTs} was proposed and its performance on some benchmark classification problems were investigated and compared with a number of well known classifiers. In this paper, a methodology for extending {LDTs} to prediction problems is proposed and the performance of {LDTs} are compared with other state-of-art prediction algorithms such as a Support Vector Regression (SVR) system and Fuzzy Semi-Naive Bayes [13] on a variety of data sets. Finally, a method for linguistic query evaluation is discussed and supported with an example.

Keywords: Label semantics
[386] Sounak Chakraborty, Malay Ghosh, and Bani K. Mallick. Bayesian nonlinear regression for large small problems. Journal of Multivariate Analysis, 108:28 - 40, 2012. [ bib | DOI | http ]
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik’s ϵ -insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the {RVM} relying on the use of type {II} maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our {RVM} and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models.

Keywords: Bayesian hierarchical model
[387] Xianlun Tang, Ling Zhuang, and Changjiang Jiang. Prediction of silicon content in hot metal using support vector regression based on chaos particle swarm optimization. Expert Systems with Applications, 36(9):11853 - 11857, 2009. [ bib | DOI | http ]
The prediction of silicon content in hot metal has been a major study subject as one of the most important means for the monitoring state in ferrous metallurgy industry. A prediction model of silicon content is established based on the support vector regression (SVR) whose optimal parameters are selected by chaos particle swarm optimization. The data of the model are collected from No. 3 {BF} in Panzhihua Iron and Steel Group Co. of China. The results show that the proposed prediction model has better prediction results than neural network trained by chaos particle swarm optimization and least squares support vector regression, the percentage of samples whose absolute prediction errors are less than 0.03 when predicting silicon content by the proposed model is higher than 90%, it indicates that the prediction precision can meet the requirement of practical production.

Keywords: Support vector regression
[388] H. Ping Tserng, Gwo-Fong Lin, L. Ken Tsai, and Po-Cheng Chen. An enforced support vector machine model for construction contractor default prediction. Automation in Construction, 20(8):1242 - 1249, 2011. [ bib | DOI | http ]
The financial health of construction contractors is critical in successfully completing a project, and thus default prediction is highly concerned by owners and other stakeholders. In other industries many previous studies employ support vector machine (SVM) or other Artificial Neural Networks (ANN) methods for corporate default prediction using the sample-matching method, which produces sample selection biases. In order to avoid the sample selection biases, this paper used all available firm-years samples during the sample period. Yet this brings a new challenge: the number of non-defaulted samples greatly exceeds the defaulted samples, which is referred to as between-class imbalance. Although the {SVM} algorithm is a powerful learning process, it cannot always be applied to data with extreme distribution characteristics. This paper proposes an enforced support vector machine-based model (ESVM model) for the default prediction in the construction industry, using all available firm-years data in our sample period to solve the between-class imbalance. The traditional logistic regression model is provided as a benchmark to evaluate the forecasting ability of the {ESVM} model. All financial variables related to the prediction of contractor default risk as well as 7 variables selected by the Multivariate Discriminant Analysis (MDA) stepwise method are put in the models for comparison. The empirical results of this paper show that the {ESVM} model always outperforms the logistic regression model, and is more convenient to use because it is relatively independent of the selection of variables. Thus, we recommend the proposed {ESVM} model as an alternative to the traditionally used logistic model.

Keywords: Contractor analysis
[389] Jan Luts, Geert Molenberghs, Geert Verbeke, Sabine Van Huffel, and Johan A.K. Suykens. A mixed effects least squares support vector machine model for classification of longitudinal data. Computational Statistics & Data Analysis, 56(3):611 - 628, 2012. [ bib | DOI | http ]
A mixed effects least squares support vector machine (LS-SVM) classifier is introduced to extend the standard LS-SVM classifier for handling longitudinal data. The mixed effects LS-SVM model contains a random intercept and allows to classify highly unbalanced data, in the sense that there is an unequal number of observations for each case at non-fixed time points. The methodology consists of a regression modeling and a classification step based on the obtained regression estimates. Regression and classification of new cases are performed in a straightforward manner by solving a linear system. It is demonstrated that the methodology can be generalized to deal with multi-class problems and can be extended to incorporate multiple random effects. The technique is illustrated on simulated data sets and real-life problems concerning human growth.

Keywords: Classification
[390] Juan F. Ramirez-Villegas and David F. Ramirez-Moreno. Wavelet packet energy, tsallis entropy and statistical parameterization for support vector-based and neural-based classification of mammographic regions. Neurocomputing, 77(1):82 - 100, 2012. [ bib | DOI | http ]
This work develops a support vector and neural-based classification of mammographic regions by applying statistical, wavelet packet energy and Tsallis entropy parameterization. From the first four wavelet packet decomposition levels, four different feature sets were evaluated using two-sample Kolmogorov–Smirnov test (KS-test) and, in one case, principal component analysis (PCA). Feature selection was performed applying a hybrid scheme integrating non-parametric KS-test, correlation analysis, a logistic regression (LR) model and sequential forward selection (SFS). The top selected features (depending on the selected wavelet decomposition level) produced the best classification performances in comparison to other well-known feature selection methods. The classification of the data was carried out using several support vector machine (SVM) schemes and multi-layer perceptron (MLP) neural networks. The new set of features improved significantly the classification performance of mammographic regions using conventional {SVMs} and MLPs.

Keywords: Mammographic regions
[391] Albert Samà, Cecilio Angulo, Diego Pardo, Andreu Català, and Joan Cabestany. Analyzing human gait and posture by combining feature selection and kernel methods. Neurocomputing, 74(16):2665 - 2674, 2011. Advances in Extreme Learning Machine: Theory and ApplicationsBiological Inspired Systems. Computational and Ambient IntelligenceSelected papers of the 10th International Work-Conference on Artificial Neural Networks (IWANN2009). [ bib | DOI | http ]
This paper evaluates a set of computational algorithms for the automatic estimation of human postures and gait properties from signals provided by an inertial body sensor. The use of a single sensor device imposes limitations for the automatic estimation of relevant properties, like step length and gait velocity, as well as for the detection of standard postures like sitting or standing. Moreover, the exact location and orientation of the sensor are also a common restriction that is relaxed in this study. Based on accelerations provided by a sensor, known as the ‘9×2’, three approaches are presented extracting kinematic information from the user motion and posture. First, a two-phases procedure implementing feature extraction and support vector machine based classification for daily living activity monitoring is presented. Second, support vector regression is applied on heuristically extracted features for the automatic computation of spatiotemporal properties during gait. Finally, sensor information is interpreted as an observation of a particular trajectory of the human gait dynamical system, from which a reconstruction space is obtained, and then transformed using standard principal components analysis, finally support vector regression is used for prediction. Daily living activities are detected and spatiotemporal parameters of human gait are estimated using methods sharing a common structure based on feature extraction and kernel methods. The approaches presented are susceptible to be used for medical purposes.

Keywords: Human gait and posture detection
[392] Elham Omrani, Benyamin Khoshnevisan, Shahaboddin Shamshirband, Hadi Saboohi, Nor Badrul Anuar, and Mohd Hairul Nizam Md Nasir. Potential of radial basis function-based support vector regression for apple disease detection. Measurement, 55:512 - 519, 2014. [ bib | DOI | http ]
Abstract Plant pathologists detect diseases directly with the naked eye. However, such detection usually requires continuous monitoring, which is time consuming and very expensive on large farms. Therefore, seeking rapid, automated, economical, and accurate methods of plant disease detection is very important. In this study, three different apple diseases appearing on leaves, namely Alternaria, apple black spot, and apple leaf miner pest were selected for detection via image processing technique. This paper presents three soft-computing approaches for disease classification, of artificial neural networks (ANNs), and support vector machines (SVMs). Following sampling, the infected leaves were transferred to the laboratory and then leaf images were captured under controlled light. Next, K-means clustering was employed to detect infected regions. The images were then processed and features were extracted. The {SVM} approach provided better results than the {ANNs} for disease classification.

Keywords: Plant disease
[393] Hiromasa Kaneko and Kimito Funatsu. Fast optimization of hyperparameters for support vector regression models with highly predictive ability. Chemometrics and Intelligent Laboratory Systems, 142:64 - 69, 2015. [ bib | DOI | http ]
Abstract Support vector regression (SVR) attracts much attention in chemometrics as a nonlinear regression method due to its theoretical background. In {SVR} modeling, three hyperparameters must be set beforehand. The optimization method based on grid search (GS) and cross-validation (CV) is employed normally in the selection of the {SVR} hyperparameters. However, this takes enormous time. Although theoretical techniques exist to decide the values of the {SVR} hyperparameters, predictive ability of {SVR} models is not considered in the decision. We therefore proposed a method based on the {GS} and {CV} method and theoretical techniques for fast optimization of the {SVR} hyperparameters, considering predictive ability of {SVR} models. After values of two hyperparameters are decided theoretically, each hyperparameter is optimized independently with {GS} and CV. The highly predictive ability of {SVR} models and small computational time for the proposed method are confirmed through the case studies using real data sets.

Keywords: Support vector regression
[394] Xianlong Wang and Annie Qu. Efficient classification for longitudinal data. Computational Statistics & Data Analysis, 78:119 - 134, 2014. [ bib | DOI | http ]
Abstract A new classifier, QIFC, is proposed based on the quadratic inference function for longitudinal data. Our approach builds a classifier by taking advantage of modeling information between the longitudinal responses and covariates for each class, and assigns a new subject to the class with the shortest newly defined distance to the subject. For finite sample applications, this enables one to overcome the difficulty in estimating covariance matrices while still incorporating correlation into the classifier. The proposed classifier only requires the first moment condition of the model distribution, and hence is able to handle both continuous and discrete responses. Simulation studies show that {QIFC} outperforms competing classifiers, such as the functional data classifier, support vector machine, logistic regression, linear discriminant analysis, the naive Bayes classifier and the decision tree in various practical settings. Two time-course gene expression data sets are used to assess the performance of {QIFC} in applications.

Keywords: QIFC
[395] Katherine Holshausen, Philip D. Harvey, Brita Elvevåg, Peter W. Foltz, and Christopher R. Bowie. Latent semantic variables are associated with formal thought disorder and adaptive behavior in older inpatients with schizophrenia. Cortex, 55:88 - 96, 2014. Language, Computers and Cognitive Neuroscience. [ bib | DOI | http ]
Introduction Formal thought disorder is a hallmark feature of schizophrenia in which disorganized thoughts manifest as disordered speech. A dysfunctional semantic system and a disruption in executive functioning have been proposed as possible mechanisms for formal thought disorder and verbal fluency impairment. Traditional rating scales and neuropsychological test scores might not be sensitive enough to distinguish among types of semantic impairments. This has lead to the proposed used of a natural language processing technique, Latent Semantic Analysis (LSA), which offers improved semantic sensitivity. Method In this study, LSA, a computational, vector-based text analysis technique to examine the contribution of vector length, an {LSA} measure related to word unusualness and cosines between word vectors, an {LSA} measure of semantic coherence to semantic and phonological fluency, disconnectedness of speech, and adaptive functioning in 165 older inpatients with schizophrenia. Results In stepwise regressions word unusualness was significantly associated with semantic fluency and phonological fluency, disconnectedness in speech, and impaired functioning, even after considering the contribution of premorbid cognition, positive and negative symptoms, and demographic variables. Conclusions These findings support the utility of {LSA} in examining the contribution of coherence to thought disorder and the its relationship with daily functioning. Deficits in verbal fluency may be an expression of underlying disorganization in thought processes.

Keywords: Schizophrenia
[396] Rachid Darnag, Brahim Minaoui, and Mohamed Fakir. {QSAR} models for prediction study of {HIV} protease inhibitors using support vector machines, neural networks and multiple linear regression. Arabian Journal of Chemistry, pages -, 2012. [ bib | DOI | http ]
Support vector machines (SVM) represent one of the most promising Machine Learning (ML) tools that can be applied to develop a predictive quantitative structure–activity relationship (QSAR) models using molecular descriptors. Multiple linear regression (MLR) and artificial neural networks (ANNs) were also utilized to construct quantitative linear and non linear models to compare with the results obtained by SVM. The prediction results are in good agreement with the experimental value of {HIV} activity; also, the results reveal the superiority of the {SVM} over {MLR} and {ANN} model. The contribution of each descriptor to the structure–activity relationships was evaluated.

Keywords: QSAR
[397] C. Ordóñez, J.M. Matías, J.F. de Cos Juez, and P.J. García. Machine learning techniques applied to the determination of osteoporosis incidence in post-menopausal women. Mathematical and Computer Modelling, 50(5–6):673 - 679, 2009. Mathematical Models in Medicine & Engineering. [ bib | DOI | http ]
Osteoporosis is a disease that mostly affects women in developed countries. It is characterised by reduced bone mineral density (BMD) and results in a higher incidence of fractured or broken bones. In this research we studied the relationship between {BMD} and diet and lifestyle habits for a sample of 305 post-menopausal women by constructing a non-linear model using the regression support vector machines technique. One aim of this model was to make an initial preliminary estimate of {BMD} in the studied women (on the basis of a questionnaire with questions mostly on dietary habits) so as to determine whether they needed densitometry testing. A second aim was to determine the factors with the greatest bearing on {BMD} with a view to proposing dietary and lifestyle improvements. These factors were determined using regression trees applied to the support vector machines predictions.

Keywords: Osteoporosis
[398] Sun Lingfang and Wang Yechi. Soft-sensing of oxygen content of flue gas based on mixed model. Energy Procedia, 17, Part A:221 - 226, 2012. 2012 International Conference on Future Electrical Power and Energy System. [ bib | DOI | http ]
In order to increase the measuring accuracy of oxygen content of flue gas, a kind of new soft-sensing method of oxygen content in flue gas based on mixed model was presented. The main body of the model was set up with support vector regression (SVR), the input set was pretreated with principal component analysis (PCA) method to reduce input number of dimensions, the training output set was pretreated with empirical mode decomposition (EMD) method to eliminate the influences caused by high-frequency interference, and model calibration was carried with K-fold cross validation (K-CV) method. The simulation result shows that this mixed model method has better accuracy and the ability of generalization than those single-models with support vector machine or neural network.

Keywords: Soft sensing
[399] Bartosz Swiderski, Jarosław Kurek, and Stanislaw Osowski. Multistage classification by using logistic regression and neural networks for assessment of financial condition of company. Decision Support Systems, 52(2):539 - 547, 2012. [ bib | DOI | http ]
The paper presents the new approach to the automatic assessment of the financial condition of the company. We develop the computerized classification system applying {WOE} representation of data, logistic regression and Support Vector Machine (SVM) used as the final classifier. The applied method is a combination of a classical binary scoring approach and Support Vector Machine classification. The application of this method to the assessment of the financial condition of companies, classified into five classes, has shown its superiority with respect to classical approaches.

Keywords: Multinomial ordinary regression
[400] Hao CHEN, Yu chao MA, Mu zi CHEN, Yue TANG, Bo WANG, Min CHEN, and Xiao guang YANG. Recovery discrimination based on optimized-variables support vector machine for nonperforming loan. Systems Engineering - Theory & Practice, 29(12):23 - 30, 2009. [ bib | DOI | http ]
This article modifies the Support Vector Machine (SVM) algorithm to address the issue of a large number of explantory variables in the analysis of nonperforming loan recovery. First, the stepwise {SVM} is employed in the selection of model structure. Secondly, the results of linear stepwise regression are used as the initial states of the model selection. Empirical results show that the method not only achieves high accurate out-sample prediction, but also stable performance with in-samples and out-samples.

Keywords: variables optimization
[401] Ajaya Kumar Pani and Hare Krishna Mohanta. Online monitoring and control of particle size in the grinding process using least square support vector regression and resilient back propagation neural network. {ISA} Transactions, 56:206 - 221, 2015. [ bib | DOI | http ]
Abstract Particle size soft sensing in cement mills will be largely helpful in maintaining desired cement fineness or Blaine. Despite the growing use of vertical roller mills (VRM) for clinker grinding, very few research work is available on {VRM} modeling. This article reports the design of three types of feed forward neural network models and least square support vector regression (LS-SVR) model of a {VRM} for online monitoring of cement fineness based on mill data collected from a cement plant. In the data pre-processing step, a comparative study of the various outlier detection algorithms has been performed. Subsequently, for model development, the advantage of algorithm based data splitting over random selection is presented. The training data set obtained by use of Kennard–Stone maximal intra distance criterion (CADEX algorithm) was used for development of LS-SVR, back propagation neural network, radial basis function neural network and generalized regression neural network models. Simulation results show that resilient back propagation model performs better than {RBF} network, regression network and LS-SVR model. Model implementation has been done in {SIMULINK} platform showing the online detection of abnormal data and real time estimation of cement Blaine from the knowledge of the input variables. Finally, closed loop study shows how the model can be effectively utilized for maintaining cement fineness at desired value.

Keywords: Cement fineness
[402] Vasilios Plakandaras, Rangan Gupta, Periklis Gogas, and Theophilos Papadimitriou. Forecasting the u.s. real house price index. Economic Modelling, 45:259 - 267, 2015. [ bib | DOI | http ]
Abstract The 2006 sudden and immense downturn in U.S. house prices sparked the 2007 global financial crisis and revived the interest about forecasting such imminent threats for economic stability. In this paper we propose a novel hybrid forecasting methodology that combines the Ensemble Empirical Mode Decomposition (EEMD) from the field of signal processing with the Support Vector Regression (SVR) methodology that originates from machine learning. We test the forecasting ability of the proposed model against a Random Walk (RW), a Bayesian Autoregressive and a Bayesian Vector Autoregressive model. The proposed methodology outperforms all the competing models with half the error of the {RW} model with and without drift in out-of-sample forecasting. Finally, we argue that this new methodology can be used as an early warning system for forecasting sudden house price drops with direct policy implications.

Keywords: House prices
[403] Ping-Feng Pai and Wei-Chiang Hong. Forecasting regional electricity load based on recurrent support vector machines with genetic algorithms. Electric Power Systems Research, 74(3):417 - 425, 2005. [ bib | DOI | http ]
Accompanying deregulation of electricity industry, accurate load forecasting of the future electricity demand has been the most important role in regional or national power system strategy management. Electricity load forecasting is complex to conduct due to its nonlinearity of influenced factors. Support vector machines (SVMs) have been successfully employed to solve nonlinear regression and time series problems. However, the application for load forecasting is rare. In this study, a recurrent support vector machines with genetic algorithms (RSVMG) is proposed to forecast electricity load. In addition, genetic algorithms (GAs) are used to determine free parameters of support vector machines. Subsequently, examples of electricity load data from Taiwan are used to illustrate the performance of proposed {RSVMG} model. The empirical results reveal that the proposed model outperforms the {SVM} model, artificial neural network (ANN) model and regression model. Consequently, the {RSVMG} model provides a promising alternative for forecasting electricity load in power industry.

Keywords: Recurrent neural networks (RNNs)
[404] Anna M.C. Prakash, Christopher M. Stellman, and Karl S. Booksh. Optical regression: a method for improving quantitative precision of multivariate prediction with single channel spectrometers. Chemometrics and Intelligent Laboratory Systems, 46(2):265 - 274, 1999. [ bib | DOI | http ]
`Optical regression' (OR) is presented as a method for improving the quantitative precision of scanning and filter wheel process analyzers. {OR} combines analog variable selection and optimization of signal to noise measurements under constrained total measurement time to maximize the precision of prediction in multivariate analysis. With optical regression, the regression vector is employed as a template to optimize the data collection time at each wavelength of the unknown spectra. Implicitly, this performs the dot product of the spectrum and regression vector by electronically integrating the signal of the detector instead of performing the mathematical operations in the computer following digitization of the spectrum. The theory of optical regression is developed and the expected precision of optical regression is shown to be superior to the expected precision of digital regression. This conclusion is supported by Monte Carlo simulations with three types of random errors. Further support is supplied by quantitation of three fluorescent dyes with a fiber optic fluorescence spectrometer.

Keywords: Multivariate calibration
[405] Yongqiao Wang, He Ni, and Shouyang Wang. Nonparametric bivariate copula estimation based on shape-restricted support vector regression. Knowledge-Based Systems, 35:235 - 244, 2012. [ bib | DOI | http ]
Copula has become a standard tool in describing dependent relations between random variables. This paper proposes a nonparametric bivariate copula estimation method based on shape-restricted ϵ-support vector regression (ϵ-SVR). This method explicitly supplements the classical ϵ-SVR with constraints related to three shape restrictions: grounded, marginal and 2-increasing, which are the necessary and sufficient conditions for a bivariate function to be a copula. This nonparametric method can be reformulated to a convex quadratic programming, which is computationally tractable. Experiments on both five artificial data sets and three international stock indexes clearly showed that it could achieve significantly better performance than common parametric models and kernel smoother.

Keywords: Support vector regression
[406] Jirong Gu, Mingcang Zhu, and Liuguangyan Jiang. Housing price forecasting based on genetic algorithm and support vector machine. Expert Systems with Applications, 38(4):3383 - 3386, 2011. [ bib | DOI | http ]
Accurate forecasting for future housing price is very significant for socioeconomic development and national lives. In this study, a hybrid of genetic algorithm and support vector machines (G-SVM) approach is presented in housing price forecasting. Support vector machine (SVM) has been proven to be a robust and competent algorithm for both classification and regression in many applications. However, how to select the most appropriate the training parameter value is the important problem in the using of SVM. Compared to Grid algorithm, genetic algorithm (GA) method consumes less time and performs well. Thus, {GA} is applied to optimize the parameters of {SVM} simultaneously. The cases in China are applied to testify the housing price forecasting ability of G-SVM method. The experimental results indicate that forecasting accuracy of this G-SVM approach is more superior than GM.

Keywords: Housing price
[407] Caihao Weng, Yujia Cui, Jing Sun, and Huei Peng. On-board state of health monitoring of lithium-ion batteries using incremental capacity analysis with support vector regression. Journal of Power Sources, 235:36 - 44, 2013. [ bib | DOI | http ]
Battery state of health (SOH) monitoring has become a crucial challenge in hybrid electric vehicles (HEVs) and all electric vehicles (EVs) research, as {SOH} significantly affects the overall vehicle performance and life cycle. In this paper, we focus on the identification of Li-ion battery capacity fading, as the loss of capacity and therefore the driving range is a primary concern for {EV} and plug-in {HEV} (PHEV). While most studies on battery capacity fading are based on laboratory measurement such as open circuit voltage (OCV) curve, few publications have focused on capacity loss monitoring during on-board operations. We propose a battery {SOH} monitoring scheme based on partially charging data. Through analysis of battery aging cycle data, a robust signature associated with battery aging is identified through incremental capacity analysis (ICA). Several algorithms to extract this signature are developed and evaluated for on-board {SOH} monitoring. The use of support vector regression (SVR) is shown to provide the most consistent identification results with moderate computational load. For battery cells tested, we show that the {SVR} model built upon the data from one single cell is able to predict the capacity fading of 7 other cells within 1% error bound.

Keywords: Electric vehicles
[408] Antonio Morell, Mahmoud Tarokh, and Leopoldo Acosta. Solving the forward kinematics problem in parallel robots using support vector regression. Engineering Applications of Artificial Intelligence, 26(7):1698 - 1706, 2013. [ bib | DOI | http ]
Abstract The Stewart platform, a representative of the class of parallel manipulators, has been successfully used in a wide variety of fields and industries, from medicine to automotive. Parallel robots have key benefits over serial structures regarding stability and positioning capability. At the same time, they present challenges and open problems which need to be addressed in order to take full advantage of their utility. In this paper, we propose a new approach for solving one of these key aspects: the solution to the forward kinematics in real-time, an under-defined problem with a high-degree nonlinear formulation, using a popular machine learning method for classification and regression, the Support Vector Machines. Instead of solving a numerical problem, the proposed method involves applying Support Vector Regression to model the behavior of a platform in a given region or partition of the pose space. It consists of two phases, an off-line preprocessing step and a fast on-line evaluation phase. The experiments made have yielded a good approximation to the analytical solution, and have shown its suitability for real-time application.

Keywords: Parallel robots
[409] Adem Ukte, Aydin Kizilkaya, and M. Dogan Elbi. Two empirical methods for improving the performance of statistical multirate high-resolution signal reconstruction. Digital Signal Processing, 26:36 - 49, 2014. [ bib | DOI | http ]
Abstract The problem of reconstructing a known high-resolution signal from a set of its low-resolution parts exposed to additive white Gaussian noise is addressed in this paper from the perspective of statistical multirate signal processing. To enhance the performance of the existing high-resolution signal reconstruction procedure that is based on using a set of linear periodically time-varying (LPTV) Wiener filter structures, we propose two empirical methods combining empirical mode decomposition- and least squares support vector machine regression-based noise reduction schemes with these filter structures. The methods originate from the idea of reducing the effects of white Gaussian noise present in the low-resolution observations before applying them directly to the {LPTV} Wiener filters. Performances of the proposed methods are evaluated over one-dimensional simulated signals and two-dimensional images. Simulation results show that, under certain conditions, considerable improvements have been achieved by the proposed methods when compared with the previous study that only uses a set of {LPTV} Wiener filter structures for the signal reconstruction process.

Keywords: Multirate signal processing
[410] Rosario Capparuccia, Renato De Leone, and Emilia Marchitto. Integrating support vector machines and neural networks. Neural Networks, 20(5):590 - 597, 2007. [ bib | DOI | http ]
Support vector machines (SVMs) are a powerful technique developed in the last decade to effectively tackle classification and regression problems. In this paper we describe how support vector machines and artificial neural networks can be integrated in order to classify objects correctly. This technique has been successfully applied to the problem of determining the quality of tiles. Using an optical reader system, some features are automatically extracted, then a subset of the features is determined and the tiles are classified based on this subset.

Keywords: Support vector machines
[411] Jui-Sheng Chou and Chih-Fong Tsai. Concrete compressive strength analysis using a combined classification and regression technique. Automation in Construction, 24:52 - 60, 2012. [ bib | DOI | http ]
High performance concrete (HPC) is a complex composite material, and a model of its compressive strength must be highly nonlinear. Many studies have tried to develop accurate and effective predictive models for {HPC} compressive strength, including linear regression (LR), artificial neural networks (ANNs), and support vector regression (SVR). Nevertheless, in accordance with recent reports that a hierarchical structure outperforms a flat one, this study proposes a hierarchical classification and regression (HCR) approach for improving performance in predicting {HPC} compressive strength. Specifically, the first-level analyses of the {HCR} find exact classes for new unknown cases. The cases are then entered into the corresponding prediction model to obtain the final output. The analytical results for a laboratory dataset show that the {HCR} approach outperforms conventional flat prediction models (LR, ANNs, and SVR). Notably, the {HCR} with a 4-class support vector machine in the first level combined with a single {ANNs} obtains the lowest mean absolute percentage error.

Keywords: High performance concrete
[412] Jiu sheng Li and Xiang jun Li. Determination principal component content of seed oils by thz-tds. Chemical Physics Letters, 476(1–3):92 - 96, 2009. [ bib | DOI | http ]
The terahertz transmission spectra of seed oils are measured in the frequency range extending from 0.2 to 1.4 {THz} using terahertz time-domain spectroscopy (THz-TDS). The absorption spectra of three acid compounds (octadecanoic acid, octadecenoic acid and octadecadienoic acid) in seed oils are recorded and simulated using both THz-TDS and density functional theory (DFT) methods. Support vector regression (SVR) model using the raw measured terahertz spectral data directly as input of the principal component is established and is employed to determinate three acid compounds content for the terahertz time-domain spectroscopy. Comparison of the experimental data using liquid chromatography with predictions based on support vector regression, respectively, exhibits excellent agreement.

[413] Taichun Qin, Shengkui Zeng, and Jianbin Guo. Robust prognostics for state of health estimation of lithium-ion batteries based on an improved pso–svr model. Microelectronics Reliability, pages -, 2015. [ bib | DOI | http ]
Abstract State of health (SOH) estimation of lithium-ion batteries is significant for safe and lifetime-optimized operation. In this study, support vector regression (SVR) is employed in battery {SOH} prognostics, and particle swarm optimization (PSO) is employed in obtaining the {SVR} kernel parameter. Through a new validation method, the proposed PSO–SVR model in this paper can well grasp the global degradation trend of {SOH} and is little affected by local regeneration and fluctuations. The case study shows that compared with the eight published methods, the proposed model can obtain more accurate {SOH} prediction results. Even {SOH} prediction starts from the cycle near capacity regeneration, the proposed model still can grasp the global degradation trend. Furthermore, the improved PSO–SVR model has great robustness when the training data contain noise and measurement outliers, which makes it possible to get satisfactory prediction performance without pre-processing the data manually.

Keywords: Lithium-ion battery
[414] Arantza Gorostiaga and José Luis Rojo-Álvarez. On the use of conventional and statistical-learning techniques for the analysis of {PISA} results in spain. Neurocomputing, pages -, 2015. [ bib | DOI | http ]
Abstract A simple and general feature extraction procedure is presented which provides robust nonparametric estimates on the statistical relevance of data features, by computing the confidence intervals for the model weights in the case of linear models, and for the the change in the error rate when removing each feature in the case of nonlinear models. The method performance is specially scrutinized for the prediction of the 2009 {PISA} scores of the Spanish students. We compare the ability of logistic regression, Fisher linear discriminant analysis, and Support Vector Machine (SVM, both with linear and with nonlinear kernel), to classify top performers in the mathematics exam. All the methods yield similar accuracy, with linear and nonlinear {SVM} providing improved feature reduction capabilities, at the expense of computational complexity. The results show relevant relationships of the success rate with regional variables, computer availability, gender, immigration status, learning strategies, and some others. The proposed feature selection procedure for machine learning classification can be readily used in other fields, and it can be improved with further theoretical and probabilistic development.

Keywords: Large Surveys Analysis
[415] Ma Liyong, Shen Yi, and Ma Jiachen. Local spatial properties based image interpolation scheme using {SVMs}. Journal of Systems Engineering and Electronics, 19(3):618 - 623, 2008. [ bib | DOI | http ]
Image interpolation plays an important role in image process applications. A novel support vector machines (SVMs) based interpolation scheme is proposed with increasing the local spatial properties in the source image as {SVMs} input patterns. After the proper neighbor pixels region is selected, trained support vectors are obtained by training {SVMs} with local spatial properties that include the average of the neighbor pixels gray values and the gray value variations between neighbor pixels in the selected region. The support vector regression machines are employed to estimate the gray values of unknown pixels with the neighbor pixels and local spatial properties information. Some interpolation experiments show that the proposed scheme is superior to the linear, cubic, neural network and other {SVMs} based interpolation approaches.

Keywords: image processing
[416] William Ford and Walker Land. A latent space support vector machine (lssvm) model for cancer prognosis. Procedia Computer Science, 36:470 - 475, 2014. Complex Adaptive Systems Philadelphia, {PA} November 3-5, 2014. [ bib | DOI | http ]
Abstract Gene expression microarray analysis is a rapid, low cost method of analyzing gene expression profiles for cancer prognosis/diagnosis. Microarray data generated from oncological studies typically contain thousands of expression values with few cases. Traditional regression and classification methods require first reducing the number of dimensions via statistical or heuristic methods. Partial Least Squares (PLS) is a dimensionality reduction method that builds a least squares regression model in a reduced dimensional space. It is well known that Support Vector Machines (SVM) outperform least squares regression models. In this study, we replace the {PLS} least squares model with a {SVM} model in the {PLS} reduced dimensional space. To verify our method, we build upon our previous work with a publicly available data set from the Gene Expression Omnibus database containing gene expression levels, clinical data, and survival times for patients with non-small cell lung carcinoma. Using 5-fold cross validation, and Receiver Operating Characteristic (ROC) analysis, we show a comparison of classifier performance between the traditional {PLS} model and the PLS/SVM hybrid. Our results show that replacing least squares regression with SVM, we increase the quality of the model as measured by the area under the {ROC} curve.

Keywords: Machine Learning
[417] M. Mohammadi, M. Raoofat, H. Marzooghi, and G.B. Gharehpetian. Nonlinear multivariable modeling of solid oxide fuel cells using core vector regression. International Journal of Hydrogen Energy, 36(19):12538 - 12548, 2011. [ bib | DOI | http ]
This paper presents new steady-state and dynamic models for solid oxide fuel cells (SOFCs) using core vector regression (CVR). So far, most of conventional {SOFC} models have been presented based on conversion laws. Due to complex mathematical equations used in these models, they are time-consuming and need large amount of memory to be applied for controller design, especially power electronic interface controller design, generation and load predictions, optimization and other studies. To overcome these problems, some black-box models, such as support vector machine (SVM) and artificial neural network (ANN)-based models have been also proposed for SOFC. In this paper, in order to model nonlinear multivariable behavior of {SOFC} two CVR-based black-box models are proposed for each operation mode, one for steady-state and the other one for dynamic modeling. The proposed models are trained in a very little time and need small amount of memory in comparison with existing black-box models. This is due to usage of fewer number of support vectors (SVs). In order to demonstrate the efficacy of the proposed models, they are applied to a 5-kW {SOFC} stack. Simulation results illustrate the effectiveness of the proposed models for both steady-state and dynamic studies.

Keywords: Solid oxide fuel cell
[418] Mingming Zhang and Xinggao Liu. A soft sensor based on adaptive fuzzy neural network and support vector regression for industrial melt index prediction. Chemometrics and Intelligent Laboratory Systems, 126:83 - 90, 2013. [ bib | DOI | http ]
Abstract An adaptive soft sensor for online monitoring melt index (MI), an important variable determining the product quality in the industrial propylene polymerization (PP) process, is proposed, where fuzzy neural network (FNN) is served as the basic model for its powerful nonlinear approximation ability as a machine learning method. However, considering the difficulty of structure determination of the FNN, an adaptive fuzzy neural network (A-FNN) is subsequently developed to determine the number of fuzzy rules, where a novel adaptive method dynamically changes the structure of the model by the predefined thresholds. Furthermore, in order to get better generalization ability of the soft sensor, support vector regression (SVR) is introduced for parameter tuning, where the output function is transformed into an {SVR} based optimization problem. The online soft sensor is also carried out on a real industrial {PP} plant as illustration, where the soft sensors including the SVR, FNN–SVR and A-FNN–SVR models are compared in detail. The research results show that the proposed soft sensor achieves a good performance in the industrial {MI} prediction process.

Keywords: Soft sensor
[419] Elina Kontio, Antti Airola, Tapio Pahikkala, Heljä Lundgren-Laine, Kristiina Junttila, Heikki Korvenranta, Tapio Salakoski, and Sanna Salanterä. Predicting patient acuity from electronic patient records. Journal of Biomedical Informatics, 51:35 - 40, 2014. [ bib | DOI | http ]
AbstractBackground The ability to predict acuity (patients’ care needs), would provide a powerful tool for health care managers to allocate resources. Such estimations and predictions for the care process can be produced from the vast amounts of healthcare data using information technology and computational intelligence techniques. Tactical decision-making and resource allocation may also be supported with different mathematical optimization models. Methods This study was conducted with a data set comprising electronic nursing narratives and the associated Oulu Patient Classification (OPCq) acuity. A mathematical model for the automated assignment of patient acuity scores was utilized and evaluated with the pre-processed data from 23,528 electronic patient records. The methods to predict patient’s acuity were based on linguistic pre-processing, vector-space text modeling, and regularized least-squares regression. Results The experimental results show that it is possible to obtain accurate predictions about patient acuity scores for the coming day based on the assigned scores and nursing notes from the previous day. Making same-day predictions leads to even better results, as access to the nursing notes for the same day boosts the predictive performance. Furthermore, textual nursing notes allow for more accurate predictions than previous acuity scores. The best results are achieved by combining both of these information sources. The developed model achieves a concordance index of 0.821 when predicting the patient acuity scores for the following day, given the scores and text recorded on the previous day. Conclusions By applying language technology to electronic patient documents it is possible to accurately predict the value of the acuity scores of the coming day based on the previous daýs assigned scores and nursing notes.

Keywords: Patient acuity
[420] Najeebullah, Aneela Zameer, Asifullah Khan, and Syed Gibran Javed. Machine learning based short term wind power prediction using a hybrid learning model. Computers & Electrical Engineering, pages -, 2014. [ bib | DOI | http ]
Abstract Depletion of conventional resources has led to the exploration of renewable energy resources. In this regard, wind power is taking significant importance, worldwide. However, to acquire consistent power generation from wind, the expected wind power is required in advance. Consequently, various prediction models have been reported for wind power prediction. However, we observe that Support Vector Regression (SVR), and specially, a hybrid learning model based on {SVR} offer better performance and generalization compared to multiple linear regression (MLR) and is thus quite suitable for the development of short-term wind power prediction system. To this end, a new methodology ML-STWP namely Machine Learning based Short Term Wind Power Prediction is proposed for short-term wind power prediction. This approach utilizes a combination of machine learning (ML) techniques for feature selection and regression. The proposed methodology is thus a hybrid {ML} model, which makes use of feature selection through irrelevancy and redundancy filters, and then employs {SVR} for auxiliary prediction. Finally, the wind power is predicted using enhanced particle swarm optimization and a hybrid neural network. The wind power dataset on which the model is tuned and tested consists of real-time daily values of wind speed, relative humidity, temperature, and wind power. The obtained results demonstrate that the proposed prediction model performs better as compared to the existing methods and demonstrates the efficacy of the proposed intelligent system in accurately predicting wind power on daily basis.

[421] Emad A. El-Sebakhy. Forecasting {PVT} properties of crude oil systems based on support vector machines modeling scheme. Journal of Petroleum Science and Engineering, 64(1–4):25 - 34, 2009. [ bib | DOI | http ]
{PVT} properties are very important in the reservoir engineering computations. There are numerous approaches for predicting various {PVT} properties, namely, empirical correlations and computational intelligence schemes. The achievements of neural networks open the door to data mining modeling techniques to play a major role in petroleum industry. Unfortunately, the developed neural networks modeling schemes have many drawbacks and limitations as they were originally developed for certain ranges of reservoir fluid characteristics. This article proposes support vector machines a new intelligence framework for predicting the {PVT} properties of crude oil systems and solve most of the existing neural networks drawbacks. Both steps and training algorithms are briefly illustrated. A comparative study is carried out to compare support vector machines regression performance with the one of the neural networks, nonlinear regression, and different empirical correlation techniques. Results show that the performance of support vector machines is accurate, reliable, and outperforms most of the published correlations. This leads to a bright light of support vector machines modeling and we recommended for solving other oil and gas industry problems, such as, permeability and porosity prediction, identify liquid-holdup flow regimes, and other reservoir characterization.

Keywords: Support Vector Machines Regression
[422] Wang Xiufeng, Zhang Lei, Huang Rongbo, Wu Qinghua, Min Jianxin, Ma Na, and Luo Laicheng. Regulatory mechanism of hormones of the pituitary-target gland axes in kidney-yang deficiency based on a support vector machine model. Journal of Traditional Chinese Medicine, 35(2):238 - 243, 2015. [ bib | DOI | http ]
AbstractObjective To study the development mechanism of kidney-Yang deficiency through the establishment of support vector machine models of relevant hormones of the pituitary-target gland axes in rats with kidney-Yang deficiency syndrome. Methods The kidney-Yang deficiency rat model was created by intramuscular injection of hydrocortisone, and contents of the hormones of the pituita- ry-thyroid axis: thyroid stimulating hormone (TSH), 3,3',5-triiodothyronine (T3) and thyroxine (T4); hormones of the pituitary-adrenal gland axis: adrenocorticotropic hormone (ACTH) and cortisol (CORT); and hormones of the pituitary-gonadal axis: luteinizing hormone (LH), follicle-stimulating hormone (FSH), and testosterone (T), were determined in the early, middle, and advanced stages. Ten support vector regression (SVR) models of the hormones were established to analyze the mutual relationships among the hormones of the three axes. Results The feedback control action of the pituitary-adrenal axis began to lose efficacy from the middle stage of kidney-Yang deficiency. The contents all hormones of the three pituitary-target gland axes decreased in the advanced stage. Relative errors of the jackknife test of the {SVR} models all were less than 10%. Conclusion Imbalances in mutual regulation among the hormones of the pituitary-target gland axes, especially loss of effectiveness of the pituitary-adrenal axis, is one pathogenesis of kidney-Yang deficiency. The {SVR} model can accurately reflect the complicated non-linear relationships among pituitary-target gland axes in rats with of kidney-Yang deficiency.

Keywords: Kidney Yang deficiency
[423] Isis Didier Lins, Enrique López Droguett, Márcio das Chagas Moura, Enrico Zio, and Carlos Magno Jacinto. Computing confidence and prediction intervals of industrial equipment degradation by bootstrapped support vector regression. Reliability Engineering & System Safety, 137:120 - 128, 2015. [ bib | DOI | http ]
Abstract Data-driven learning methods for predicting the evolution of the degradation processes affecting equipment are becoming increasingly attractive in reliability and prognostics applications. Among these, we consider here Support Vector Regression (SVR), which has provided promising results in various applications. Nevertheless, the predictions provided by {SVR} are point estimates whereas in order to take better informed decisions, an uncertainty assessment should be also carried out. For this, we apply bootstrap to {SVR} so as to obtain confidence and prediction intervals, without having to make any assumption about probability distributions and with good performance even when only a small data set is available. The bootstrapped {SVR} is first verified on Monte Carlo experiments and then is applied to a real case study concerning the prediction of degradation of a component from the offshore oil industry. The results obtained indicate that the bootstrapped {SVR} is a promising tool for providing reliable point and interval estimates, which can inform maintenance-related decisions on degrading components.

Keywords: Degradation
[424] Zhenbo Wei and Jun Wang. The evaluation of sugar content and firmness of non-climacteric pears based on voltammetric electronic tongue. Journal of Food Engineering, 117(1):158 - 164, 2013. [ bib | DOI | http ]
The sugar content and firmness of non-climacteric pear of different cultivars were studied by a voltammetric electronic tongue (VE-tongue). The VE-tongue self-developed in this study comprised six working electrodes (gold, silver, platinum, palladium, tungsten, and titanium electrode), an Ag/AgCl reference electrode, and a platinum auxiliary electrode. The multi-frequency large amplitude pulse voltammetry (MLAPV) was applied to the working electrodes as the scanning potential waveform,and it consisted of four frequency segments of 1 Hz, 10 Hz, 100 Hz, and 1000 Hz. In this study, five cultivars of pear from different geographical origins were tested by VE-tongue, and the firmness and sugar content of pears were tested by the traditional methods. The characteristic data (the maximum and minimum values) obtained by VE-tongue were compressed by principal component analysis (PCA), and the principal components (PCs) were taken as the input variables of principal component regression (PCR), partial least squares regression (PLSR), and least squared-support vector machines (LS-SVMs) to predict sugar content and firmness. All the models showed good results, and LS-SVM preformed best in the prediction.

Keywords: Magness-Taylor technique
[425] Xiaowei Yang, Liangjun Tan, and Lifang He. A robust least squares support vector machine for regression and classification with noise. Neurocomputing, 140:41 - 52, 2014. [ bib | DOI | http ]
Abstract Least squares support vector machines (LS-SVMs) are sensitive to outliers or noise in the training dataset. Weighted least squares support vector machines (WLS-SVMs) can partly overcome this shortcoming by assigning different weights to different training samples. However, it is a difficult task for WLS-SVMs to set the weights of the training samples, which greatly influences the robustness of WLS-SVMs. In order to avoid setting weights, in this paper, a novel robust LS-SVM (RLS-SVM) is presented based on the truncated least squares loss function for regression and classification with noise. Based on its equivalent model, we theoretically analyze the reason why the robustness of RLS-SVM is higher than that of LS-SVMs and WLS-SVMs. In order to solve the proposed RLS-SVM, we propose an iterative algorithm based on the concave–convex procedure (CCCP) and the Newton algorithm. The statistical tests of the experimental results conducted on fourteen benchmark regression datasets and ten benchmark classification datasets show that compared with LS-SVMs, WLS-SVMs and iteratively reweighted LS-SVM (IRLS-SVM), the proposed RLS-SVM significantly reduces the effect of the noise in the training dataset and provides superior robustness.

Keywords: Least squares support vector machines
[426] Peter C. Austin, Jack V. Tu, Jennifer E. Ho, Daniel Levy, and Douglas S. Lee. Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. Journal of Clinical Epidemiology, 66(4):398 - 407, 2013. [ bib | DOI | http ]
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine-learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study Design and Setting We compared the performance of these classification methods with that of conventional classification trees to classify patients with heart failure (HF) according to the following subtypes: {HF} with preserved ejection fraction (HFPEF) and {HF} with reduced ejection fraction. We also compared the ability of these methods to predict the probability of the presence of {HFPEF} with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data-mining literature offer substantial improvement in prediction and classification of {HF} subtype compared with conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of {HFPEF} compared with the methods proposed in the data-mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying {HF} subtypes in a population-based sample of patients from Ontario, Canada. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF.

Keywords: Boosting
[427] F. Sánchez Lasheras, P.J. García Nieto, F.J. de Cos Juez, and J.A. Vilán Vilán. Evolutionary support vector regression algorithm applied to the prediction of the thickness of the chromium layer in a hard chromium plating process. Applied Mathematics and Computation, 227:164 - 170, 2014. [ bib | DOI | http ]
Abstract The hard chromium plating process aims at creating a coating of hard and wear-resistant chromium with a thickness of some microns directly on the metal part, without the insertion of copper or nickel layers. It is one of the most difficult electroplating processes due to the influence of the hydrogen evolution that occurs on the cathode surface simultaneously to the chromium deposition. Chromium plating is characterized by high levels of hardness and resistance to wear and it is thanks to these properties that they can be applied in a huge range of sectors. Resistance to corrosion of a hard chromium plate depends on the thickness of the coating, adherence and micro-fissures of the latter. This micro-fissured structure is what provides the optimal hardness of the layers. The electro-deposited chromium layer is not uniformly distributed: there are zones such as sharp edges or points where deposits are highly accentuated, while deposits are virtually nonexistent in holes or in the undercuts. The hard chromium plating process is one of the most effective ways of protecting the base material in a hostile environment or improving surface properties of the base material. However, in the electroplating industry, electro-platers are faced with many problems and often achieve undesirable results on chromium-plated materials. Problems such as matt deposition, milky white chromium deposition, rough or sandy chromium deposition and insufficient thickness or hardness are the most common problems faced in the electroplating industry. Finally, it must be remarked that defects in the coating locally lower the corrosion resistance of the layer and that the decomposition of chromium hydrides causes the formation of a network of cracks in the coating. This innovative research work uses an evolutionary support vector regression algorithm for the prediction of the thickness of the chromium layer in a hard chromium plating process. Evolutionary support vector machines (ESVMs) is a novel technique that assimilates the learning engine of the state-of-the-art support vector machines (SVMs) but evolves the coefficients of the decision function by means of evolutionary algorithms (EAs). In this sense, the current research is focused on the estimation of the hyper-parameters required for the support vector machines technique for regression (SVR), by means of evolutionary strategies. The results are briefly compared with those obtained by authors in a previous paper, where a model based on an artificial neural network was tuned using the design of experiments (DOE).

Keywords: Hard chromium plating process
[428] Changyi Park. Convergence rates of generalization errors for margin-based classification. Journal of Statistical Planning and Inference, 139(8):2543 - 2551, 2009. [ bib | DOI | http ]
This paper develops a general approach to quantifying the size of generalization errors for margin-based classification. A trade-off between geometric margins and training errors is exhibited along with the complexity of a binary classification problem. Consequently, this results in dealing with learning theory in a broader framework, in particular, of handling both convex and non-convex margin classifiers, among which includes, support vector machines, kernel logistic regression, and ψ -learning. Examples for both linear and nonlinear classifications are provided.

Keywords: Classification
[429] Gang Dong, Kin Keung Lai, and Jerome Yen. Credit scorecard based on logistic regression with random coefficients. Procedia Computer Science, 1(1):2463 - 2468, 2010. {ICCS} 2010. [ bib | DOI | http ]
Many credit scoring techniques have been used to build credit scorecards. Among them, logistic regression model is the most commonly used in the banking industry due to its desirable features (e.g., robustness and transparency). Although some new techniques (e.g., support vector machine) have been applied to credit scoring and shown superior prediction accuracy, they have problems with the results interpretability. Therefore, these advanced techniques have not been widely applied in practice. To improve the prediction accuracy of logistic regression, logistic regression with random coefficients is proposed. The proposed model can improve prediction accuracy of logistic regression without sacrificing desirable features. It is expected that the proposed credit scorecard building method can contribute to effective management of credit risk in practice.

Keywords: Credit scorecard
[430] C. Dai, Y.P. Li, and G.H. Huang. A two-stage support-vector-regression optimization model for municipal solid waste management – a case study of beijing, china. Journal of Environmental Management, 92(12):3023 - 3037, 2011. [ bib | DOI | http ]
In this study, a two-stage support-vector-regression optimization model (TSOM) is developed for the planning of municipal solid waste (MSW) management in the urban districts of Beijing, China. It represents a new effort to enhance the analysis accuracy in optimizing the {MSW} management system through coupling the support-vector-regression (SVR) model with an interval-parameter mixed integer linear programming (IMILP). The developed {TSOM} can not only predict the city’s future waste generation amount, but also reflect dynamic, interactive, and uncertain characteristics of the {MSW} management system. Four kernel functions such as linear kernel, polynomial kernel, radial basis function, and multi-layer perception kernel are chosen based on three quantitative simulation performance criteria [i.e. prediction accuracy (PA), fitting accuracy (FA) and over all accuracy (OA)]. The {SVR} with polynomial kernel has accurate prediction performance for {MSW} generation rate, with all of the three quantitative simulation performance criteria being over 96%. Two cases are considered based on different waste management policies. The results are valuable for supporting the adjustment of the existing waste-allocation patterns to raise the city’s waste diversion rate, as well as the capacity planning of waste management system to satisfy the city’s increasing waste treatment/disposal demands.

Keywords: Support-vector-regression
[431] Abdul Majid, Asifullah Khan, and Tae-Sun Choi. Predicting lattice constant of complex cubic perovskites using computational intelligence. Computational Materials Science, 50(6):1879 - 1888, 2011. [ bib | DOI | http ]
Recently in the field of materials science, advanced computational intelligence (CI) based approaches are gaining substantial importance for modeling the quantitative structure to properties relationship. In this study, we have used support vector regression, random forest, generalized regression neural network, and multiple linear regression based {CI} approaches to predict lattice constants (LCs) of complex cubic perovskites. We have collected reasonable number of perovskites compounds from the recent literature of materials science. The {CI} models are developed using 100 training compounds and the generalized performance is estimated for the novel 97 compounds. Our analysis highlights the improved prediction performance of {CI} approaches than the well-known {SPuDS} software, which is extensively used in crytsallography. We further observed that, for some of the compounds, the larger prediction error provided by the {CI} models is correlated with the structure deviation of the compounds from its ideal cubic symmetry.

Keywords: Support vector regression
[432] X.X. Wang, S. Chen, D. Lowe, and C.J. Harris. Sparse support vector regression based on orthogonal forward selection for the generalised kernel model. Neurocomputing, 70(1–3):462 - 474, 2006. Neural NetworksSelected Papers from the 7th Brazilian Symposium on Neural Networks (SBRN '04)7th Brazilian Symposium on Neural Networks. [ bib | DOI | http ]
This paper considers sparse regression modelling using a generalised kernel model in which each kernel regressor has its individually tuned centre vector and diagonal covariance matrix. An orthogonal least squares forward selection procedure is employed to select the regressors one by one, so as to determine the model structure. After the regressor selection, the corresponding model weight parameters are calculated from the Lagrange dual problem of the original regression problem with the regularised ε -insensitive loss function. Unlike the support vector regression, this stage of the procedure involves neither reproducing kernel Hilbert space nor Mercer decomposition concepts. As the regressors used are not restricted to be positioned at training input points and each regressor has its own diagonal covariance matrix, sparser representation can be obtained. Experiments involving one simulated example and three real data sets are used to demonstrate the effectiveness of the proposed novel regression modelling approach.

Keywords: Generalised kernel model
[433] Paulo R. Filgueiras, Cristina M.S. Sad, Alexandre R. Loureiro, Maria F.P. Santos, Eustáquio V.R. Castro, Júlio C.M. Dias, and Ronei J. Poppi. Determination of {API} gravity, kinematic viscosity and water content in petroleum by atr-ftir spectroscopy and multivariate calibration. Fuel, 116:123 - 130, 2014. [ bib | DOI | http ]
Abstract In this work, {API} gravity, kinematic viscosity and water content were determined in petroleum oil using Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR). Support vector regression (SVR) was used as the non-linear multivariate calibration procedure and partial least squares regression (PLS) as the linear procedure. In {SVR} models, the multiplication of the spectra matrix by support vectors resulted in information about the importance of the original variables. The most important variables in {PLS} models were attained by regression coefficients. For {API} gravity and kinematic viscosity these variables correspond to vibrations around 2900 cm−1, 1450 cm−1 and below to 720 cm−1 and for water content, between 3200 and 3650 cm−1, around 1650 cm-1 and below to 900 cm−1. The {SVR} model produced a root mean square error of prediction (RMSEP) of 0.25 for {API} gravity, 22 mm2 s−1 for kinematic viscosity and 0.26% v/v for water content. For {PLS} models, the {RMSEP} values for {API} gravity was 0.38 mm2 s−1, for kinematic viscosity was 27 mm2 s−1 and for water content was 0.34%. Using the F-test at 95% of confidence it was concluded that the {SVR} model produced better results than {PLS} for {API} gravity determination. For kinematic viscosity and water content the two methods were equivalent. However, a non-linear behavior in the {PLS} kinematic viscosity model was observed.

Keywords: Crude oil
[434] G.Y. Chen and G. Dudek. Auto-correlation wavelet support vector machine. Image and Vision Computing, 27(8):1040 - 1046, 2009. [ bib | DOI | http ]
A support vector machine (SVM) with the auto-correlation of a compactly supported wavelet as a kernel is proposed in this paper. The authors prove that this kernel is an admissible support vector kernel. The main advantage of the auto-correlation of a compactly supported wavelet is that it satisfies the translation invariance property, which is very important for its use in signal processing. Also, we can choose a better wavelet by selecting from different wavelet families for our auto-correlation wavelet kernel. This is because for different applications we should choose wavelet filters selectively for the autocorrelation kernel. We should not always select the same wavelet filters independent of the application, as we demonstrate. Experiments on signal regression and pattern recognition show that this kernel is a feasible kernel for practical applications.

Keywords: Wavelets
[435] Álvaro Barbero and José R. Dorronsoro. Cycle-breaking acceleration for support vector regression. Neurocomputing, 74(16):2649 - 2656, 2011. Advances in Extreme Learning Machine: Theory and ApplicationsBiological Inspired Systems. Computational and Ambient IntelligenceSelected papers of the 10th International Work-Conference on Artificial Neural Networks (IWANN2009). [ bib | DOI | http ]
Support vector regression (SVR) is a powerful tool in modeling and prediction tasks with widespread application in many areas. The most representative algorithms to train {SVR} models are Shevade et al.'s Modification 2 and Lin's {WSS1} and {WSS2} methods in the {LIBSVM} library. Both are variants of standard {SMO} in which the updating pairs selected are those that most violate the Karush–Kuhn–Tucker optimality conditions, to which {LIBSVM} adds a heuristic to improve the decrease in the objective function. In this paper, and after presenting a simple derivation of the updating procedure based on a greedy maximization of the gain in the objective function, we show how cycle-breaking techniques that accelerate the convergence of support vector machines (SVM) in classification can also be applied under this framework, resulting in significantly improved training times for SVR.

Keywords: Pattern recognition
[436] Leonardo Ramirez-Lopez, Thorsten Behrens, Karsten Schmidt, Antoine Stevens, Jose Alexandre M. Demattê, and Thomas Scholten. The spectrum-based learner: A new local approach for modeling soil vis–nir spectra of complex datasets. Geoderma, 195–196:268 - 279, 2013. [ bib | DOI | http ]
Abstract This paper shows that memory-based learning (MBL) is a very promising approach to deal with complex soil visible and near infrared (vis–NIR) datasets. The main goal of this work was to develop a suitable {MBL} approach for soil spectroscopy. Here we introduce the spectrum-based learner (SBL) which basically is equipped with an optimized principal components distance (oPC-M) and a Gaussian process regression. Furthermore, this approach combines local distance matrices and the spectral features as predictor variables. Our {SBL} was tested in two soil spectral libraries: a regional soil vis–NIR library of State of São Paulo (Brazil) and a global soil vis–NIR library. We calibrated models of clay content (CC), organic carbon (OC) and exchangeable Ca (Ca++). In order to compare the predictive performance of our {SBL} with other approaches, the following algorithms were used: partial least squares (PLS) regression, support vector regression machines (SVM), locally weighted {PLS} regression (LWR) and LOCAL. In all cases our {SBL} algorithm outperformed the accuracy of the remaining algorithms. Here we show that the {SBL} presents great potential for predicting soil attributes in large and diverse vis–NIR datasets. In addition we also show that soil vis–NIR distance matrices can be used to further improve the prediction performance of spectral models.

Keywords: Soil similarity
[437] Ahmed Chacón Iznaga, Miguel Rodríguez Orozco, Edith Aguila Alcantara, Meilyn Carral Pairol, Yanet Eddith Díaz Sicilia, Josse de Baerdemaeker, and Wouter Saeys. Vis/nir spectroscopic measurement of selected soil fertility parameters of cuban agricultural cambisols. Biosystems Engineering, 125:105 - 121, 2014. [ bib | DOI | http ]
The conventional methods frequently used in Cuba to determine some fertility parameters important for sugarcane production, such as organic matter (OM), available phosphorus (P) and potassium (K2O), are difficult, costly, and time-consuming procedures. This study was undertaken to build and validate Visible/Near Infrared Reflectance (Vis/NIR) calibration models of these parameters at landscape level and within a field, by taking into consideration their correlation coefficients with the OM. The parameters P and K2O, which are not spectrally active in the Vis/NIR range should be better predicted when are highly correlated with OM. Also, the wavelength intervals to simplify this methodology were selected. Samples were air-dried before scanning using a diode array spectrophotometer covering the wavelength range from 399 to 1697 nm. The regression models were built by using the linear multivariate regression method Partial Least Squares (PLS), and the nonlinear multivariate regression methods Support Vector Machines (SVM) and Locally Weighted Regression (LWR). At landscape level the best correlations between soil spectra and {OM} (0.90 ≤ R2 ≤ 0.93; 0.12 ≤ RMSEP≤0.14) were obtained with LWR, followed by {K2O} with {LWR} (0.77 ≤ R2 ≤ 0.79; 3.47 ≤ RMSEP≤3.62), Olsen P (0.69 ≤ R2 ≤ 0.81; 0.27 ≤ RMSEP≤0.35) and Oniani P (0.64 ≤ R2 ≤ 0.65; 3.31 ≤ RMSEP≤3.61) both with SVM. Also, the nonlinear regression models gave the best results within a field. The higher values for {OM} (R2 = 0.92; RMSEP = 0.14) and Olsen P (0.68 ≤ R2 ≤ 0.83; 0.27 ≤ RMSEP≤0.34) were observed with SVM, while for {K2O} (0.16 ≤ R2 ≤ 0.63; 5.13 ≤ RMSEP≤5.88), and Oniani P (0.70 ≤ R2 ≤ 0.72; 2.32 ≤ RMSEP≤2.52) were obtained with LWR. The soil fertility parameters studied at landscape level and within a field were best estimated by using nonlinear regression models.

Keywords: Soil fertility parameters
[438] Mohammad Ali Ahmadi, Mohammad Ebadi, Payam Soleimani Marghmaleki, and Mohammad Mahboubi Fouladi. Evolving predictive model to determine condensate-to-gas ratio in retrograded condensate gas reservoirs. Fuel, 124:241 - 257, 2014. [ bib | DOI | http ]
Abstract Added values to project economy from condensate sales and gas deliverability loss due to condensate blockage are the distinctive differences between gas condensate and dry gas reservoirs. To estimate the added value, one needs to obtain condensate to gas ratio (CGR); however, this needs special pressure–volume–temperature (PVT) experimental study and field tests. In the absence of experimental studies during early period of field exploration, techniques which correlate such a parameter would be of interest for engineers. In this work, the developed model inspired from a new intelligent scheme known as “least square support vector machine (LSSVM)” to monitor condensate gas ratio (CGR) in retrograde condensate gas reservoirs. The proposed approach is conducted to the laboratorial data from Iranian oil fields and reported in literature has been implemented to mature and test this approach. The generated results from the {LSSVM} model were compared to the addressed real data and generated results of conventional correlation and fuzzy logic models. Making judgements between the generated outcomes of our model and the another course of action proves that the least square support vector machine model estimate condensate gas ratio more accurately in comparison with the conventional applied approaches. It worth mentioning that, least square support vector machine do not have any conceptual errors like as over-fitting issue while artificial neural networks suffer from many local minima solutions. Outcomes of this research could couple with the commercial production softwares for condensate gas reservoirs for different goals such as production optimization and facilitate design.

Keywords: Condensate gas
[439] Saeid Shokri, Mohammad Taghi Sadeghi, and Mahdi Ahmadi Marvast. High reliability estimation of product quality using support vector regression and hybrid meta-heuristic algorithms. Journal of the Taiwan Institute of Chemical Engineers, 45(5):2225 - 2232, 2014. [ bib | DOI | http ]
Abstract Online estimation of product quality is a complicated task in refining processes. Data driven soft sensors have been successfully employed as a supplement to the online hardware analyzers that are often expensive and require high maintenance. Support Vector Regression (SVR) is an efficient machine learning technique that can be used for soft sensor design. However, choosing optimal hyper-parameter values for the {SVR} is a hard optimization problem. In order to determine the parameters as fast and accurate as possible, some Hybrid Meta-Heuristic (HMH) algorithms have been developed in this study. A comprehensive study has been carried out comparing the meta-heuristic algorithms of {GA} and {PSO} to the {HMH} algorithms of GA–SQP and PSO–SQP for prediction of sulfur quality in treated gas oil using the {SVR} technique. Experimental data from a hydrodesulfurization (HDS) setup were collected to validate the proposed {SVR} model. The {SVR} model yields better performances both in accuracy and computation time (CT) for predicting the sulfur quality with hyper parameters optimized by {HMH} algorithms. Applying the PSO–SQP algorithm gives the best performance with {AARE} = 0.133 and {CT} = 15.88 s compared to the other methods.

Keywords: Soft sensor
[440] Hua Su, Xiangbai Wu, Xiao-Hai Yan, and Autumn Kidwell. Estimation of subsurface temperature anomaly in the indian ocean during recent global surface warming hiatus from satellite measurements: A support vector machine approach. Remote Sensing of Environment, 160:63 - 71, 2015. [ bib | DOI | http ]
Abstract Estimating the thermal information in the subsurface and deeper ocean from satellite measurements over large basin-wide scale is important but also challenging. This paper proposes a support vector machine (SVM) method to estimate subsurface temperature anomaly (STA) in the Indian Ocean from a suite of satellite remote sensing measurements including sea surface temperature anomaly (SSTA), sea surface height anomaly (SSHA), and sea surface salinity anomaly (SSSA). The {SVM} estimation of {STA} features the inclusion of in-situ Argo {STA} data for training and testing. SVM, one of the most popular machine learning methods, can well estimate the {STA} in the upper 1000 m of the Indian Ocean from satellite measurements of sea surface parameters (SSTA, {SSHA} and {SSSA} as input attributes for SVM). The results, based on the common {SVM} application of Support Vector Regression (SVR), were validated for accuracy and reliability using the Argo {STA} data. Both {MSE} and r2 for performance measures are improved after including {SSSA} for {SVR} (MSE decreased by 12% and r2 increased by 11% on average). The results showed that SSSA, in addition to {SSTA} and SSHA, is a useful parameter that can help detect and describe the deeper ocean thermal structure, as well as improve the {STA} estimation accuracy. Moreover, our method can provide a useful technique for studying subsurface and deeper ocean thermal variability which has played an important role in recent global surface warming hiatus since 1998, from satellite measurements in large basin-wide scale.

Keywords: Subsurface temperature anomaly
[441] Andreas Christmann and Robert Hable. Consistency of support vector machines using additive kernels for additive models. Computational Statistics & Data Analysis, 56(4):854 - 873, 2012. [ bib | DOI | http ]
Support vector machines (SVMs) are special kernel based methods and have been among the most successful learning methods for more than a decade. {SVMs} can informally be described as kinds of regularized M -estimators for functions and have demonstrated their usefulness in many complicated real-life problems. During the last few years a great part of the statistical research on {SVMs} has concentrated on the question of how to design {SVMs} such that they are universally consistent and statistically robust for nonparametric classification or nonparametric regression purposes. In many applications, some qualitative prior knowledge of the distribution P or of the unknown function f to be estimated is present or a prediction function with good interpretability is desired, such that a semiparametric model or an additive model is of interest. The question of how to design {SVMs} by choosing the reproducing kernel Hilbert space (RKHS) or its corresponding kernel to obtain consistent and statistically robust estimators in additive models is addressed. An explicit construction of such {RKHSs} and their kernels, which will be called additive kernels, is given. {SVMs} based on additive kernels will be called additive support vector machines. The use of such additive kernels leads, in combination with a Lipschitz continuous loss function, to {SVMs} with the desired properties for additive models. Examples include quantile regression based on the pinball loss function, regression based on the ϵ -insensitive loss function, and classification based on the hinge loss function.

Keywords: Support vector machine
[442] Qi Wu. Hybrid forecasting model based on support vector machine and particle swarm optimization with adaptive and cauchy mutation. Expert Systems with Applications, 38(8):9070 - 9075, 2011. [ bib | DOI | http ]
This paper presents a novel hybrid forecasting model based on support vector machine and particle swarm optimization with Cauchy mutation objective and decision-making variables. On the basis of the slow convergence of particle swarm algorithm (PSO) during parameters selection of support vector machine (SVM), the adaptive mutation operator based on the fitness function value and the iterative variable is also applied to inertia weight. Then, a hybrid {PSO} with adaptive and Cauchy mutation operator (ACPSO) is proposed. The results of application in regression estimation show the proposed hybrid model (ACPSO–SVM) is feasible and effective, and the comparison between the method proposed in this paper and other ones is also given, which proves this method is better than other methods.

Keywords: Particle swarm optimization
[443] Theodore B. Trafalis and Robin C. Gilbert. Robust classification and regression using support vector machines. European Journal of Operational Research, 173(3):893 - 909, 2006. [ bib | DOI | http ]
In this paper, we investigate the theoretical aspects of robust classification and robust regression using support vector machines. Given training data (x1, y1), … , (xl, yl), where l represents the number of samples, x i ∈ R n and yi ∈ −1, 1 (for classification) or y i ∈ R (for regression), we investigate the training of a support vector machine in the case where bounded perturbation is added to the value of the input x i ∈ R n . We consider both cases where our training data are either linearly separable and nonlinearly separable respectively. We show that we can perform robust classification or regression by using linear or second order cone programming.

Keywords: Robustness
[444] Marcos Rodrigues and Juan de la Riva. An insight into machine-learning algorithms to model human-caused wildfire occurrence. Environmental Modelling & Software, 57:192 - 201, 2014. [ bib | DOI | http ]
Abstract This paper provides insight into the use of Machine Learning (ML) models for the assessment of human-caused wildfire occurrence. It proposes the use of {ML} within the context of fire risk prediction, and more specifically, in the evaluation of human-induced wildfires in Spain. In this context, three {ML} algorithms—Random Forest (RF), Boosting Regression Trees (BRT), and Support Vector Machines (SVM)—are implemented and compared with traditional methods like Logistic Regression (LR). Results suggest that the use of any of these {ML} algorithms leads to an improvement in the accuracy—in terms of the {AUC} (area under the curve)—of the model when compared to {LR} outputs. According to the {AUC} values, {RF} and {BRT} seem to be the most adequate methods, reaching {AUC} values of 0.746 and 0.730 respectively. On the other hand, despite the fact that the {SVM} yields an {AUC} value higher than that from LR, the authors consider it inadequate for classifying wildfire occurrences because its calibration is extremely time-consuming.

Keywords: Machine learning
[445] Yi-Chao Yang, Da-Wen Sun, and Nan-Nan Wang. Rapid detection of browning levels of lychee pericarp as affected by moisture contents using hyperspectral imaging. Computers and Electronics in Agriculture, 113:203 - 212, 2015. [ bib | DOI | http ]
Abstract Lychee is an important tropical and subtropical fruit. However, the quality of lychee fruit changes easily after harvest and it is difficult to control the process. One of the most significant factors impacting lychee quality seriously is enzymatic browning, which is commonly affected by moisture loss of pericarp during storage. As an emerging technique, hyperspectral imaging (HSI) carries many unique advantages compared to conventional detection methods, providing an innovative tool for quality evaluation of many fruits. The current study focused on exploring the relationship between browning levels of lychee and moisture contents (MC) of pericarp, and developing calibration models for determining browning degree of lychee based on the {MC} prediction of pericarp using {HSI} technique. Two sets of optimal wavelengths were selected using regression coefficients (RC) from partial least squares regression (PLSR) and successive projections algorithm (SPA), respectively. Calibration models for determining browning levels of lychee were developed using PLSR, back-propagation neural network (BP-NN) and radial basis function support vector regression (RBF-SVR) algorithms and their performances were compared. The results demonstrated that the RBF-SVR model based on the optimal wavelengths selected by {RC} had the best performance with coefficients of determination {R2} of 0.946 and 0.948, and root mean square error (RMSE) of 0.80% and 0.83% for training and testing sets, respectively, showing browning levels of lychee could be determined by this approach. Finally, the visualization map of lychee with different browning levels was created and distribution of browning degree in a lychee was observed by examining color variation among pixels in the map.

Keywords: Litchi
[446] Michel Ballings and Dirk Van den Poel. {CRM} in social media: Predicting increases in facebook usage frequency. European Journal of Operational Research, 244(1):248 - 260, 2015. [ bib | DOI | http ]
Abstract The purpose of this study is to (1) assess the feasibility of predicting increases in Facebook usage frequency, (2) evaluate which algorithms perform best, (3) and determine which predictors are most important. We benchmark the performance of Logistic Regression, Random Forest, Stochastic Adaptive Boosting, Kernel Factory, Neural Networks and Support Vector Machines using five times twofold cross-validation. The results indicate that it is feasible to create models with high predictive performance. The top performing algorithm was Stochastic Adaptive Boosting with a cross-validated {AUC} of 0.66 and accuracy of 0.74. The most important predictors include deviation from regular usage patterns, frequencies of likes of specific categories and group memberships, average photo album privacy settings, and recency of comments. Facebook and other social networks alike could use predictions of increases in usage frequency to customize its services such as pacing the rate of advertisements and friend recommendations, or adapting News Feed content altogether. The main contribution of this study is that it is the first to assess the prediction of increases in usage frequency in a social network.

Keywords: Decision support systems
[447] Jun-Hu Cheng and Da-Wen Sun. Rapid and non-invasive detection of fish microbial spoilage by visible and near infrared hyperspectral imaging and multivariate analysis. {LWT} - Food Science and Technology, 62(2):1060 - 1068, 2015. [ bib | DOI | http ]
Abstract The feasibility of visible and near infrared hyperspectral imaging in the range of 400–1000 nm for determinating total viable counts (TVC) to evaluate microbial spoilage of fish fillets was investigated. Partial least square regression (PLSR) and least square support vector machines (LS-SVM) models established based on full wavelengths showed excellent performances and the LS-SVM model was better with higher residual predictive deviation (RPD) of 3.89, determination coefficients in prediction ( R 2 P ) of 0.93 and lower root mean square errors in prediction (RMSEP) of 0.49 log10 CFU/g. Seven optimal wavelengths were selected by successive projections algorithm (SPA) and the simplified SPA-PLSR was better than SPA-LS-SVM models with {RPD} of 3.13, R 2 P of 0.90 and {RMSEP} of 0.57 log10 CFU/g, and was transferred to each pixel of the hyperspectral images for generating the {TVC} distribution map. This study showed that hyperspectral imaging is suitable to determine {TVC} value for evaluating microbial spoilage of grass carp fillets in a rapid and non-invasive manner.

Keywords: Hyperspectral imaging
[448] Ahmad Reza Gholami and Mehdi Shahbazian. Soft sensor design based on fuzzy c-means and rfn_svr for a stripper column. Journal of Natural Gas Science and Engineering, 25:23 - 29, 2015. [ bib | DOI | http ]
Abstract Soft sensors have been extensively employed in the dynamic setting of industrial factories. In general, a soft sensor is a computer program used for estimating the variables, which are impossible or very hard to be acquired in real time by using the easily accessible process measurements. In the present research, a soft sensor by incorporating the Fuzzy C-Means clustering with the Recursive Finite Newton algorithm for training the Support Vector Regression (FCM_RFN_SVR) is proposed. In this technique, the samples are partitioned into smaller partitions and with the aid of the RFN_SVR, a local model for each partition is adjusted. The presented method is applied to a stripper column in order to estimate the concentration of the bottom product H2S. The gained results were compared with a typical {SVR} method, where the findings confirmed that the presented technique is stronger and relatively more capable in enhancing the generalizability of the soft sensor.

Keywords: Soft sensor
[449] Yuanning Liu, Fei He, Xiaodong Zhu, Zhen Liu, Ying Chen, Ye Han, and Lijiao Yu. The improved characteristics of bionic gabor representations by combining with {SIFT} key-points for iris recognition. Journal of Bionic Engineering, 12(3):504 - 517, 2015. [ bib | DOI | http ]
Abstract Gabor filters are generally regarded as the most bionic filters corresponding to the visual perception of human. Their filtered coefficients thus are widely utilized to represent the texture information of irises. However, these wavelet-based iris representations are inevitably being misaligned in iris matching stage. In this paper, we try to improve the characteristics of bionic Gabor representations of each iris via combining the local Gabor features and the key-point descriptors of Scale Invariant Feature Transformation (SIFT), which respectively simulate the process of visual object class recognition in frequency and spatial domains. A localized approach of Gabor features is used to avoid the blocking effect in the process of image division, meanwhile a {SIFT} key point selection strategy is provided to remove the noises and probable misaligned key points. For the combination of these iris features, we propose a support vector regression based fusion rule, which may fuse their matching scores to a scalar score to make classification decision. The experiments on three public and self-developed iris datasets validate the discriminative ability of our multiple bionic iris features, and also demonstrate that the fusion system outperforms some state-of-the-art methods.

Keywords: iris recognition
[450] José I. Muñoz-Barús, María Sol Rodríguez-Calvo, José M. Suárez-Peñaranda, Duarte N. Vieira, Carmen Cadarso-Suárez, and Manuel Febrero-Bande. Pmicalc: An r code-based software for estimating post-mortem interval (pmi) compatible with windows, mac and linux operating systems. Forensic Science International, 194(1–3):49 - 52, 2010. [ bib | DOI | http ]
In legal medicine the correct determination of the time of death is of utmost importance. Recent advances in estimating post-mortem interval (PMI) have made use of vitreous humour chemistry in conjunction with Linear Regression, but the results are questionable. In this paper we present PMICALC, an R code-based freeware package which estimates {PMI} in cadavers of recent death by measuring the concentrations of potassium ([K+]), hypoxanthine ([Hx]) and urea ([U]) in the vitreous humor using two different regression models: Additive Models (AM) and Support Vector Machine (SVM), which offer more flexibility than the previously used Linear Regression. The results from both models are better than those published to date and can give numerical expression of {PMI} with confidence intervals and graphic support within 20 min. The program also takes into account the cause of death.

Keywords: Post-mortem interval
[451] Jun Zhao, Ying Liu, Xiaoping Zhang, and Wei Wang. A {MKL} based on-line prediction for gasholder level in steel industry. Control Engineering Practice, 20(6):629 - 641, 2012. [ bib | DOI | http ]
The real-time prediction for gasholder level is significant for gas scheduling in steel enterprises. In this study, we extended the least squares support vector regression (LSSVR) to multiple kernel learning (MKL) based on reduced gradient method. The {MKL} based LSSVR, using the optimal linear combination of kernels, improves the generalization of the model and reduces the training time. The experiments using the classical non-flat function and the practical problem shows that the proposed method achieves well performance and high computational efficiency. And, an application system based on the approach is developed and applied to the practice of Shanghai Baosteel Co. Ltd.

Keywords: Gasholder level prediction
[452] Xing Yan and Nurul A. Chowdhury. Mid-term electricity market clearing price forecasting: A multiple {SVM} approach. International Journal of Electrical Power & Energy Systems, 58:206 - 214, 2014. [ bib | DOI | http ]
Abstract In a deregulated electric market, offering the appropriate amount of electricity at the right time with the right bidding price is of paramount importance for utility companies maximizing their profits. Mid-term electricity market clearing price (MCP) forecasting has become essential for resources reallocation, maintenance scheduling, bilateral contracting, budgeting and planning. Although there are many techniques available for short-term electricity {MCP} forecasting, very little has been done in the area of mid-term electricity {MCP} forecasting. A multiple support vector machine (SVM) based mid-term electricity {MCP} forecasting model is proposed in this paper. Data classification and price forecasting modules are designed to first pre-process the input data into corresponding price zones, and then forecast the electricity price. The proposed model showed improved forecasting accuracy on both peak prices and overall system compared with the forecasting model using a single SVM. {PJM} interconnection data are used to test the proposed model.

Keywords: Classification
[453] Xiu zhi SHI, Jian ZHOU, Bang biao WU, Dan HUANG, and Wei WEI. Support vector machines approach to mean particle size of rock fragmentation due to bench blasting prediction. Transactions of Nonferrous Metals Society of China, 22(2):432 - 441, 2012. [ bib | DOI | http ]
Aiming at the problems of the traditional method of assessing distribution of particle size in bench blasting, a support vector machines (SVMs) regression methodology was used to predict the mean particle size (X50) resulting from rock blast fragmentation in various mines based on the statistical learning theory. The data base consisted of blast design parameters, explosive parameters, modulus of elasticity and in-situ block size. The seven input independent variables used for the {SVMs} model for the prediction of {X50} of rock blast fragmentation were the ratio of bench height to drilled burden (H/B), ratio of spacing to burden (S/B), ratio of burden to hole diameter (B/D), ratio of stemming to burden (T/B), powder factor (Pf), modulus of elasticity (E) and in-situ block size (XB). After using the 90 sets of the measured data in various mines and rock formations in the world for training and testing, the model was applied to 12 another blast data for validation of the trained support vector regression (SVR) model. The prediction results of {SVR} were compared with those of artificial neural network (ANN), multivariate regression analysis (MVRA) models, conventional Kuznetsov method and the measured {X50} values. The proposed method shows promising results and the prediction accuracy of {SVMs} model is acceptable.

Keywords: rock fragmentation
[454] Wei Li, Yuping Song, and Changle Zhou. Computationally evaluating and synthesizing chinese calligraphy. Neurocomputing, 135:299 - 305, 2014. [ bib | DOI | http ]
Abstract We present an approach for synthesizing Chinese calligraphy with a similar topological style from learning author′s written works. Our first contribution is an algorithm to match the trajectory. Second contribution is a method to represent Chinese character topology via WF-histogram. Third contribution is an algorithm to take topological features as features and feed them into the evaluation model—that is Adaboost composed of support vector regressions (SVRs). Fourth contribution is a Genetic Algorithm (GA) introduced in the optimization glyph phase. Moreover, we introduce hypothesis testing and the decay function of transformation amplitude to improve the converge speed. The experiments demonstrate that our approach can obtain a similar topological style Chinese calligraphy with training samples.

Keywords: Chinese calligraphy style
[455] Alexandros Lazaridis, Todor Ganchev, Iosif Mporas, Evaggelos Dermatas, and Nikos Fakotakis. Two-stage phone duration modelling with feature construction and feature vector extension for the needs of speech synthesis. Computer Speech & Language, 26(4):274 - 292, 2012. [ bib | DOI | http ]
We propose a two-stage phone duration modelling scheme, which can be applied for the improvement of prosody modelling in speech synthesis systems. This scheme builds on a number of independent feature constructors (FCs) employed in the first stage, and a phone duration model (PDM) which operates on an extended feature vector in the second stage. The feature vector, which acts as input to the first stage, consists of numerical and non-numerical linguistic features extracted from text. The extended feature vector is obtained by appending the phone duration predictions estimated by the {FCs} to the initial feature vector. Experiments on the American-English {KED} {TIMIT} and on the Modern Greek WCL-1 databases validated the advantage of the proposed two-stage scheme, improving prediction accuracy over the best individual predictor, and over a two-stage scheme which just fuses the first-stage outputs. Specifically, when compared to the best individual predictor, a relative reduction in the mean absolute error and the root mean square error of 3.9% and 3.9% on the {KED} TIMIT, and of 4.8% and 4.6% on the WCL-1 database, respectively, is observed.

Keywords: Feature construction
[456] Aixia Yan and Kai Wang. Quantitative structure and bioactivity relationship study on human acetylcholinesterase inhibitors. Bioorganic & Medicinal Chemistry Letters, 22(9):3336 - 3342, 2012. [ bib | DOI | http ]
Several {QSAR} (Quantitative Structure–Activity Relationships) models for predicting the inhibitory activity of 404 Acetylcholinesterase inhibitors were developed. The whole dataset was split into a training set and a test set randomly or using a Kohonen’s self-organizing map. Then the inhibitory activity of 404 Acetylcholinesterase inhibitors was predicted using Multilinear Regression (MLR) analysis and Support Vector Machine (SVM) methods, respectively. For the test sets, correlation coefficients of all our models over 0.90 were achieved. Y-randomization test was employed to ensure the robustness of our models and a docking simulation was used to confirm the descriptors we used.

Keywords: Acetylcholinesterase inhibitors
[457] Quansheng Chen, Zhiming Guo, Jiewen Zhao, and Qin Ouyang. Comparisons of different regressions tools in measurement of antioxidant activity in green tea using near infrared spectroscopy. Journal of Pharmaceutical and Biomedical Analysis, 60:92 - 97, 2012. [ bib | DOI | http ]
To rapidly and efficiently measure antioxidant activity (AA) in green tea, near infrared (NIR) spectroscopy was employed with the help of a regression tool in this work. Three different linear and nonlinear regressions tools (i.e. partial least squares (PLS), back propagation artificial neural network (BP-ANN), and support vector machine regression (SVMR)), were systemically studied and compared in developing the model. The model was optimized by a leave-one-out cross-validation, and its performance was tested according to root mean square error of prediction (RMSEP) and correlation coefficient (Rp) in the prediction set. Experimental results showed that the performance of {SVMR} model was superior to the others, and the optimum results of the {SVMR} model were achieved as follow: {RMSEP} = 0.02161 and Rp = 0.9691 in the prediction set. The overall results sufficiently demonstrate that the spectroscopy coupled with the {SVMR} regression tool has the potential to measure {AA} in green tea.

Keywords: Near infrared (NIR) spectroscopy
[458] Guangcan Liu, Zhouchen Lin, and Yong Yu. Multi-output regression on the output manifold. Pattern Recognition, 42(11):2737 - 2743, 2009. [ bib | DOI | http ]
Multi-output regression aims at learning a mapping from an input feature space to a multivariate output space. Previous algorithms define the loss functions using a fixed global coordinate of the output space, which is equivalent to assuming that the output space is a whole Euclidean space with a dimension equal to the number of the outputs. So the underlying structure of the output space is completely ignored. In this paper, we consider the output space as a Riemannian submanifold to incorporate its geometric structure into the regression process. To this end, we propose a novel mechanism, called locally linear transformation (LLT), to define the loss functions on the output manifold. In this way, currently existing regression algorithms can be improved. In particular, we propose an algorithm under the support vector regression framework. Our experimental results on synthetic and real-life data are satisfactory.

Keywords: Regression analysis
[459] Mohammad Goodarzi, Richard Jensen, and Yvan Vander Heyden. {QSRR} modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions. Journal of Chromatography B, 910:84 - 94, 2012. Chemometrics in Chromatography. [ bib | DOI | http ]
A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated {QSRR} models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (log kw). The overall best model was the {SVM} one built using descriptors selected by ACO.

Keywords: QSRR
[460] Chang Jun Lee, Gibaek Lee, and Jong Min Lee. A fault magnitude based strategy for effective fault classification. Chemical Engineering Research and Design, 91(3):530 - 541, 2013. [ bib | DOI | http ]
A common approach in fault diagnosis is monitoring the deviations of measured variables from the values at normal operations to identify the root causes of faults. When the number of conceivable faults is larger than that of predictive variables, conventional approaches can yield ambiguous diagnosis results including multiple fault candidates. To address the issue, this work proposes a fault magnitude based strategy. Signed digraph is first used to identify qualitative relationships between process variables and faults. Empirical models for predicting process variables under assumed faults are then constructed with support vector regression (SVR). Fault magnitude data are projected onto principal components subspace, and the mapping from scores to fault magnitudes is learned via SVR. This model can estimate fault magnitudes and discriminate a true fault among multiple candidates when different fault magnitudes yield distinguishable responses in the monitored variables. The efficacy of the proposed approach is illustrated on an actuator benchmark problem.

Keywords: DAMADICS
[461] Eslam Pourbasheer, Reza Aalizadeh, and Mohammad Reza Ganjali. {QSAR} study of {CK2} inhibitors by ga-mlr and ga-svm methods. Arabian Journal of Chemistry, pages -, 2015. [ bib | DOI | http ]
Abstract In this work, the quantitative structure–activity relationship models were developed for predicting activity of a series of compounds such as {CK2} inhibitors using multiple linear regressions and support vector machine methods. The data set consisted of 48 compounds was divided into two subsets of training and test set, randomly. The most relevant molecular descriptors were selected using the genetic algorithm as a feature selection tool. The predictive ability of the models was evaluated using Y-randomization test, cross-validation and external test set. The genetic algorithm-multiple linear regression model with six selected molecular descriptors was obtained and showed high statistical parameters (R2train = 0.893, {R2test} = 0.921, {Q2LOO} = 0.844, F = 43.17, {RMSE} = 0.287). Comparison of the results between GA-MLR and GA-SVM demonstrates that GA-SVM provided better results for the training set compounds; however, the predictive quality for both models is acceptable. The results suggest that atomic mass and polarizabilities and also number of heteroatom in molecules are the main independent factors contributing to the {CK2} inhibition activity. The predicted results of this study can be used to design new and potent {CK2} inhibitors.

Keywords: QSAR
[462] Ping-Feng Pai, Kuo-Chen Hung, and Kuo-Ping Lin. Tourism demand forecasting using novel hybrid system. Expert Systems with Applications, 41(8):3691 - 3702, 2014. [ bib | DOI | http ]
Abstract Accurate prediction of tourism demand is a crucial issue for the tourism and service industry because it can efficiently provide basic information for subsequent tourism planning and policy making. To successfully achieve an accurate prediction of tourism demand, this study develops a novel forecasting system for accurately forecasting tourism demand. The construction of the novel forecasting system combines fuzzy c-means (FCM) with logarithm least-squares support vector regression (LLS-SVR) technologies. Genetic algorithms (GA) were optimally used simultaneously to select the parameters of the LLS-SVR. Data on tourist arrivals to Taiwan and Hong Kong were used. Empirical results indicate that the proposed forecasting system demonstrates a superior performance to other methods in terms of forecasting accuracy.

Keywords: Forecasting
[463] Asterios Toutios and Konstantinos Margaritis. Estimating electropalatographic patterns from the speech signal. Computer Speech & Language, 22(4):346 - 359, 2008. [ bib | DOI | http ]
Electropalatography is a well established technique for recording information on the patterns of contact between the tongue and the hard palate during speech, leading to a stream of binary vectors representing contacts or non-contacts between the tongue and certain positions on the hard palate. A data-driven approach to mapping the speech signal onto electropalatographic information is presented. Principal component analysis is used to model the spatial structure of the electropalatographic data and support vector regression is used to map acoustic parameters onto projections of the electropalatographic data on the principal components.

Keywords: Electropalatography
[464] Subhabrata Choudhury, Subhajyoti Ghosh, Arnab Bhattacharya, Kiran Jude Fernandes, and Manoj Kumar Tiwari. A real time clustering and {SVM} based price-volatility prediction for optimal trading strategy. Neurocomputing, 131:419 - 426, 2014. [ bib | DOI | http ]
Abstract Financial return on investments and movement of market indicators are fraught with uncertainties and a highly volatile environment that exists in the global market. Equity markets are heavily affected by market unpredictability and maintaining a healthy diversified portfolio with minimum risk is undoubtedly crucial for any investment made in such assets. Effective price and volatility prediction can highly influence the course of the investment strategy with regard to such a portfolio of equity instruments. In this paper a novel {SOM} based hybrid clustering technique is integrated with support vector regression for portfolio selection and accurate price and volatility predictions which becomes the basis for the particular trading strategy adopted for the portfolio. The research considers the top 102 stocks of the {NSE} stock market (India) to identify set of best portfolios that an investor can maintain for risk reduction and high profitability. Short term stock trading strategy and performance indicators are developed to assess the validity of the predictions with regard to actual scenarios.

Keywords: Stock market
[465] S. Meysam Mousavi, R. Tavakkoli-Moghaddam, Behnam Vahdani, H. Hashemi, and M.J. Sanjari. A new support vector model-based imperialist competitive algorithm for time estimation in new product development projects. Robotics and Computer-Integrated Manufacturing, 29(1):157 - 168, 2013. [ bib | DOI | http ]
Time estimation in new product development (NPD) projects is often a complex problem due to its nonlinearity and the small quantity of data patterns. Support vector regression (SVR) based on statistical learning theory is introduced as a new neural network technique with maximum generalization ability. The {SVR} has been utilized to solve nonlinear regression problems successfully. However, the applicability of the {SVR} is highly affected due to the difficulty of selecting the {SVR} parameters appropriately. The imperialist competitive algorithm (ICA) as a socio-politically inspired optimization strategy is employed to solve the real world engineering problems. This optimization algorithm is inspired by competition mechanism among imperialists and colonies, in contrast to evolutionary algorithms. This paper presents a new model integrating the {SVR} and the {ICA} for time estimation in {NPD} projects, in which {ICA} is used to tune the parameters of the SVR. A real data set from a case study of an {NPD} project in a manufacturing industry is presented to demonstrate the performance of the proposed model. In addition, the comparison is provided between the proposed model and conventional techniques, namely nonlinear regression, back-propagation neural networks (BPNN), pure {SVR} and general regression neural networks (GRNN). The experimental results indicate that the presented model achieves high estimation accuracy and leads to effective prediction.

Keywords: Support vector regression
[466] Hicham Laanaya, Arnaud Martin, Driss Aboutajdine, and Ali Khenchaf. Support vector regression of membership functions and belief functions – application for pattern recognition. Information Fusion, 11(4):338 - 350, 2010. [ bib | DOI | http ]
Caused by many applications during the last few years, many models have been proposed to represent imprecise and uncertain data. These models are essentially based on the theory of fuzzy sets, the theory of possibilities and the theory of belief functions. These two first theories are based on the membership functions and the last one on the belief functions. Hence, it could be interesting to learn these membership and belief functions from data and then we can, for example, deduce the class for a classification task. Therefore, we propose in this paper a regression approach based on the statistical learning theory of Vapnik. The membership and belief functions have the same properties; that we take as constraints in the resolution of our convex problem in the support vector regression. The proposed approach is applied in a pattern recognition context to evaluate its efficiency. Hence, the regression of the membership functions and the regression of the belief functions give two kinds of classifiers: a fuzzy {SVM} and a belief SVM. From the learning data, the membership and belief functions are generated from two classical approaches given respectively by fuzzy and belief k-nearest neighbors. Therefore, we compare the proposed approach, in terms of classification results, with these two k-nearest neighbors and with support vector machines classifier.

Keywords: SVR
[467] Yuxia Fan, Keqiang Lai, Barbara A. Rasco, and Yiqun Huang. Determination of carbaryl pesticide in fuji apples using surface-enhanced raman spectroscopy coupled with multivariate analysis. {LWT} - Food Science and Technology, 60(1):352 - 357, 2015. [ bib | DOI | http ]
Abstract Residual pesticides in fruits and vegetables are one of the major food safety concerns around the world. Surface-enhanced Raman spectroscopy (SERS) coupled with chemometric methods was applied for quantitative analysis of trace levels of carbaryl pesticide in apple. The lowest detectable level for carbaryl in apple was 0.5 μg g−1, which was sensitive enough for identifying apple contaminated with carbaryl above the maximum residue level. Quantification of carbaryl residues (0–10 μg g−1) was conducted using partial least squares regression (PLSR) and support vector regression (SVR) models. Based upon the results of leave-one-out cross-validation, carbaryl levels in apples could be predicted by {PLSR} (R2 = 0.983) or {SVR} (R2 = 0.986) with a low root mean square errors (RMSE = 0.48 μg g−1 or 0.44 μg g−1) and a high ratio of performance to deviation (RPD = 7.71 or 8.11) value. This study indicates that {SERS} has the potential to quantify carbaryl pesticide in complex food matrices reliably.

Keywords: Surface-enhanced Raman spectroscopy
[468] Jie Hu, Jin Qi, Yinghong Peng, and Qiushi Ren. Predicting electrical evoked potential in optic nerve visual prostheses by using support vector regression and case-based prediction. Information Sciences, 290:7 - 21, 2015. [ bib | DOI | http ]
Abstract Electrical evoked potential (EEP) forecasting is an intelligent time series prediction (TSP) activity to explore the temporal properties of electrically elicited responses of the visual cortex triggered by various electrical stimulations. Our previous studies used support vector regression (SVR) as a {TSP} predictor to forecast temporal {EEP} values. {SVR} shows high prediction performance but with high computation time for multivariable stimulation inputs in {EEP} prediction. To reduce the computational burden of {SVR} and further improve the performance, this paper utilizes technique of case-based prediction (CBP) to integrate the initial stimulation variables into an integrated stimulation value (ISV), and total four independent {CBPs} are used to achieve the stimulation feature integration. Then the temporal samples are extracted from transformed data to construct a new {SVR} regression model to perform the prediction activity. The new hybridizing system is named as CBSVR, which was also empirically tested with data collected from actual {EEP} electrophysiological experiments. Both 30-fold cross-validation method and adapted point predictive accuracy (PPA) index were used to compare the predictive performances between CBSVR, classical {CBP} approaches, single {SVR} model and other common {TSP} methods. Empirical comparison results show that {CBSVR} is feasible and validated for {EEP} prediction in visual prostheses research.

Keywords: Electrical evoked potential
[469] Zengguang Li, Zhenjiang Ye, Rong Wan, and Chi Zhang. Model selection between traditional and popular methods for standardizing catch rates of target species: A case study of japanese spanish mackerel in the gillnet fishery. Fisheries Research, 161:312 - 319, 2015. [ bib | DOI | http ]
Abstract Improving existing catch per unit effort (CPUE) models for construction of a fishery abundance index is important to fish stock assessment and management. {CPUE} standardization research is a rapidly developing field, and many statistical models have been used, including generalized linear models (GLMs), generalized additive models (GAMs), regression trees (RTs) and artificial neural networks (ANNs). However, the popular and influential methods, random forests (RFs) and support vector machines (SVMs) have not been used in this field. We evaluate the performance of six candidate methods (GLMs, GAMs, RTs, RFs, {ANNs} and SVMs) using gillnet data for Japanese Spanish mackerel (Scomberomorus niphonius) collected by a fishery-dependent survey (National Basic Research Program of China, NBRPC) in the south of the Yellow Sea from 2006 to 2012. Predictive performance metrics and Regression Error Characteristic (REC) curves computed by 10-fold cross-validation results showed that the {SVM} provided the best performance among the six candidate models and slightly improved the prediction accuracies compared to RF. However, the traditional methods {GLM} and {GAM} were inferior to the other four nonlinear statistical models (RTs, ANNs, {RFs} and SVMs). In general, {RFs} and {SVMs} should be considered as potential statistical methods for {CPUE} standardization. Model performance was affected by several factors, including data structure and model construction. Therefore, further research should focus these factors to improve model functionality.

Keywords: CPUE
[470] Mahdi Kalantari Meybodi, Amin Shokrollahi, Hossein Safari, Moonyong Lee, and Alireza Bahadori. A computational intelligence scheme for prediction of interfacial tension between pure hydrocarbons and water. Chemical Engineering Research and Design, 95:79 - 92, 2015. [ bib | DOI | http ]
Abstract Interfacial tension plays a major role in many disciplines of science and engineering. Complex nature of this property has restricted most of the previous theoretical studies on thermophysical properties to bulk properties measured far from the interface. Considering the drawbacks and deficiencies of preexisting models, there is yet a huge interest in accurate determination of this property using a rather simple and more comprehensive modeling approach. In recent years, inductive machine learning algorithms have widely been applied in solving a variety of engineering problems. This study introduces least-square support vector machines (LS-SVM) approach as a viable and powerful tool for predicting the interfacial tension between pure hydrocarbon and water. Comparing the model to experimental data, an excellent agreement was observed yielding the overall squared correlation coefficient (R2) of 0.993. Proposed model was also found to outperform when compared to some previously presented multiple regression models. An outlier detection method was also introduced to determine the model applicability domain and diagnose the outliers in the gathered dataset. Results of this study indicate that the model can be applied in systems over temperature ranges of 454.40–890 °R and pressure ranges of 0.1–300 MPa.

Keywords: Interfacial tension
[471] Jingfei Yang and Juergen Stenzel. Short-term load forecasting with increment regression tree. Electric Power Systems Research, 76(9–10):880 - 888, 2006. [ bib | DOI | http ]
This paper presents a new regression tree method for short-term load forecasting. Both increment and non-increment tree are built according to the historical data to provide the data space partition and input variable selection. Support vector machine is employed to the samples of regression tree nodes for further fine regression. Results of different tree nodes are integrated through weighted average method to obtain the comprehensive forecasting result. The effectiveness of the proposed method is demonstrated through its application to an actual system.

Keywords: Load forecasting
[472] Mahmoud O. Elish. Improved estimation of software project effort using multiple additive regression trees. Expert Systems with Applications, 36(7):10774 - 10778, 2009. [ bib | DOI | http ]
Accurate estimation of software project effort is crucial for successful management and control of a software project. Recently, multiple additive regression trees (MART) has been proposed as a novel advance in data mining that extends and improves the classification and regression trees (CART) model using stochastic gradient boosting. This paper empirically evaluates the potential of {MART} as a novel software effort estimation model when compared with recently published models, in terms of accuracy. The comparison is based on a well-known and respected {NASA} software project dataset. The results indicate that improved estimation accuracy of software project effort has been achieved using {MART} when compared with linear regression, radial basis function neural networks, and support vector regression models.

Keywords: Software effort estimation
[473] B. Üstün, W.J. Melssen, and L.M.C. Buydens. Visualisation and interpretation of support vector regression models. Analytica Chimica Acta, 595(1–2):299 - 309, 2007. Papers presented at the 10th International Conference on Chemometrics in Analytical ChemistryCAC 2006. [ bib | DOI | http ]
This paper introduces a technique to visualise the information content of the kernel matrix and a way to interpret the ingredients of the Support Vector Regression (SVR) model. Recently, the use of Support Vector Machines (SVM) for solving classification (SVC) and regression (SVR) problems has increased substantially in the field of chemistry and chemometrics. This is mainly due to its high generalisation performance and its ability to model non-linear relationships in a unique and global manner. Modeling of non-linear relationships will be enabled by applying a kernel function. The kernel function transforms the input data, usually non-linearly related to the associated output property, into a high dimensional feature space where the non-linear relationship can be represented in a linear form. Usually, {SVMs} are applied as a black box technique. Hence, the model cannot be interpreted like, e.g., Partial Least Squares (PLS). For example, the {PLS} scores and loadings make it possible to visualise and understand the driving force behind the optimal {PLS} machinery. In this study, we have investigated the possibilities to visualise and interpret the {SVM} model. Here, we exclusively have focused on Support Vector Regression to demonstrate these visualisation and interpretation techniques. Our observations show that we are now able to turn a {SVR} black box model into a transparent and interpretable regression modeling technique.

Keywords: Support Vector Regression
[474] Hailong Yang, Qi Zhao, Zhongzhi Luan, and Depei Qian. imeter: An integrated {VM} power model based on performance profiling. Future Generation Computer Systems, 36:267 - 286, 2014. Special Section: Intelligent Big Data ProcessingSpecial Section: Behavior Data Security Issues in Network Information PropagationSpecial Section: Energy-efficiency in Large Distributed Computing ArchitecturesSpecial Section: eScience Infrastructure and Applications. [ bib | DOI | http ]
Abstract The unprecedented burst in power consumption encountered by contemporary datacenters continually boosts the development of energy efficient techniques from both hardware and software perspectives to alleviate the energy problem. The most widely adopted power saving solutions in datacenters that deliver cloud computing services are power capping and {VM} consolidation. However, without the capability to track the {VM} power usage precisely, the combined effect of the above two techniques could cause severe performance degradation to the consolidated VMs, thus violating the user service level agreements. In this paper, we propose an integrated {VM} power model called iMeter, which overcomes the drawbacks of overpresumption and overapproximation in segregated power models used in previous studies. We leverage the kernel-based performance counters that provide accurate performance statistics as well as high portability across heterogeneous platforms to build the {VM} power model. Principal component analysis is applied to identify performance counters that show strong impact on the {VM} power consumption with mathematical confidence. We also present a brief interpretation of the first four selected principal components on their indications of {VM} power consumption. We demonstrate that our approach is independent of underlying hardware and virtualization configurations with clustering analysis. We utilize the support vector regression to build the {VM} power model predicting the power consumption of both a single {VM} and multiple consolidated {VMs} running various workloads. The experimental results show that our model is able to predict the instantaneous {VM} power usage with an average error of 5% and 4.7% respectively against the actual power measurement.

Keywords: Virtualization
[475] Jiankang Wang, Haibo Zhang, Changkai Yan, Shujing Duan, and Xianghua Huang. An adaptive turbo-shaft engine modeling method based on {PS} and mrr-lssvr algorithms. Chinese Journal of Aeronautics, 26(1):94 - 103, 2013. [ bib | DOI | http ]
In order to establish an adaptive turbo-shaft engine model with high accuracy, a new modeling method based on parameter selection (PS) algorithm and multi-input multi-output recursive reduced least square support vector regression (MRR-LSSVR) machine is proposed. Firstly, the {PS} algorithm is designed to choose the most reasonable inputs of the adaptive module. During this process, a wrapper criterion based on least square support vector regression (LSSVR) machine is adopted, which can not only reduce computational complexity but also enhance generalization performance. Secondly, with the input variables determined by the {PS} algorithm, a mapping model of engine parameter estimation is trained off-line using MRR-LSSVR, which has a satisfying accuracy within 5‰. Finally, based on a numerical simulation platform of an integrated helicopter/turbo-shaft engine system, an adaptive turbo-shaft engine model is developed and tested in a certain flight envelope. Under the condition of single or multiple engine components being degraded, many simulation experiments are carried out, and the simulation results show the effectiveness and validity of the proposed adaptive modeling method.

Keywords: Adaptive engine model
[476] Jaime Alonso, Ángel Rodríguez Castañón, and Antonio Bahamonde. Support vector regression to predict carcass weight in beef cattle in advance of the slaughter. Computers and Electronics in Agriculture, 91:116 - 120, 2013. [ bib | DOI | http ]
In this paper we present a function to predict the carcass weight for beef cattle. The function uses a few zoometric measurements of the animals taken days before the slaughter. For this purpose we have used Artificial Intelligence tools based on Support Vector Machines for Regression (SVR). We report a case study done with a set of 390 measurements of 144 animals taken from 2 to 222 days in advance of the slaughter. We used animals of the breed Asturiana de los Valles, a specialized beef breed from the North of Spain. The results obtained show that it is possible to predict carcass weights 150 days before the slaughter day with an average absolute error of 4.27% of the true value. The prediction function is a polynomial of degree 3 that uses five lengths and the estimation of the round profile of the animals.

Keywords: Support Vector Machines (SVMs)
[477] Carlos Serrano-Cinca and Begoña Gutiérrez-Nieto. Partial least square discriminant analysis for bankruptcy prediction. Decision Support Systems, 54(3):1245 - 1255, 2013. [ bib | DOI | http ]
Abstract This paper uses Partial Least Square Discriminant Analysis (PLS-DA) for the prediction of the 2008 {USA} banking crisis. {PLS} regression transforms a set of correlated explanatory variables into a new set of uncorrelated variables, which is appropriate in the presence of multicollinearity. PLS-DA performs a {PLS} regression with a dichotomous dependent variable. The performance of this technique is compared to the performance of 8 algorithms widely used in bankruptcy prediction. In terms of accuracy, precision, F-score, Type I error and Type {II} error, results are similar; no algorithm outperforms the others. Behind performance, each algorithm assigns a score to each bank and classifies it as solvent or failed. These results have been analyzed by means of contingency tables, correlations, cluster analysis and reduction dimensionality techniques. PLS-DA results are very close to those obtained by Linear Discriminant Analysis and Support Vector Machine.

Keywords: Bankruptcy
[478] Andre Marquand, Matthew Howard, Michael Brammer, Carlton Chu, Steven Coen, and Janaina Mourão-Miranda. Quantitative prediction of subjective pain intensity from whole-brain fmri data using gaussian processes. NeuroImage, 49(3):2178 - 2189, 2010. [ bib | DOI | http ]
Supervised machine learning (ML) algorithms are increasingly popular tools for fMRI decoding due to their predictive capability and their ability to capture information encoded by spatially correlated voxels. In addition, an important secondary outcome is a multivariate representation of the pattern underlying the prediction. Despite an impressive array of applications, most fMRI applications are framed as classification problems and predictions are limited to categorical class decisions. For many applications, quantitative predictions are desirable that more accurately represent variability within subject groups and that can be correlated with behavioural variables. We evaluate the predictive capability of Gaussian process (GP) models for two types of quantitative prediction (multivariate regression and probabilistic classification) using whole-brain fMRI volumes. As a proof of concept, we apply {GP} models to an fMRI experiment investigating subjective responses to thermal pain and show {GP} models predict subjective pain ratings without requiring anatomical hypotheses about functional localisation of relevant brain processes. Even in the case of pain perception, where strong hypotheses do exist, {GP} predictions were more accurate than any region previously demonstrated to encode pain intensity. We demonstrate two brain mapping methods suitable for {GP} models and we show that {GP} regression models outperform state of the art support vector- and relevance vector regression. For classification, {GP} models perform categorical prediction as accurately as a support vector machine classifier and furnish probabilistic class predictions.

[479] Abdul Majid, Syed Bilal Ahsan, and Naeem ul Haq Tariq. Modeling glass-forming ability of bulk metallic glasses using computational intelligent techniques. Applied Soft Computing, 28:569 - 578, 2015. [ bib | DOI | http ]
Abstract Modeling the glass-forming ability (GFA) of bulk metallic glasses (BMGs) is one of the hot issues ever since bulk metallic glasses (BMGs) are discovered. It is very useful for the development of new {BMGs} for various engineering applications, if {GFA} criterion modeled precisely. In this paper, we have proposed support vector regression (SVR), artificial neural network (ANN), general regression neural network (GRNN), and multiple linear regression (MLR) based computational intelligent (CI) techniques that model the maximum section thickness (Dmax) parameter for glass forming alloys. For this study, a reasonable large number of {BMGs} alloys are collected from the current literature of material science. {CI} models are developed using three thermal characteristics of glass forming alloys i.e., glass transition temperature (Tg), the onset crystallization temperature (Tx), and liquidus temperature (Tl). The R2-values of GRNN, SVR, ANN, and {MLR} models are computed to be 0.5779, 0.5606, 0.4879, and 0.2611 for 349 {BMGs} alloys, respectively. We have investigated that {GRNN} model is performing better than SVR, ANN, and {MLR} models. The performance of proposed models is compared to the existing physical modeling and statistical modeling based techniques. In this study, we have investigated that proposed {CI} approaches are more accurate in modeling the experimental Dmax than the conventional {GFA} criteria of {BMGs} alloys.

Keywords: Glass forming alloys
[480] Peifeng Niu and Weiping Zhang. Model of turbine optimal initial pressure under off-design operation based on {SVR} and {GA}. Neurocomputing, 78(1):64 - 71, 2012. Selected papers from the 8th International Symposium on Neural Networks (ISNN 2011). [ bib | DOI | http ]
Ascertaining real time optimal initial pressure has important significance to safeguard the economic, efficient and safe operation of turbine units. In this paper, a new calculation model of the optimal initial pressure under off-design conditions has been put forward. Support Vector Regression (SVR) is used to build the model of heat rate and the optimal selection approach of {SVR} parameters is discussed. Heat rate is chosen as the fitness function, and then Genetic Algorithm (GA) is applied to seek the optimal initial pressure within the feasible pressure range depend on its global optimal search capability. The obtained optimal initial pressure can effectually guide the economical operation of turbine unit.

Keywords: Steam turbine
[481] Jooyong Shim and Changha Hwang. Support vector censored quantile regression under random censoring. Computational Statistics & Data Analysis, 53(4):912 - 919, 2009. [ bib | DOI | http ]
Censored quantile regression models have received a great deal of attention in both the theoretical and applied statistical literature. In this paper, we propose support vector censored quantile regression (SVCQR) under random censoring using iterative reweighted least squares (IRWLS) procedure based on the Newton method instead of usual quadratic programming algorithms. This procedure makes it possible to derive the generalized approximate cross validation (GACV) method for choosing the hyperparameters which affect the performance of SVCQR. Numerical results are then presented which illustrate the performance of {SVCQR} using the {IRWLS} procedure.

[482] Dali Wei and Hongchao Liu. Analysis of asymmetric driving behavior using a self-learning approach. Transportation Research Part B: Methodological, 47:1 - 14, 2013. [ bib | DOI | http ]
This paper presents a self-learning Support Vector Regression (SVR) approach to investigate the asymmetric characteristic in car-following and its impacts on traffic flow evolution. At the microscopic level, we find that the intensity difference between acceleration and deceleration will lead to a ‘neutral line’, which separates the speed-space diagram into acceleration and deceleration dominant areas. This property is then used to discuss the characteristics and magnitudes of microscopic hysteresis in stop-and-go traffic. At the macroscopic level, according to the distribution of neutral lines for heterogeneous drivers, different congestion propagation patterns are reproduced and found to be consistent with Newell’s car following theory. The connection between the asymmetric driving behavior and macroscopic hysteresis in the flow-density diagram is also analyzed and their magnitudes are shown to be positively related.

Keywords: Asymmetric driving behavior
[483] Jian Zhang, Tadanobu Sato, Susumu Iai, and Tara Hutchinson. A pattern recognition technique for structural identification using observed vibration signals: Linear case studies. Engineering Structures, 30(5):1439 - 1446, 2008. [ bib | DOI | http ]
This and the companion article summarize linear and nonlinear structural identification (SI) methods using a pattern recognition technique, support vector regression (SVR). Signal processing plays a key role in the {SI} field, because observed data are often incomplete and contaminated by noise. Support vector regression (SVR) is a novel data processing technique that is superior in terms of its robustness, thus it has the potential to be applied for accurate and efficient structural identification. Three SVR-based methods employing the autoregression moving average (ARMA) time series, the high-order {AR} model, and the sub-structuring strategy are presented for linear structural parameter identification using observed vibration data. The {SVR} coefficient selection and incremental training algorithm have also been presented. Numerical evaluations demonstrate that the SVR-based methods identify structural parameters accurately. A five-floor structure shaking table test has also been conducted, and the observed data are used to verify experimentally the novel {SVR} technique for linear structural identification.

Keywords: Support vector regression
[484] Bao Rong Chang, Hsiu Fen Tsai, and Chung-Ping Young. Diversity of quantum optimizations for training adaptive support vector regression and its prediction applications. Expert Systems with Applications, 34(4):2612 - 2621, 2008. [ bib | DOI | http ]
Three kinds of quantum optimizations are introduced in this paper as follows: quantum minimization (QM), neuromorphic quantum-based optimization (NQO), and logarithmic search with quantum existence testing (LSQET). In order to compare their optimization ability for training adaptive support vector regression, the performance evaluation is accomplished in the basis of forecasting the complex time series through two real world experiments. The model used for this complex time series prediction comprises both BPNN-Weighted Grey-C3LSP (BWGC) and nonlinear generalized autoregressive conditional heteroscedasticity (NGARCH) that is tuned perfectly by quantum-optimized adaptive support vector regression. Finally, according to the predictive accuracy of time series forecast and the cost of the computational complexity, the concluding remark will be made to illustrate and discuss these quantum optimizations.

Keywords: Quantum minimization
[485] Paulo R. Filgueiras, Júlio Cesar L. Alves, Cristina M.S. Sad, Eustáquio V.R. Castro, Júlio C.M. Dias, and Ronei J. Poppi. Evaluation of trends in residuals of multivariate calibration models by permutation test. Chemometrics and Intelligent Laboratory Systems, 133:33 - 41, 2014. [ bib | DOI | http ]
Abstract This paper proposes the use of a nonparametric permutation test to assess the presence of trends in the residuals of multivariate calibration models. The permutation test was applied to the residuals of models generated by principal component regression (PCR), partial least squares (PLS) regression and support vector regression (SVR). Three datasets of real cases were studied: the first dataset consisted of near-infrared spectra for animal fat biodiesel determination in binary blends, the second one consisted of attenuated total reflectance infrared spectra (ATR-FTIR) for the determination of kinematic viscosity in petroleum and the third one consisted of near infrared spectra for the determination of the flash point in diesel oil from an in-line blending optimizer system of a petroleum refinery. In all datasets, the residuals of the linear models presented trends that have been satisfactorily diagnosed by a permutation test. Additionally, it was verified that 500,000 permutations were enough to produce reliable test results.

Keywords: Permutation test
[486] L. Iliadis, F. Maris, and S. Tachos. Soft computing techniques toward modeling the water supplies of cyprus. Neural Networks, 24(8):836 - 841, 2011. Artificial Neural Networks: Selected Papers from {ICANN} 2010. [ bib | DOI | http ]
This research effort aims in the application of soft computing techniques toward water resources management. More specifically, the target is the development of reliable soft computing models capable of estimating the water supply for the case of “Germasogeia” mountainous watersheds in Cyprus. Initially, ε -Regression Support Vector Machines ( ε -RSVM) and fuzzy weighted ε -RSVMR models have been developed that accept five input parameters. At the same time, reliable artificial neural networks have been developed to perform the same job. The 5-fold cross validation approach has been employed in order to eliminate bad local behaviors and to produce a more representative training data set. Thus, the fuzzy weighted Support Vector Regression (SVR) combined with the fuzzy partition has been employed in an effort to enhance the quality of the results. Several rational and reliable models have been produced that can enhance the efficiency of water policy designers.

Keywords: Support vector machines
[487] P. Lingras and C.J. Butz. Conservative and aggressive rough {SVR} modeling. Theoretical Computer Science, 412(42):5885 - 5901, 2011. Rough Sets and Fuzzy Sets in Natural Computing. [ bib | DOI | http ]
Support vector regression provides an alternative to the neural networks in modeling non-linear real-world patterns. Rough values, with a lower and upper bound, are needed whenever the variables under consideration cannot be represented by a single value. This paper describes two approaches for the modeling of rough values with support vector regression (SVR). One approach, by attempting to ensure that the predicted high value is not greater than the upper bound and that the predicted low value is not less than the lower bound, is conservative in nature. On the contrary, we also propose an aggressive approach seeking a predicted high which is not less than the upper bound and a predicted low which is not greater than the lower bound. The proposal is shown to use ϵ -insensitivity to provide a more flexible version of lower and upper possibilistic regression models. The usefulness of our work is realized by modeling the rough pattern of a stock market index, and can be taken advantage of by conservative and aggressive traders.

Keywords: Support vector regression
[488] Tianhong Gu, Wencong Lu, Xinhua Bao, and Nianyi Chen. Using support vector regression for the prediction of the band gap and melting point of binary and ternary compound semiconductors. Solid State Sciences, 8(2):129 - 136, 2006. [ bib | DOI | http ]
In this work, atomic parameters support vector regression (APSVR) was proposed to predict the band gap and melting point of III–V, II–VI binary and I–III–VI2, II–IV–V2 ternary compound semiconductors. The predicted results of {APSVR} were in good agreement with the experimental ones. The prediction accuracies of different models were discussed on the basis of their mean error functions (MEF) in the leave-one-out cross-validation. It was found that the performance of {APSVR} model outperformed those of back propagation-artificial neural network (BP-ANN), multiple linear regression (MLR) and partial least squares regression (PLSR) methods.

Keywords: Semiconductor
[489] Jie Zhao and Khee Poh Lam. Influential factors analysis on {LEED} building markets in u.s. east coast cities by using support vector regression. Sustainable Cities and Society, 5:37 - 43, 2012. Special Issue on Third Global Conference on Renewable Energy and Energy Efficiency for Desert Region - {GCREEDER} 2011. [ bib | DOI | http ]
Building industry is closely related to current energy and environmental issues. Several green building codes and rating systems addressing the problems have been developed. Leadership in Energy and Environmental Design (LEED) rating system is recognized as one of the effective and widely adopted commercial building standards. {LEED} buildings were investigated in several green city and green building studies but only used as instances in static matrices. These studies were not able to answer the question why a particular city favors LEED. However, in this paper, three commonly used machine learning algorithms – Linear Regression, Locally Weighted Regression and Support Vector Regression (SVR) – are compared and {SVR} is used to investigate, discover and evaluate the variables that could influence {LEED} building markets in U.S. East Coast cities. Machine learning models are first created and optimized with the features of city geography, demography, economy, higher education and policy. Then {SVR} model identifies the key factors by dynamic self-training and model-tuning using the dataset. Via optimization, the correlation coefficient between the model's prediction and actual value is 0.79. The result suggests that population and policy can be important factors for developing {LEED} buildings. It is also interesting that higher education institutions, especially accredited architecture schools could also be driving forces for {LEED} commercial building markets in East Coast cities.

Keywords: LEED
[490] Michael A. King, Alan S. Abrahams, and Cliff T. Ragsdale. Ensemble learning methods for pay-per-click campaign management. Expert Systems with Applications, 42(10):4818 - 4829, 2015. [ bib | DOI | http ]
Abstract Sponsored search advertising has become a successful channel for advertisers as well as a profitable business model for the leading commercial search engines. There is an extensive sponsored search research stream regarding the classification and prediction of performance metrics such as clickthrough rate, impression rate, average results page position and conversion rate. However, there is limited research on the application of advanced data mining techniques, such as ensemble learning, to pay per click campaign classification. This research presents an in-depth analysis of sponsored search advertising campaigns by comparing the classification results from four base classification models (Naïve Bayes, logistic regression, decision trees, and Support Vector Machines) with four popular ensemble learning techniques (Voting, Boot Strap Aggregation, Stacked Generalization, and MetaCost). The goal of our research is to determine whether ensemble learning techniques can predict profitable pay-per-click campaigns and hence increase the profitability of the overall portfolio of campaigns when compared to standard classifiers. We found that the ensemble learning methods were superior classifiers based on a profit per campaign evaluation criterion. This paper extends the research on applied ensemble methods with respect to sponsored search advertising.

Keywords: Sponsored search
[491] Yaoxiang Li, Yazhao Zhang, and Lichun Jiang. Modeling chlorophyll content of korean pine needles with {NIR} and {SVM}. Procedia Environmental Sciences, 10, Part A:222 - 227, 2011. 2011 3rd International Conference on Environmental Science and Information Application Technology {ESIAT} 2011. [ bib | DOI | http ]
Model for predicting chlorophyll content of Korean pine needles was developed using near-infrared spectroscopy (NIR) combined with support vector machines (SVM). A hundred and forty-four Korean pine needle samples were collected in the study. Chlorophyll content of needle samples was measured with chlorophyll tester of SPAD502. Support vector machines for regression (SVR) was applied to model building. Radial basis function (RBF) was used as kernel function to establish a model for predicting chlorophyll content of Korean pine needles. For the train set, the coefficient of determination (R2) and the mean square error (MSE) were 0.8342 and 0.3104, respectively. The {R2} and {MSE} were 0.8207 and 0.4618, respectively, for the test set. Results showed that using {SVM} in near-infrared spectroscopy calibration could significantly improve the model performance for rapid and accurate prediction of chlorophyll content of Korean pine needles.

Keywords: near-infrared spectroscopy
[492] Lü You, Liu Jizhen, and Qu Yaxin. A new robust least squares support vector machine for regression with outliers. Procedia Engineering, 15:1355 - 1360, 2011. {CEIS} 2011. [ bib | DOI | http ]
The least squares support vector machine (LS-SVM) is sensitive to noises or outliers. To address the drawback, a new robust least squares support vector machine (RLS-SVM) is introduced to solve the regression problem with outliers. A fuzzy membership function, which is determined by heuristic method, is assigned to each training sample as a weight. For each data point, firstly a deleted input neighborhood is found when the high-dimension feature space of input is focused on. Then the new field is reformulated after the output is brought in the neighborhood which we have found. The fuzzy membership function (weight) is set according to the distance from the data point to the center of its neighborhood and the radius of the neighborhood, which implies the probability to be an outlier. Two benchmark simulation experiments and analysis are presented to verify that the performance is improved.

Keywords: Outlier
[493] Yothin Jinjarak. Equity prices and financial globalization. International Review of Financial Analysis, 33:49 - 57, 2014. [ bib | DOI | http ]
Abstract This paper examines the association between equity returns, economic shocks, and economic integration. The empirical findings show that oil prices and U.S. Federal Reserve funds rates are associated with negative responses of international equity returns, of which a simple asset-pricing model is capable of explaining the international differences. Using vector autoregressions, we find that the effects of global economic shocks operate through the current excess returns of equity prices. Empirically, trade integration increases the responses of international equity returns to oil prices, while finance integration increases the responses of equity returns to Federal Reserve funds rates across countries.

Keywords: Asset prices
[494] H. Hang and I. Steinwart. Fast learning from -mixing observations. Journal of Multivariate Analysis, 127:184 - 199, 2014. [ bib | DOI | http ]
Abstract We present a new oracle inequality for generic regularized empirical risk minimization algorithms learning from stationary α -mixing processes. Our main tool to derive this inequality is a rather involved version of the so-called peeling method. We then use this oracle inequality to derive learning rates for some learning methods such as empirical risk minimization (ERM), least squares support vector machines (SVMs) using given generic kernels, and {SVMs} using the Gaussian {RBF} kernels for both least squares and quantile regression. It turns out that for i.i.d. processes our learning rates for {ERM} and {SVMs} with Gaussian kernels match, up to some arbitrarily small extra term in the exponent, the optimal rates, while in the remaining cases our rates are at least close to the optimal rates.

Keywords: Alpha-mixing processes
[495] Huadi Xiong, Zhenzhong Chen, Haobo Qiu, Hongyan Hao, and Haoli Xu. Adaptive svr-hdmr metamodeling technique for high dimensional problems. {AASRI} Procedia, 3:95 - 100, 2012. Conference on Modelling, Identification and Control. [ bib | DOI | http ]
Modeling or approximating high dimensional, computationally-expensive problems faces an exponentially increasing difficulty, the “curse of dimensionality”. This paper proposes a new form of high dimensional model representation (HDMR) by utilizing the support vector regression (SVR), termed as adaptive SVR-HMDR, to conquer this dilemma. The proposed model could reveal explicit correlations among different input variables of the underlying function which is unknown or expensive for computation. Taking advantage of HDMR's hierarchical structure, it could alleviate the exponential increasing difficulty, and gain satisfying accuracy with small set of samples by SVR. Numerical examples of different dimensionality are given to illustrate the principle, procedure and performance of SVR-HDMR.

Keywords: Metamodel
[496] Andreia Andrade, José Silvestre Silva, Jaime Santos, and Pedro Belo-Soares. Classifier approaches for liver steatosis using ultrasound images. Procedia Technology, 5:763 - 770, 2012. 4th Conference of {ENTERprise} Information Systems – aligning technology, organizations and people (CENTERIS 2012). [ bib | DOI | http ]
This paper presents a semi-automatic classification approach to evaluate steatotic liver tissues using B-scan ultrasound images. Several features have been extracted and used in three different classifiers, such as Artificial Neural Networks (ANN), Support Vector Machines (SVM) and k-Nearest Neighbors (kNN). The classifiers were trained using the 10-cross validation method. A feature selection method based on stepwise regression was also exploited resulting in better accuracy predictions. The results showed that the {SVM} have a slightly higher performance than the kNN and the ANN, appearing as the most relevant one to be applied to the discrimination of pathologic tissues in clinical practice.

Keywords: Classifier
[497] Yasheng Wang, Meng Yang, Gao Wei, Ruifen Hu, Zhiyuan Luo, and Guang Li. Improved {PLS} regression based on {SVM} classification for rapid analysis of coal properties by near-infrared reflectance spectroscopy. Sensors and Actuators B: Chemical, 193:723 - 729, 2014. [ bib | DOI | http ]
Abstract Using near infrared reflectance spectra (NIRS) for rapid coal property analysis is convenient, fast, safe and could be used as online analysis method. This study first built Partial Least Square regression (PLS regression) models for six coal properties (total moisture (Mt), inherent moisture (Minh), ash (Ash), volatile matter (VM), fixed carbon (FC), and sulfur (S)) with the {NIRS} of 199 samples. The 199 samples came from different mines including 4 types of coal (fat coal, coking coal, lean coal and meager lean coal). In comparison, models for the six properties according to different types were built. Results show that models for different types are more effective than that of the entire sample set. A new method for coal classification was then obtained by applying Principle Components Analysis (PCA) and Support Vector Machine (SVM) to the spectra of the coal samples, which was of high classification accuracy and time saving. At last, different {PLS} regression models were built for different types classified by the new method and got better prediction results than that of full samples. Thus, the predictive ability was improved by fitting the coal samples into corresponding models using the {SVM} classification.

Keywords: Near infrared reflectance spectra
[498] Rongjie Yu and Mohamed Abdel-Aty. Utilizing support vector machine in real-time crash risk evaluation. Accident Analysis & Prevention, 51:252 - 259, 2013. [ bib | DOI | http ]
Real-time crash risk evaluation models will likely play a key role in Active Traffic Management (ATM). Models have been developed to predict crash occurrence in order to proactively improve traffic safety. Previous real-time crash risk evaluation studies mainly employed logistic regression and neural network models which have a linear functional form and over-fitting drawbacks, respectively. Moreover, these studies mostly focused on estimating the models but barely investigated the models’ predictive abilities. In this study, support vector machine (SVM), a recently proposed statistical learning model was introduced to evaluate real-time crash risk. The data has been split into a training dataset (used for developing the models) and scoring datasets (meant for assessing the models’ predictive power). Classification and regression tree (CART) model has been developed to select the most important explanatory variables and based on the results, three candidates Bayesian logistic regression models have been estimated with accounting for different levels unobserved heterogeneity. Then {SVM} models with different kernel functions have been developed and compared to the Bayesian logistic regression model. Model comparisons based on areas under the {ROC} curve (AUC) demonstrated that the {SVM} model with Radial-basis kernel function outperformed the others. Moreover, several extension analyses have been conducted to evaluate the effect of sample size on {SVM} models’ predictive capability; the importance of variable selection before developing {SVM} models; and the effect of the explanatory variables in the {SVM} models. Results indicate that (1) smaller sample size would enhance the {SVM} model's classification accuracy, (2) variable selection procedure is needed prior to the {SVM} model estimation, and (3) explanatory variables have identical effects on crash occurrence for the {SVM} models and logistic regression models.

Keywords: Support vector machine model
[499] Ling Wang, Zhichun Mu, and Hui Guo. Application of support vector machine in the prediction of mechanical property of steel materials. Journal of University of Science and Technology Beijing, Mineral, Metallurgy, Material, 13(6):512 - 515, 2006. [ bib | DOI | http ]
The investigation of the influences of important parameters including steel chemical composition and hot rolling parameters on the mechanical properties of steel is a key for the systems that are used to predict mechanical properties. To improve the prediction accuracy, support vector machine was used to predict the mechanical properties of hot-rolled plain carbon steel Q235B. Support vector machine is a novel machine learning method, which is a powerful tool used to solve the problem characterized by small sample, nonlinearity, and high dimension with a good generalization performance. On the basis of the data collected from the supervisor of hotrolling process, the support vector regression algorithm was used to build prediction models, and the off-line simulation indicates that predicted and measured results are in good agreement.

Keywords: mechanical properties
[500] Rongjing Hu, Jean-Pierre Doucet, Michel Delamar, and Ruisheng Zhang. {QSAR} models for 2-amino-6-arylsulfonylbenzonitriles and congeners hiv-1 reverse transcriptase inhibitors based on linear and nonlinear regression methods. European Journal of Medicinal Chemistry, 44(5):2158 - 2171, 2009. [ bib | DOI | http ]
A quantitative structure–activity relationship study of a series of HIV-1 reverse transcriptase inhibitors (2-amino-6-arylsulfonylbenzonitriles and their thio and sulfinyl congeners) was performed. Topological and geometrical, as well as quantum mechanical energy-related and charge distribution-related descriptors generated from CODESSA, were selected to describe the molecules. Principal component analysis (PCA) was used to select the training set. Six techniques: multiple linear regression (MLR), multivariate adaptive regression splines (MARS), radial basis function neural networks (RBFNN), general regression neural networks (GRNN), projection pursuit regression (PPR) and support vector machine (SVM) were used to establish {QSAR} models for two data sets: anti-HIV-1 activity and HIV-1 reverse transcriptase binding affinity. Results showed that {PPR} and {SVM} models provided powerful capacity of prediction.

Keywords: QSAR
[501] X. Sun, K.J. Chen, K.R. Maddock-Carlin, V.L. Anderson, A.N. Lepper, C.A. Schwartz, W.L. Keller, B.R. Ilse, J.D. Magolski, and E.P. Berg. Predicting beef tenderness using color and multispectral image texture features. Meat Science, 92(4):386 - 393, 2012. [ bib | DOI | http ]
The objective of this study was to investigate the usefulness of raw meat surface characteristics (texture) in predicting cooked beef tenderness. Color and multispectral texture features, including 4 different wavelengths and 217 image texture features, were extracted from 2 laboratory-based multispectral camera imaging systems. Steaks were segregated into tough and tender classification groups based on Warner–Bratzler shear force. The texture features were submitted to {STEPWISE} multiple regression and support vector machine (SVM) analyses to establish prediction models for beef tenderness. A subsample (80%) of tender or tough classified steaks were used to train models which were then validated on the remaining (20%) test steaks. For color images, the {SVM} model correctly identified tender steaks with 100% accurately while the {STEPWISE} equation identified 94.9% of the tender steaks correctly. For multispectral images, the {SVM} model predicted 91% and {STEPWISE} predicted 87% average accuracy of beef tender.

Keywords: Beef
[502] Real Carbonneau, Kevin Laframboise, and Rustam Vahidov. Application of machine learning techniques for supply chain demand forecasting. European Journal of Operational Research, 184(3):1140 - 1154, 2008. [ bib | DOI | http ]
Full collaboration in supply chains is an ideal that the participant firms should try to achieve. However, a number of factors hamper real progress in this direction. Therefore, there is a need for forecasting demand by the participants in the absence of full information about other participants’ demand. In this paper we investigate the applicability of advanced machine learning techniques, including neural networks, recurrent neural networks, and support vector machines, to forecasting distorted demand at the end of a supply chain (bullwhip effect). We compare these methods with other, more traditional ones, including naïve forecasting, trend, moving average, and linear regression. We use two data sets for our experiments: one obtained from the simulated supply chain, and another one from actual Canadian Foundries orders. Our findings suggest that while recurrent neural networks and support vector machines show the best performance, their forecasting accuracy was not statistically significantly better than that of the regression model.

Keywords: Supply chain management
[503] Yitian Xu and Laisheng Wang. A weighted twin support vector regression. Knowledge-Based Systems, 33:92 - 101, 2012. [ bib | DOI | http ]
Twin support vector regression (TSVR) is a new regression algorithm, which aims at finding ϵ-insensitive up- and down-bound functions for the training points. In order to do so, one needs to resolve a pair of smaller-sized quadratic programming problems (QPPs) rather than a single large one in a classical SVR. However, the same penalties are given to the samples in TSVR. In fact, samples in the different positions have different effects on the bound function. Then, we propose a weighted {TSVR} in this paper, where samples in the different positions are proposed to give different penalties. The final regressor can avoid the over-fitting problem to a certain extent and yield great generalization ability. Numerical experiments on one artificial dataset and nine benchmark datasets demonstrate the feasibility and validity of our proposed algorithm.

Keywords: SVR
[504] Li-Yueh Chen. Application of {SVR} with chaotic {GASA} algorithm to forecast taiwanese 3g mobile phone demand. Neurocomputing, 127:206 - 213, 2014. Advances in Intelligent SystemsSelected papers from the 2012 Brazilian Symposium on Neural Networks (SBRN 2012). [ bib | DOI | http ]
Abstract Along with the increases of 3G relevant products and the updating regulations of 3G phones, 3G phones are gradually replacing 2G phones as the mainstream product in Taiwan. Taiwan will be the country with higher 3G phone penetration rate in the world. Therefore, accurate 3G phones demand forecasting is necessary for those communication related enterprises. Due to complicate market growth tendency and multi-variate competitions, different subscribers with different demand types, 3G phones demand forecasting is with highly nonlinear characteristics. Recently, support vector regression (SVR) has been successfully applied to solve nonlinear regression and time series problems. This investigation presents a 3G phones demand forecasting model which combines chaotic sequence (mapped by cat function) with genetic algorithm–simulated annealing algorithm (namely CGASA) to improve the forecasting performance. The proposed {SVRCGASA} employs internal randomness of chaos iterations which is with better performance in function optimization to overcome premature local optimum that is suffered by GA–SA. Subsequently, a numerical example of 3G phones demand data from Taiwan are used to illustrate the proposed {SVRCGASA} model. The empirical results reveal that the proposed model outperforms the other three models, namely the autoregressive integrated moving average (ARIMA) model, the general regression neural networks (GRNN) model, {SVRGA} model, and {SVRGASA} model.

Keywords: Chaotic genetic algorithm–simulated annealing (CGASA)
[505] Junying Gan, Lichen Li, Yikui Zhai, and Yinhua Liu. Deep self-taught learning for facial beauty prediction. Neurocomputing, 144:295 - 303, 2014. [ bib | DOI | http ]
Abstract Most modern research of facial beauty prediction focuses on geometric features by traditional machine learning methods. Geometric features may easily lose much feature information characterizing facial beauty, rely heavily on accurate manual landmark localization of facial features and impose strict restrictions on training samples. Deep architectures have been recently demonstrated to be a promising area of research in statistical machine learning. In this paper, deep self-taught learning is utilized to obtain hierarchical representations, learn the concept of facial beauty and produce human-like predictor. Deep learning is helpful to recognize a broad range of visual concept effectively characterizing facial beauty. Through deep learning, reasonable apparent features of face images are extracted without depending completely on artificial feature selection. Self-taught learning, which has the ability of automatically improving network systems to understand the characteristics of data distribution and making recognition significantly easier and cheaper, is used to relax strict restrictions of training samples. Moreover, in order to choose a more appropriate method for mapping high-level representations into beauty ratings efficiently, we compare the performance of five regression methods and prove that support vector machine (SVM) regression is better. In addition, novel applications of deep self-taught learning on local binary pattern (LBP) and Gabor filters are presented, and the improvements on facial beauty prediction are shown by deep self-taught learning combined with LBP. Finally, human-like performance is obtained with learning features in full-sized and high-resolution images.

Keywords: Deep self-taught learning
[506] Min-Yuan Cheng and Minh-Tu Cao. Accurately predicting building energy performance using evolutionary multivariate adaptive regression splines. Applied Soft Computing, 22:178 - 188, 2014. [ bib | DOI | http ]
Abstract This paper proposes using evolutionary multivariate adaptive regression splines (EMARS), an artificial intelligence (AI) model, to efficiently predict the energy performance of buildings (EPB). {EMARS} is a hybrid of multivariate adaptive regression splines (MARS) and artificial bee colony (ABC). In EMARS, {MARS} addresses learning and curve fitting and {ABC} carries out optimization to determine the fittest parameter settings with minimal prediction error. The proposed model was constructed using 768 experimental datasets from the literature, with eight input parameters and two output parameters (cooling load (CL) and heating load (HL)). {EMARS} performance was compared against five other {AI} models, including MARS, back-propagation neural network (BPNN), radial basis function neural network (RBFNN), classification and regression tree (CART), and support vector machine (SVM). A 10-fold cross-validation approach found {EMARS} to be the best model for predicting {CL} and {HL} with 65% and 45% deduction in terms of RMSE, respectively, compared to other methods. Furthermore, {EMARS} is able to operate autonomously without human intervention or domain knowledge; represent derived relationship between response (HL and CL) with predictor variables associated with their relative importance.

Keywords: Multivariate adaptive regression splines
[507] A. Suárez Sánchez, P.J. García Nieto, P. Riesgo Fernández, J.J. del Coz Díaz, and F.J. Iglesias-Rodríguez. Application of an svm-based regression model to the air quality study at local scale in the avilés urban area (spain). Mathematical and Computer Modelling, 54(5–6):1453 - 1466, 2011. [ bib | DOI | http ]
The objective of this study is to build a regression model of air quality by using the support vector machine (SVM) technique in the Avilés urban area (Spain) at local scale. Hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental data of nitrogen oxides (NOx), carbon monoxide (CO), sulphur dioxide (SO2), ozone (O3) and dust (PM10) for the years 2006–2008 are used to create a highly nonlinear model of the air quality in the Avilés urban nucleus (Spain) based on {SVM} techniques. One aim of this model is to obtain a preliminary estimate of the dependence between primary and secondary pollutants in the Avilés urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. They are known as criteria pollutants. This support vector regression model captures the main insight of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Avilés urban area. Finally, on the basis of these numerical calculations, using the support vector regression (SVR) technique, conclusions of this work are drawn.

Keywords: Air quality
[508] Zhe Sun, Jingjing Zhao, Zhengang Shi, and Suyuan Yu. Soft sensing of magnetic bearing system based on support vector regression and extended kalman filter. Mechatronics, 24(3):186 - 197, 2014. [ bib | DOI | http ]
Abstract The rotor displacement measurement plays an important role in an active bearing system, however, in practice this measurement might be quite noisy, so that the control performance might be seriously degraded. In this paper, a soft sensing method for magnetic bearing-rotor system based on Support Vector Regression (SVR) and Extended Kalman Filter (EKF) is proposed. In the proposed method, {SVR} technique is applied to model the acceleration of the rotor, which is regarded as a nonlinear function of rotor displacement, rotor velocity and bearing currents; then this {SVR} model is used to construct an {EKF} estimator of rotor displacement. In the proposed method the bearing current is incorporated to the estimation of displacement, so that displacement can be precisely estimated even if very large observation noise is present. A series of experiments are performed and the results verify the validity of the proposed displacement soft sensing method.

Keywords: Active magnetic bearing
[509] X.C. Guo, C.G. Wu, M. Marchese, and Y.C. Liang. Ls-svr-based solving volterra integral equations. Applied Mathematics and Computation, 218(23):11404 - 11409, 2012. [ bib | DOI | http ]
In this paper, a novel hybrid method is presented for solving the second kind linear Volterra integral equations. Due to the powerful regression ability of least squares support vector regression (LS-SVR), we approximate the unknown function of integral equations by using LS-SVR in intervals with known numerical solutions. The trapezoid quadrature is used to approximate subsequent integrations in intervals with unknown numerical solutions. The feasibility of the proposed method is examined on some integral equations. Experimental results of comparison with analytic and repeated modified trapezoid quadrature method’s solutions show that the proposed algorithm could reach a very high accuracy. The proposed algorithm could be a good tool for solving the second kind linear Volterra integral equations.

Keywords: Integral equation
[510] Sergio Saludes Rodil and M.J. Fuente. Fault tolerance in the framework of support vector machines based model predictive control. Engineering Applications of Artificial Intelligence, 23(7):1127 - 1139, 2010. [ bib | DOI | http ]
Model based predictive control (MBPC) has been extensively investigated and is widely used in industry. Besides this, interest in non-linear systems has motivated the development of {MBPC} formulations for non-linear systems. Moreover, the importance of security and reliability in industrial processes is in the origin of the fault tolerant strategies developed in the last two decades. In this paper a {MBPC} based on support vector machines (SVM) able to cope with faults in the plant itself is presented. The fault tolerant capability is achieved by means of the accurate on-line support vector regression (AOSVR) which is capable of training an {SVM} in an incremental way. Thanks to {AOSVR} is possible to train a plant model when a fault is detected and to change the nominal model by the new one, that models the faulty plant. Results obtained under simulation are presented.

Keywords: Accurate online support vector regression
[511] Yen Yee Chia, Lam Hong Lee, Niusha Shafiabady, and Dino Isa. A load predictive energy management system for supercapacitor-battery hybrid energy storage system in solar application using the support vector machine. Applied Energy, 137:588 - 602, 2015. [ bib | DOI | http ]
Abstract This paper presents the use of a Support Vector Machine load predictive energy management system to control the energy flow between a solar energy source, a supercapacitor-battery hybrid energy storage combination and the load. The supercapacitor-battery hybrid energy storage system is deployed in a solar energy system to improve the reliability of delivered power. The combination of batteries and supercapacitors makes use of complementary characteristic that allow the overlapping of a battery’s high energy density with a supercapacitors’ high power density. This hybrid system produces a straightforward benefit over either individual system, by taking advantage of each characteristic. When the supercapacitor caters for the instantaneous peak power which prolongs the battery lifespan, it also minimizes the system cost and ensures a greener system by reducing the number of batteries. The resulting performance is highly dependent on the energy controls implemented in the system to exploit the strengths of the energy storage devices and minimize its weaknesses. It is crucial to use energy from the supercapacitor and therefore minimize jeopardizing the power system reliability especially when there is a sudden peak power demand. This study has been divided into two stages. The first stage is to obtain the optimum {SVM} load prediction model, and the second stage carries out the performance comparison of the proposed SVM-load predictive energy management system with conventional sequential programming control (if-else condition). An optimized load prediction classification model is investigated and implemented. This C-Support Vector Classification yields classification accuracy of 100% using 17 support vectors in 0.004866 s of training time. The Polynomial kernel is the optimum kernel in our experiments where the C and g values are 2 and 0.25 respectively. However, for the load profile regression model which was implemented in the K-step ahead of load prediction, the radial basis function (RBF) kernel was chosen due to the highest squared correlation coefficient and the lowest mean squared error. Results obtained shows that the proposed {SVM} load predictive energy management system accurately identifies and predicts the load demand. This has been justified by the supercapacitor charging and leading the peak current demand by 200 ms for different load profiles with different optimized regression models. This methodology optimizes the cost of the system by reducing the amount of power electronics within the hybrid energy storage system, and also prolongs the batteries’ lifespan as previously mentioned.

Keywords: Supercapacitor
[512] Enrico Zio and Francesco Di Maio. Fatigue crack growth estimation by relevance vector machine. Expert Systems with Applications, 39(12):10681 - 10692, 2012. [ bib | DOI | http ]
The investigation of damage propagation mechanisms on a selected safety–critical component or structure requires the quantification of its remaining useful life (RUL) to verify until when it can continue performing the required function. In this work, a relevance vector machine (RVM), that is a Bayesian elaboration of support vector machine (SVM), automatically selects a low number of significant basis functions, called relevant vectors (RVs), for degradation model identification, degradation state regression and {RUL} estimation. In particular, {RVM} capabilities are exploited to provide estimates of the {RUL} of a component undergoing crack growth, within an original combination of data-driven and model-based approaches to prognostics. The application to a case study shows that the proposed approach compares well to other methods (the model-based Bayesian approach of particle filtering and the data-driven fuzzy similarity-based approach) with respect to computational demand, data requirements, accuracy and that its Bayesian setting allows representing and propagating the uncertainty in the estimates.

Keywords: Prognostics
[513] Shubh Bansal, Shantanu Roy, and Faical Larachi. Support vector regression models for trickle bed reactors. Chemical Engineering Journal, 207–208:822 - 831, 2012. 22nd International Symposium on Chemical Reaction Engineering (ISCRE 22). [ bib | DOI | http ]
Abstract Transport phenomena in multiphase reactors are poorly understood and first-principles modeling approaches have hitherto met with limited success. Industry continues thus far to depend heavily on engineering correlations for variables like pressure drop, transport coefficients and wetting efficiencies. While immensely useful, engineering correlations typically have wide variations in their predictive capability when venturing outside their instructed domain, and hence universally applicable correlations are rare. In this contribution, we present a machine learning approach for modeling such multiphase systems, specifically using the Support Vector Regression (SVR) algorithm. An application of trickle bed reactors is considered wherein key design variables for which numerous correlations exist in the literature (with a large variation in their predictions), are all correlated using the {SVR} approach with remarkable accuracy of prediction for all the different literature data sets with wide-ranging databanks.

Keywords: Support Vector Machines (SVMs)
[514] M. Piles, J. Díez, J.J. del Coz, E. Montañés, J.R. Quevedo, J. Ramon, O. Rafel, M. López-Béjar, and L. Tusell. Predicting fertility from seminal traits: Performance of several parametric and non-parametric procedures. Livestock Science, 155(1):137 - 147, 2013. [ bib | DOI | http ]
Abstract This research aimed at assessing the efficacy of non-parametric procedures to improve the classification of the ejaculates in the artificial insemination (AI) centers according to their fertility rank predicted from characteristics of the {AI} doses. A total of 753 ejaculates from 193 bucks were evaluated at three different times from 5 to 9 months of age for 21 seminal variables (related to ejaculate pH and volume, sperm concentration, viability, morphology and acrosome reaction traits, and dose characteristic) and their corresponding fertility score after {AI} over crossbred females. Fertility rate was categorized into five classes of equal length. Linear Regression (LR), Ordinal Logistic Regression (OLR), Support Vector Regression (SVR), Support Vector Ordinal Regression (SVOR), and Non-deterministic Ordinal Regression (NDOR) were compared in terms of their predictive ability with two base line algorithms: {MEAN} and {MODE} which always predict the mean and mode value of the classes observed in the data set, respectively. Predicting ability was measured in terms of rate of erroneous classifications, linear loss (average of the distance between the predicted and the observed classes), the number of predicted classes and the {F1} statistic (which allows comparing procedures taking into account that they can predict different number of classes). The seminal traits with a bigger influence on fertility were established using stepwise regression and a nondeterministic classifier. MEAN, {LR} and {SVR} produced a higher percentage of wrong classified cases than {MODE} (taken as reference for this statistic), whereas it was 6%, 13% and 39% smaller for SVOR, {OLR} and NDOR, respectively. However, {NDOR} predicted an average of 2.04 classes instead of one class predicted by the other procedures. All the procedures except {MODE} showed a similar smaller linear loss than the reference one (MEAN) {SVOR} being the one with the best performance. The {NDOR} showed the highest value of the {F1} statistic. Values of linear loss and {F1} statistics were far from their best value indicating that possibly, the variation in fertility explained by this group of semen characteristics is very low. From the total amount of traits included in the full model, 11, 16, 15, 18 and 3 features were kept after performing variable selection with the LR, OLR, SVR, {SVOR} and {NDOR} methods, respectively. For all methods, the reduced models showed almost an irrelevant decrease in their predictive abilities compared to the corresponding values obtained with the full models.

Keywords: Fertility
[515] Wei-Chiang Hong. Traffic flow forecasting by seasonal {SVR} with chaotic simulated annealing algorithm. Neurocomputing, 74(12–13):2096 - 2107, 2011. [ bib | DOI | http ]
Accurate forecasting of inter-urban traffic flow has been one of the most important issues globally in the research on road traffic congestion. However, the information of inter-urban traffic presents a challenging situation; the traffic flow forecasting involves a rather complex nonlinear data pattern, particularly during daily peak periods, traffic flow data reveals cyclic (seasonal) trend. In the recent years, the support vector regression model (SVR) has been widely used to solve nonlinear regression and time series problems. However, the applications of {SVR} models to deal with cyclic (seasonal) trend time series have not been widely explored. This investigation presents a traffic flow forecasting model that combines the seasonal support vector regression model with chaotic simulated annealing algorithm (SSVRCSA), to forecast inter-urban traffic flow. Additionally, a numerical example of traffic flow values from northern Taiwan is employed to elucidate the forecasting performance of the proposed {SSVRCSA} model. The forecasting results indicate that the proposed model yields more accurate forecasting results than the seasonal autoregressive integrated moving average (SARIMA), back-propagation neural network (BPNN) and seasonal Holt-Winters (SHW) models. Therefore, the {SSVRCSA} model is a promising alternative for forecasting traffic flow.

Keywords: Traffic flow forecasting
[516] Mohammad-Bagher Gholivand, Ali R. Jalalvand, Hector C. Goicoechea, and Thomas Skov. Chemometrics-assisted simultaneous voltammetric determination of ascorbic acid, uric acid, dopamine and nitrite: Application of non-bilinear voltammetric data for exploiting first-order advantage. Talanta, 119:553 - 563, 2014. [ bib | DOI | http ]
Abstract For the first time, several multivariate calibration (MVC) models including partial least squares-1 (PLS-1), continuum power regression (CPR), multiple linear regression-successive projections algorithm (MLR-SPA), robust continuum regression (RCR), partial robust M-regression (PRM), polynomial-PLS (PLY-PLS), spline-PLS (SPL-PLS), radial basis function-PLS (RBF-PLS), least squares-support vector machines (LS-SVM), wavelet transform-artificial neural network (WT-ANN), discrete wavelet transform-ANN (DWT-ANN), and back propagation-ANN (BP-ANN) have been constructed on the basis of non-bilinear first order square wave voltammetric (SWV) data for the simultaneous determination of ascorbic acid (AA), uric acid (UA), dopamine (DP) and nitrite (NT) at a glassy carbon electrode (GCE) to identify which technique offers the best predictions. The compositions of the calibration mixtures were selected according to a simplex lattice design (SLD) and validated with an external set of analytes' mixtures. An asymmetric least squares splines regression (AsLSSR) algorithm was applied for correcting the baselines. A correlation optimized warping (COW) algorithm was used to data alignment and lack of bilinearity was tackled by potential shift correction. The effects of several pre-processing techniques such as genetic algorithm (GA), orthogonal signal correction (OSC), mean centering (MC), robust median centering (RMC), wavelet denoising (WD), and Savitsky–Golay smoothing (SGS) on the predictive ability of the mentioned {MVC} models were examined. The best preprocessing technique was found for each model. According to the results obtained, the RBF-PLS was recommended to simultaneously assay the concentrations of AA, UA, {DP} and {NT} in human serum samples.

Keywords: Ascorbic acid
[517] Athanassios Zagouras, Hugo T.C. Pedro, and Carlos F.M. Coimbra. On the role of lagged exogenous variables and spatio–temporal correlations in improving the accuracy of solar forecasting methods. Renewable Energy, 78:203 - 218, 2015. [ bib | DOI | http ]
Abstract We propose and analyze a spatio–temporal correlation method to improve forecast performance of solar irradiance using gridded satellite-derived global horizontal irradiance (GHI) data. Forecast models are developed for seven locations in California to predict 1-h averaged {GHI} 1, 2 and 3 h ahead of time. The seven locations were chosen to represent a diverse set of maritime, mediterranean, arid and semi-arid micro-climates. Ground stations from the California Irrigation Management Information System were used to obtain solar irradiance time-series from the points of interest. In this method, firstly, we define areas with the highest correlated time-series between the satellite-derived data and the ground data. Secondly, we select satellite-derived data from these regions as exogenous variables to several forecast models (linear models, Artificial Neural Networks, Support Vector Regression) to predict {GHI} at the seven locations. The results show that using linear forecasting models and a genetic algorithm to optimize the selection of multiple time-lagged exogenous variables results in significant forecasting improvements over other benchmark models.

Keywords: Solar forecasting
[518] Daniel Westreich, Justin Lessler, and Michele Jonsson Funk. Propensity score estimation: neural networks, support vector machines, decision trees (cart), and meta-classifiers as alternatives to logistic regression. Journal of Clinical Epidemiology, 63(8):826 - 833, 2010. [ bib | DOI | http ]
Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Conclusion Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice.

Keywords: Propensity scores
[519] Yukun Bao, Tao Xiong, and Zhongyi Hu. Multi-step-ahead time series prediction using multiple-output support vector regression. Neurocomputing, 129:482 - 493, 2014. [ bib | DOI | http ]
Abstract Accurate time series prediction over long future horizons is challenging and of great interest to both practitioners and academics. As a well-known intelligent algorithm, the standard formulation of Support Vector Regression (SVR) could be taken for multi-step-ahead time series prediction, only relying either on iterated strategy or direct strategy. This study proposes a novel multiple-step-ahead time series prediction approach which employs multiple-output support vector regression (M-SVR) with multiple-input multiple-output (MIMO) prediction strategy. In addition, the rank of three leading prediction strategies with {SVR} is comparatively examined, providing practical implications on the selection of the prediction strategy for multi-step-ahead forecasting while taking {SVR} as modeling technique. The proposed approach is validated with the simulated and real datasets. The quantitative and comprehensive assessments are performed on the basis of the prediction accuracy and computational cost. The results indicate that (1) the M-SVR using {MIMO} strategy achieves the best accurate forecasts with accredited computational load, (2) the standard {SVR} using direct strategy achieves the second best accurate forecasts, but with the most expensive computational cost, and (3) the standard {SVR} using iterated strategy is the worst in terms of prediction accuracy, but with the least computational cost.

Keywords: Multi-step-ahead time series prediction
[520] Chen Lin, Xue Chen, Lei Jian, Chunhai Shi, Xiaoli Jin, and Guoping Zhang. Determination of grain protein content by near-infrared spectrometry and multivariate calibration in barley. Food Chemistry, 162:10 - 15, 2014. [ bib | DOI | http ]
Abstract Grain protein content (GPC) is an important quality determinant in barley. This research aimed to explore the relationship between {GPC} and diffuse reflectance spectra in barley. The results indicate that normalizing, and taking first-order derivatives can improve the class models by enhancing signal-to-noise ratio, reducing baseline and background shifts. The most accurate and stable models were obtained with derivative spectra for GPC. Three multivariate calibrations including least squares support vector machine regression (LSSVR), partial least squares (PLS), and radial basis function (RBF) neural network were adopted for development of {GPC} determination models. The Lin_LSSVR and RBF_LSSVR models showed higher accuracy than {PLS} and RBF_NN models. Thirteen spectral wavelengths were found to possess large spectrum variation and show high contribution to calibration models. From the present study, the calibration models of {GPC} in barley were successfully developed and could be applied to quality control in malting, feed processing, and breeding selection.

Keywords: Grain protein content (GPC)
[521] Hongying Du, Jie Wang, Xiaoyun Zhang, and Zhide Hu. A novel quantitative structure–activity relationship method to predict the affinities of {MT3} melatonin binding site. European Journal of Medicinal Chemistry, 43(12):2861 - 2869, 2008. [ bib | DOI | http ]
The linear regression (LR) and non-linear regression methods – grid search-support vector machine (GS-SVM) and projection pursuit regression (PPR) were used to develop quantitative structure–activity relationship (QSAR) models for a series of derivatives of naphthalene, benzofurane and indole with respect to their affinities to MT3/quinone reductase 2 (QR2) melatonin binding site. Five molecular descriptors selected by genetic algorithm (GA) were used as the input variables for the {LR} model and two non-linear regression approaches. Comparison of the results of the three methods indicated that {PPR} was the most accurate approach in predicting the affinities of the MT3/QR2 melatonin binding site. This confirmed the capability of {PPR} for the prediction of the binding affinities of compounds. Moreover, it should facilitate the design and development of new selective MT3/QR2 ligands.

Keywords: Melatonin
[522] Guo en XIA and Wei dong JIN. Model of customer churn prediction on support vector machine. Systems Engineering - Theory & Practice, 28(1):71 - 77, 2008. [ bib | DOI | http ]
To improve the prediction abilities of machine learning methods, a support vector machine (SVM) on structural risk minimization was applied to customer churn prediction. Researching customer churn prediction cases both in home and foreign carries, the method was compared with artifical neural network, decision tree, logistic regression, and naive bayesian classifier. It is found that the method enjoys the best accuracy rate, hit rate, covering rate, and lift coefficient, and therefore, provides an effective measurement for customer churn prediction.

Keywords: customer churn
[523] Huaiping Jin, Xiangguang Chen, Jianwen Yang, Hua Zhang, Li Wang, and Lei Wu. Multi-model adaptive soft sensor modeling method using local learning and online support vector regression for nonlinear time-variant batch processes. Chemical Engineering Science, 131:282 - 303, 2015. [ bib | DOI | http ]
Abstract Batch processes are often characterized by inherent nonlinearity, multiplicity of operating phases, and batch-to-batch variations, which poses great challenges for accurate and reliable online prediction of soft sensor. Especially, the soft sensor built with old data may encounter performance deterioration due to a failure of capturing the time-variant behaviors of batch processes, thus adaptive strategies are necessary. Unfortunately, conventional adaptive soft sensors cannot efficiently account for the within-batch as well as between-batch time-variant changes in batch process characteristics, which results in poor prediction accuracy. Therefore, a novel multi-model adaptive soft sensor modeling method is proposed based on the local learning framework and online support vector regression (OSVR) for nonlinear time-variant batch processes. First, a batch process is identified with a set of local domains and then the localized {OSVR} models are built for all isolated domains. Further, the estimation for a query data is obtained by adaptively combining multiple local models that perform best on the similar samples to the query point. The proposed multi-model {OSVR} (MOSVR) method provides four types of adaptation strategies: (i) adaptive combination based on Bayesian ensemble learning; (ii) online offset compensation; (iii) incremental updating of local models; and (iv) database updating. The effectiveness of the {MOSVR} approach and its superiority over traditional adaptive soft sensors in dealing with the within-batch and between-batch shifting dynamics is demonstrated through a simulated fed-batch penicillin fermentation process as well as an industrial fed-batch chlortetracycline fermentation process.

Keywords: Adaptive soft sensor
[524] Athina Tzovara, Ricardo Chavarriaga, and Marzia De Lucia. Quantifying the time for accurate {EEG} decoding of single value-based decisions. Journal of Neuroscience Methods, 250:114 - 125, 2015. Cutting-edge {EEG} Methods. [ bib | DOI | http ]
AbstractBACKGROUND Recent neuroimaging studies suggest that value-based decision-making may rely on mechanisms of evidence accumulation. However no studies have explicitly investigated the time when single decisions are taken based on such an accumulation process. {NEW} {METHOD} Here, we outline a novel electroencephalography (EEG) decoding technique which is based on accumulating the probability of appearance of prototypical voltage topographies and can be used for predicting subjects’ decisions. We use this approach for studying the time-course of single decisions, during a task where subjects were asked to compare reward vs. loss points for accepting or rejecting offers. {RESULTS} We show that based on this new method, we can accurately decode decisions for the majority of the subjects. The typical time-period for accurate decoding was modulated by task difficulty on a trial-by-trial basis. Typical latencies of when decisions are made were detected at ∼500 ms for ‘easy’ vs. ∼700 ms for ‘hard’ decisions, well before subjects’ response (∼340 ms). Importantly, this decision time correlated with the drift rates of a diffusion model, evaluated independently at the behavioral level. {COMPARISON} {WITH} {EXISTING} METHOD(S) We compare the performance of our algorithm with logistic regression and support vector machine and show that we obtain significant results for a higher number of subjects than with these two approaches. We also carry out analyses at the average event-related potential level, for comparison with previous studies on decision-making. Conclusions We present a novel approach for studying the timing of value-based decision-making, by accumulating patterns of topographic {EEG} activity at single-trial level.

Keywords: Decision-making
[525] Jennifer Dumont, Tapani Hirvonen, Ville Heikkinen, Maxime Mistretta, Lars Granlund, Katri Himanen, Laure Fauch, Ilkka Porali, Jouni Hiltunen, Sarita Keski-Saari, Markku Nygren, Elina Oksanen, Markku Hauta-Kasari, and Markku Keinänen. Thermal and hyperspectral imaging for norway spruce (picea abies) seeds screening. Computers and Electronics in Agriculture, 116:118 - 124, 2015. [ bib | DOI | http ]
Abstract The quality of seeds used in agriculture and forestry is tightly linked to the plant productivity. Thus, the development of high-throughput nondestructive methods to classify the seeds is of prime interest. Visible and near infrared (VNIR, 400–1000 nm range) and short-wave infrared (SWIR, 1000–2500 nm range) hyperspectral imaging techniques were compared to an infrared lifetime imaging technique to evaluate Norway spruce (Picea abies (L.) Karst.) seed quality. Hyperspectral image and thermal data from 1606 seeds were used to identify viable seeds, empty seeds and seeds infested by Megastigmus sp. larvae. The spectra of seeds obtained from hyperspectral imaging, especially in {SWIR} range and the thermal signal decay of seeds following an exposure to a short light pulse were characteristic of the seed status. Classification of the seeds to three classes was performed with a Support Vector Machine (nu-SVM) and sparse logistic regression based feature selection. Leave-One-Out classification resulted to 99% accuracy using either thermal or spectral measurements compared to radiography classification. In spectral imaging case, all important features were located in the {SWIR} range. Furthermore, the classification results showed that accurate (93.8%) seed sorting can be achieved with a simpler method based on information from only three hyperspectral bands at 1310 nm, 1710 nm and 1985 nm locations, suggesting a possibility to build an inexpensive screening device. The results indicate that combined classification methods with hyperspectral imaging technique and infrared lifetime imaging technique constitute practically high performance fast and non-destructive techniques for high-throughput seed screening.

Keywords: Classification
[526] E. Alexandre, L. Cuadra, J.C. Nieto-Borge, G. Candil-García, M. del Pino, and S. Salcedo-Sanz. A hybrid genetic algorithm—extreme learning machine approach for accurate significant wave height reconstruction. Ocean Modelling, 92:115 - 123, 2015. [ bib | DOI | http ]
Abstract Wave parameters computed from time series measured by buoys (significant wave height Hs, mean wave period, etc.) play a key role in coastal engineering and in the design and operation of wave energy converters. Storms or navigation accidents can make measuring buoys break down, leading to missing data gaps. In this paper we tackle the problem of locally reconstructing Hs at out-of-operation buoys by using wave parameters from nearby buoys, based on the spatial correlation among values at neighboring buoy locations. The novelty of our approach for its potential application to problems in coastal engineering is twofold. On one hand, we propose a genetic algorithm hybridized with an extreme learning machine that selects, among the available wave parameters from the nearby buoys, a subset F n S P with nSP parameters that minimizes the Hs reconstruction error. On the other hand, we evaluate to what extent the selected parameters in subset F n S P are good enough in assisting other machine learning (ML) regressors (extreme learning machines, support vector machines and gaussian process regression) to reconstruct Hs. The results show that all the {ML} method explored achieve a good Hs reconstruction in the two different locations studied (Caribbean Sea and West Atlantic).

Keywords: Significant wave height local reconstruction
[527] Huaizhi Su, Zhiping Wen, Xiaoran Sun, and Meng Yang. Time-varying identification model for dam behavior considering structural reinforcement. Structural Safety, 57:1 - 7, 2015. [ bib | DOI | http ]
Abstract Mathematical relationship model between structural response and its influence factors is often used to identify and assess dam behavior. Under the action of loads, changing material property, structural reinforcement and so on, dam behavior expresses the uncertain variation characteristics. According to the prototypical observations, objective and subjective uncertain information on dam behavior before and after structural reinforcement, support vector regression (SVR) method is combined with Bayesian approach to build the time-varying identification model for dam behavior after structural reinforcement. Firstly, a static {SVR} model identifying dam behavior is established. Secondly, Bayesian approach is adopted to adjust dynamically the calculated results of static identification model. A method determining the Bayesian prior distribution and likelihood function is developed to describe the objective and subjective uncertainty on dam behavior. Emphasizing the importance of recent information on dam behavior, an algorithm updating in real time the Bayesian parameters is proposed to reflect the characteristic change of dam behavior after structural reinforcement. Lastly, the displacement behavior of one actual dam undergoing structural reinforcements is taken as an example. The identification capabilities of classical statistical model, static {SVR} model and time-varying model are compared. It is indicated that the proposed time-varying model can provide more accurate fitted and forecasted results, and is more suitable to be used to evaluate the reinforcement effect of dangerous dam.

Keywords: Dam
[528] Jonghyuck Park, Ick-Hyun Kwon, Sung-Shick Kim, and Jun-Geol Baek. Spline regression based feature extraction for semiconductor process fault detection using support vector machine. Expert Systems with Applications, 38(5):5711 - 5718, 2011. [ bib | DOI | http ]
Quality control is attracting more attention in semiconductor market due to harsh competition. This paper considers Fault Detection (FD), a well-known philosophy in quality control. Conventional methods, such as non-stationary {SPC} chart, PCA, PLS, and Hotelling’s T2, are widely used to detect faults. However, even for identical processes, the process time differs. Missing data may hinder fault detection. Artificial intelligence (AI) techniques are used to deal with these problems. In this paper, a new fault detection method using spline regression and Support Vector Machine (SVM) is proposed. For a given process signal, spline regression is applied regarding step changing points as knot points. The coefficients multiplied to the basis of the spline function are considered as the features for the signal. {SVM} uses those extracted features as input variables to construct the classifier for fault detection. Numerical experiments are conducted in the case of artificial data that replicates semiconductor manufacturing signals to evaluate the performance of the proposed method.

Keywords: Fault detection
[529] P.J. García Nieto, E. García-Gonzalo, J.R. Alonso Fernández, and C. Díaz Muñiz. A hybrid {PSO} optimized svm-based model for predicting a successful growth cycle of the spirulina platensis from raceway experiments data. Journal of Computational and Applied Mathematics, pages -, 2015. [ bib | DOI | http ]
Abstract In this research work, a practical new hybrid model to predict the successful growth cycle of Spirulina platensis was proposed. The model was based on Particle Swarm Optimization (PSO) in combination with support vector machines (SVMs). This optimization mechanism involved kernel parameter setting in the {SVM} training procedure, which significantly influences the regression accuracy. PSO–SVM-based models, which are based on the statistical learning theory, were successfully used here to predict the Chlorophyll a (Chl-a) concentration (output variable) as a function of the following input variables: pH, optical density, oxygen concentration, nitrate concentration, phosphate concentration, salinity, water temperature and irradiance. Regression with three different kernels (linear, quadratic and RBF) was performed and determination coefficients of 0.94 , 0.97 , and 0.99 , respectively, were obtained. The PSO–SVM-based model goodness of fit to experimental data (Chl-a concentration) confirmed the good performance of this model. Indeed, it is well-known that Chl-a is an extremely important biomolecule, critical in photosynthesis, which allows plants to obtain energy from light and it is one of the most often used algal biomass estimator. The model also allowed to know the most influent parameters in the growth of the S. platensis. Finally, conclusions of this study are exposed.

Keywords: Support vector machines (SVMs)
[530] David J. Bradshaw and Marianna Pensky. Svm-like decision theoretical classification of high-dimensional vectors. Journal of Statistical Planning and Inference, 140(3):705 - 718, 2010. [ bib | DOI | http ]
In this paper, we consider the classification of high-dimensional vectors based on a small number of training samples from each class. The proposed method follows the Bayesian paradigm, and it is based on a small vector which can be viewed as the regression of the new observation on the space spanned by the training samples. The classification method provides posterior probabilities that the new vector belongs to each of the classes, hence it adapts naturally to any number of classes. Furthermore, we show a direct similarity between the proposed method and the multicategory linear support vector machine introduced in Lee et al. [2004. Multicategory support vector machines: theory and applications to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association 99 (465), 67–81]. We compare the performance of the technique proposed in this paper with the {SVM} classifier using real-life military and microarray datasets. The study shows that the misclassification errors of both methods are very similar, and that the posterior probabilities assigned to each class are fairly accurate.

Keywords: Support vector machine
[531] Weiwei Zong and Guang-Bin Huang. Face recognition based on extreme learning machine. Neurocomputing, 74(16):2541 - 2551, 2011. Advances in Extreme Learning Machine: Theory and ApplicationsBiological Inspired Systems. Computational and Ambient IntelligenceSelected papers of the 10th International Work-Conference on Artificial Neural Networks (IWANN2009). [ bib | DOI | http ]
Extreme learning machine (ELM) is an efficient learning algorithm for generalized single hidden layer feedforward networks (SLFNs), which performs well in both regression and classification applications. It has recently been shown that from the optimization point of view {ELM} and support vector machine (SVM) are equivalent but {ELM} has less stringent optimization constraints. Due to the mild optimization constraints {ELM} can be easy of implementation and usually obtains better generalization performance. In this paper we study the performance of the one-against-all (OAA) and one-against-one (OAO) {ELM} for classification in multi-label face recognition applications. The performance is verified through four benchmarking face image data sets.

Keywords: Face recognition
[532] Jun Zheng, Xinyu Shao, Liang Gao, Ping Jiang, and Haobo Qiu. A prior-knowledge input {LSSVR} metamodeling method with tuning based on cellular particle swarm optimization for engineering design. Expert Systems with Applications, 41(5):2111 - 2125, 2014. [ bib | DOI | http ]
Abstract Engineering design is usually a daunting optimization task which often involving time-consuming, even computation-prohibitive process. To reduce the computational expense, metamodels are commonly used to replace the actual expensive simulations or experiments. In this paper, a new and efficient metamodeling method named prior-knowledge input least square support vector regression (PKI-LSSVR) is developed, in which samples from different levels of fidelity are incorporated to gain an accurate approximation with limited times of the high-fidelity (HF) expensive simulations. The low-fidelity (LF) output serves as a prior-knowledge of the real response function, and then is used as the input variables of least square support vector regression (LSSVR). When the corresponding {HF} response is gained, a function that maps the {LF} outputs to {HF} outputs is constructed via LSSVR. The predictive accuracy of {LSSVR} models is highly dependent on their learning parameters. Therefore, a novel optimization method, cellular particle swarm optimization (CPSO), is exploited to seek the optimal hyper-parameters for PKI-LSSVR in order to improve its generalization capability. To get a better optimization performance, a new neighborhood function is developed for {CPSO} where the global and local search is efficiently balanced by adaptively varied neighbor radius. Several numerical experiments and one engineering case verify the efficiency of the proposed PKI-LSSVR method. Sample quality merits including sample sizes and noise, and metamodel performance evaluation measures incorporating accuracy, robustness, and efficiency are considered.

Keywords: Variable fidelity metamodel
[533] O. Gualdrón, J. Brezmes, E. Llobet, A. Amari, X. Vilanova, B. Bouchikhi, and X. Correig. Variable selection for support vector machine based multisensor systems. Sensors and Actuators B: Chemical, 122(1):259 - 268, 2007. [ bib | DOI | http ]
In this paper, a new variable selection technique inspired in sequential forward selection but specifically designed to work with support vector machines is introduced. The usefulness of the variable selection coupled to support vector machines for solving classification and regression problems is assessed by analysing two different databases. The first database corresponds to different concentrations of vapours and vapour mixtures measured with a metal oxide gas-sensor e-nose and the second database corresponds to different Iberian hams measured with a mass-spectrometry based e-nose. Using a reduced set of important variables (i.e. reducing the dimensionality of input space by the variable selection procedure) results in support vector machines with better performance. For example, the success rate in ham classification (11-class problem) rises from 79.91% (when all the variables available are used) to 90.30% (when a reduced set of input variables is used). Furthermore, a quantitative analysis of ham samples with good accuracy is shown to be possible: when the variable selection process introduced is coupled to support vector machine regression models, the correlation coefficients of actual versus predicted humidity, water activity and salt in ham samples are 0.975, 0.972 and 0.943, respectively. This compares favourably with the correlation coefficients obtained when no variable selection is performed (0.937, 0.924 and 0.894).

Keywords: Support vector machine
[534] Fudi Chen, Hao Li, Zhihan Xu, Shixia Hou, and Dazuo Yang. User-friendly optimization approach of fed-batch fermentation conditions for the production of iturin a using artificial neural networks and support vector machine. Electronic Journal of Biotechnology, pages -, 2015. [ bib | DOI | http ]
AbstractBackground In the field of microbial fermentation technology, how to optimize the fermentation conditions is of great crucial for practical applications. Here, we use artificial neural networks (ANNs) and support vector machine (SVM) to offer a series of effective optimization methods for the production of iturin A. The concentration levels of asparagine (Asn), glutamic acid (Glu) and proline (Pro) (mg/L) were set as independent variables, while the iturin A titer (U/mL) was set as dependent variable. General regression neural network (GRNN), multilayer feed-forward neural networks (MLFNs) and the {SVM} were developed. Comparisons were made among different {ANNs} and the SVM. Results The {GRNN} has the lowest {RMS} error (457.88) and the shortest training time (1 s), with a steady fluctuation during repeated experiments, whereas the {MLFNs} have comparatively higher {RMS} errors and longer training times, which have a significant fluctuation with the change of nodes. In terms of the SVM, it also has a relatively low {RMS} error (466.13), with a short training time (1 s). Conclusion According to the modeling results, the {GRNN} is considered as the most suitable {ANN} model for the design of the fed-batch fermentation conditions for the production of iturin A because of its high robustness and precision, and the {SVM} is also considered as a very suitable alternative model. Under the tolerance of 30%, the prediction accuracies of the {GRNN} and {SVM} are both 100% respectively in repeated experiments.

Keywords: Artificial neural network
[535] Wentao Mao, Guirong Yan, and Longlei Dong. Weighted solution path algorithm of support vector regression based on heuristic weight-setting optimization. Neurocomputing, 73(1–3):495 - 505, 2009. Timely Developments in Applied Neural Computing (EANN 2007) / Some Novel Analysis and Learning Methods for Neural Networks (ISNN 2008) / Pattern Recognition in Graphical Domains. [ bib | DOI | http ]
In the conventional solution path algorithm of support vector regression, the ε-insensitive error of every training sample is equally penalized, which means every sample affects the generalization ability equally. However, in some cases, e.g. time series prediction or noisy function regression, the ε-insensitive error of the sample which could provide more important information should be penalized more heavily. Therefore, the weighted solution path algorithm of support vector regression is proposed in this paper. Error penalty parameter of each training sample is weighted differently, and the whole solution path is modified correspondingly. More importantly, by choosing Arc Tangent function as the prototype to generate weights with various characteristics, a heuristic weight-setting optimization algorithm is proposed to compute the optimal weights using particle swarm optimization (PSO). This method is applicable to different applications. Experiments on time series prediction and noisy function regression are conducted, demonstrating comparable results of the proposed weighted solution path algorithm and encouraging performance of the heuristic weight-setting optimization.

Keywords: Keywords: Support vector machines
[536] Alessio Micheli, Filippo Portera, and Alessandro Sperduti. A preliminary empirical comparison of recursive neural networks and tree kernel methods on regression tasks for tree structured domains. Neurocomputing, 64:73 - 92, 2005. Trends in Neurocomputing: 12th European Symposium on Artificial Neural Networks 2004. [ bib | DOI | http ]
The aim of this paper is to start a comparison between recursive neural networks (RecNN) and kernel methods for structured data, specifically support vector regression (SVR) machine using a tree kernel, in the context of regression tasks for trees. Both the approaches can deal directly with a structured input representation and differ in the construction of the feature space from structured data. We present and discuss preliminary empirical results for specific regression tasks involving well-known quantitative structure-activity and quantitative structure-property relationship (QSAR/QSPR) problems, where both the approaches are able to achieve state-of-the-art results.

Keywords: Kernel methods
[537] Karol Lina López, Christian Gagné, Germán Castellanos-Dominguez, and Mauricio Orozco-Alzate. Training subset selection in hourly ontario energy price forecasting using time series clustering-based stratification. Neurocomputing, 156:268 - 279, 2015. [ bib | DOI | http ]
Abstract Training a given learning-based forecasting method to a satisfactory level of performance often requires a large dataset. Indeed, any data-driven methods require having examples that are providing a satisfactory representation of what we wish to model to work properly. This often implies using large datasets to be sure that the phenomenon of interest is properly sampled. However, learning from time series composed of too many samples can also be a problem, given that the computational requirements of the learning algorithms can easily grow following a polynomial complexity according to the training set size. In order to identify representative examples of a dataset, we are proposing a methodology using clustering-based stratification of time series to select a training data subset. The principle for constructing a representative sample set using this method consists in selecting heterogeneous instances picked from all the various clusters composing the dataset. Results obtained show that with a small number of training examples, obtained through the proposed clustering-based stratification, we can preserve the performance and improve the stability of models such as artificial neural networks and support vector regression, while training at a much lower computational cost. We illustrate the methodology through forecasting the one-step ahead Hourly Ontario Energy Price (HOEP).

Keywords: Stratification
[538] Jiangtao Peng and Luoqing Li. Support vector regression in sum space for multivariate calibration. Chemometrics and Intelligent Laboratory Systems, 130:14 - 19, 2014. [ bib | DOI | http ]
Abstract In this paper, a support vector regression algorithm in the sum of reproducing kernel Hilbert spaces (SVRSS) is proposed for multivariate calibration. In SVRSS, the target regression function is represented as the sum of several single kernel decision functions, where each single kernel function with specific scale can approximate certain component of the target function. For sum spaces with two Gaussian kernels, the proposed method is compared, in terms of RMSEP, to traditional chemometric {PLS} calibration methods and recent promising SVR, {GPR} and {ELM} methods on a simulated data set and four real spectroscopic data sets. Experimental results demonstrate that {SVR} methods outperform {PLS} methods for spectroscopy regression problems. Moreover, {SVRSS} method with multi-scale kernels improves the single kernel {SVR} method and shows superiority over {GPR} and {ELM} methods.

Keywords: Support vector regression
[539] J. Taboada, J.M. Matías, C. Ordóñez, and P.J. García. Creating a quality map of a slate deposit using support vector machines. Journal of Computational and Applied Mathematics, 204(1):84 - 94, 2007. Special issue dedicated to Professor Shinnosuke Oharu on the occasion of his 65th birthday. [ bib | DOI | http ]
In this work, we create a quality map of a slate deposit, using the results of an investigation based on surface geology and continuous core borehole sampling. Once the quality of the slate and the location of the sampling points have been defined, different kinds of support vector machines (SVMs)—SVM classification (multiclass one-against-all), ordinal {SVM} and {SVM} regression—are used to draw up the quality map. The results are also compared with those for kriging. The results obtained demonstrate that {SVM} regression and ordinal {SVM} are perfectly comparable to kriging and possess some additional advantages, namely, their interpretability and control of outliers in terms of the support vectors. Likewise, the benefits of using the covariogram as the kernel of the {SVM} are evaluated, with a view to incorporating the problem association structure in the feature space geometry. In our problem, this strategy not only improved our results but also implied substantial computational savings.

Keywords: Kriging
[540] Keun Lee, Sohyung Cho, and Shihab Asfour. Web-based algorithm for cylindricity evaluation using support vector machine learning. Computers & Industrial Engineering, 60(2):228 - 235, 2011. [ bib | DOI | http ]
This paper introduces a cylindricity evaluation algorithm based on support vector machine learning with a specific kernel function, referred to as SVR, as a viable alternative to traditional least square method (LSQ) and non-linear programming algorithm (NLP). Using the theory of support vector machine regression, the proposed algorithm in this paper provides more robust evaluation in terms of {CPU} time and accuracy than {NLP} and this is supported by computational experiments. Interestingly, it has been shown that the {SVR} significantly outperforms {LSQ} in terms of the accuracy while it can evaluate the cylindricity in a more robust fashion than {NLP} when the variance of the data points increases. The robust nature of the proposed algorithm is expected because it converts the original nonlinear problem with nonlinear constraints into other nonlinear problem with linear constraints. In addition, the proposed algorithm is programmed using Java Runtime Environment to provide users with a Web based open source environment. In a real-world setting, this would provide manufacturers with an algorithm that can be trusted to give the correct answer rather than making a good part rejected because of inaccurate computational results.

Keywords: Cylindricity evaluation
[541] Thrimoorthy Potta, Zhuo Zhen, Taraka Sai Pavan Grandhi, Matthew D. Christensen, James Ramos, Curt M. Breneman, and Kaushal Rege. Discovery of antibiotics-derived polymers for gene delivery using combinatorial synthesis and cheminformatics modeling. Biomaterials, 35(6):1977 - 1988, 2014. [ bib | DOI | http ]
Abstract We describe the combinatorial synthesis and cheminformatics modeling of aminoglycoside antibiotics-derived polymers for transgene delivery and expression. Fifty-six polymers were synthesized by polymerizing aminoglycosides with diglycidyl ether cross-linkers. Parallel screening resulted in identification of several lead polymers that resulted in high transgene expression levels in cells. The role of polymer physicochemical properties in determining efficacy of transgene expression was investigated using Quantitative Structure–Activity Relationship (QSAR) cheminformatics models based on Support Vector Regression (SVR) and ‘building block’ polymer structures. The {QSAR} model exhibited high predictive ability, and investigation of descriptors in the model, using molecular visualization and correlation plots, indicated that physicochemical attributes related to both, aminoglycosides and diglycidyl ethers facilitated transgene expression. This work synergistically combines combinatorial synthesis and parallel screening with cheminformatics-based {QSAR} models for discovery and physicochemical elucidation of effective antibiotics-derived polymers for transgene delivery in medicine and biotechnology.

Keywords: Gene delivery
[542] Nikolaos Mittas, Efi Papatheocharous, Lefteris Angelis, and Andreas S. Andreou. Integrating non-parametric models with linear components for producing software cost estimations. Journal of Systems and Software, 99:120 - 134, 2015. [ bib | DOI | http ]
Abstract A long-lasting endeavor in the area of software project management is minimizing the risks caused by under- or over-estimations of the overall effort required to build new software systems. Deciding which method to use for achieving accurate cost estimations among the many methods proposed in the relevant literature is a significant issue for project managers. This paper investigates whether it is possible to improve the accuracy of estimations produced by popular non-parametric techniques by coupling them with a linear component, thus producing a new set of techniques called semi-parametric models (SPMs). The non-parametric models examined in this work include estimation by analogy (EbA), artificial neural networks (ANN), support vector machines (SVM) and locally weighted regression (LOESS). Our experimentation shows that the estimation ability of {SPMs} is superior to their non-parametric counterparts, especially in cases where both a linear and non-linear relationship exists between software effort and the related cost drivers. The proposed approach is empirically validated through a statistical framework which uses multiple comparisons to rank and cluster the models examined in non-overlapping groups performing significantly different.

Keywords: Software cost estimation
[543] Wooshik Kim, Jangbom Chai, and Intaek Kim. Development of a majority vote decision module for a self-diagnostic monitoring system for an air-operated valve system. Nuclear Engineering and Technology, pages -, 2015. [ bib | DOI | http ]
Abstract A self-diagnostic monitoring system is a system that has the ability to measure various physical quantities such as temperature, pressure, or acceleration from sensors scattered over a mechanical system such as a power plant, in order to monitor its various states, and to make a decision about its health status. We have developed a self-diagnostic monitoring system for an air-operated valve system to be used in a nuclear power plant. In this study, we have tried to improve the self-diagnostic monitoring system to increase its reliability. We have implemented three different machine learning algorithms, i.e., logistic regression, an artificial neural network, and a support vector machine. After each algorithm performs the decision process independently, the decision-making module collects these individual decisions and makes a final decision using a majority vote scheme. With this, we performed some simulations and presented some of its results. The contribution of this study is that, by employing more robust and stable algorithms, each of the algorithms performs the recognition task more accurately. Moreover, by integrating these results and employing the majority vote scheme, we can make a definite decision, which makes the self-diagnostic monitoring system more reliable.

Keywords: Air-operated valve
[544] Ginés Rubio, Héctor Pomares, Ignacio Rojas, and Luis Javier Herrera. A heuristic method for parameter selection in ls-svm: Application to time series prediction. International Journal of Forecasting, 27(3):725 - 739, 2011. Special Section 1: Forecasting with Artificial Neural Networks and Computational IntelligenceSpecial Section 2: Tourism Forecasting. [ bib | DOI | http ]
Least Squares Support Vector Machines (LS-SVM) are the state of the art in kernel methods for regression. These models have been successfully applied for time series modelling and prediction. A critical issue for the performance of these models is the choice of the kernel parameters and the hyperparameters which define the function to be minimized. In this paper a heuristic method for setting both the σ parameter of the Gaussian kernel and the regularization hyperparameter based on information extracted from the time series to be modelled is presented and evaluated.

Keywords: Least squares support vector machines
[545] Mahesh Pal and Surinder Deswal. Support vector regression based shear strength modelling of deep beams. Computers & Structures, 89(13–14):1430 - 1439, 2011. [ bib | DOI | http ]
Support vector regression based modelling approach was used to predict the shear strength of reinforced and prestressed concrete deep beams. To compare its performance, a back-propagation neural network and the three empirical relations was used with reinforced concrete deep beams. For prestressed deep beams, one empirical relation was used. Results suggest an improved performance by the {SVR} in terms of prediction capabilities in comparison to the empirical relations and back propagation neural network. Parametric studies with {SVR} suggest the importance of concrete cylinder strength and ratio of shear span to effective depth of beam on strength prediction of deep beams.

Keywords: Support vector machines
[546] Hamid Taghavifar and Aref Mardani. A comparative trend in forecasting ability of artificial neural networks and regressive support vector machine methodologies for energy dissipation modeling of off-road vehicles. Energy, 66:569 - 576, 2014. [ bib | DOI | http ]
Abstract Machine dynamics and soil elastic–plastic characteristic sort out the soil-wheel interaction productions as very complex problem to be estimated. Energy dissipation due to motion resistance, as the most prominent performance index of towed wheels, is associated with soil properties and tire parameters. The objective of this study was to develop, for the first time, a model for prediction of energy loss in soil working machines using the datasets obtained from soil bin facility and a single-wheel tester. A total of 90 data points were derived from experimentations at five levels of wheel load (1, 2, 3, 4, and 5 kN), six tire inflation pressure (50, 100, 150, 200, 250, and 300 kPa) and three forward velocities (0.7, 1.4 and 2 m/s). {ANN} (Artificial neural network) was used for modeling of obtained results compared to the forecasting ability of {SVR} (support vector regression) technique. Several statistical criterions, (i.e. {MAPE} (mean absolute percentage error), {MSE} (mean square error), {MRE} (mean relative error) and coefficient of determination (R2) were incorporated in the investigations. It was observed, on the basis of statistical criterions, that SVR-based generalized model outperformed {ANN} in modeling energy loss and exhibited its applicability as a promising tool in this domain.

Keywords: Artificial neural network
[547] Shuangyin Liu, Longqin Xu, Yu Jiang, Daoliang Li, Yingyi Chen, and Zhenbo Li. A hybrid wa–cpso-lssvr model for dissolved oxygen content prediction in crab culture. Engineering Applications of Artificial Intelligence, 29:114 - 124, 2014. [ bib | DOI | http ]
Abstract To increase prediction accuracy, reduce aquaculture risks and optimize water quality management in intensive aquaculture ponds, this paper proposes a hybrid dissolved oxygen content forecasting model based on wavelet analysis (WA) and least squares support vector regression (LSSVR) with an optimal improved Cauchy particle swarm optimization (CPSO) algorithm. In the modeling process, the original dissolved oxygen sequences were de-noised and decomposed into several resolution frequency signal subsets using the wavelet analysis method. Independent prediction models were developed using decomposed signals with wavelet analysis and least squares support vector regression. The independent prediction values were reconstructed to obtain the ultimate prediction results. In addition, because the kernel parameter δ and the regularization parameter γ in the {LSSVR} training procedure significantly influence forecasting accuracy, the Cauchy particle swarm optimization (CPSO) algorithm was used to select optimum parameter combinations for LSSVR. The proposed hybrid model was applied to predict dissolved oxygen in river crab culture ponds. Compared with traditional models, the test results of the hybrid WA–CPSO-LSSVR model demonstrate that de-noising and capturing non-stationary characteristics of dissolved oxygen signals after {WA} comprise a very powerful and reliable method for predicting dissolved oxygen content in intensive aquaculture accurately and quickly.

Keywords: Least squares support vector regression
[548] Abdollah Kavousi-Fard, Haidar Samet, and Fatemeh Marzbani. A new hybrid modified firefly algorithm and support vector regression model for accurate short term load forecasting. Expert Systems with Applications, 41(13):6047 - 6056, 2014. [ bib | DOI | http ]
Abstract Precise forecast of the electrical load plays a highly significant role in the electricity industry and market. It provides economic operations and effective future plans for the utilities and power system operators. Due to the intermittent and uncertain characteristic of the electrical load, many research studies have been directed to nonlinear prediction methods. In this paper, a hybrid prediction algorithm comprised of Support Vector Regression (SVR) and Modified Firefly Algorithm (MFA) is proposed to provide the short term electrical load forecast. The {SVR} models utilize the nonlinear mapping feature to deal with nonlinear regressions. However, such models suffer from a methodical algorithm for obtaining the appropriate model parameters. Therefore, in the proposed method the {MFA} is employed to obtain the {SVR} parameters accurately and effectively. In order to evaluate the efficiency of the proposed methodology, it is applied to the electrical load demand in Fars, Iran. The obtained results are compared with those obtained from the {ARMA} model, ANN, SVR-GA, SVR-HBMO, SVR-PSO and SVR-FA. The experimental results affirm that the proposed algorithm outperforms other techniques.

Keywords: Support Vector Regression (SVR)
[549] Tatyana V. Bandos, Gustavo Camps-Valls, and Emilio Soria-Olivas. Statistical criteria for early-stopping of support vector machines. Neurocomputing, 70(13–15):2588 - 2592, 2007. Selected papers from the 3rd International Conference on Development and Learning (ICDL 2004)Time series prediction competition: the {CATS} benchmark3rd International Conference on Development and Learning. [ bib | DOI | http ]
This paper proposes the use of statistical criteria for early-stopping support vector machines, both for regression and classification problems. The method basically stops the minimization of the primal functional when moments of the error signal (up to fourth order) become stationary, rather than according to a tolerance threshold of primal convergence itself. This simple strategy induces lower computational efforts and no significant differences are observed in terms of performance and sparsity.

Keywords: Support vector machines
[550] S. Deng and Tsung-Han Yeh. Using least squares support vector machines for the airframe structures manufacturing cost estimation. International Journal of Production Economics, 131(2):701 - 708, 2011. [ bib | DOI | http ]
Accurate cost estimation plays a significant role in industrial product development and production. This research applied least squares support vector machines (LS-SVM) method solving the problem of estimating the manufacturing cost for airframe structural projects. This research evaluated the estimation performance using back-propagation neural networks and statistical regression analysis. In case studies, this research considered structural weight and manufacturing complexity as the main factors in determining the manufacturing labor hour. The test results verified that the LS-SVM model can provide accurate estimation performance and outperform other methods. This research provides a feasible solution for airframe manufacture industry.

Keywords: Airframe structure
[551] F. Antonanzas-Torres, R. Urraca, J. Antonanzas, J. Fernandez-Ceniceros, and F.J. Martinez de Pison. Generation of daily global solar irradiation with support vector machines for regression. Energy Conversion and Management, 96:277 - 286, 2015. [ bib | DOI | http ]
Abstract Solar global irradiation is barely recorded in isolated rural areas around the world. Traditionally, solar resource estimation has been performed using parametric-empirical models based on the relationship of solar irradiation with other atmospheric and commonly measured variables, such as temperatures, rainfall, and sunshine duration, achieving a relatively high level of certainty. Considerable improvement in soft-computing techniques, which have been applied extensively in many research fields, has lead to improvements in solar global irradiation modeling, although most of these techniques lack spatial generalization. This new methodology proposes support vector machines for regression with optimized variable selection via genetic algorithms to generate non-locally dependent and accurate models. A case of study in Spain has demonstrated the value of this methodology. It achieved a striking reduction in the mean absolute error (MAE) – 41.4% and 19.9% – as compared to classic parametric models; Bristow & Campbell and Antonanzas-Torres et al., respectively.

Keywords: Solar resource estimation
[552] Xiaoli Zhang, Peng Wang, Dakai Liang, Chunfeng Fan, and Cailing Li. A soft self-repairing for {FBG} sensor network in {SHM} system based on pso–svr model reconstruction. Optics Communications, 343:38 - 46, 2015. [ bib | DOI | http ]
Abstract Structural health monitoring (SHM) system takes advantage of an array of sensors to continuously monitor a structure and provide an early prediction such as the damage position and damage degree etc. Such a system requires monitoring the structure in any conditions including bad condition. Therefore, it must be robust and survivable, even has the self-repairing ability. In this study, a model reconstruction predicting algorithm based on particle swarm optimization-support vector regression (PSO–SVR) is proposed to achieve the self-repairing of the Fiber Bragg Grating (FBG) sensor network in {SHM} system. Furthermore, an eight-point {FBG} sensor {SHM} system is experimented in an aircraft wing box. For the damage loading position prediction on the aircraft wing box, six kinds of disabled modes are experimentally studied to verify the self-repairing ability of the {FBG} sensor network in the {SHM} system, and the predicting performance are compared with non-reconstruction based on PSO–SVR model. The research results indicate that the model reconstruction algorithm has more excellence than that of non-reconstruction model, if partial sensors are invalid in the FBG-based {SHM} system, the predicting performance of the model reconstruction algorithm is almost consistent with that no sensor is invalid in the {SHM} system. In this way, the self-repairing ability of the {FBG} sensor is achieved for the {SHM} system, such the reliability and survivability of the FBG-based {SHM} system is enhanced if partial {FBG} sensors are invalid.

Keywords: Self-repairing
[553] Athanasios Tsakonas and Bogdan Gabrys. A fuzzy evolutionary framework for combining ensembles. Applied Soft Computing, 13(4):1800 - 1812, 2013. [ bib | DOI | http ]
We propose an evolutionary framework for the production of fuzzy rule bases where each rule executes an ensemble of predictors. The architecture, the rule base and the composition of the ensembles are evolved over time. To achieve this, we employ a context-free grammar within a hybrid genetic programming system using a multi-population model. As base predictors, multilayer perceptron neural networks and support vector machines are available. We apply the system to several function approximation and regression tasks and compare the results with recent research and state-of-the-art models. We conclude that the proposed architecture is competitive and has a number of very desirable features supporting automation of predictive model building and their adaptation over time. Finally, we suggest further potential research directions.

Keywords: Ensemble systems
[554] Rachid Darnag, E.L. Mostapha Mazouz, Andreea Schmitzer, Didier Villemin, Abdellah Jarid, and Driss Cherqaoui. Support vector machines: Development of {QSAR} models for predicting anti-hiv-1 activity of {TIBO} derivatives. European Journal of Medicinal Chemistry, 45(4):1590 - 1597, 2010. [ bib | DOI | http ]
The tetrahydroimidazo[4,5,1-jk][1,4]benzodiazepinone (TIBO) derivatives, as non-nucleoside reverse transcriptase inhibitors, acquire a significant place in the treatment of the infections by the HIV. In the present paper, the support vector machines (SVM) are used to develop quantitative relationships between the anti-HIV activity and four molecular descriptors of 82 {TIBO} derivatives. The results obtained by {SVM} give good statistical results compared to those given by multiple linear regressions and artificial neural networks. The contribution of each descriptor to structure-activity relationships was evaluated. It indicates the importance of the hydrophobic parameter. The proposed method can be successfully used to predict the anti-HIV of {TIBO} derivatives with only four molecular descriptors which can be calculated directly from molecular structure alone.

Keywords: QSAR
[555] Long Yu and Jian Xiao. Trade-off between accuracy and interpretability: Experience-oriented fuzzy modeling via reduced-set vectors. Computers & Mathematics with Applications, 57(6):885 - 895, 2009. Advances in Fuzzy Sets and Knowledge Discovery. [ bib | DOI | http ]
This paper focuses on accuracy and interpretability issue of fuzzy model approaches. In order to balance the trade-off between both of the aspects, a new fuzzy model based on experience-oriented learning algorithm is proposed. Firstly, support vector regression (SVR) with presented Mercer kernels is employed to generate the initial fuzzy model and the available experience on the training data. Secondly, a bottom-up simplification algorithm is introduced to generate reduced-set vectors for simplifying the structure of the initial fuzzy model, at the same time the parameters of the simplified model derived are adjusted by a hybrid learning algorithm including linear ridge regression algorithm and gradient descent method based on a new performance measure. Finally, taking the results from two-dimensional sinc function approximation and fuzzy control of the bar and beam system, the proposed fuzzy model preserves nice accuracy and interpretability.

Keywords: Fuzzy modeling
[556] Qi Wu. A hybrid-forecasting model based on gaussian support vector machine and chaotic particle swarm optimization. Expert Systems with Applications, 37(3):2388 - 2394, 2010. [ bib | DOI | http ]
Load forecasting is an important subject for power distribution systems and has been studied from different points of view. This paper aims at the Gaussian noise parts of load series the standard v-support vector regression machine with ε-insensitive loss function that cannot deal with it effectively. The relation between Gaussian noises and loss function is built up. On this basis, a new v-support vector machine (v-SVM) with the Gaussian loss function technique named by g-SVM is proposed. To seek the optimal unknown parameters of g-SVM, a chaotic particle swarm optimization is also proposed. And then, a hybrid-load-forecasting model based on g-SVM and embedded chaotic particle swarm optimization (ECPSO) is put forward. The results of application of load forecasting indicate that the hybrid model is effective and feasible.

Keywords: Support vector machine
[557] Yuya Suzuki, Hirofumi Ibayashi, Yukimasa Kaneda, and Hiroshi Mineno. Proposal to sliding window-based support vector regression. Procedia Computer Science, 35:1615 - 1624, 2014. Knowledge-Based and Intelligent Information & Engineering Systems 18th Annual Conference, KES-2014 Gdynia, Poland, September 2014 Proceedings. [ bib | DOI | http ]
Abstract This paper proposes a new methodology, Sliding Window-based Support Vector Regression (SW-SVR), for micrometeorological data prediction. {SVR} is derived from a statistical learning theory and can be used to predict a quantity forward in time based on training that uses past data. Although {SVR} is superior to traditional learning algorithms such as Artificial Neural Network (ANN), it is difficult to choose the suitable amount of training data to build an optimum {SVR} model for micrometeorological data prediction. This paper revealed the periodic characteristics of micrometeorological data and evaluated SW-SVR can adapt the appropriate amount of training data to build an optimum {SVR} model automatically using parallel distributed processing. The future prediction experiment was conducted on air temperature of Sapporo, Tokyo, Hamamatsu, and Naha. As a result, SW-SVR has improved prediction accuracy in Sapporo, and Tokyo. In addition, it has reduced calculation time by more than 96% in all regions.

Keywords: Support vector regression (SVR)
[558] Qi Wu. The hybrid forecasting model based on chaotic mapping, genetic algorithm and support vector machine. Expert Systems with Applications, 37(2):1776 - 1783, 2010. [ bib | DOI | http ]
Aiming at the complex system with multi-dimension, small samples, nonlinearity and multi-apex, and combining chaos theory, genetic algorithm with support vector machine (SVM), a kind of chaotic {SVM} named Cv-SVM short for chaotic v-support vector machine is proposed in this paper. Cv-SVM, whose constraint conditions are less than those of the standard v-SVM by one, is proved to satisfy the structure risk minimum rule under the condition of probability. Moreover, there is no parameter b in the regression function of Cv-SVM. And then, an intelligence-forecasting method is put forward. The results of application in car demand forecasting show that the forecasting method based on Cv-SVM is feasible and effective.

Keywords: Support vector machine
[559] Inchio Lou, Zhengchao Xie, Wai Kin Ung, and Kai Meng Mok. Integrating support vector regression with particle swarm optimization for numerical modeling for algal blooms of freshwater. Applied Mathematical Modelling, pages -, 2015. [ bib | DOI | http ]
Abstract Algae-releasing cyanotoxins are cancer-causing and very harmful to the human being. Therefore, it is of great significance to model how the algae population dynamically changes in freshwater reservoirs. But the practical modeling is very difficult because water variables and their internal mechanism are very complicated and non-linear. So, in order to alleviate the algal bloom problems in Macau Main Storage Reservoir (MSR), this work proposes and develops a hybrid intelligent model combining Support Vector Regression (SVR) and Particle Swarm Optimization (PSO) to yield optimal control of parameters that predict and forecast the phytoplankton dynamics. In this process, collected data for current month’s variables and previous months’ variables are used for model predict and forecast, respectively. In the correlation analysis of 23 water variables that monitored monthly, 15 variables such as alkalinity, Bicarbonate ( {HCO} 3 - ), dissolved oxygen (DO), total nitrogen (TN), turbidity, conductivity, nitrate, suspended solid (SS) and total organic carbon (TOC) are selected, and data from 2001 to 2008 for each of these selected variables are used for training, while data from 2009 to 2011 which are the most recent three years are used for testing. It can be seen from the numerical results that the prediction and forecast powers are respectively estimated at approximately 0.767 and 0.876, and naturally it can be concluded that the newly proposed PSO–SVR is working well and can be