Applications of soft computing models for predicting sea surface temperature: a comprehensive review and assessment

The application of soft computing (SC) models for predicting environmental variables is widely gaining popularity, because of their capability to describe complex non-linear processes. The sea surface temperature (SST) is a key quantity in the analysis of sea and ocean systems, due to its relation with water quality, organisms, and hydrological events such as droughts and floods. This paper provides a comprehensive review of the SC model applications for estimating SST over the last two decades. Types of model (based on artificial neural networks, fuzzy logic, or other SC techniques), input variables, data sources, and performance indices are discussed. Existing trends of research in this field are identified, and possible directions for future investigation are suggested.


Background on sea surface temperature (SST)
In the last five decades, several studies have been conducted to estimate the sea surface temperature (SST) for assessing thermal exchanges between oceans and atmosphere, behavior patterns of aquatic species, and ocean or sea currents (Anding and Kauth 1970). Identifying SST anomalies (departures from average conditions) has been an active area of research in oceanography and atmospheric studies (Corchado 1995). These anomalies are caused by the dynamic behavior of oceans, which contain many water masses interacting with each other at their boundaries (Corchado and Aiken 2002). These interactions directly affect the SST anomalies (Corchado et al. 2001) and make it difficult to develop mathematical expressions to estimate SST. SST anomalies significantly affect sea surface salinity, precipitation, and ocean circulation (Amouamouha and Badalians Gholikandi 2017; Gupta and Malmgren 2009;Huang et al. 2008a). In addition, SST plays an important role in the occurrence of the El Niño Southern Oscillation (ENSO) phenomenon (Annamalai et al. 2005;Gordon 1986;Nicholls 1984). There is strong evidence that SST anomalies directly influence extreme hydrological events such as droughts Salles et al. 2016), and multiple studies have indicated a strong correlation between SST anomalies and hurricanes Jiang et al. 2018a;Kahira et al. 2018;Patil and Deo 2018).
Historically, linear regression and statistical methods, such as the Autoregressive Integrated Moving Average (ARIMA) models, have been extensively applied for estimating SST. Floating buoys and satellite observations are the two main sources of data to evaluate SST in seas and oceans. The statistical methods attempt to identify relations between different parameters and SST. For instance, for satellite-based datasets, statistical models map satellite data, such as thermal infrared radiation, to SST. When data is gathered from buoys, these methods attempt to find appropriate relations between SST and surface heat flux, wind stress and other factors (Anding and Kauth 1970;Corchado 1995;McMillin 1975;Prabhakara et al. 1974).
In general, the complex nature of the SST anomalies, as well as the intrinsic uncertainties on the conditions of sea or ocean systems (Corchado and Fyfe 1999;De Paz et al. 2012), make the SST prediction in space and time with mathematical or statistical models very challenging. The progressive increase in computing power has led to the development of techniques to investigate ocean systems such as case-based reasoning (CBR), which is based on the solution of similar previous problems. However, this kind of approach is limited when applied to complex problems because of its high dependence to human judgment (Corchado and Aiken 2002).

Soft computing (SC) models for SST Prediction
Soft computing (SC) methods, often indicated with the term artificial intelligence (AI), are increasingly being adopted to solve complex problems, due to lower cost of computation and higher flexibility and accuracy in comparison with physically based numerical models (Konar 2018;Yaseen et al. 2019;Sharafati et al. 2020;Tung and Yaseen 2020). SC models are capable of recognizing meaningful patterns in complex problems (Sharafati et al. 2019a) and often adopt nature-inspired techniques (Barzegar et al. 2016;Corchado and Aiken 2002;Konar 2018). There are several categories of SC models, such as artificial neural networks (ANN), adaptive neurofuzzy inference systems (ANFIS), and evolutionary methods inspired by animals or plants.
Generally, two types of SC approaches have been utilized. The first type of approach entails the combination of SC methods and numerical models (e.g., CBR) to enhance the numerical model results. In this case, the SC techniques are applied to eliminate the dependency on human judgment. The second type of approach involves the use of a standalone SC model; from a review of the past studies on SST, this approach is more commonly adopted than the first.
There are strong evidences from literature that indicate how SC standalone models can overcome the common limitations of other predictive models in different fields, such as prediction of precipitation, scour and fractional coverage of melt ponds (Rösel et al. 2012;Sharafati et al. 2019b;Yaseen et al. 2019).

Scope of this study
This study aims to produce a comprehensive survey of the previous applications of SC methods for SST prediction, based on the last two decades of research, focusing on the methods employed and the input variables and data sources used. This study also highlights the issues that are still unsolved and the possible future directions of SC application for SST prediction. The goal is to provide an overarching reference for researchers and practitioners in the field. Figure 1 shows the general sequence of steps used for predicting SST using SC methods: (i) the variables for prediction are obtained from field or remote sensing sources, such as buoys, vessels, and satellites; (ii) the SC model's parameters are initialized and tuned; and (iii) the SC model's results are compared with the observed Fig. 1 Conceptual workflow of SC models for SST prediction Haghbin et al. Progress in Earth and Planetary Science (2021)  SST data to quantify the prediction performance using appropriate metrics.
2 Literature review 2.1 Input variables and data sources for SST prediction Several types of input variables have been used to predict SST in seas and oceans using SC models. Information on fossil remains of phytoplankton and zooplankton from surface marine sediment samples has been used to estimate SST in past periods (Pflaumann et al. 1996).
In other models, values of SST itself at previous times (lagged SST values) were used to estimate later values of SST, with an approach that is often adopted in hydrological time series prediction applications.
In a few studies, other variables related to SST, such as net surface heat flux, wind stress, and dynamic wave height, have been used as input variables, although this is not very common because information on these variables is often lacking. Figure 2 shows that lagged SST values is the most frequently adopted input variable type (55%), while the sediment surface samples have been occasionally employed (8%) to estimate SST.

Data from buoys
Marine buoys are floating devices for in situ observations of marine environments, measuring parameters such as SST, turbidity, conductivity, and sea surface salinity (SSS). Historically, these objects were initially designed for vessel navigation and warning purposes, since at least 285 BC (Soreide et al. 2001). The modern type of buoys has been utilized along the US coasts since 1940. These buoys are equipped with sensors to gather data on various hydrodynamic and atmospheric quantities. The first modern buoy of the US navy, with boat shape and 6-meter long, is called Navy Oceanographic Meteorological Automated Device (NOMAD). NOMAD buoys can transfer marine and ocean information every 3 h (Soreide et al. 2001). A smaller type of buoys, named Autonomous Temperature Line Acquisition System (ATLAS), has been employed to collect data on ocean currents and other parameters such as temperature in the North Pacific Ocean and to investigate El Niño related events.
Several programs led by various institutions around the world investigate ocean behaviors using buoy data. One of the most well-known and successful is the Argo program, which has been operational since the early 2000s. The seminal work by (Davis et al. 1992;Davis 1991) established the basis for this research program, which is conducted as part of the World Climate Research Program (WCRP) to investigate temperature, salinity and ocean hydrodynamic properties such as circulation in the Atlantic, Indian, Pacific and Southern oceans. More than 3200 floating buoys are used, and most data are gathered at depths between 1000 and 2000 meters. The Argo program employs a satellite system called Argos and Iridium satellite communication system to transfer the observed data. The outputs of this program are widely used for estimating SST or verifying predicted results in studies using satellite data (Argo 2020).

Data from satellite sensors
Data from satellite sensors have been used for evaluating SST since 1981. Satellite observations are recognized as a tool for indirect measurement that can provide the spatio-temporal SST distribution around the world (Merchant et al. 2019). To estimate SST using SC models, data from different satellite sensors are used, with the most commonly used in this field being the Moderate Resolution Imaging Spectroradiometer (MODIS) and the Advanced Very High Resolution Radiometer (AVHRR).
The MODIS (Dou et al. 2020) is a device that was installed on the Terra satellite in 1999 (Kwon et al. 2020) by NASA (Gao and Kaufman 2003). This instrument was also installed on the Aqua satellite in 2002. It can receive data in 36 bands in the wavelength range of 0.4 microns to 4.14 microns with variable spatial resolution (two, five and 29 bands at 250 m, 500 m, and 1 kilometer resolution, respectively) ( Barnes et al. 1998;Chen et al. 2020). The MODIS is designed to measure largescale changes in the land cover (Kwon et al. 2020) as well as cloud cover and ground radiation. This sensor includes three tools for onboard calibration, namely a solar diffuser (SD) (Angal et al. 2020) with a solar diffuser stability monitor, a spectral radiation calibration package (Yu et al. 2020a), and a black body MODIS sensor data are presented in four groups (atmosphere, ocean, earth, and ice), splitting the Earth in 35 sectors from west to east and 17 sectors from north to south.
The AVHRR (Akkermans and Clerbaux 2020) is a radiation detector that can be used to remotely determine cloud cover and ground temperature. The AVHRR collects data in different bands of radiation wavelength (Mouginis- Mark et al. 1994), using six detectors. The first AVHRR was a 4-channel radiometer first deployed on the TIROS-N satellite . Then, it was upgraded to a 5-channel instrument (AVHRR/2) (Zhu et al. 2019) and installed on the NOAA-7 satellite (Wang et al. 2020b). The latest version was the 6-channel radiometer (AVHRR/3) launched on the NOAA-15 satellite (Tao et al. 2020). The AVHRR/3 weighs about 72 pounds (Ilčev 2017;Jyothirmai et al. 2018), measures 11.5 inches by 14.4 inches by 31.4 inches, and has 28.5 watts power (Jyothirmai et al. 2018). The satellite orbit on which the sensor is installed is between 833 and 870 kilometers above the Earth's surface (Brown et al. 1985) and the sensor has been continuously collecting data since 1981 (Pinzon and Tucker 2014). The wavelength information collected by the AVHRR allows, through processing, to perform a multi-spectrum analysis to estimate hydrological, oceanographic, and meteorological parameters, to support, among others, climate change and environmental pollution studies.

Frameworks for SST mapping
The Optimum Interpolation Sea Surface Temperature (OISST) framework by the US National Oceanic and Atmospheric Administration (NOAA) uses different data sources, such as satellites, vessels, buoys, and Argo data, for estimating SST at global scale. The use of multiple data sources allows for complementing data and reducing possible errors. This framework includes data from 1 September 1981 until present. The OISST framework is widely accepted by the research community for assessing SST using AI-based models ((NOAA) 2020). The Hadley Centre Sea Ice and Sea Surface Temperature (HadISST) framework is another valuable data bank (National Center for Atmospheric Research Staff (Eds) 2020) containing monthly SST and Sea Ice Concentration (SIC) data from around the world since 1871. Like the OISST framework, the HadISST framework uses different data sources including buoys, ships and AVHRR.

Most commonly investigated regions
Our literature review has revealed the Pacific Ocean to be the most extensively investigated for SST prediction using SC methods in the last two decades, with more than 10 papers. Another region that has attracted significant attention is the Indian Ocean, with eight different studies. Research groups also focused on Atlantic Ocean, Bohai Sea, East China Sea and Arabian Sea, respectively ranking third, fourth, fifth, and sixth. Several studies (13 studies) conducted in other regions. The details are provided in Fig. 3.

Soft computing models for SST prediction
For the purposes of this review, we categorized the SC models in two groups: models based on ANN (characterized by different kinds of train function) and other models based on different SC techniques. The frequency application of these two SC model categories for SST prediction is shown in Fig. 4. The ANN-based models (88%) are by far the most commonly used in comparison to the other SC models (12%).
The following sections review the various applications in literature of the different types of SC model for SST prediction.
2.4 ANN-based models for SST prediction 2.4.1 Brief background on ANN-based models ANN methods are inspired by brain processes and solve problems by establishing non-linear relations between multiple inputs and outputs. From our overview of the related literature regarding SST prediction, ANNs can be categorized as classic neural networks, improved neural networks, long short-term memory (LSTM), convolutional neural networks (CNN), and their improved versions which are developed by combining them with other soft computing approaches.   5a) and can be categorized in two main types: single-layer perceptron (SLP) and multi-layer perceptron (MLP). The latter is more suitable to solve complex problems, due to its ability to include loops in the computation process. The backpropagation neural network (BPNN) is the most widely used type of MLP algorithm and iteratively uses different backward and forward loops to establish relations between inputs and outputs. In general, the relations are described through different activation functions (e.g. hyperbolic tangent sigmoid) and the type of activation function defines the type of ANN algorithm. For instance, the radial basis function (RBF) is a type of ANN algorithm that adopts different activation functions (thin plan spline, harmonic spline, Gaussian) than MLP.
2.4.1.2 Improved neural networks Research on enhancing the capabilities of neural network algorithms is in continuous development . For instance, different types of machine learning techniques such as Wavelet or Kalman filter are combined with standalone neural networks models, which can lead to more accurate results (Grosan and Abraham 2007). In this review, we consider these types of neural networks as a category of its own.

Long short-term memory (LSTM)
The LSTM is one of the most popular types of recurrent neural networks (RNN). Recurrent neural networks (RNNs) are different from the FFNNs. In the latter, the information just passes forward, whereas in the former the output feeds into the input (Fig. 5b). RNN algorithms also have a different neuron architecture, with self-neuron connections that make RNNs more dynamic than FFNNs. RNNs may have vanishing and exploding gradient issues (Karim and Rivera 1992;Lukoševičius and Jaeger 2009;Sak et al. 2014): to solve them, Hochreiter and Schmidhuber (1997) presented an RNN model called long short-term memory (LSTM), which uses a chain structure for its computations. The cell state is regulated with three different gates (input, output, and forget gates), and the role of these gates is to control the amount of information passed between layers. The LSTM network has been notably applied in speech-to-text transcription, machine translation, process forecasting, and language modeling (Peng et al. 2018;Sherstinsky 2020;Somu et al. 2020). A LSTM algorithm can learn how to connect minimal time lags of more than 1000 discrete time steps. This solution uses constant error carousels (CEC) (Ganesh and Kamarasan 2020), which apply a constant error flow to specific cells (Staudemeyer and Morris 2019). Unlike a traditional RNN, which calculates only the sum of the input signals and then passes through an activation function, each LSTM unit uses a C t cell memory at time t. The output of h t or the activation of the LSTM unit is where Γ 0 is the output gate that controls the amount of content expressed through memory. The output gate is calculated by the expression as bellow: where σ is the sigmoid activation function, W 0 is a matrix andb 0 represents bias vector. The C t cell memory is also updated as with relative forgetting of the current memory and addition of new memory content as c C t , where the new memory content is obtained aŝ The amount of current memory to be forgotten is controlled by the forget gate Γ f , expressed by the equation and the amount of new memory content to be added to the memory cell is expressed by the equation (Graves 2013). 2.4.1.5 Convolutional neural networks The convolutional neural network (CNN) algorithms are one of the best learning options for understanding image content (Zhou 2020) and show good performance in image processing and computer vision (Khan et al. 2020). CNN algorithms reduce the number of learning parameters due to the use of spatial relationships, which improves the training performance (Krizhevsky et al. 2012;Wang et al. 2020a).
In general, a CNN consists of three main layers: the pooling layer, the convolutional layer, and the fully connected layer (Shin et al. 2016). A pooling layer is usually placed after a convolution layer and can be used to reduce network parameters (Boureau et al. 2010) and spatial dimension of the feature maps (Singh et al. 2020). Like convolutional layers, pooling layers are stable over translation due to the consideration of neighboring pixels in their calculations. The convolutional layer applies a convolution operation on the inputs. The average pool layer computes the average value for each neuron cluster in the previous layer, and the fully connected layer connects each neuron in one layer to a neuron in the other layer (Zhou et al. 2018).
2.4.1.6 Improved convolutional neural networks As seen for classic neural and LSTM algorithms, also for the CNN algorithms, efforts have been undertaken to modify them and enhance their performance. Improved CNNs are considered as a category of their own in this review.
2.4.1.7 Overview of ANN-based models for SST prediction The following is a summary of previous investigations that have employed ANN-based models to predict SST. A summary list is also presented in Table 1.
Pioneering work on this topic has been carried out by Corchado and Fyfe (1999). They compared the capabilities of Finite Impulse Response-Neural Network (FIR-NN), Linear Regression (LR), and ARIMA models for estimating the SST at the Falkland Islands, UK. They used the water temperature at a fixed depth, measured by vessels, as input variable for prediction. Their results showed that the FIR-NN model provided a better performance than the LR and ARIMA models.
Following up the previous study, Corchado et al. (2001) used a different approach and source of data for SST prediction. Specifically, they assessed the performance of an Instance-Based Reasoning-Radial Basis Function (IBR-RBF) model for SST prediction at the Falkland Islands. They used the satellite-based water temperature, provided by the Plymouth Marine Laboratory, as predictive input variable. Their research findings showed that the IBR-RBF model produced the highest prediction performance among several employed predictive models. Malmgren et al. (2001) applied BPNN, modern analog technique with similarity index (SIMMAX), Revised Analog Method (RAM), Modern Analog Technique (MAT), Imbrie-Kipp Transfer Function (IKTF), and modified Artificial Neural Network (ANND) models to estimate SST in the Caribbean Sea and Atlantic Ocean. A fossil (planktonic) dataset was used as input and the BPNN model was found to provide the best prediction performance. This study inspired other researchers to  (Pflaumann et al. 1996(Pflaumann et al. ) 1991(Pflaumann et al. -1992(Pflaumann et al. 1992(Pflaumann et al. -1994  There are few studies in the literature that used ocean and marine currents to predict SST, such as in the work by Ali et al. (2004), who evaluated the SST in the Arabian Sea using a Multi-Layer Perceptron-Back Propagation (MLP-BP) model, with input variables such as net surface heat flux, net radiation, wind stress, SST at previous times, and dynamic height. They found that MLP-BP offers more accurate estimates than classic regression methods.
A study similar to the one by Malmgren et al. (2001) and Peyron and Vernal (2001) was carried out to forecast SST using sediment fossil sample (plankton) data by (Chen et al. 2005), who employed BPNN, IKTF, SIMM AX, RAM, and MAT models for SST prediction in the Western Pacific Ocean. The outcome of their study confirmed the better performance of the BPNN model in estimating SST.
In line to assess the capability of neural networks, Garcia-Gorriz and Garcia-Sanchez (2007) utilized FFNN to assess SST in the Mediterranean Sea. They used several input parameters associated to climate and marine conditions; their findings reveal that FFNN is a reliable technique in this research area.
There have been several attempts to find appropriate models to predict SST, and in order to do that, Gupta and Malmgren (2009) used different models with similar types of input variables employed for assessing SST. They compared ANN, IKTF, Weighted Averaging Partial Least Squares (WAPLS) regression, MAT, and Maximum Likelihood (ML) models for identifying SST trends in the Pacific and Antarctic Oceans. Surface sediment data were selected as input variable. Again, the ANN model showed the best agreement with the observed field data among all models considered.
Since 2010, a large body of studies has been carried out to compare neural networks' performance with that of statistical models for predicting SST. For instance, Bhaskaran et al. (2010) predicted SST using MLP and LR in the Indian Ocean, using as input variables water depth, longitude, and latitude. Their findings confirmed the higher prediction performance of MLP compared to LR.
One of the open problems has been the selection of the appropriate type(s) of neural network algorithm for SST prediction. In this regard, Mahongo and Deo (2013) set out to identify the best neural network model, by comparing FFNN, RBF, Generalized Regression Neural Network (GRNN), and ARIMAX models for forecasting SST in the western Indian Ocean. They used lagged SST values as input for their predictions, and their results showed the FFNN model to be the superior one.
In line with the previous study, Piotrowski et al. (2015) compared the performance of different soft computing models. They simulated streamwater temperature (not SST) using MLP, ANFIS, K-Nearest Neighbors (KNN), and Wavelet ANN models in two catchments in Poland, using air temperature, river runoff, and declination of the Sun as input variables. They employed wavelet technique for preprocessing of the input data for the neural network model. The Wavelet ANN model provided the better estimates of streamwater temperature. The approach that combines wavelet technique and neural network models was also adopted by Patil et al. (2016), who estimated SST using a Wavelet ANN model in the Arabian Sea, Bay of Bengal, African Coast, and Indian Ocean, with SST values known at previous times as input variable. Their results showed that the Wavelet ANN model provided more accurate predictions than the standalone ANN model. The impact of SST variation on streamflow is currently an open question, which was investigated by a unique study by Modaresi et al. (2016). They specifically used a GRNN model to forecast the spring streamflow for the Karkheh Basin in Iran, using SST data from the Persian Gulf and the Mediterranean Sea as input, obtaining adequate predictions.
In a study by Liao et al. (2017), a new approach named Reynolds Optimum Interpolation (OI) was developed and applied for the first time in the field of SST prediction, and compared to an ANN RBF model for the case of the Pacific Ocean, using lagged SST values as input. The study confirmed the superior performance of the ANN RBF model. In an attempt to use wavelet models for forecasting SST, this approach is combined with auto-regression model by Patil and Deo (2017). They analyzed the SST using Wavelet ANN and Wavelet autoregression models for the Indian Ocean, using the SST transformed by Wavelet functions as model input. Better results were obtained with the Wavelet ANN model. A similar study was conducted to evaluate the influence of wavelet technique for assessing SST by Patil and Deo (2017), who compared Wavelet ANN and Regional Ocean Modeling System (ROMS) to estimate SST. They entered Wavelet transform values of SST as input variable to their model. Also in this case, the Wavelet ANN model provided the better results.
A close look to the literature reveals that deep learning-based models such as LSTM and CNN have attracted progressively more attention in the research community since 2017. In this regard, a large body of investigations has been carried out to employ this type of neural networks or to improve existing models. In their pioneering work on this topic, Zhang et al. (2017) examined the LSTM-RNN, MLP, and Support Vector Regression (SVR) models to forecast the SST in the coastal seas of China, observing that the LSTM-RNN model provided the better prediction performance. In line with the previous study, Yang et al. (2017) investigated the capabilities of Combined Fully Connected Convolutional-Long Short-Term Memory-Recurrent Neural Networks (CFCC-LSTM-RNN), Support Vector Machine (SVM), SVR and Fully Connected-Long Short-Term Memory (FC-LSTM) models to simulate SST in the Bohai Sea on the east coast of China. Spatiotemporal parameters related to SST were selected as inputs for simulation, and their outcomes showed the most accurate predictions to be provided by the CFCC-LSTM-RNN model.
A different direction of investigation was taken by Guo et al. (2017), who assessed, for the first time, the performance of self-organizing map (SOM) to estimate the SST in the Pacific Ocean. The input variable considered was SST data obtained from different sources as shown in Table 1. The authors found that SOM provided excellent SST forecast performance. Aparna et al. (2018) studied the capability of a FFNN-Quasi Newton BPNN model to predict the SST in the Northeastern Arabian Sea, using the Sea Surface Temperature Average (SSTA) at previous times as input. They obtained SST prediction with satisfactory agreement with the observed data. In an attempt to find an appropriate approach for estimating STT, Foroozand et al. (2018) compared ANN, Ensemble Entropy (Bagging), Multiple Linear Regression (MLR), and Bayesian Neural Networks as predictive models for SST in the Tropical Pacific Sea, finding all models to provide close prediction performance. Foroozand et al.'s (2018) study was the first to use an ensemble model for predicting SST.
Another study evaluating the capabilities of LSTM models for estimating SST was the one by Liu et al. (2018), who applied LSTM, Multi-Layer Perceptron Regression (MLPR), and SVR for modelling the SST in oceans and found LSTM to provide the most accurate estimates among the models considered.
In the first study on SST prediction in the Hawaii region, Nodoushan (2018) estimated SST using FFNN and a Bayesian Network (BN), specifically for Honolulu, Hawaii Coast. The BN model reproduced the observed data better than the FFNN model.
To add to the studies trying to find appropriate prediction models by combination of different techniques, Ouala et al. (2018) discussed the application of Bi-NNbased Kalman filter, ensemble Kalman filter, and Bi-NN-NNKF-EOF for predicting the SST in South Africa, finding the better prediction performance with Bi-NN-NNKF-EOF. This work shed light on the benefits of combining Kalman filter, ensemble methods, and neural networks. The same approach was undertaken in the Red Sea. Patil and Deo (2018) used an ANN model for forecasting the SST in the Red Sea and Indian Ocean, finding general consistency of the ANN model's results with the field observed data.
Using appropriate input variables for estimating SST is a known challenge in this area. A recent study conducted by Quilodrán Casas (2018) explored the benefits of using new types of input variables for SST prediction. Specifically, he assessed the performance of Dimensional Reduction Analysis Neural Networks (DA-NN) and Ensemble Kalman filter in simulating the SST in the Atlantic Ocean. Sea Surface Height (SSH), SST, and Eastward and Northward horizontal velocities were employed as predictive variables. The DA-NN resulted in the most accurate SST predictions.
To assess and compare the predictive capabilities of neural networks, ensemble, and statistical models, Davies (2018) forecasted SST in the Pacific Ocean using ANN, Bootstrap, and Ordinary Least Squares (OLS) methods, using SST values at previous times as input data. Overall, the Bootstrap model provided the best results among the models considered.
All previous investigations focused on specific regions. Broni-Bedaiko et al. (2019) analyzed the performance of LSTM and Multiple Input-Multiple Output (MIMO) models in predicting the SST, for the first time across the whole world. They found that the LSTM model better reproduces the observed data. In another study, using a similar concept, Wei et al. (2019) applied MLP to simulate the SST in the South China Sea, showing accurate predictions.
In another study focusing on combining different techniques, Wu et al. (2019) compared the performance of Complementary Ensemble Empirical Mode Decomposition-Backpropagation Neural Networks (CEEMD-BPNNs) and Ensemble Empirical Mode Decomposition-Backpropagation Neural Networks (EEMD-BPNN) for forecasting the SST in the northeastern region of the North Pacific Ocean, reporting a better performance for CEEMD-BPNNs.
A further investigation with LSTM models was produced by Xiao et al. (2019b), who compared convolutional LSTM, LSTM, and SVR to estimate the SST in the East China Sea. Their results showed that the convolutional LSTM model provided the best prediction performance among the models considered. Xiao et al. (2019a) combined ensemble approach with LSTM: they simulated SST using LSTM-AdaBoost, SVR, BPNN, and LSTM models and found significant consistency between predicted and observed SST values by using the LSTM-AdaBoost model. An additional study involving LSTM models was carried out by Xie et al. (2019), who employed LSTM, SVR, and GED for SST modeling in the Bohai Sea and South China Sea. Results showed that the LSTM model provided the most accurate SST predictions.
As mentioned earlier, deep learning based models such as "classic" LSTM, CNN and their improved versions have attracted significant attention lately within the research community in this field, with studies comparing classic LSTM and CNN for predicting SST in different regions (Han et al. 2019;Wolff et al. 2020) and other researchers focusing on enhancing the performance of classic CNN and comparing it with other soft computing models (Barth et al. 2020;Saha and Chauhan 2020;Yu et al. 2020b;Zhang et al. 2020b).

Trends in ANN-based model applications for SST prediction
In the last two decades, the use of ANN-based models has significantly advanced the SST prediction study field. A timeline summarizing the various types of models adopted is presented in Fig. 6. The Finite Impulse Response model was the first one used in this area, and then, models such as Backpropagation and Multi-Layer Perceptron became widely adopted. In the last two years, deep learning-based models such as conventional neural networks and self-organized maps, which have excellent visual capabilities, have been used successfully for SST prediction.
Our systematic review revealed that most of the previous studies focused on using classic neural network algorithms such as MLP and RBF. Improved versions of these models, through combination with other approaches such as wavelet technique, have been progressively attracting attentions in this area. Based on the available literature, it was found that 20 papers were published which employed classic types of neural network algorithms for estimating SST, while 9 papers used improved model versions. As discussed earlier, several investigations employed neural network based deep learning algorithms such as LSTM or CNN (16 studies used standalone or improved versions of LSTM or CNN). Figure 7 summarizes the popularity of each type of ANN-based approaches in the last two decades.

ANN-based models for SST prediction compared to other models
The SST prediction performance of ANN-based models has been compared in the literature with that of other methods, such as ARIMA, SVM, and ensemble approaches (e.g., bagging and adaBoost). The ARIMA model was the first "traditional" model to compare the capabilities of ANN-based model with. The ARIMA model uses several assumptions such as associating linear relationships between previous observations to estimate future values. In all the studies from literature, the performance of the ANN-based models was significantly better than ARIMA's. ANN-based models have also been compared with SVM models, which are suitable for classification and estimation problems. In most of the studies from literature, especially when deep learning-based models such as LSTM were employed, SVM models showed a lower performance than the ANN-based models.
2.5 Other soft computing models for SST prediction 2.5.1 Brief background on the other available soft computing models There are different types of SC models, other than ANN-based models, such as ANFIS and SVM, for estimating SST or other parameters (Awan and Bae 2016;Sharafati et al. 2020). Fuzzy logic-based models originated from Zadeh (1965), who introduced the fuzzy logic (FL) rules to describe non-linear relations between inputs and outputs. These rules are expressed mathematically through a Fuzzy Inference System (FIS), which includes three main steps: (i) definition of fuzzy If-Then rules, (ii) definition of Membership Functions (MFs), and (iii) tuning of the MF parameters. The fuzzy If-Then rules are expressed using Membership Functions (to map the relations between inputs and outputs) and a set of designed parameters. Jang (1993) presented a new FIS system implementing an automatic approach for parameter tuning, named Adaptive Neuro-Fuzzy Inference System (ANFIS): this model applies neural networks for tuning both designed and MF's parameters. To achieve this aim, ANFIS is linked to several heuristic algorithms such as Genetic Algorithms and Particle Swarm Optimization .
Other SC models, as well as common statistical models, have been applied for SST prediction. Among them, there are the support vector regression and support vector machines: these models attempt to address the relations between variables using different kernel functions such as exponential, rational quadratic, Laplacian, polynomial, and Gaussian sigmoid.
The Auto Regressive Integrated Moving Average (ARIMA) is another model based on time series modelling that has been employed to predict the SST. This model comprises the both autoregressive and moving average terms to predict stationary series (Box and Jenkins 1976).

Overview of the other available soft computing models for SST prediction
Below is a summary of previous investigations that have employed SC techniques, other than ANN-based, to predict SST. A summary list is also presented in Table 2.
Regarding fuzzy logic-based models, a pioneering work was conducted by Huang et al. (2007), who assessed the potential of an ANFIS model for prediction of the SST in the Taiwan Sea, using various input variables such as salinity, temperature time, angle, and radius which identify the direction and distance to reference points. They obtained predictions with satisfactory agreement with the observed data. In a similar study, Huang et al. (2008b) examined the application of a FIS model to simulate the SST in the Taiwan Sea. Salinity and temperature were used as input parameters, and results showed that the FIS model provided accurate SST predictions. In line with the previous study but with different input variables and case studies, Awan and Bae (2016) employed an ANFIS model to forecast the SST in East Asia (Indian and Pacific Oceans), using values of Standardized Precipitation Index (SPI), SST, and Sea Surface Temperature Anomalies (SSTA) as input data for prediction. Their findings confirmed that ANFIS can provide predictions with significant agreement with the observed field data.
ARIMA models have also been used for SST prediction (Table 2). Shirvani et al. (2015) focused for the first time on the Persian Gulf, using ARIMA and Autoregressive Moving-Average (ARMA) models to forecast SST. The results of the former showed a significant agreement with the observed data. In another study by Salles et al. (2016), a similar approach was employed to assess SST using ARIMA and Random Walk models in the tropical Atlantic Ocean. They found the ARIMA model to provide sufficiently accurate predictions.
In line with studies focusing on the combination of different techniques, Li et al. (2017) assessed the ability of a Support Vector Machine-Complementary Ensemble Empirical Mode Decomposition (SVM-CEEMD) model for estimating the SST in the northeast Pacific Ocean. This is a seminal work for combination of SVM with ensemble approaches in this field of research and their proposed technique returned excellent performance in comparison with classic regression techniques.
In another study, Jiang et al. (2018b) evaluated the SVR and LR performance for SST prediction in the Canadian Berkley Canyon. Latitude, longitude, and water depth were used as input variables. These input variables have been seldom used to assess SST. The results revealed that SVR provided estimates closer to the observed data, compared with LR.
Although so far the majority of SST prediction studies based on SC techniques adopted neural network algorithms, a few studies employed fuzzy logic-based model and its hybrid versions, for instance ANFIS, or SVM. Figure 8 shows the number of contributions using techniques other than ANN algorithms to predict SST.

Conclusions
In the last two decades, SC models have attracted considerable attention in the SST study field due to their capabilities to solve complex and non-linear problems. More than 50 papers have been reviewed in this study, to assess the trends of SC model application for SST estimation.
The key findings of this review are the following: i. SC models have been used either to estimate past values of SST (using marine sediment samples) or to predict SST (using data from buoys or satellites). ii. An increasing trend in utilizing satellite-based information for predicting SST is observed over the last five years, although the measurements obtained from buoys are still the most important data source for SST prediction. iii. The most widely adopted type of input variable for SST prediction is the SST itself observed at previous times.  iv. The ANN-based models (i.e., MLP) have been widely used to predict SST for the last two decades, with RNNs, especially LSTM, gaining popularity in the last 2 years. v. Models with high visual capabilities, such as CNN and SOM, are also becoming increasingly adopted. CNN models in particular have shown a better performance than other available numerical models for assessing SST. This technique is also extremely useful approach for estimating ENSO phenomena or prediction of sub-surface temperature or filling missing Argo data (Ham et al. 2019;Han et al. 2019). vi. In recent years, the deep learning-based models have gained popularity, although the findings from literature show that classic neural network models such as FFBP or RBF can produce reliable predictions of SST or other marine and climate indices (Ratnam et al. 2020). vii. To evaluate the performance of the various SC models, different indices were used. Correlation coefficient and root mean square error are the most common metrics adopted (Tables 1 and 2). viii.Pacific and Indian oceans are the most common study areas, and the China Sea has been increasingly studied in recent years. ix. Several studies used SC models alongside numerical methods (i.e., CEEMD) to improve on the SST prediction performance.
The following are a few considerations about possible future directions in the field: i. Most of the previous studies on SST prediction have used observed SST values at previous times as input variable for prediction. Use of alternative input variables, such as heat surface net flux, ocean front, and eddy recognition, should be investigated. These variables are essential to demonstrate the thermal interaction between atmosphere, ocean and different water masses which causes significant uncertainty when assessing SST especially in the Arctic and Antarctica regions (Ali et al. 2004;Gautam and Panigrahi 2003). ii. The prediction of SST using SC models could be enhanced in further investigations by involving approaches such as Gamma Test or Mutual Information Theory to optimize the number of input variables in regions with highly variable conditions. As discussed earlier, for more reliable predictions, it would be useful to consider variables associated with ocean and marine conditions but predictions could be even more accurate if parameters associated with solar variabilities and cloudiness were considered in future studies. iii. Most of the predictive models from previous studies are based on ANN algorithms. The potential of either machine learning (e.g. Decision Trees) or ensemble machine learning (e.g. Ada Boost Regression) models should be assessed in future studies. As mentioned earlier, in the last 5 years, deep learning-based neural network models have shown potential due to their visual capabilities and flexibility with large datasets. In particular, LSTM and CNN models provide high speed calculation and more flexibility for fitting large input datasets to outputs and require less memory during the prediction process. iv. Satellite-based information will increasingly be the major source of input data for SST prediction in future studies; bias correction for satellite-based information will be of critical importance. v. Beyond considering the effects of El Niño and La Niña on SST (Broni-Bedaiko et al. 2019;Foroozand et al. 2018;LI et al. 2017), indices such as Southern Oscillation Index (SOI) and North Atlantic Oscillation (NAO) should be included in the input variable combinations for SST prediction. vi. Assessing the uncertainty associated with SST prediction due to different factors such as input data measuring error, data handling, model structure, and combination of input variables for prediction should be evaluated in future studies. vii. SC models should be used to address open questions such as the impact of abrupt changes of SST on coral reefs (Wei et al. 2019) and melting ponds and rapid changes in ice thickness in cold regions such as the Arctic and Antarctica (Ressel et al. 2015;Ressel and Singha 2016). viii.A closer look at the previous studies has revealed that there is a number of regions where models for SST prediction have not been applied yet and would be useful: for instance, the regions affected by the Aghulas current occurring near the southeast coast of Africa or the regions affected by the Kuroshio-Oyashio extension current along the coast of Japan.