2.1 Overview of Network Traffic Prediction in Cloud Computing
Network traffic prediction has remained one of the major subfields of network management research area. Given that cloud computing has become so prevalent, having rapid and correct estimations of the network traffic has become even more crucial (Shafiq et al., 2022). Traffic patterns can be adequately predicted with the aim of smart allocation of resources so as to reduce latency time and at the same time have efficient cloud services (Saeik et al., 2021).
Actually, first studies devoted to prediction of network traffic were carried out in the context of traditional computer data networks, where traffic processes were much more stable and did not fluctuate so dynamically as in the contemporary ‘cloud’ arrangements (Lohrasbinasab et al., 2022). Most of these studies employed both time series analysis, autoregressive models, and linear regression to make forecasts of traffic volumes by using historical data. However, with its move to cloud computing, new issues such as traffic fluctuations, multi-tenancy, and dynamic resource allocation, issues which previous models were ill-equipped to solve (Abouelyazid, 2022). To these challenges, recent research has strived to develop complex models that incorporate the characteristics of cloud networks. Some of these are machine learning techniques, stochastic models, and integrated models which use more than one method in a single model with the aim of enhancing the accuracy of the results (Li et al., 2024). Even if these theories have been developed, most of these models still lack a number of characteristics such as scalability, flexibility, and computational performance, especially in dynamic cloud environments (Balarezo et al., 2022).
2.2 Time Series Models for Network Traffic Prediction
Time series (TS) analysis of network traffic has been one of the most popular techniques for achieving the goal of network traffic prediction in general and in the context of traditional and cloud networks in particular (Tuli et al., 2020). ARIMA type models, seasonal ARIMA models, and ETS types are used frequently because traffic data provides temporal dependency. These models are generally used when the number of vehicles entering the parking area follows some cyclic pattern that may be daily or weekly, etc. (Shafiq et al., 2022).
For instance, in cloud computing environments, time series models have been used to predict traffic at different time horizons on the scale of seconds, minutes, hours, and the likes (Hsieh et al., 2020). For instance, traffic prediction in virtualized networks, which experience sudden shifts in user behaviour, can be done using ARIMA to predict short-term traffic patterns (Gill et al., 2020). Likewise, SARIMA has been used to forecast fluctuating patterns in traffic caused by periodic events or behaviours of users, while in the case of cloud traffic periodicity has been addressed by the model (Lohrasbinasab et al., 2022).
Yet, classic time series models can be problematic in cloud networks. Another issue that might be found in cloud environments due to the very high level of dynamic changes in traffic patterns is prediction inaccuracy (Saxena & Singh, 2022). In addition, time series models differ from other models used in data analytics and may need more data history to build prediction models, which is often insufficient to predict the operation of new fast-growing cloud networks (Tuli et al., 2020). These drawbacks have raised the need to seek other models such as machine learning and hybrid models to improve the accuracy of cloud predictions (Aburukba et al., 2021).
2.3 Machine Learning Approaches to Network Traffic Prediction
The network traffic prediction using machine learning techniques has attracted great focus in the recent past because of the high capability of learning highly complex patterns from the data (Aburukba et al., 2021). Compared to other statistical models that are more conventional, ML approaches can actually identify non-linear relationships and interactions between variables, which cloud networks are very dynamic and iron mixed can be very heterogeneous (Alghamdi, 2022). Most commonly, four models for traffic prediction are used, namely Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), and Random Forest (Omer et al., 2021). ANNs are especially used when it comes to network traffic prediction because of their versatility and capability to recreate complexity (Shafiq et al., 2022). Several works with ANN have been done for traffic forecasting, traffic anomaly detection, as well as traffic type classification in cloud settings (Alghamdi, 2022). For instance, other enhanced architectures of ANNs including LSTM have been employed to enable the prediction of traffic data that has temporal migration, giving improved predictions, in cases where dependencies are temporal (Tuli et al., 2020).
Support Vector Machines (SVMs) is another method in case of the prediction task which results in the classification and regression problem (Tuli et al., 2020). Cuong et al. utilized the SVMs anticipating the traffic load in the cloud data centres as a means of enhancing the provision of services under the domain of cloud computing (Saeik et al., 2021). The main strength of SVMs embraces their capacity when working with high-dimensional data and good performance in conditions when training sets are limited (Lohrasbinasab et al., 2022).Network traffic prediction has also been worked out using Random Forests and other methods under the umbrella of ensemble learning (Alghamdi, 2022). These methods use many decision trees to increase the accuracy of the prediction and avoid overtraining because, in cloud, traffic could be very unpredictable (Gill et al., 2020).
While the application of machine learning seems a promising prospect, these models are not without their issues. A major concern is that to build these models, one requires big labelled datasets to feed the models into over the first stage (Omer et al., 2021). In most cloud platforms, it can take a lot of time and be very expensive to gather and annotate a sufficient amount of data (Abouelyazid, 2022). Moreover, the training time of some of the models in the ML process can be long and bulky to be deployed in real-time prediction mode where prompt decision/action has to be made (Tuli et al., 2020).
2.4 Stochastic Models in Network Traffic Prediction
Stochastic models are based on the probability theory of expected traffic, hence differing from deterministic time-series and machine learning models (Saxena & Singh, 2022). Such models are specifically useful for architectures of cloud computing where traffic rates are unpredictable and arbitrary (Shafiq et al., 2022). Stochastic models incorporate the probabilistic nature of the network traffic, and in addition to the point values of the models, they give confidence bounds, which allow for measuring the spread of the predictions (Balarezo et al., 2022). Thus, Markov models are one of the most common stochastic techniques used in network traffic modelling (Lohrasbinasab et al., 2022). All these models suppose that the state of the network traffic at the subsequent moment depends on the current state and is not influenced by the series of events that have led up to it. This “memoryless” property makes the modelling easier and Markov models more applicable in cloud environments where traffic may be varying at a faster pace (Hsieh et al., 2020). For instance, Markov models have been employed in order to forecast the state of traffic load in cloud networks to plan for high traffic intensity periods (Li et al., 2024). Another stochastic model is the Queuing Theory which has also been used in the network traffic prediction, especially with relation to traffic management of cloud resources (Balarezo et al., 2022). Queuing models can estimate the rates of traffic arrival and waiting time for service so as to schedule the systems and expected bottlenecks in the network (Shafiq et al., 2022). These types of models are useful in circumstances where network traffic is variable, that is, they are useful for capturing queuing behaviour which causes congestion (Abouelyazid, 2022).
However, like any other stochastic models, there are certain limitations associated with the use of the Geometric Brownian model (Alghamdi, 2022). Work on these models has also shown that they can tend to be less accurate if the assumptions made on traffic patterns are not accurate (Shafiq et al., 2022). In the case of cloud environments, the traffic pattern is dynamic and may depend on several factors which are opposite to the assumptions made above (Saxena & Singh, 2022). Also, stochastic models can be complex in terms of computation, especially when applied to large-scale cloud networks with multiple traffic flows and with consideration of multiple services.
2.5 Hybrid Models and Advanced Techniques
Combined with aspects of the time series approach, machine learning, together with stochastic techniques, have featured as a more suitable technique in predicting network traffic in the cloud model (Abouelyazid, 2022). These models are designed to make use of the advantages of each technique and at the same time minimize the shortfalls (Hsieh et al., 2020). For instance, short-term dynamic variables may be captured using time-series analysis, while the more complex and non-linear relationships may be analysed using machine learning algorithms (Tuli et al., 2020).
Hybrid ARIMA-ANN Models are a perfect example of this approach as it entails the use of both ARIMA models and ANN models (Li et al., 2024). The advantages of ARMA models are the linear predictions which are enriched with the non-linear capabilities of ANN in predicting patterns (Lohrasbinasab et al., 2022). Another case of hybrid models for predicting network traffic is the combination of ARIMA-SVM, where the ARIMA model was used for the linear components of the traffic pattern, while the SVM model was used for the non-linear components (Aburukba et al., 2021).
Another hybrid model is a stochastic model with an Artificial Neural Network (ANN), which allows the integration of the uncertainty of the traffic in the cloud model to the architecture of ANN (Alghamdi, 2022). Stochastic models such as Markov models or Poisson models have also been merged with deep learning techniques, such as LSTM, to capture the temporal dynamics of cloud traffic and make accurate predictions (Saeik et al., 2021). However, these hybrid models come with their challenges as well. They are computationally intensive and require the knowledge and understanding of various methodologies which makes their implementation more difficult, particularly when compared to the implementation of single-method models (Shafiq et al., 2022).
Cloud Manufacturing Platform and Task Scheduling
In cloud manufacturing, there is a central cloud platform which plays the role of the intermediate between the consumers and the providers of the services. Clients provide work, whilst vendors supply virtualized capabilities. It breaks big tasks into small chunks of tasks, assigns them to the right resource, and then plan the schedule for performing these sub-tasks.
Mathematical Model for Task Scheduling
To optimize resource allocation and task execution, a mathematical model is proposed. The model considers the following factors:
-
Objective functions: Time (T), Cost (C), Quality (Q), and Utilization (U).
-
Decision variables: x_ij, indicating whether task i is assigned to resource j.
-
Constraints: Time, cost, quality, and utilization limits.
Mathematical Formulation
Minimize: ∑(a1 * T + a2 * C - a3 * Q - a4 * U) * x_ij
Subject to:
a1, a2, a3, a4 > = 0
T < = T_max
C < = C_max
Q > = Q_min
U > = U_min
x_ij ∈ {0, 1}
where:
-
a1, a2, a3, and a4 are weights representing the relative importance of each objective function.
-
T_max, C_max, Q_min, and U_min are predefined thresholds for time, cost, quality, and utilization.
Quality of Service (QoS) and Resource Allocation
The results also reveal that the level of QoS is affected by the number of resources in the cloud manufacturing system. Adding up the number of resources to improve the QoS will, however, increase the costs. To counter balance these positions the heavy traffic limit approach is suggested. By doing this, it is possible to establish the number of operation machines that is required depending on the QoS.
Visual Representation
cloud platform architecture, showing task submission, decomposition, resource discovery, service composition, and scheduling
diagram illustrating the objective functions, decision variables, and constraints of the mathematical model
Note: The specific values for a1, a2, a3, a4, T_max, C_max, Q_min, and U_min would depend on the particular cloud manufacturing environment and its requirements.