Estimation of Shear Wave Velocity for Shallow Depth Using Artificial Neural Network Technique: A Case Study in Rumaila oil field

Abstract

In Rumaila oilfield, the lost circulation problem is a challenging issue. The geological and geomechanical properties of subsurface formations have a role in causing a circulation loss. One of the natural features related to mud loss against these formations is an unconformity surface. Inferring surface unconformity needs to be understood how the mechanical properties of rocks are distributed across the entire reservoir. The result is identifying areas of loss for solving most loss problems. Using well logs, a geomechanical model can be constructed to identify the surface unconformity. Shear wave velocity is the most crucial factor in determining mechanical properties. They are not frequently recorded during well logging for time and cost-saving purposes. save time and cost. To overcome this challenge, an ANNs model was developed to estimate the missing Vs data for the Hasa and Aruma groups in the Rumaila oil field for interested wells from the south and north domes (extending from the top of Dammam to the bottom of Hartha). The performance of the new model was tested through calibration. The outcomes showed that measured depth (MD), bulk density (RHOB), and compressional velocity (Vp) are key parameters for creating the ANN model utilizing. This study has proven that the basic systematic equations are accurately anticipate shear velocity (Vs) from conventional well logs. The correlation coefficient (R 2 ) and the root mean square error were 0.956) and0.118, respectively.The optimum number of hidden neurons was 3 neurons). The presented model is closely resemble the measured Vs data when dataset from other well was used to check the accuracy of that predictive model. This study presents an effective, simple, and cost-effective technique, which can be used in the absence of rock tests and DTs.

Introduction
Perhaps the most expensive mud-related drilling issue is lost circulation problem. Large amounts of expensive drilling fluids are lost to the formation in addition to the wasted rig time. The loss of circulation might also cause serious problems related to well control. The lost circulation is severely hampered the efficient development of oil wells in Rumaila oil field.
According to the well design in the Rumaila oil field, the problem of losses occurs in the section 12.25", because that section frequently exposed to loss issues spatially in the Dammam and Hartha formations ( Fig. 1) represents the percentage of NPT in Rumaila oil field during the last ten years. More than 30% and 23% of NPTs were observed in the Dammam and Hartha formations respectively and these NPTs because of the partial and complete losses. The severity of loss circulation problem can be linked to the geological characteristics of rocks such as depositional history, sedimentation environment, post-depositional mechanical, chemical, physical, hydrological, and thermal weathering, as well as the diagenetic effect and the tectonic activities of the relevant regions. By nature, the unconformity is the resulting thing due to the combined and progressive actions of all these factors, resulting in lost circulation as shown in (Fig. 2) (Amanullah et al., 2017).  (Amanullah et al., 2017) Unconformities can be identified as gaps in the geologic record that may represent episodes of crustal deformation, erosion, or changes in sea level.The depth of unconformity can contribute to the loss of circulation during drilling operations. The section of unconformity located at a shorter depth has a larger chance of triggering circulation loss than the section of unconformity located at a deeper depth. This is owing to the reduced consolidation impact of shallower formations compared to deeper formations. The main losses of Rumail oil field are in the Dammam and Al-Hartha formations located in the 12 ¼″ section. At the top of these formations, surface unconformity is exited, which related to to the loss of circulation in Rumaila oil field (Arshad et al., 2015).The best way to improve the lost circulation prevention and rehabilitation is to clarify the weak layers. It is crucial to establish and understand the mechanical rock properties of the weak formation based on constructing a geomechanical model using well logging and seismic data (Akhundi et al., 2014). To detect surface unconformity, fully understanding of mechanical rock properties is important, resulting in determining lost circulation zones and establishing standards for evaluating lost circulation risk. Determining the loss circulation zones can significantly improve the guide for designing the well trajectory and well construction in the Rumaila oil field (Wenjun et al., 2021). The most important variables to determine the mechanical characteristics of rocks are compressional and shear wave velocities (Hadi & Nygaard, 2018). However, well logs data such as shear and compressional sonic logs are often absent or insufficient in any interested field because of the high cost and the time taken to acquire the data, especially in shallow depths (Abdul Majeed & Alhaleem, 2020).
Many correlations and models have been presented in literature to predict shear wave velocities based on different parameters (Bingham,1965;Rehm &McClendon,1971;Zamora & Lord, 1974;Eaton, 1975). These models have their limitations, such as the fact that some of them can only be applied to clean shales (Ahmed et al., 2019). Additionally, some of these models are not applicable in unloading formations because they are based on empirical relations and constants. The result is the uncertainty in characterizing thereservoir, which could have an impact on drilling operations and oil recovery factors (Jubair & Hadi, 2021).
To overcome these difficulties and complete missing Vs data, artificial neural network (ANNs) is one of the most powerful techniques for supplying such missing information to provide useful input information for geomechanical modeling (Jubair & Hadi, 2021). ANNs are effective tools with nonlinear approximation skills employed in various computer and engineering fields. They are computational systems that emulate the computing capabilities of biological systems using either hardware or software (Maren, 1990). Previous research demonstrates that reservoir modeling, the study of formation properties, and the generation of synthetic distinct well logs have all been performed using neural networks (Rolon et al., 2009;Rolon et al., 2005). (Rogers et al., 1992) have created a computer program that uses a back-propagation neural network to identify the lithologies from well logs automatically. Additionally, ANN was utilized to produce synthetic geomechanical well logs like shear wave (DTs or Vs). from conventional well logs (Mohaghegh et al., 2017;Eshkalak et al., 2013). All of the studies mentioned above and work have one common objective: to effectively increase reservoir productivity by using statistical and artificially intelligent methods to make up for missing data or important parameters and better understand formation characteristics or mechanical rock properties.
This study aims to develop ANNs model to estimate the missing shear wave (Vs or DTs) for Hasa and Aruma groups in Rumaila oil field.. The lithological composition of the target layers is varied between limestone, dolomite, and very little anhydrite, so the model can be applied to all interested layers since carbonate is the predominant composition. The ANN model was developed by processing datasets of NR-1 well (located in the north dome of the Rumaila oil field) through training and validation processes. in which The presented model were then tested and calibrated using datsets of SR-1 well (located in the south dome of Rumaila oil field).

Area of Study
A super-giant Rumaila oil field is situated in southern Iraq, about 20 miles (32 km) from the Kuwaiti border. It is made up of the north dome and the south dome, two anticlines. Rumaila reservoir includes layered sandstone and carbonate (Zubair and Mishrif formations) that extend up to 4 km and and returned to the Cretaceous age. the field is estimated to contain 12% of Iraq's oil reserves. it is the world's third biggest producing field and delivers approximately one-third of Iraq's total oil supply. The giant onshore field, which has been producing since 1954, is still left with an estimated 17 billion barrels of recoverable oil reserves (Rumaila iq, 2018) .

Data Analysis forf Constructing ANN Model
Before developing any predictive model, data analysis must be done to figure out the influence of input parameters on the output function (herein shear wave velocity). A new ANN model is developed to forecast shear wave velocity based on well logs data of top and intermediate sections of Rumaila oil field (extended from top Dammam to the bottom of Hartha). Fig. 3 shows the variation histograms with a statistical analysis of the log dataset (measured depth (MD) in (m), compressional wave velocity (Vp) in (Km/sec), bulk density (RHOB) in gm/cc, and shear wave velocity (Vs) in (Km/sec). The raw datasets are consisted of13771 data points for each. While (Vp) varies between 2.23 and 7.04 km/sec, (Vs) has a range between 1.7 and 3.6 km/sec. data of RHOB is between 1.96 and 2.95 gm/cc while NPHI, GR and depth have range (0.0071-0.8733), (9.5-120) API and (577-1954m), respectively. By establishing which input parameter (s) have the greatest impact on the output function, shear wave velocity data has been plotted against all input parameters which are compressional wave velocity, bulk density, neutron porosity, gamma ray and measured depth as shown in (Fig. 4). The results indicated that Vp, RHOB and NPHI are more effective on the shear velocity trend line than MD and GR. Due to a lack of NPHI data at the target wells in most of wells and the effect of depth on Vs prediction is higher than GR, gamma ray and neutron porosity data will not be employed in this investigation. Fig. 4. Analysis of shear wave velocities using well log measurements of compressional wave velocity, bulk density, Neutron porosity, gamma ray and formation depth.

Development of ANN Model for Predicting Shear Wave Velocity
JMP Pro-2016 program was used to develop a new model of ANN for estimating the shear wave velocity for shallow and intermediate sections in Rumaila oil field. The ANN has the ability to mimic the system and offering an output function. Supervised network has been used in this study, which can provide an inferred function for mapping new examples using input-output data patterns or training examples (Philip, 2001).
Training, validation, and testing are the general three processes that used in developing ANN models.

Training
The network's first modification step is the training phase. It is possible to summarise up the training process as follows: The output that the network generates using the input data is compared to the desired or accurate output. If there is a disparity, the connection weights between the input, hidden, and output layers would be changed or altered until the ANN outputs are very near to the expected real output values (i.e., minimize the error between the target and predicted values which is idelly zero). There are two types of training phase. Supervised training type is a sort of training in which both inputs and actual outputs are given to the network. Supervised machine learning refers to the fact that at least some of this method requires human supervision, it is typically more prevalent in the oil and gas business. In contrast, the unsupervised training type or unsupervised machine learning is more of a hands-off approach, just input values are provided, and the ANN modifies its own weights to cause the network to produce similar outputs while entering similar inputs (Mohaghegh, 2000). The error backward propagation learning algorithm or the backpropagation neural network (BPNN) algorithm has been employed in this study. It is a common technique used during the training of neural networks. With regard to each weight in the network, this technique aids in calculating the gradient of a loss function. The key characteristics of BPNN are a simple construction, great plasticity, powerful nonlinear approximation, good adaptive, large parallel processing, self-learning, and fault tolerance (Karlik, 2014). Figure (5) depicts the three-layer architecture of a BPNN for shear wave prediction. Input, hidden, and output layers constitute the majority of an artificial neural network (ANN). The number of input and output layers are generally set based on the problem of interest, and the number of hidden layers are also determined through trial-and-error methodology or through experience (Hu & Song, 2014). In this study, one of input layers (composed of MD, Vp, and RHOB) and one of the output layers (composed of Vs) by one of the hidden layers and three neurons that have been used to develop the Vs model. Simple are interconnected processing neurons are constituted each layer. Each neuron is given a specific weight that will be multiplied by the amount of data traveling over each link. As shown in (Fig. 6), there is also a desired bias or a threshold value that must be added to the sum in order to improve the convergence property of the network. This value is known as the net value. The activation function receives the net value (as the input) and processes it. In fact, the neuron's output from this function serves as the input for the other neurons in the next layer. By separating the outputs for both the hidden and output layers, the tangent sigmoid function, as one of the activation functions, is more frequently utilized to obtain neuron outputs as shown in (Fig. 7), ( Zoveidavianpoor et al., 2013;Philip, 2001). It is used to process input and output values between 1 and -1 Since the raw datasets are sometimes too small or too large which are not suitable to be utilized, therefore scaling of data should be performed using a normalized formula (Eq. 1) (Saeedi et al., 2017). In the end, the net classified the fed data randomly for 70% training and 30% validation.
where: Xnormalize is the normalized value of input parameter, X is the input parameter and (Xmax & Xmin ) are the maximum and minimum values of input parameters, respectively.

Validation
The validation process is the first step in testing the performance of the trained network. Testing is carried out on input-output pairs that were not used during the training stage. If it is well performed during testing, it is designated as a suitable network for use with the new data (Cranganu et al., 2015). The training error curve is then displayed along with the validation curve vs the number of epochs, as seen in (Fig. 8). The validation curve should also exhibit the same downward trend as the training error curve just prior to convergence. After convergence, the training error curve would continue to drop, indicating too much training, while the validation and testing error curves would grow (over-fitting). Therefore, validation process also considers as stopping criterion of training process. This prevents overfitting, which occurs when the network prefers to memorize unimportant bits of trained data, reducing the network's capacity for prediction (Smith, 1993;Stone, 1974).

Testing
The final stage of the ANN application is testing process in which the trained ANN is fed new input data (with unknown outputs) in order to get the network outputs. As a result of the trained network's successful performance in the validation stage, the acquired outputs in the testing stage are regarded as reliable and are presumed to be as close to real as possible.

Performance Criteria
The network's performance is analyzed using several statistical parameters; however, it is important to note that mean square error (MSE) and square Pearson correlation coefficient (R2) taken together might provide the best indication of the network's performance as formulated in Eqs. 2 and 3 respectively. (2) Where: n is the number of data points in the training, validation, or testing subsets at step ; & are respectively the actual and forecast values at step , and and ̅̅̅̅̅ & ̅̅̅̅ are the standard deviation and mean of the forecasted values, respectively.

Results and Discussion
The dataset shown in (Fig. 4) has been chosen for generating the ANN model. The evaluation of the network's performance was confirmed based on R 2 and MSE that were acquired during the training and validation stages. When contrasting the real shear wave velocity (target) values with the anticipated shear wave (output) values established by ANN, the obtained results demonstrated that the R 2 values were 0.956 and 0.954, as well as the mean square errors (MSE) were 0.118 and 0.121 for both the training and the validation datasets, respectively as shown in (Fig. 9).

Fig. 9. The predicted and actual Vs datasets cross-plot for training and Validation
These results revealed that the ANN technique can be used to accurately determine the missing values of shear wave velocities based on conventional well log data. Fig. 10 shows the predicted and residual Vs datasets cross-plot for training and validation.  Table 1 and Eq. 4 display the Vs model that was created using ANNs. As a result, the weights and biases of the neural networks Table 2, which are also integrated with other conventional well logs, including the compressional wave velocity (Vp), bulk density (RHOB), and formation depth (MD), can be used to estimate the Vs values of the target formations. Where: N is the number of neurons, is shear wave velocity (Km/ sec), compressional velocity (Km/sec), RHOB is bulk density (g/cc), MD formation measured depth (m), is the index for neurons, 1 is the weights between input layer and hidden layer for neuron , 2 is the weights between hidden layer and output layer for neuron , 1 is the bias between input layer and hidden layer of neural network, and 2 is the bias between hidden layer and output layer of neural network. This study also employed the data sets from a single well (SR-1) to check the performance of the new shear wave velocity model; Fig. 11 shows the cross plot of predicted and actual Vs datasets in SR-1. The results revealed that the ANN model was successfully checked for a prediction of Vs using conventional well log data as presented in Figs. 12 and 13). Table 3 shows a statistical comparison between predicted and actual Vs datasets.Where these results have proven the extent of the validity of the model and the possibility of applying it to calculate the values of the shear wave, especially in the intermediate section of offset fields or similar to the lithology of the target field, as in most fields, the shear wave for an interested section is not measured as it does not represent a pay zone, but we found the importance of this in order to deal with Drilling problems related to this section, especially the loss problem. Fig. 11. Cross plot of predicted and actual Vs datasets in SR-1 well.

Conclusions
This study uses ANNs to find the missing data of shear wave, which may subsequently be used to generate the synthesizing geomechanical well logs in the Rumaila oil field. BPNN is used within this study to accurately estimate the shear wave velocity in the Hasa and Aruma groups (extending from the top of the Dammam to the bottom of the Hartha) in the interested field. The accuracy of the generated model was assessed using the two performance criteria R 2 and MSE. The results show that R 2 was very high (0.956) while the MSE was very low (0.118). This result demonstrates the ability to estimate shear waves using the weights and biases of ANNs with highly efficient capacity. Despite the fact that the method described in this study seems to predict Vs reasonably closely to actual values, none of such methods can provide a flawless assessment of Vs. This can be attributed to the fact that rocks are naturally heterogeneous, which may affect their elastic moduli. This result in a broad velocity range that might be existed even with uniform porosity. This work offers an accurate and useful prediction approach for determining Vs using conventional well logs, which may be further used in the absence of rock tests and shear log readings. The results have proven the extent of the model validity and the ability to use it to calculate shear wave values, particularly in the intermediate section of offset fields or in lithologies similar to those of the target field. In most fields, the shear wave velocity for an interesting section is not measured because it does not represent a pay zone. However, we revealed this significance in dealing with drilling issues related to an intermediate section, particularly the loss circulation problem.