Permeability Estimation for Upper Shale Member in Southern Iraqi Oil Field Using Machine learning and Hydraulic Flow Units Methods

Abstract


Introduction
The shaly sandstone oil reservoir is heterogeneous in rock properties.Understanding these heterogeneities' form and spatial distribution is fundamental to successfully exploiting these reservoirs.The most crucial rock property in the reservoir is permeability.While this property is essential in reservoir management, there are no specific conventional well logs for measuring permeability directly (Zhang et al., 2007) The K values could be more satisfactory if they came from the core analysis, which takes time and funds to operate and conduct research.That is why there is a need for more practical and less expensive methods for obtaining permeability.Permeability is an independent property of a reservoir.Nevertheless, permeability values are low if porosity is disconnected (like in the shale zone), whereas permeability values are high when porosity is connected and effective (clean sandstone formations).Several correlations have been found to estimate it indirectly from other logs.Timur-Coates (Timur, 1968) and Schlumberger Doll Research (SDR) models can also be used to predict permeability if nuclear magnetic resonance (NMR) records are provided.These models were empirically established depending on the correlation between porosity and permeability; they are used in homogeneous sandstone cases.Also, the Pickett plot (Pickett, 1973) can be used to estimate permeability, but only in the circumstances with little variation in the grain or pore size.These correlations struggle in heterogeneous reservoirs like USM.Accordingly, dividing the reservoir into groups that have the same characteristics and properties helps in calculating the value of the permeability by taking the regression analyses for each group separately.Therefore, the method that will give the best divisions of the rock reservoir will make the best prediction for estimating permeability.Numerous techniques have been created to identify rock types (RT) in conventional reservoirs.Most of these techniques used cross-plots of permeability and porosity derived from the core or log data.(Pittman, 1992) proposed utilizing r35 cutoffs based on mercury injection capillary pressure experiments (MICP) to identify distinct rock types.Another method for classifying the rock type is the hydraulic flow unit (HFU), based on core data, porosity, and permeability.This paper employed two methods, machine learning (ML), which was the new technique developed by the author of this paper, and the hydraulic flow unit.For the ML method, self-organizing maps (SOM) unsupervised machine learning, a clustering technique was used.Honkela and Kohonen (Pöllä et al., 2007), to perform core-based rock typing as a first step, there are two operating modes for SOM: training and mapping.To create a lower-dimensional representation of the input data, training first employs a collection of input data, and second, mapping uses the created map to categorize further input data.Two approaches were made in ML: the supervised one, meaning the data to be classified for porosity and permeability were learned by the classification of special core analysis data (SCAL), which was only 12 points, and the unsupervised approach, which applies the clustering to core data for porosity and permeability without the SCAL points classification learning step.In the second method (HFU), Amaefule et al. (1993) proposed two established techniques to determine permeability and identify hydraulic units for uncored wells: the reservoir quality index (RQI) and the flow zone indicator (FZI).FZI has a strong relationship with the size of the pore throat, but it is only obtained from porosity and permeability.The hydraulic unit will be introduced as a unit of reservoir rock that provides a unique relationship between porosity and permeability by altering the Kozeny-Carman equation.They also suggested using cut-offs for the flow zone indicator to categorize the various rock types.Traditional regression methods are less effective at predicting permeability than as adequately at predicting permeability as HFU, which divides the reservoir into several flow units that are different from one another using features that control fluid flow in a reservoir, providing more accurate and exact models for the entire reservoir.After predicting the K-equations from the core rock type, the core-to-log correlation should take place for the cored well to train the logs for predicting RT by using principal component analysis (PCA) and the decision tree algorithm (Dtree); this will build a training data set for applying to the other wells that have no core and have logged only.The building and applying processes for the uncored-logged and training data sets will use the same classification method, Dtree.The two methods' results differed; the HFU was more suited to the core data, contrary to the ML method.

Data Preparation
The database, including the core and logs for the wells, was examined to ensure that it covered the relevant interval of the upper shall member.The core data were thoroughly evaluated geologically and petrographically to confirm the core-to-log shifts.A review of the log data showed that the responses of the logs from different runs were the same, except for those that could be explained by changes in lithology or fluids.Furthermore, logging data for all the wells should be checked for corrections and depth matching, especially when the pore hole is not stable, and many washouts happened in our case in USM that affected log responses, particularly the density, sonic, and neutron logs.From the petrophysics side, shale volume (Vsh) was calculated from both gamma ray and neutron-density log data.Using PHI-density and PHI-sonic data, the porosity was calculated.The two approaches were then averaged and compared with core porosity.Good matching happened when used this way, which is why this method was chosen.The Vsh cutoff applied was 0.5 v/v; each value higher than 50% was excluded and not considered a reservoir rock or net pay.This value was detected by the experience of the production history of the study area, as some subzone intervals can produce oil with that Vsh percentage.

Methodology
Flow chart 1 shows the workflow of the two methods for predicting RT and estimating permeability, starting from core data, separated into two branches representing the methods, and ending with a K prediction that comes from the logs only from uncored wells.The special core analysis data (SCAL) contained only 12 plug tests for the capillary pressure (Pc) and injected brine solution (Sw), in addition to porosity (Ф) and permeability (K).These 12 points are clustered into four petrophysical groups (PG4) in Fig. 2. It was a step toward learning for the remaining core data, which only had K, 94 samples.The classification method used is a self-organization map (SOM).The author proposed four group counts (number of rock types or groups): very clean sand, clean sand, shaly sand, and shale.The reference variable is K-core.The Ward algorithm is used to propagate this model (groups) to the entire core data, which had only K-core, and Ф-core learned from the previous PG4.The regression analysis has been applied to each group to predict the K equation.Fig. 3a depicts the supervised classification of the -core and K-core group distributions, as well as the percentage of each group of entre samples shown by the pie chart.Fig. 3b, which also includes the regression best-fit line for estimating permeability.Fig. 3c.After finding the equations for estimating K for the cored well derived by porosity and testing the result with the K core to see the match, the next step is to link the conventional logs (density, gamma-ray, neutron, resistivity, and sonic) with the results that come from the core data for the cored well (training the logs).This training data set was created using the decision tree classification method.After that, we use this training data set to build a model for applying to the rest of the uncored wells that have only logs, also by using the decision tree (Dtree) technique.This method is identical to the one that came before it; the only significant change is that instead of learning from the 12 SCAL test points, it uses only k-core and Ф-core for 94 samples.The SOM was utilized for the clustering process.With R-squared, statistics tables can figure out how a dependent variable, permeability, and an independent variable, porosity, are related to each other; an R-squared of 1, or 100%, means that permeability is completely explained by porosity.A low R-squared, at 60% or less, indicates that permeability does not generally follow the porosity index.The distribution groups, pie chart, and statistic table for predicting permeability were presented in Fig. 3abc.

Hydraulic Flow Unit (HFU)
This sectional study attempts to identify the flow zone indicator (FZI) in uncored wells using log records, then estimate k.For the cord well, after determining the FZI constant for each rock type (RT), by taking four different line slopes from the FZI-CDF (the probability plot or cumulative distribution function), we could identify four RTs, Core permeability versus core porosity Cross-plots were used to define a k/Φ-relationship for each reservoir rock type (Fig. 5).The HFU method was described in great detail in three reference papers (Alobaidi, 2016;Abdulmajeed et al., 2022;Baker et al., 2013;Alsinbili et al., 2013;Al-Qattan & Al Mohammed, 2017;Muzaal et al., 2022;Salman et al., 2022).Table 1 gives the RQI indicator registration equation and permeability regression equations for hydraulic flow units.It is important to define the permeability as accurately as possible since the k/Φ-relationship affects irreducible water saturation (Swirr) vs. depth; thus, it affects the reservoir model and its predictions.A correlation between the rock types and the log curves must be discovered to calibrate the log-derived rock types and produce a continuous rock type throughout all zones.There are many options for correlating.Decision trees were used in this study's data classification process.The method here generates a porosity-to-permeability transform for each kind of RT that will be applied to the log-derived porosity.Compute permeability from rock types using porosity-permeability relationships, and compare core and predicted permeability for finalizing the result.This process is a training data set (cored well) for the uncored wells that have only logs to predict their RTs and K.

Results
Both approaches estimate RT and K.The ML method is easy and fast.Although the classification was done, the unsupervised type was not satisfactory for predicting permeability; the RT was hugely different with supervised and HFU methods, and the R² of unsupervised is low for most of the RTs.Fig. 4c.For the supervised method, the permeability estimation from logs in most locations, especially when the pore hole is stable and the logs are not affected, there is a high alignment with core data.The HFU is better than ML; also, the HFU provides permeability results that are more suited to the K-core data.Fig. 5 shows us their RT for the core data and the RT from the logs for the same well.Fig. 6 shows the permeability results for the cored well of both approaches.When estimating the permeability of an uncored well, the HFU method is an extremely precise way to go.The core permeability and the estimated permeability using the HFU approach have been found to agree with one another to a reasonable degree.Finally, predicted RT and estimated K results were applied to three uncored wells of the two models.Fig. 8. Fig. 6.Correlation log-core for predicting rock type.The result for the cored well of both approaches, the left side demonstrates the core data rock types, while the right side represents the logs RT that learn from the core data Fig. 7. K-core-K estimated for both models.The first column shows the porosity from logs and core porosity red dots and the shaped area is for high shale content.The red dots after the first column represent the core permeability and each carve represents the estimated permeability models for each method also their rock types.FZI supervised and unsupervised respectively.

Discussion
The machine-learning petrophysical groups (PG) for both supervised and unsupervised and hydraulic flow zones (HFU) were used to divide the reservoir into four types of rock, going from the best quality rock to non-reservoir rock.RT1 represents the best-quality fluvial sandstone, which has high permeability values and is the best zone for oil production.RT2 represents clean sandstone, which has a lower perm than RT1 but still has good rock quality for production.RT3 represents Shaly Sandstone, not a clean formation with low potential to produce oil, and RT4 represents the non-reservoir rock, Shale, as a cap rock that separates the formation from each other.
For the PG, clustering separates the core data into groups by using the self-organizing map technique; we found there is a difference in the divided groups for supervised and unsupervised.This could be because of the previous learning from SCAL data like capillary pressure and brine core injection, which makes it more proper and convenient than the latter, with only 12 SCAL points (the training data for the supervised classification), which does seem to be enough to make that difference and be better and more compatible with the core data in comparing.Linear regression power one equations were used to extract permeability.There was little difference in having high R^2 results when increasing equation power, but for the sake of simplicity; we chose it to be power one.
The correlation between core and log data using the principal component analysis (PCA), decision tree technique yields compelling rock-typing results.A comparative tool should be used for qualitative variables and verified using core facies (electro-facies) variables.Ancore was the tool for comparative analysis (Schlumberger, 2016), and the outcome is the contingency table shows us how the degree of core rock type to log rock type matching percentage (Pearson, 1904).The table also displays, for varying depths, the summation of sample values from one qualitative variable that corresponds to the sum of sample values from another qualitative variable (C), Cramér's (V) is a statistic that evaluates the degree of dependence or correlation between two nominal categorical variables.Contingency Table 2 shows us the contingency percentage between the core clustering rock type and the modelled -log rock type of the cored well.Cramer's V = 0.83, C = 89.36%for the supervised method, as we can see in Fig. 6.It works well for both RT and K predictions and seems almost like the hydraulic flow unit method.The Cramer's V is 0.86, and C is 89.94% for the unsupervised one, which differs in character from the supervised and HFU.
The hydraulic flow unit method is more fit and compatible with the core data and is more applicable and reliable to use, but this does not mean that the ML-supervised method is not good enough but rather that it is still representative.HFU gave us satisfactory results for both estimations, RT and K (Salman et al., 2023).In both models, ML and HFU, the contingency and Cramer's V are high, which indicates their validity.2 and 3 represent how many samples match the core RT and the log RT for each group.For example, in Table 3: The number of RT1 samples is 74; 62 (84%) of the samples matched, and the same is true for the rest of the rock types.

Conclusions
• Accurately predicting the type of rock is essential for a reliable petrophysical evaluation and for extracting a good permeability model representing the reservoir of the study.• Logs are used to bridge the gap between core samples, as they continuously record.Not unlike the core samples, where plugs were taken from selective depths.• There are significant differences in permeability between rock types but only minor differences in porosity; thus, permeability effectively distinguishes between the various rock types.• Two methods were used: machine learning (unsupervised and supervised), considered a new technique, and hydraulic flow units, the classical method for performing rock typing and permeability estimation.• The supervised machine learning method has better permeability prediction than the unsupervised method due to the learning step before the classification of the porosity and permeability into groups.
• The training data sets from different approaches and techniques make it safe to draw several useful conclusions from the permeability models, which is the most important property of rocks in the reservoir.
• The new techniques, such as machine learning, worked for estimating rock type and permeability but needed to be more precise.It could work more efficiently with more wells that have core or other reservoir data after comparing them with core data.
• The model's output curve trends (permeability curve) of the two methods were almost the same when applied to three uncored wells; however, because the results of the second method, the hydraulic flow zone, were matched and compatible with the core result groups, this is why it was concluded that it is the best.

Fig. 1 .
Fig.1.Fow chart showing rock-type permeability estimation processes workflow for the two methods

Fig. 2 .
Fig.2.a) the left plot illustrates the core permeability and core porosity the 12 points that have SCAI; b) the right plot illustrates the capillary pressure and core saturation of the 12 points that have SCAI

Fig. 3 .
Fig.3.the supervised method, a) distribution of the perm and porosity core groups; b) the group percentage for all core data; c) Statistics of permeability registration equations.

Fig. 4 .
Fig. 4. The unsupervised method, a) distribution of the permeability and porosity ore groups, b) the group percentage for all core data; c) statistics of permeability registration equations.

Fig. 8 .
Fig.8.Permeability prediction and rock typing by applying the two approaches to the uncored wells

Table 2 .
a-Contingency Table of Supervised Petrophysical Rock type; b) Contingency Table of Unsupervised Petrophysical rock type * The main diagonal numbers in Tables