Saturday, August 30, 2014

Visualizing Clusters of Earthquakes and Wells in Oklahoma

Clusters of Quakes and Wells 

I decided to look at the spatial correlations of clusters of quake and well data in Oklahoma after the previous blogs showing a strong time-series correlation. The main conclusion of this analysis is that every major cluster of earthquakes is associated with a cluster of injection wells. 


Data cleaning

I took the same data sets as in the previous blogs. I updated the cleaning of well data to include only wells with a MaxPress listed above 1000 (presumably psi) and those listed as "active". The result is a list of 3200 wells active high pressure wells.

Establishing Clusters

A key element of cluster analysis is the choice of the distance function. I played around with various functions and ended up using the geographical distance squared. If you imagine that planar diffusion spreads as a circle this makes sense. 

To speed up calculation I wrote my own Mathematica function for the geographic distance between points

mapsqDistance[x_List, y_List] :=
 Block[{dist},
  dist = (0.663*(x[[1]] - y[[1]])^2 + (x[[2]] - y[[2]])^2)]

It accounts for the latitude of Oklahoma (about 35.5 degrees). Results were accurate to 0.25% between individual points about 70 miles apart.

Mapping the results

I experimented with different numbers of clusters of wells and quakes and generally found about 7-8 was the right range. Fewer and ajor groups were missed. Many more and there was a lot of overlap between clusters. 

A result of a Well cluster run is shown below for NCluster = 7. I did several of these with different random orders of sequence of data points. The clusters were very stable with only minor differences between runs.

Clusters of Injection Wells in Oklahoma


The locations of individual wells as shown, with points color coded for the cluster. The large circles are centered at the median of the cluster data points and the radius of the circle is representative of the number of points in the cluster and their distance apart. 

Radius = 0.004*Sqrt[(WCluster[[i]] // Length)/Norm[StandardDeviation[WCluster[[i]]]]]]

The square root ensures the area of the circle is proportional to the number of points. A large standard deviation is indicated by a smaller circle (less tightly clustered).

Here are the clusters for the earthquakes with NCluster = 8. The tight clusters of earthquakes in the Tulsa area are evident as are less concentrated clusters in the South and Southeast of the State (as revealed by the small circles).



Clusters of Earthquakes in Oklahoma


While the graphs are visually appealing the standard deviation is confounded with the number of points in the cluster. A cluster with a large number of points spread over a large area would be hard to discern from a small tight cluster. 

Going 3D

Making the circle into cylinders solves the problem of confounding. In the plot below the radius of the cylinder is proportional to the square root of the standard deviation (so the area of the cylinder is related to the spatial distribution of points). And I have made the volume of the cylinder equal to the number of points in the cluster (this means the height is equal to the number of elements divided by the standard deviation). 

Oklahoma Earthquake and Injection Well Clusters. Wells clusters are green and earthquake clusters are red. Cylinder volume is proportional to the number of elements in the cluster. The tallest cylinder, cut off for clarity, is about double the height it appears in the graph.
This analysis shows quite plainly the correlation between wells and earthquakes. While the overlaps are not perfect, one can make an intuitive connection between well clusters and earthquake clusters in an almost 1:1 fashion. Every major cluster of earthquakes is closely associated with a cluster of wells. 

The converse isn't true; the well cluster in the panhandle has no earthquake associated with it. However, I discovered quite by accident (forgot to turn on the spatial filtering in the earthquake data cleaning at one point) there is an earthquake clusters in the Texas panhandle just to the south of it. I don't have time this morning but I'll post that analysis later on.

It's interesting that the overall numbers of wells don't correlate well to the number of earthquakes. This might be related to further details (like magnitude of the quake or injection volume) that I haven't yet explored. 



Friday, August 15, 2014

Stepping on the accelerator - Oklahoma earthquakes and injection wells

Introduction

In my last blog I showed some quick data exploration of available earthquake and injection well data, showing both a strong temporal and spatial correlation of earthquakes. 
Of course, strictly speaking correlation by itself does not prove causation, but at the same time it doesn't dismiss the fact, either. If I step on the gas pedal in my car, it accelerates. 

That last analogy got me thinking. If I were going to "gather evidence" that pushing on the accelerator makes my car go, what experiment would I run?

The Simple Experiment 

The simplest experiment might be to push on the gas pedal, measure speed, and then push down again, and see if the speed increases. Without really understanding how my engine works, I would gather evidence (one way or the other) about the relationship between the two. And let's face it, many people out there have little to no idea how an internal combustion engine works, let alone look under the hood of their car, yet through repeated "evidence gathering" are pretty confident their gas pedal makes their car go.

The Data

Data sources were discussed in the previous blog. Oklahoma well data are sourced from the Oklahoma Corporation Commission. Earthquake data are reported by the USGS.

Stepping on the gas

In this situation the gas pedal is assumed to be "well mechanical integrity testing" of the well. Of course, the mechanical link is more complex. Actual well use, or perhaps ultimate waste disposal, may be correlated. But the idea is exactly like the car example above. Push on the gas, see how fast the car goes. Push down more on the gas, and see if the car goes faster...



This graph plots the number of earthquakes and well MIT dates in each calendar year since 1970. There are two steps up in well activity, and in each case it i appears to be accompanies by a rise in earthquake activity. Note that the correlation is not perfect. In the years from 1994 and 2006 a constant level of activity was accompanied by a decrease in earthquake activity. In 2001 no earthquakes were measured, for instance. 

The correlation

When we step on the gas in a car, there's a bit of a lag while "stuff happens" and then the car moves forward. Looking at the above graph it appears there's about a two year lag in the steps of drilling activity and the apparent steps in earthquake activity. The analysis below looks at the frequency of earthquakes in year n as a function of the drilling activity in year n-2. (I also looked at other variations, and this one seemed to offer the cleanest relationship).






The fit works out to about 0.15 quakes per well (The 90% confidence interval is 0.13 to 0.17 quakes per well and the rSquared value of the fit is 0.90). The earthquake and drilling activity since 2007 obviously have the biggest influence on that behavior.

Conclusion

If we step on the gas pedal in our car, the car moves forward, although there are many steps in between stepping on the accelerator (mechanical linkages, computer analysis, fuel flow modulation, spark plug ignition, etc.) the correlation of one action to the other leave little doubt about causality.

Similarly, this data strongly suggests that putting wells into production ultimately is correlated to earthquakes. While the mechanics between the two are not explained by this data, our confidence in the linkage between one and the other is certainly increased by this data.

It looks very much like the gap pedal might be making the car go....

Next steps

There are lots of opportunities to dig in deeper.

As a next step I want to look at the "swarms" that are apparent in the geographical data as shown in the previous blog. Perhaps by narrowing the geographical regions of analysis below the state level to even a few km from the sites of earthquakes (as suggested might be the case here) one might find much stronger correlations.

It's be great to get the actual flow data, but so far I haven't been able to find it online. That data might shed light on the gap in late 90's on earthquake data, one way or the other. It may also explain why the two year gap exists. Is that a meaningful timeframe relevant to real drilling operations? It may even suggest ways to mitigate consequences.

If someone knows where or how to get that data, please make a comment or send a link.



Tuesday, August 5, 2014

Oklahoma Earthquakes and their correlation to Injection Wells

Correlating Oklahoma Earthquakes and Injection Wells


I got interested in the correlation of earthquakes to injection wells when, after hearing a report on the radio, I went to a favorite data source, Wolfram Alpha, and entered "Oklahoma Earthquakes 2004 to 2014." The sharp increase in the frequency of earthquakes recent years was striking. Below is my version of the  data.


Hypothesis: Earthquake locations and timing are correlated to injections wells.

Conclusions: The data below will show qualitatively that the correlations appear to be quite strong.
(Of course correlation does not necessarily prove causation).

Background Info

A collection of background a reference material I've found.
Here is an informative presentation on Injection Wells by an engineer at the Chesapeake Corp.
Real time tracking from Tulsa World.
Instructive video on fracking technology.
Here's a good video of a scientist at the USGS talking about her study.

Data sources

Well data

For injection wells apparently the Oklahoma Corporation Commission is in charge of permitting. Fracking is excluded from regulation as part of the 2005 Energy Policy Act.

I got the data from a Google Fusion Table available here.
Below is a heat map of the OK injection well data.



Earthquake data

Earthquake data is available from the USGS here.
To retrieve data I used the coordinates North 37, South 33.6, East 265.6, West 257. Later in my analysis I restricted data points to be within Oklahoma State borders

I searched back to 1970 and generally used data from Magnitude 2 or greater. This data is plotted  in the intro.

Correlation of Quakes and Wells

Time Correlation

The simplest data exploration is to look at cumulative earthquake and drilling permits.
As a proxy for the well date, I use the Last_MIT_Date for the well from the data, assuming the MIT (Mechanical Integrity Test) is done prior to but close to actual use.


The correlation is unarguably strong.

Geographical Correlation

The next thing to look at is whether the locations of earthquakes correlate to the location of heavy drilling activity.


Indeed there seems to be a high degree of correlation here as well. The heaviest concentrations of drilling activity are correlated with the highest density of earthquakes

Putting it together

We can put the time and location data together in the form of a .gif file. Each snapshot of the animation below  represents a time slice of about 6 months. (I intend to add dates to the animation later)


You have to look quick, but on the last time-slice of the gif the earthquake activity really spikes. This is also seen in the graph in the introduction. Cummulative analysis shows that the number of earthquakes between Jan2013 and Jun2014 more than triples the number of quakes in Oklahoma compared to the total number from Jan1970! There seem to be three big swarms, which will be a subject of further investigation.

It's important to note that the current of correlation of earthquakes and drilling, but itself, does not constitute causation. However, I agree with editorial write Wayne Greene that the correlation is strong enough that the drilling and oil exploration companies have some 'splainin' to do. Umbrellas don't cause rain, but if you see a lot of them, you're dang sure it's raining.

Clearly more disclosure of data will help protect business interests.