Summary
Real World Accuracy for Most Locations on Earth:
Type | Standard Error | RMSE | Worst Error |
---|
Single Point | 0.8 °C | 1.2 °C | 4.2 °C |
Borehole Average | 0.7 °C | 1.1 °C | 3.9 °C |
Inaccurate Zones:
Predictions will be less accurate than the above numbers if...
The ground is influenced by strong nearby volcanic activity or a hot spring
Photo by the Bureau Of Land Management via
Flickr CC2.
Photo by the Royal Olive via
Flickr CC BY-NC-ND 2.0.
In an extreme arctic climate (average annual temperature of -5 °C or lower)
The ground is in the direct path of cold mountain water runoff (ex. steep mountain valleys)
Photo by the David McKelvey via
Flickr CC BY-NC-ND 2.0.
Photo by denisbin via
Flickr CC BY 2.0.
On the coast of an ocean or large lake, where cold water can seep into the nearby ground.
Limitations:
Our model cannot predict...
- Ground that is under water
- Antarctica
- Extreme geothermal hot spots (model is limited to predict a maximum geothermal gradient of 80-120 °C/km depending on depth)
Visualizing The Accuracy
Accuracy By Depth
Error Magnitudes (excluding "Inacurate Zones", see below)
Predicted Temperatures vs Measured Temperatures
Open this page on your desktop to view interactive accuracy map.
Maps Made using our Predictions
@ 1 meter Depth
15-200 meters Depths
Testing Methodology
These metrics on this page represent the real world performance of GTP even in areas where no data exists.
We know this because the data used to generate the graphs and metrics on this page is from an isolated dataset, which the AI/ML had zero exposure to prior to release. This testing data is separated from all training and validation data by geographic location, data-source, and many other variables. This separation was achieved using an in-house algorithm designed to select the data which was most unique, and therefore the most difficult to predict.
Image: Testing Dataset Locations Colored by Data Source
In order to eliminate possible systemic biases in data based on the source of that data, we removed data-sources from the main dataset entirely, and reserved them for the final testing process. In this map you can see the geographic testing locations colored by data-source. Our testing set includes over 6500 datapoints, gathered from 63 unique data sources, primarily peer-reviewed scientific papers, but also including government reports, and other sources.
Image: Demonstration of Geographic Exclusion Circles used to Isolate Testing Data
Our testing dataset includes 546 unique locations. One of the most essential parts of our testing methodology was the creation of the circles you see in the image below. We used a complex algorithm to select 136 different locations on earth, and then drew a 100km diameter circle around them. In order to simulate making predictions in areas with no data available ALL data inside these circles was removed from training and validation datasets and ONLY used during testing, so the ML/AI had no prior knowledge about these areas. The statistics you see on this page (for example "Standard Error = 0.8 °C") are derived only from predictions made inside these circles which the ML/AI has no prior knowledge of.
Image: All Data Locations Colored by Data Source
Our training and validation datasets (separate from the testing dataset used for our accuracy statistics) include over 500,000 datapoints, from 183 unique data sources. This map shows roughly where, geographically, that data lies. Note that there are large areas with no data at all, this is why we are so careful with our ML and testing methodology, to allow us to make accurate predictions even in areas with no data, since there are many areas of the world where doing physical sensor measurements is not feasible due to geography or economics.
Additional Note - Oversampling:
Before calculating the accuracy statistics on this page, the testing dataset was "oversampled" (some rows were duplicated) in order to better represent its whole-world performance, and reduce its bias towards the locations and data sources with higher volumes of data. Some of the data sources we used, although high in quality, had very few data points, and likewise some of the geographic areas we reserved for testing had very few data points within them, and we did not want them to be under-represented in our final statistical analysis. The oversampling algorithm first oversampled points from data sources that were under-represented, and then did a second pass, oversampling from geographic testing circles that were under-represented.
Details on Inaccurate Zones
Volcanic Areas and Hotsprings
Zone Definition
Within 30 km of a volcano AND anywhere in Iceland (due to Iceland's abnormal geology).Photo by the Bureau Of Land Management via
Flickr CC2.
Prediction Accuracy in this Zone
Type | Standard Error | RMSE | Worst Error |
---|
Single Point | 2.0 °C | 3.2 °C | 11.3 °C |
Borehole Average | 1.7 °C | 3.0 °C | 7.3 °C |
This current version of our model can predict many locations even
directly on active volcanoes accurately, however the areas directly
around magma and the hotsprings that frequently are found around
volcanoes are currently not accurately predicted by our model. Temperatures in
these areas can be extremely high, leading to the possibility of very
large errors when near a volcano. The "Worst Error" listed above was
based on our testing dataset, however we did not include extremely hot
surface springs (20+ °C above the ambient temperature) or magma pockets
in our testing dataset, and so if the testing location is directly on a
powerful hotspring or magma pocket the "Worst Error" could be much
higher than shown.
Extreme Cold Climates
Zone Definition
Average annual temperature of -5 °C or lower.Photo by the Royal Olive via
Flickr CC BY-NC-ND 2.0.
Prediction Accuracy in this Zone
Type | Standard Error | RMSE | Worst Error |
---|
Single Point | 1.7 °C | 3.6 °C | 7.4 °C |
Borehole Average | 1.2 °C | 3.4 °C | 4.5 °C |
What we have observed testing our model in extremely cold
climates is that it tends to underestimate how cold permafrost
can get, especially at deeper depths. In other words the
geothermal gradient it predicts is too high.
Areas with Highly Abnormal Groundwater Flow
Zone Definition
Areas in the flow path of water that is abnormally cold because it comes from nearby areas of much higher altitudes, such as nearby mountains. Our model has an especially hard time with these areas when they are also in an arid climate.Photo by the David McKelvey via
Flickr CC BY-NC-ND 2.0.
Prediction Accuracy in this Zone
Type | Standard Error | RMSE | Worst Error |
---|
Single Point | 1.4 °C | 2.0 °C | 6.0 °C |
Borehole Average | 1.3 °C | 1.8 °C | 4.8 °C |
In our next version of this app we are going to be adding a
complex groundwater model which will allow it to predict
temperatures in areas that are highly affected by groundwater
flow.
Coastal Areas
Zone Definition
Areas with porous ground near large bodies of water such as oceans or large lakes. In very humid climates generally this temperature effect either does not exist or only extends a maximum of 1km inland, but in very arid areas with porous ground layers seawater can intrude up to 30km with some effect on ground temperatures.Photo by denisbin via
Flickr CC BY-NC-ND 2.0.
Prediction Accuracy in this Zone
Type | Standard Error | RMSE | Worst Error |
---|
Single Point | 1.6 °C | 2.3 °C | 7.0 °C |
Borehole Average | 1.4 °C | 1.9 °C | 5.8 °C |
This source of inaccuracy is something that we believe we can fix fairly soon, but for now if you are predicting in coastal areas be aware that the predicted temperatures will likely be too high. This is especially true the more surrounded your point is by the water (for example a spit, peninsula or small island).
Human Activity
Zone Definition
Either in the shallow soil (< 10 meters) where recent construction has occured (ie. new suburb, highway, etc) and/or within a few hundred meters of a geothermal well, fracking, reinjection well or industrial wastewater disposal well.Photo by Warren via
Flickr CC2.
Prediction Accuracy in this Zone
Single-Point Prediction Accuracy in this Zone
Standard Error | RMSE | Worst Error |
---|
1.3 °C | 2.3 °C | 5.3 °C |
Our predictions are generally very accurate in urban areas, and even account for the urban heat effect, however there can be localized areas of inaccuracy where we lack data about a thermally-significant human activity. In our testing data this inaccuracy was most often seen on the outskirts of very large cities.
How to Verify the Accuracy Yourself
We want you to have the same confidence in these predictions as we have, and the best way to achieve that is to test the model yourself. We encourage this, however there are some important things to keep in mind when doing so to ensure that your test is valid...
Input Location Accuracy is very Important:
- Make sure you input the exact location of the temperature measurement you are comparing against. In some areas, especially mountainous areas, ground temperatures can change significantly over short distances (tens to hundreds of meters). If you use a location within 250 meters of the real location that is good enough in most situations, but the more accurate your location the better, especially in areas with steep terrain.
Measured Temperature does not always equal Ground Temperature:
- This model is intended to predict ground temperatures (a.k.a. "formation temperature") which is NOT always the same thing as the temperature of water in the ground, and is certainly not the same thing as the temperature of water flowing from a borehole (see below if this is the only type of data you have). In most cases your data WILL be from a fluid-filled borehole, so you will have to take a few things into account when comparing it to our predictions...
- When comparing temperature measurements to our predictions we must account for the following factors:
- Drilling Disturbances: The most common and significant source of error in borehole temperature logs; drilling disturbance is a factor when a measurement is made in a borehole too soon after drilling has been completed. For shallow boreholes the recommended time to wait before measurement is at least 5 days (and this recommended time increases for deeper boreholes, up to several months), however this depends a lot on the circumstances of the drilling. The greater the difference between the drilling fluid temperature and the ambient ground temperature, and the more porous the ground, the longer you will have to wait before getting an accurate temperature reading from a borehole.
- Groundwater flow disturbances: When there is not a casing along its entire depth, the borehole can sometimes create a new flow path between soil or rock layers, resulting in singnificant vertical water flow within the borehole. In these cases fluid temperatures will bear little resemblance to actual ground temperatures.
- Temperature testing methodology: Temperature readings should be taken while the sensor is being lowered, not being raised, in the borehole. And the lowering should be done very slowly to avoid disturbing the fluid column as much as possible.
- Depth: Generally the top sections of the fluid column (several meters up to tens of meters, depending on conditions) have temperatures that deviate further from the surrounding ground temperature due to the temperature of the air and/or seasonal temperature changes that travel differently through water than through the ground.
- Convection currents: Cold water sinking and hot water rising (tending to reduce the overall measured gradient of a fluid column compared to the reality of the surrounding ground) are also a potential source of error, especially in larger diameter boreholes. The smaller the diameter the borehole, the less of an issue this is.
How to compare to TC / TRT test results:
We do not expect our models predictions to match TC/TRT test results for specific depths (due to the inaccuracies of measuring ground temperature by depth by pumping fluid out of a borehole). However you can still validate that the model is accurate by comparing the whole-borehole average temperatures, as long as the TC/TRT test was done properly (for example adequate time was allowed for drilling disturbances to dissipate).
How to compare Fluid-Producing well test results:
This applies for example to water wells where fluid is pumped out of the well and the temperature of that fluid coming out of the well is measured.
It is very hard to get accurate ground temperature estimates from this measurement method, because the fluid temperature exiting the well is primarily a representation of the ground temperature at the specific depth where the fluid is coming from, but it is also influenced by the ground temperatures above that as it flows upwards through the well. You should only use this as a last resort validation method, and take the results with a grain of salt.
To ensure the best comparison with this method you need to determine from what depth, or range of depths the fluid is being extracted from, and also if the fluid will spend sufficient time travelling UP the well to be influenced significantly by the temperatures at shallower depths. Compare your temperature readings to our predictions for the average temperature over the depth range where the fluid is coming from, and bias this depth range towards the shallower depths if the fluid is spending a lot of time flowing up the well.
Selected Study Comparisons from Testing Dataset
Istanbul
Model results are reperesented by the dashed lines, solid lines are the measured data from the study.This graph compares the data from Fig. 9 of the study to the predictions made by our model for the exact same times, depths and locations. This data is in the very difficult to model seasonal variation zone, and the model is still able to predict the temperature with a very high degree of accuracy.
Winnipeg
Borehole 'GSC 7001' is the most thoroughly measured borehole in this study. This graphic compares the measurements from Figure 5 of the study, to the predictions made by our model for the exact same times, depths and locations. Model results are reperesented by the dashed lines, solid lines are the measured data from the study. Our predictions are all within 1 degree of the measured data, but as you can see the shape of the temperature profile is different, which is why although our model is accurate for a given depth and point, the thermal gradients it predicts for the top 200 meters may still be inaccurate.
Zurich
In Zurich
Near Zurich Influenced By Mountain Runoff
Comparison to data from Fig. 6 from this study demonstrates again our model's accuracy even in areas affected by the Urban Heat Effect. However, this study also, once again, demonstrates that our model, even where it provides accurate point temperature predictions, does not guarantee accurate shallow thermal gradients.
More Information
This Ground Temperature Predictor (GTP) application is a hybrid physics and machine learning software.
Using datasets collected from governments and researchers around the globe, with total datapoints in the millions, the algorithm was trained with Artificial Intelligence (AI)/Machine Learning (ML) using extensive hyperparameter searching, statistical data processing, and traditional thermodynamic modeling to ensure robust data projections.
GTP leverages large quantities of covariate data for key geospatial, topographical, geophysical, thermophysical, climatic, and human impact variables to generate predictions.
This software has been thoroughly tested against measured data and is comparable to traditional calculation methods for ground temperature when the thermal properties of the exact site have already been measured (which almost never happens). Otherwise, for the vast majority of situations, GTP far outperforms traditional methods, and not only in terms of accuracy: acquiring ground temperature estimates with this tool is many times faster and easier than conventional methods.
This software is continuously updated by the experts at Umny Inc. to further improve its accuracy and speed. If you have questions, feedback, or experimental data you would like to test, please feel free to send us an email.