Digging on deep data: a real-world global treasure huntDigging on deep data: a real-world global treasure huntYhana LucasBlockedUnblockFollowFollowingJan 31Can data scientists make geology a real science?Geologists are often given a pretty hard time by other scientists (and TV characters) about geology being more of an art than a science; unpacking why unveils a huge opportunity for data scientists.
A mind-boggling 4.
5 billion years of chaos has created Planet Earth as we know it today — so the locations and types of mineral deposits are due to epic space collisions, massive moving plates of rock and magma, and a dynamo in the centre as hot as the sun somehow producing a stabilizing magnetic field.
Going back to basics, most modern sciences follow the scientific method to test and modify hypotheses; observe, measure, experiment, analyse, report and so on.
Well, in geology, it’s just not that easy.
Mineralization processes, including mountain building, earthquake activity and groundwater flow for example, are a tightly-coupled web of causes and effects spanning from the nanoscale to the radius of the earth, over timescales from microseconds to billions of years.
This makes it really hard to build a rigorous, quantitative physical model for ore body formation (beyond the one-hit-wonder that is the planet we live on).
Geologists are, in a way, passive scientists, mere witnesses of the amazing processes that happen around our one Earth.
Since we can’t use physical models to easily explain orebodies at the scales we need to, there’s a huge opportunity for data scientists to explain it with data instead!The current standard for mineral exploration is; an iterative process of collecting different datasets and subjectively applying descriptive geological interpretation to known deposit styles, piece-by-piece, with valuable results that are few and far between.
It’s rather like blind men encountering their first elephant, and each investigating only one part of the creature, but claiming to have a whole picture.
And because the interpretation of the data is carried out subjectively by geologists, there are plenty of examples (e.
Mt Remarkable in Western Australia) where explorers have missed things that were staring them in the face, because they were looking for a different deposit type, or have drilled in the wrong place because they’ve interpreted the geophysics incorrectly, or simply because geologists are better at visualising ore deposit models in two dimensional slices rather than a three dimensional volume.
This status quo of holistic interpretation and testing, guessing at secrets held within the earth’s depths, has contributed to the extended periods of time spent on discovering any deposit.
All of this makes mineral exploration an intriguing realm for data scientists, while making data scientists an invaluable asset to exploration companies.
With such large amounts of ground to cover, data to work with, and stories to tell, it is a perfect area for machine learning to once again prove itself.
Machine learning has catapulted us forwards in several areas, which aren’t unlike minerals exploration in their challenges and opportunities — for example, remote sensing and agriculture.
Deep learning and deep recurrent networks have been applied in remote sensing to perform image processing, interpretation, data fusion and time-series analysis — such as removing cloud cover, classifying objects, detecting and tracking targets.
Remote sensing advancements have even affected the geology world in recent years; for instance, Geoscience Australia, in collaboration with CSIRO (the Commonwealth Scientific and Industrial Research Organisation, Australia’s peak public research-body), has developed algorithms to generate cover depth predictions.
In agriculture, the same machine learning methods have applications for detecting weeds and disease, predicting yield and crop quality, and manage fauna, water, and soil.
But even with the leaps and bounds achieved in these fields, there are still many opportunities where data science could be applied to mineral exploration.
So, how and why?As you might have gathered from the “Geology 101” first-half of this article, it’s not easy to identify economic mineral deposits.
Because of the rarity of positive results, the application of data science isn’t as straightforward as simple cut-and-paste, legacy machine learning algorithms — and physical models for inversion just aren’t available.
The field requires more complex deep learning applications, not simple AI or rudimentary machine learning.
Deep learning in the broader field of AI — Goodfellow et al 2016There are a few particular challenges which the right approaches could overcome.
For instance, whilst recognising different geological data markers is easy enough for a machine, every deposit is different.
So, deposit-finding isn’t as simple as training based on defined positives and negatives in a large training dataset — because each recognisable past scenario will likely never occur again.
However, machines can be guided by strong priors and identify useful information in datasets which may come together to suggest a deposit.
Geological data is also complex and highly dimensional, but it’s also typically quite constrained.
For example, we know that certain minerals will always form together based on rock composition, temperature and pressure.
Manifold learning, which seeks to reduce dimensionality of non-linear datasets, could help solve the challenge of separating large-scale geological processes present in the dataset (e.
regional metamorphic gradient) from smaller-scale processes indicative of a deposit (e.
local chemical changes around veins which host gold).
Warwick Anderson, founder of OreFox who is using artificial intelligence techniques in mineral exploration agrees, telling us “machine and deep learning are showing massive potential.
”“I think where geologists and data scientists need to work together is on things like normalization, transforms and outlier detection — or removal!” Anderson said.
“Geological maps can be massively extrapolated, so what value can we place on data from it?” Multiscale approaches could be brought in here to handle data variety and sparse aspects — to construct useful models in mineral exploration despite huge, noisy datasets.
As an added bonus, applying data science will also move the field away from descriptive, subjective analyses, towards an empirical, recreatable science.
“Geology is chaos,” according to Anderson.
“[It] can be very biased… [using] visual methods to determine rock types.
Rock types can change to other similar types in space of a metre; granite to granodiorite for instance, which may have a statistical difference.
”But with the creation of feedback loops in data science methods, it will ensure that no knowledge goes forgotten or ignored when the right geologist doesn’t happen to be looking at the right data at the right time.
Data science isn’t a panacea of course, and no method is able to solve the problem of resource discovery in one go.
To move forward, domain reduction is crucial, to progress towards a manageable problem and improve on the status quo.
To this end, an Australian project is opening up years of data on a large exploration project for data scientists to have at it, as part of a crowdsourcing competition with a A$ one million dollar prize pool.
Data scientists and geoscientists working together on data (photo courtesy Unearthed Solutions)The reward for geologists is greater than just financial recompense though.
The full history of earth’s structures has puzzled man for centuries, and minerals and metals are in need of communications devices, transport, energy, and medical technology.
Finding economic deposits of much-needed materials is exactly like looking for a proverbial needle in an equally proverbial haystack.
It’s a bit overused in the data science space, the needle-haystack analogy, but it fits.
And it fits just as much in geoscience; we could actually cut the whole hay and needle example and say that data science is often like looking for economic mineralizations of needed materials, and vice versa, but it’s not quite as pithy.
Data science moves much faster to find the needle than mineral exploration, and finds far more lost treasures amongst the hay in the process.
And the geo world has many literal treasures yet to be uncovered!Thank you to Holly Bridgwater, Jess Robertson, and Warwick Anderson for their input on this article.