Stanford, California - Borrowing insights from techniques used to image cancer, Stanford scientists have devised a new method for generating "training images" that can be used to fine-tune models of uncertainty about underground physical processes and structures.

Having an accurate picture of the planet's subsurface is crucial for rational decision-making across a wide variety of activities, including environmental cleanup, oil and natural gas well-drilling, and the underground sequestration of greenhouse gases that contribute to climate change.

But oftentimes, it's impossible to know all of the details about a subsurface system. It's therefore important to be able to systematically calculate the myriad unknown configurations the system might take – in other words, the uncertainty associated with that system – and then collect data to rule out unlikely scenarios.

"If you're using very simple models that represent the subsurface as simply layered, or homogenous, then you could be in for a rude shock when you start drilling for oil or water, or injecting cleanup chemicals into a contaminated aquifer," said geostatistician Jef Caers, a professor of geological sciences at Stanford School of Earth, Energy & Environmental Sciences.

"The wells you drill could be dry and the cleanup project that you thought would take a few months could take years. These things have actually happened, and it was due to faulty assumptions about uncertainties," Caers said.

Fixing the picture

The solution is to work from more realistic uncertainty models of the subsurface, but studying the deep Earth is difficult. The steady buildup and erosion of sediments that leads to the creation of complex structures and layers in the subsurface takes place over thousands or hundred of thousands of years and is not easily observable.

"Ideally, we would slice a riverbed to see its vertical profile, but in reality you can't do that," said Céline Scheidt, a senior research engineer in the Department of Energy Resources Engineering.

Instead, scientists use a combination of real-world and computer-generated models to replicate the physical process or system that is of interest – for example, the deposition of sediments at the mouth of a river or the complex accumulation of rock layers. They then use geostatistical tools such as multiple-point statistics (MPS) to quantify uncertainty about the subsurface.

"We want to fundamentally understand uncertainty in naturally variable systems," said Caers, who recently published a book on MPS. "For example, in a deltaic environment, you want to know all the possible deltaic configurations you can possibly imagine. Then, when you get subsurface data about the site, you can rule out those configurations that are not consistent with the data."

Better results by training

MPS algorithms have gained popularity in the past decade due to their ability to produce geologically realistic representations of actual variability. MPS works by using "training images" to narrow down the possible permutations that a system can take. Training images contain numerical patterns that are deemed representative of a particular geologic system. Typically, only one training image is used, but for situations where large uncertainty is present, this is not sufficient.

In a new study, Scheidt, Caers and their colleagues devised a new method for selecting multiple training images that can be plugged into MPS algorithms to help simulate natural variability.

The first step involved creating a miniature river basin in a lab and running an experiment that was designed to replicate the dynamic deposition of sediments in a delta. The experiment, conducted by Anjali Fernandes and Chris Paola, a postdoctoral researcher at Tulane University and a sedimentologist at the University of Minnesota, respectively, was designed to simulate tens of thousands of years' worth of river deposition in a few days.

An overhead camera took snapshots of the experiment at one-minute intervals, and the team selected 136 images – representing about two hours – for analysis.

Once the team had a pool of images from which to select training images, the questions then became, how should the training images be selected? And how many images are required to capture the variations in water flow and sediment deposition in the experiment?

"By examining the images, it became clear that more than one training image was needed," Scheidt said, "but using all of the images as training images would be impractical and unnecessary, as most likely not all of the images provide new information."

Taking a cue from cancer

The solution to their dilemma came from an unusual source. Lewis Li, a graduate student in Caers' lab, mentioned the problem to his roommate, a neuroscientist, who told Li about the "demon algorithm," which doctors use to gauge the efficacy of cancer treatments.

"The demon algorithm calculates the strain acting on a moving image over time," Caers said. "Doctors use it to determine whether a tumor's shape change is due to radiation therapy by seeing where major morphological changes to the tumor are occurring and how those changes correlate with times of treatment."

While the demon algorithm is primarily used to spot major changes in a system, Scheidt realized it could also be used to analyze a series of images and select those taken during periods when changes to a system are small and similar in nature. When the team applied the demon algorithm to their 136 snapshots, it chose six images that it deemed to be representative of the overall types of depositional changes occurring throughout the experiment.

When the team used those six images as training images for the MPS algorithm, they found that the variability generated by their geostatistical method matched well with the "natural variability" represented by the overhead snapshots of the experimental basin.

"This research is a first step toward bridging the fundamental gap that exists between those fields of science focusing on the physical understanding of natural systems and the statistical representation of uncertainty when modeling complex systems," Caers said.