Scientists find molecular patterns that may help identify extraterrestrial life
A new study by a joint Japan/US-based team, led by researchers at the Earth-Life Science Institute (ELSI) at the Tokyo Institute of Technology, reports on a machine learning technique that assesses complex organic mixtures using mass spectrometry to classify them as biological or abiological.
Mass spectrometry (MS) is a principal technique that scientists will rely on in spacecraft-based searches for extraterrestrial life.
The technique can simultaneously measure multitudes of compounds present in samples, and thus provide a sort of "fingerprint" of their composition. Nevertheless, interpreting those fingerprints can be tricky.
As best as scientists can tell, all life on Earth is based on the same highly coordinated molecular principles, which gives scientists confidence that all Earth life is derived from a common ancient terrestrial ancestor.
However, in simulations of the primitive processes that scientists believe may have contributed to life's origins on Earth, many similar but slightly different versions of the particular molecules terrestrial life uses are often detected. Furthermore, naturally occurring chemical processes are also able to produce many of the building blocks of biological molecules.
Since we still have no known sample of alien life, this leaves scientists with a conceptual paradox: did Earth life make some arbitrary choices early in evolution that were locked in, and thus, could life be constructed otherwise, or should we expect that all life everywhere is constrained in exactly the same way it is on Earth? How can we know that the detection of a particular molecule type is indicative of whether it was or was not produced by extraterrestrial life?
It has long troubled scientists that biases toward life forms similar to Earth life might cause their detection methods to fail. Viking 2, in fact, returned odd results from Mars in 1976. Some of the tests it conducted gave signals considered positive for life, but the MS measurements provided no evidence for life as we know it.
More recent MS data from NASA's Mars Curiosity rover suggest there are organic compounds on Mars, but they still do not provide evidence for life.
A related problem has plagued scientists attempting to detect the earliest evidence for life on Earth: Can we tell if signals detected in ancient terrestrial samples are from the original living organisms preserved in those samples, or derived from contamination by organisms that presently occupy the planet?
Scientists at the Earth-Life Science Institute at the Tokyo Institute of Technology in Japan and the National High Magnetic Field Laboratory (The National MagLab) in the U.S. addressed this problem using a combined experimental and machine learning computational approach.
Using ultra-high-resolution MS (a technique known as Fourier-transform ion cyclotron resonance mass spectrometry (or FT-ICR MS)), they measured the mass spectra of a wide variety of complex organic mixtures, including those derived from abiological samples made in the lab (which they are fairly certain are not living), organic mixtures found in meteorites (which are ~ 4.5 billion-year-old samples of abiologically produced organic compounds that appear never to have been alive), laboratory-grown microorganisms that fit all the modern criteria of life, including novel microbial organisms isolated and cultured by ELSI co-author Tomohiro Mochizuki, and unprocessed petroleum, which is derived from organisms that lived long ago on Earth, providing an example of how the "fingerprint" of known living organisms might change over geological time.
These samples each contained tens of thousands of discrete molecular compounds, which provided a large set of MS spectra that could be compared and classified.
In contrast to approaches that use the accuracy of MS measurements to identify each peak with a particular molecule in a complex organic mixture, the researchers instead aggregated their data and looked at the broad statistics and distribution of signals.
Complex organic mixtures, such as those derived from living things, petroleum, and abiological samples, present very different "fingerprints" when viewed in this way.
Such patterns are much more difficult for a human to detect than the presence or absence of individual molecule types.
The researchers fed their raw data into a machine-learning algorithm, and surprisingly found that the algorithms were able to accurately classify the samples as living or non-living with ~95% accuracy.
Importantly, they did so after simplifying the raw data considerably, making it plausible that lower-precision instruments used on spacecraft could obtain data of sufficient resolution to enable the biological classification accuracy the team obtained.
The underlying reasons for classification accuracy remain to be explored, but the team suggests it is because of the ways biological processes, which modify organic compounds differently than abiological processes, relate to the processes that enable life to propagate itself. Living processes have to make copies of themselves, while abiological processes have no internal process controlling this.
"This work opens many exciting avenues for using ultra-high resolution mass spectrometry for astrobiological applications," says co-author Huan Chen of the U.S. National MagLab.
Lead author Nicholas Guttenberg adds, "While it is difficult if not impossible to characterize every peak in a complex chemical mixture, the broad distribution of components can contain patterns and relationships which are informative about the process by which that mixture came about or developed.
"If we're going to understand complex prebiotic chemistry, we need ways of thinking in terms of these broad patterns—how they come about, what they imply, and how they change—rather than the presence or absence of individual molecules.
"This paper is an initial investigation into the feasibility and methods of characterisation at that level and shows that even discarding high-precision mass measurements, there is significant information in peak distribution that can be used to identify samples by the type of process that produced them."
Co-author Jim Cleaves of ELSI says, "This sort of relational analysis may offer broad advantages for searching for life in the solar system, and perhaps even in laboratory experiments designed to recreate the origins of life." The team plans to follow up with further studies to understand exactly what aspects of this type of data analysis allows for such successful classification. ■