A team of astronomers from the European Space Agency has pioneered an artificial intelligence solution that could revolutionize the search for rare astronomical phenomena in the Milky Way and beyond. David O’Ryan and Pablo Gómez developed a method that leverages AI to efficiently analyze the vast array of images collected over decades by the Hubble Space Telescope.
This innovative technique enabled O’Ryan and Gómez to identify over 1,300 previously unknown exotic stellar systems and other astronomical phenomena that had not been documented in scientific literature before.
The research heavily relied on the Hubble Legacy Archive (HLA), a resource that has amassed an extensive collection of space-based astronomical data from tens of thousands of distinct observational programs over the past 35 years. Within the HLA, there are nearly 100 million “cutouts” – smaller sections of the night sky captured by Hubble. Each cutout might contain one distant galaxy or another intriguing celestial object, making it nearly impossible for any individual to manually review all of them within a lifetime.
According to O’Ryan, the lead author of the study published in the journal Astronomy & Astrophysics, nearly 35 years of archived observations from the Hubble telescope have culminated in a formidable dataset ripe for uncovering astrophysical anomalies.
The Importance of Rare Objects
Rare astronomical objects, such as colliding star systems, gravitational lenses, and ring galaxies, are essential for advancing our understanding of the cosmos. They reveal insights into galaxy formation, demonstrate how gravity distorts light, and deepen our knowledge of how gas behaves under extreme conditions. However, due to their rarity and their tendency to be hidden among more common galaxies, finding these objects has historically posed significant challenges for astronomers.
Traditionally, astronomers have depended on specialized visual searches and the involvement of citizen scientists to spot potential anomalies. While these conventional methods have proven effective, they are often outpaced by modern imaging technology.
Historically, telescopes focused on individual objects. However, the latest telescopes can survey extensive areas of the sky and gather immense volumes of data.
When O’Ryan and Gómez sought to create an AI tool for automatic image processing of celestial objects captured by various telescopes, they developed a system named AnomalyMatch.
Training AI to Detect the Unusual
AnomalyMatch employs a neural network algorithm—a form of AI that learns to discern patterns in images—operating similarly to the way the human brain processes visual information. Instead of generating an exhaustive list of anomalies, the system prioritizes classifying objects into either normal or abnormal categories.
This distinction is crucial, given that few examples of certain anomalous objects exist. For instance, in this study, the initial training set included only three images of rare edge-on disk-forming planets alongside 128 “normal” images. The challenge was compelling, as nearly 100 million unlabeled images offered a myriad of potential anomalies to analyze.
To train the neural network, both labeled and unlabeled images were simultaneously utilized. Another key feature of the AnomalyMatch system was the active learning process, where, after each training round, the AI ranked images based on their degree of anomaly, subsequently presenting the most unusual cases to an expert reviewer.
The outcome was a system that continually refined itself through a systematic method that enhanced human expertise. AnomalyMatch’s search marked the first systematic analysis of the Hubble Legacy Archive utilizing AI. Within approximately 70 hours, the AI scrutinized nearly 100 million Hubble images, with a training period of fewer than four hours, relying on advanced computing resources.
Discoveries Made by the AI
After filtering through the top results, approximately 5,000 candidates were examined, leading to 1,339 unique anomalies after removing duplicates. Notably, more than 800 of these had never been mentioned in scientific literature.
“This is a remarkable application of artificial intelligence to enhance the scientific value of Hubble’s data,” remarked co-author Gómez. “It’s astonishing that so many anomalous objects remained to be uncovered in the Hubble dataset.”
Most of the newly discovered objects seemed to involve interacting or merging galaxies. These systems often exhibited warped shapes, frequently featuring multiple bright nuclei or elongated streams of stars and gas formed through gravitational interactions with companion galaxies. In fact, around half of the objects identified in this search were merger systems.
A Search Unlike Any Other
This search also unveiled over 100 candidate gravitational lenses—massive galaxies that create a unique bubble in space by distorting the fabric of space-time. Such distortions alter the light from more distant galaxies, resulting in mesmerizing arcs or rings. Gravitational lenses allow scientists to delve deeper into dark matter research and can magnify distant galaxies that might otherwise go undetected.
Researchers also identified new jellyfish galaxies trailing long streams of gas while navigating through dense star clusters, irregular galaxies containing numerous massive star-forming regions, and exceedingly rare ring galaxies formed through violent mergers. Furthermore, the AI detected many edge-on disks that are in the process of forming planets, appearing in a variety of colors and showcasing unique butterfly-like shapes. Numerous additional objects were also found that could not be categorized into any single group.
Bracing for a Data-Rich Future
The Hubble Observatory is just one of several data-rich instruments that will be operational in the near future. One prominent new telescope will be the European Space Agency’s Euclid mission, slated to begin surveying billions of galaxies in 2023. The Vera C. Rubin Telescope is also set to commence a decade-long survey expected to generate around 50 petabytes of imagery, while NASA’s Nancy Grace Roman Space Telescope has a launch scheduled for May 2027.
With the rapid influx of new telescope data, systems like AnomalyMatch will become essential. This technology can be trained on the vast amounts of data produced by upcoming space telescopes, enabling machine learning systems to evolve as new datasets arise, requiring less human oversight while pinpointing critical targets for further examination.
The authors noted that AnomalyMatch excels in identifying key image features, such as tidal tails or arcs of light, rather than being distracted by random noise.
Practical Implications of This Research
The findings from this research will empower astronomers to utilize artificial intelligence and machine learning strategies to cope with the ever-growing volume of astronomical data. By efficiently identifying rare astronomical objects, researchers can gather larger sample sizes to test theories concerning galaxy evolution, gravitational forces, and dark matter.
This methodology will also allow astronomers to invest more time in analyzing results rather than searching for rare objects. While this study primarily focused on astronomical research, the techniques outlined present a potential framework applicable to other scientific domains grappling with rapidly expanding data.
Fields such as medicine and climate science could similarly benefit from the integration of machine learning and expert analysis to detect rare occurrences within extensive datasets. As telescope technology advances, systems like AnomalyMatch are poised to facilitate the discovery of entirely new types of astronomical entities as future observatories embark on deeper and more detailed surveys of the universe.