Blog

The Pause That Refreshes

11 March 2024

A pair of disc galaxies in the late stages of a merger. NASA
A pair of disc galaxies in the late stages of a merger.

The Universe is filled with supermassive black holes. Almost every galaxy in the cosmos has one, and they are the most well-studied black holes by astronomers. But one thing we still don’t understand is just how they grew so massive so quickly. To answer that, astronomers have to identify lots of black holes in the early Universe, and since they are typically found in merging galaxies, that means astronomers have to identify early galaxies accurately. By hand. But thanks to the power of machine learning, that’s changing.1

With the power of current and future sky surveys, the challenge of astronomy is less about capturing the right data and more about filtering out the right data from the vast trove we gather. It takes a tremendous amount of skill to distinguish a true merging galaxy from an irregular galaxy or two independent galaxies that just happen to be seen in the same patch of sky. People can be trained to do it well, but the need for skilled identifiers far surpasses the number of skilled people. One way to overcome this is to allow volunteers to fill the gap. In general, their identifications won’t be as accurate as the professionals, but a bit of statistics will allow astronomers to glean useful information.

True positives vs false positives in machine learning identification. Avirett-Mackenzie, et al
True positives vs false positives in machine learning identification.

This new study takes a different approach. Rather than having experts train volunteers, they used experts to train machine learning algorithms. That’s easier said than done. Even the most skilled expert will occasionally make mistakes, or have certain biases, and any software trained on that expert will have the same biases. So the team partnered with the Big Data Applications for Black Hole Evolution Studies (BiD4BEST), which is an EU project that provides a training network for black hole evolution data. Together they used skilled experts to identify black hole mergers in both simulated data and data from the Sloan Digital Sky Survey (SDSS). By comparing the two, the team could remove biases from the machine learning data. The result was pretty successful. When algorithm sortings were compared to simulated mergers they found it had an accuracy of well over 80%, comparable to that of the most skilled experts.

The team then used the software to identify more than 8,000 active black holes and found an interesting connection between the growth of black holes and their galaxies. It isn’t galactic mergers that trigger the growth of supermassive black holes, but large quantities of nearby cold gas. The team found that mergers only drive rapid growth when they involve the merger of star-forming galaxies rich in gas and dust. Thus, the same conditions that lead to star formation also lead to supermassive black holes. This is part of the reason why galaxies and their black holes seem to grow in parallel.

As we continue to capture astronomical data at an almost exponential rate, software will be a necessary complement to skilled observers. As this study shows, the two can be used together effectively.


  1. Avirett-Mackenzie, M. S., et al. “A post-merger enhancement only in star-forming Type 2 Seyfert galaxies: the deep learning view.” Monthly Notices of the Royal Astronomical Society 528.4 (2024): 6915-6933. ↩︎