An AI that taught itself what noise looks like found three times more galaxies in JWST data

A team at Tsinghua University built a neural network called ASTERIS that strips structured noise from telescope images, without ever seeing a clean reference frame. Applied to JWST deep-field data, it recovered galaxies one full apparent magnitude fainter than previous pipelines could detect, and tripled the number of galaxy candidates at redshifts above 9. The paper landed in Science in April 2026, and the code is on GitHub.

One magnitude sounds modest until you remember the scale is logarithmic. A gain of 1.0 mag means detecting objects 2.5× dimmer. That’s roughly equivalent to doubling the mirror area of the telescope that took the data. ASTERIS achieves it with software alone.

The problem: noise isn’t random

Every astronomical image is a sum of signal and noise. Stack enough exposures and the signal grows faster than the noise. That’s why my Seestar runs 30-second subs and combines hundreds of them. But the standard assumption is that noise is uniform and Gaussian, and at professional depths it isn’t. Detector artifacts, scattered light, thermal gradients, and persistence from bright sources create structured, non-stationary noise patterns that survive conventional stacking.

These patterns set a floor. Below it, faint objects drown in correlated noise that doesn’t average away with more integration time. You could double the number of exposures and still not reach the next magnitude if the noise is spatially structured. For JWST’s Near Infrared Camera (NIRCam), that floor has been the practical limit on how deep the cosmic dawn surveys can reach. And JWST time is allocated in hours-long blocks that cost months of proposal review. If software can push the floor lower, it saves telescope time that no committee can create.

How ASTERIS works

ASTERIS (Astronomical Self-supervised Transformer-based Denoising) reframes the multi-exposure dataset as a three-dimensional spatiotemporal volume: two spatial dimensions plus time, where time is the sequence of individual exposures. A hybrid architecture combining 3D attention blocks with convolution layers processes this volume end to end.

The self-supervised part is what makes the system generalizable. Traditional supervised denoising needs paired data: a noisy image and a corresponding clean ground truth. Those pairs don’t exist for real astronomical observations. You can’t photograph the same patch of sky with a noise-free telescope. ASTERIS instead masks portions of the input volume and trains the network to predict the masked pixels from their spatiotemporal neighbours. The noise model is never specified. The network learns what noise looks like in a given instrument by seeing how pixel values vary across exposures, and it learns what signal looks like by seeing spatial coherence that persists across time.

Because the training scheme makes no assumptions about the noise distribution, the same architecture works on data from different instruments. The paper validates on both JWST/NIRCam and the Subaru telescope’s Hyper Suprime-Cam, two instruments with very different noise characteristics, without retraining from scratch.

What it found

The benchmark numbers are clear. On synthetic injection tests (fake sources inserted into real images at known positions and brightnesses), ASTERIS pushes the 90%-completeness detection limit 1.0 magnitude deeper while holding purity at 90%. Point spread function shape and photometric accuracy are preserved. The denoiser isn’t hallucinating structure or biasing flux measurements.

Applied to JWST’s Cosmic Dawn deep fields, the results are more dramatic. ASTERIS identified over 160 galaxy candidates at redshift z ≳ 9, roughly three times the count from previous reduction pipelines working the same data. These are objects whose light has been stretched by cosmic expansion for over 13 billion years, galaxies that existed when the universe was less than 500 million years old. Their rest-frame ultraviolet luminosities are about 1.0 magnitude fainter than what earlier methods could pull from the noise.

That number matters for a specific reason. The faint end of the ultraviolet luminosity function at z > 9 constrains how many small, young galaxies were forming stars in the first few hundred million years after the Big Bang. Those galaxies are the ones thought to have produced enough ionising photons to end the cosmic dark ages and reionise the intergalactic medium. A threefold increase in known candidates at that epoch changes the statistics on which reionisation models survive.

The validation images also revealed low-surface-brightness galaxy structures and gravitationally lensed arcs that weren’t detectable before. These are faint extended features that sit right in the regime where structured noise dominates.

Why self-supervised matters

The self-supervised training scheme is what lets ASTERIS generalize across instruments.

Supervised denoising methods for astronomy do exist. They work well on the specific instrument they were trained on, and they break when you hand them data from a different camera, a different dither pattern, or a different sky background. Each new dataset needs new training pairs, which in practice means expensive simulations that may not capture the real instrument’s noise floor.

ASTERIS sidesteps this by learning the noise structure directly from the observation it’s about to denoise. Give it a new stack of exposures from a telescope it has never seen, and it adapts. The Tsinghua team showed this explicitly by running the same trained network on Subaru/HSC data without fine-tuning — an instrument with a ground-based atmosphere, a completely different detector, and a different pixel scale. The denoiser still improved detection limits.

For the field, this is a practical gain. JWST and Rubin Observatory will produce data at a rate that makes per-instrument, per-survey retraining expensive. A self-supervised approach that bootstraps from the data itself fits the workflow far better.

What this means for the rest of us

I run a Seestar S50 from a Bortle 7 balcony in Nicosia. ASTERIS isn’t coming to my laptop any time soon. The method processes multi-exposure dithered survey data, not single-target live stacks. But the principle underneath it is the same one that makes live stacking work on smart telescopes: combining temporal information to separate signal from noise.

The gap between professional and amateur pipelines has been closing for years. Plate solving was once an observatory-only technique; now it runs on a £400 smart scope. Grading and rejection of bad sub-frames, routine in amateur astrophotography software like Siril and PixInsight, is a simpler version of the same spatiotemporal filtering ASTERIS performs. The self-supervised trick — learning noise from the data rather than from a model — is the kind of idea that could migrate downward once compute costs drop and someone packages it into a plugin.

The more immediate impact is scientific. Every deep-field image ever taken by JWST is now, in principle, deeper than anyone thought. The raw exposures sit on the MAST archive, publicly accessible. ASTERIS can be run against them retroactively. The team released the code, the weights are trainable from the target data itself, and the pipeline doesn’t need telescope-specific calibration files beyond what’s already bundled with the exposures. You don’t usually get to upgrade your telescope after the observations are done.

The numbers

Detection gain: +1.0 apparent magnitude at 90% completeness and purity
High-z yield: ~160 galaxy candidates at z ≳ 9 (3× previous methods on the same data)
Architecture: 3D attention-convolution hybrid, self-supervised masked prediction
Validated on: JWST/NIRCam, Subaru/HSC
Paper: Guo et al. 2026, Science (DOI: 10.1126/science.ady9404)
Code: github.com/freemercury/ASTERIS_THU
Preprint: arXiv:2602.17205

The problem: noise isn’t random#

How ASTERIS works#

What it found#

Why self-supervised matters#

What this means for the rest of us#

The numbers#