The activated sludge (AS) process is used to treat sewage or industrial wastewater. In this process, pollutants are removed by a diverse group of microorganisms. Because AS is a unique, controllable engineered ecosystem, its attributes make it attractive to ecologists studying microbial community assembly. A recent study reports a new machine learning approach that can distinguish metagenome-assembled genomes (MAGs) of AS bacteria from those of other environments. Using this method, the researchers identified some functional features that are likely viral for AS bacteria to adapt to treatment bioreactors. They found that few microorganisms are shared between different wastewater treatment plants, although some AS MAGs may have been missed due to short sequencing read length or low sequencing depth. The results from this study likely represent the largest reported AS genome collection in the world and suggest that this new machine learning approach could help scientists better understand how microbial communities are assembled in nature.