Summary Peer-reviewed output on synthetic data generation, evaluation, and contamination has doubled versus the prior year, with strong contributions from East Asian labs. The surge reflects both scaling headroom concerns and tightening legal exposure on scraped human data.
Facts on record: 10 observed sources contribute to this signal.