The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms

This page summarizes The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms, an ICML 2026 paper by Jinghan Zhang, Zerui Cheng, Shiqi Chen, Ge Zhang, Wenhao Huang, Jiashuo Liu, Junxian He, and Tianle Cai.

One-Sentence Summary

The paper introduces the Generalization Spectrum, an evaluation framework that measures not only whether a learning algorithm improves, but how far learning from a specific training example transfers to related and increasingly distant test variants.

Paper Links

Why This Paper Matters

Standard evaluations usually collapse learning into one aggregate score on an i.i.d. test set. That makes it hard to see whether a method is memorizing, transferring by surface similarity, or learning a more portable abstraction.

The Generalization Spectrum exposes this hidden structure by arranging test variants at increasing transfer distances. This makes it possible to compare algorithms by the radius of generalization they produce, not just by final accuracy.

Common Search Intents

This page is intended to answer questions such as:

How can learning algorithms be evaluated beyond i.i.d. test performance?
What is the Generalization Spectrum?
How can we measure per-sample generalization?
Do RL, SFT, and ICL generalize differently after matched memorization?
How can competitive programming benchmarks measure transfer distance?
What ICML 2026 papers study generalization in learning algorithms?

Technical Contribution

The paper constructs controlled test variants for each training example across five spectrum levels: exact recall, implementation transfer, context transfer, category-matched in-domain problems, and an unpaired baseline.

It instantiates the framework on competitive programming with a selection-and-synthesis pipeline seeded by recent problems to reduce contamination risk. Under matched memorization, the paper compares canonical learning paradigms and finds that RL converts memorization into near-transfer more efficiently than SFT-family baselines, while ICL shows strong but correspondence-dependent transfer.

The same spectrum also diagnoses variants within a family: local improvements do not necessarily expand the generalization radius, and some methods can improve local transfer or optimization while weakening far transfer.

Citation

@inproceedings{zhang2026generalization,
  title = {The Generalization Spectrum: A Chromatographic Approach to Evaluating Learning Algorithms},
  author = {Zhang, Jinghan and Cheng, Zerui and Chen, Shiqi and Zhang, Ge and Huang, Wenhao and Liu, Jiashuo and He, Junxian and Cai, Tianle},
  booktitle = {International Conference on Machine Learning},
  year = {2026}
}

Jinghan Zhang

张静涵