“Our benchmarks are primarily designed to measure convergent thinking rather than divergent thinking. Current AI systems excel at producing answers that align with existing knowledge consensus, but struggle with the kind of contrarian, paradigm-challenging insights that drive scientific revolutions.”