Visual Reasoning Benchmark: Evaluating Multimodal LLMs on Classroom-Authentic Visual Problems from Primary Education
21 days agovia arxiv
arxiv.org(opens in new window)AI Score: 44%paper
Comments are not yet available for curated items. Check back soon!