STEM Dataset
Bilingual Multimodal STEM Dataset — a curated collection of 500 Math and Physics questions in Malay and English, some enriched with relevant images.
problem
AI models often struggle with bilingual and multimodal STEM tasks due to a lack of high-quality, domain-specific datasets in languages like Malay and English.
solution
We created a curated dataset of 500 Math and Physics questions in Malay and English, complemented by a public leaderboard to benchmark AI model performance.
result
AI teams now have a reliable resource for fine-tuning and evaluating models on real-world STEM tasks, setting a new standard for bilingual and multimodal AI development.