Evaluate prediction files against MMOU benchmark data
Interact with a multimodal chatbot using text and images