Multimodal Moderator

Created by team West Coast Avengers on January 15, 2024

Current content moderators can process either text or images, but not both. Separate applications are needed to moderate multiple content types. Also, the moderators produce binary output that doesn’t explain why certain content is not appropriate. Our solution is the multimodal moderator. One application that can check if text or image is appropriate or not. It can “understand” the message and explain why certain content is not appropriate. The technology we used is Zapier with Discord as the front end. Zapier is a no-code solution to connect with an AI model, and does not require a constantly running program to connect with Discord. GPT-4 Vision hosted on Clarifai platform is the engine that can check if text or image is appropriate. It sends back a narrative to explain the reasoning.

Category tags:

"Seems like this project was never built during the hackathon as per the GitHub commit plus the code provided does not match the bot shown. Needs a lot of work to be done to reach the MVP stage. All the best! "


Muhammad Inaamullah

Machine Learning Engineer

"The code provided in the repository is not related to the bot implementation. Cannot find prompts and models which have been used to check messages. When tried, the bot was slow to respond and could not recognize the image."


Nikita Ladyzhnikov

Lead Frontend Engineer @ Clarifai