The system generates emotions from images, primarily intended for game NPCs so that they would react emotionally to the game environment. The sentiment analysis model is an LSTM-based Dense Neural Network that are fed Word2Vec embeddings. The model was trained using generated data from cohere.ai using the prompt: "I felt <emotion> when I saw <img2text>" System Flow: img2txt -> cohere.ai generator -> text2emote -> LSTM+DenseNN The images are passed to a CLIP Interrogator (BLIP + CLIP (ViT-32-B)) to generate text descriptions. Such text descriptions are elaborated by cohere.ai generator to generate emotional responses using the prompts: "When I saw <img2txt output>, I felt emotions such as"
Category tags: