PolyGlot Gemini lens for PDFs

Created by team PolyGlot Gemini on December 27, 2023

Problem Statement: 1) Over 70% of PDFs contain critical data in images like charts and tables, especially research articles 2) Gemini is released for English only today. Can we build a solution for 1) Answering natural language questions based on images in PDFs ? 2) Making Gemini accessible for non english speakers? By leveraging Spire, Open AI GPT 3.5, Gemini Pro Vision and Trulens, I have built an application that solves both problems - Spire for Image Extraction - Open AI for Translation to English (optional) - Gemini-Pro-Vision for the answer - TruLens for Monitoring

Category tags:

"Great use of Gemini to make PDFs and images more accessible + use of trulens to make sure it's safe. Areas of improvement: - A narrower use case can often be more impactful than a general one, and bring a lot of value! Focus on selling to your first customers, not the whole market. - It would have been nice to see evaluations that validated the core capabilities of the app in addition to the harmlessness evaluations you completed."

avatar

Josh Reini

DevRel

"excellent work. amazing and very useful idea"

avatar

Walaa Nasr Elghitany

Data scientist and doctor