Footer navigation

Unlocking state-of-the-art artificial intelligence and building with the world's talent

LinkedIn
Twitter/X
Instagram
Discord
YouTube
Twitch

Other group brands:

Links

AI Tech
AI Hackathons
AI Tutorials
AI Applications
NativelyAI
AI Articles
Leaderboard
Writers

lablab

About
Brand
Hackathon Guidelines
Terms of Use
Code of Conduct
Privacy Policy

Get in touch

Discord
Sponsor
Cooperation
Contribute
[email protected]

© 2026 NativelyAI Inc. All rights reserved.

3.37.0

Help CenterBrowse FAQs and ask our AI.

Discord CommunityChat with mentors and the team.

Claim FREE $100

creditsAI Hackathons AI Apps AI Tech AI Tutorials AI Articles NativelyAI Sponsor

Home
App Discovery
PolyGlot Gemini lens for PDFs

PolyGlot Gemini lens for PDFs

Created by team PolyGlot Gemini on December 27, 2023

Gemini AI GPT-3.5 TruLens

Problem Statement: 1) Over 70% of PDFs contain critical data in images like charts and tables, especially research articles 2) Gemini is released for English only today. Can we build a solution for 1) Answering natural language questions based on images in PDFs ? 2) Making Gemini accessible for non english speakers? By leveraging Spire, Open AI GPT 3.5, Gemini Pro Vision and Trulens, I have built an application that solves both problems - Spire for Image Extraction - Open AI for Translation to English (optional) - Gemini-Pro-Vision for the answer - TruLens for Monitoring

Category tags:

Productivity, Summarization

Github Presentation Demo

Explore more applications

router_007_v3

router_007_v2 is a Track 1 agent that records **zero billable tokens**: every answer is computed inside the container by a Qwen2.5-7B-Instruct model bundled in the Docker image and served in-process with llama-cpp-python

roc_auc_half

Claude CodeChatGPTAMD ROCm

Lexyprep

Agent knows how to make thorough legal research based on my experience of winning legal cases. It understands your issue and advises on procedure

Lexyprep - Do you have a case

AMD Developer CloudAMD ROCmCodexGemma

video-captioning-agent

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris at cursus dolor. Phasellus porta, sapien ut molestie aliquet, sapien velit.

memovate

AMD Developer Cloud

career launch ai

CareerLaunch AI: An intelligent workspace architect that synthesizes personalized technical roadmaps, project directories, and AMD-optimized hardware configurations using Llama 3 AI via Fireworks.

bloatware slayer

ChatGPTGemini 3 FlashLlama 3.1

TokenRouter - Efficient AI Task Routing Agent

A lightweight routing agent that solves simple tasks (math, sentiment) locally for zero token cost, and only escalates harder tasks to an LLM via the Fireworks AI API — minimizing token usage while maintaining accuracy.

Train&Try

ChatGPTAnthropic Claude

Raghavan Muthuregunathan
Senior Engineering Manager

Upcoming AI Hackathons
For Innovators & Creators

Explore more applications

router_007_v3

router_007_v2 is a Track 1 agent that records **zero billable tokens**: every answer is computed inside the container by a Qwen2.5-7B-Instruct model bundled in the Docker image and served in-process with llama-cpp-python

roc_auc_half

Claude CodeChatGPTAMD ROCm

Lexyprep

Agent knows how to make thorough legal research based on my experience of winning legal cases. It understands your issue and advises on procedure

Lexyprep - Do you have a case

AMD Developer CloudAMD ROCmCodexGemma

video-captioning-agent

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Mauris at cursus dolor. Phasellus porta, sapien ut molestie aliquet, sapien velit.

memovate

AMD Developer Cloud

career launch ai

CareerLaunch AI: An intelligent workspace architect that synthesizes personalized technical roadmaps, project directories, and AMD-optimized hardware configurations using Llama 3 AI via Fireworks.

bloatware slayer

ChatGPTGemini 3 FlashLlama 3.1

TokenRouter - Efficient AI Task Routing Agent

A lightweight routing agent that solves simple tasks (math, sentiment) locally for zero token cost, and only escalates harder tasks to an LLM via the Fireworks AI API — minimizing token usage while maintaining accuracy.

Train&Try

ChatGPTAnthropic Claude

AI app: PolyGlot Gemini lens for PDFs for Gemini AI Hackathon