2025

Resumix

AI-assisted research corpus explorer

The Challenge

Research papers are long, dense, and hard to cross-reference quickly. Going from a raw PDF to a useful answer still demands too much linear reading, when the real need is often to explore a corpus and surface the relevant passages fast.

The Approach

The goal was to build a full chain, from document import to conversational exploration:

research PDF upload and paper import through OpenAlex
text extraction with PyMuPDF, followed by chunking and sentence-transformer embeddings
FAISS vector indexing for semantic search across the corpus
a FastAPI backend with Redis to orchestrate ingestion, indexing, and queries
a SvelteKit interface for browsing documents and chatting with a local LLM through Ollama/LangChain

The Result

A functional and pleasant-to-use platform for exploring large research corpora. The key pieces are reliable — PDF import, OpenAlex retrieval, the RAG pipeline, and LLM conversation — and the tool makes it practical to query documents that run into the hundreds of pages through natural language.

Tech Stack

PythonFastAPISvelteKitTypeScriptTailwindFAISSOllamaRedis