RAGChat: A Web App for Intelligent PDF Information Retrieval and Question Answering
In today’s digital era, vast amounts of information are stored in unstructured document formats such as PDFs. Extracting relevant insights from these files often requires manually reading through large volumes of text, which is both time-consuming and inefficient. This challenge has created a growing need for intelligent systems that can automatically understand, retrieve, and summarize information from documents — enabling users to access knowledge effortlessly.
RAGChat is a web-based intelligent information retrieval and question-answering system designed to address this need. The platform allows users to upload one or more PDF files and interact with them through natural language queries. Instead of searching manually, users can simply ask questions, and RAGChat intelligently retrieves the most relevant document sections to generate accurate and context-aware answers.
The system leverages the power of Retrieval-Augmented Generation (RAG) — a hybrid approach combining semantic search and large language models (LLMs). Uploaded documents are processed by splitting the text into chunks, generating vector embeddings, and storing them in a vector database. When a query is asked, the system retrieves semantically similar chunks and passes them to an LLM (via Together AI) to produce precise, human-like responses.
By integrating natural language processing (NLP), vector embeddings, and web-based interactivity, RAGChat offers a practical and efficient solution for students, researchers, and professionals who frequently work with large document collections, transforming the way we access and comprehend information.
Objectives:
DA1 – Proof of Concept Development:
The primary objective of DA1 is to design and implement a proof-of-concept (PoC) version of the product within a Docker container. This initial prototype will feature a command-line interface (CLI) that demonstrates the system’s core functionality — including PDF text extraction, vector embedding generation, and intelligent query response using the Retrieval-Augmented Generation (RAG) pipeline. This stage establishes the foundational architecture and validates the feasibility of the proposed solution.
DA2 – Full Web Application Implementation:
The objective of DA2 is to extend the proof-of-concept into a fully functional web-based application. This phase involves developing a complete system using FastAPI for backend services, Streamlit for the interactive user interface, and Docker for containerization and deployment. Integration with the Together AI API will enable the system to generate accurate, context-aware responses to user queries.
DA3 – Final Presentation and Documentation:
The objective of DA3 is to present the finalized product during the Docker Showdown event, showcasing its capabilities, design, and performance. This phase also includes the preparation and submission of comprehensive project documentation, covering system architecture, implementation details, usage guidelines, and future enhancement possibilities.
Name of the containers involved and the download links:
Name of the other software involved along with the purpose:
VS Code - Environment for Software Development
Docker Desktop - To manage the containers created
All the other packages:
Fastapi
Uvicorn
Chromadb
Pydantic
Pypdf
Together.ai API
Overall architecture:
Procedure:
Part 1: Prototype Development – Creating a Working Model in a Single Container
This phase focused on developing an initial prototype of the product — a simplified version of the Retrieval-Augmented Generation (RAG) system — within a single Docker container. The main objective was to ensure that the fundamental logic for document retrieval and language model integration worked correctly before moving on to multi-container orchestration.
Step 1: Acquiring API Credentials
The first step involved obtaining the required API keys from Together AI, which provides access to powerful large language models (LLMs). These keys are essential for authentication and for making secure API calls to the model endpoint.Step 2: Setting Up the Docker Environment
Next, a Docker container was created to encapsulate the development environment.
This required studying Docker concepts, including images, containers, and networking, to understand how applications can be packaged and deployed in isolated environments.
A base Python image was selected, and a Dockerfile was written to install the necessary dependencies such as FastAPI, PyPDF2, numpy, and requests.
Step 3: Developing the Core Logic
In this step, the core logic of the prototype was implemented. The process involved:
Writing a Python script that accepts a PDF file as input.
Extracting the textual content from the uploaded document using PyPDF2.
Generating vector embeddings for the extracted text using a local embedding model (or ChromaDB in earlier versions).
Passing user queries along with the retrieved document context to the Together AI LLM for response generation.
This single-container prototype successfully demonstrated the workflow of uploading a document, retrieving relevant text sections, and generating meaningful responses from the LLM.
Part 2: Full Application Development
After validating the core logic, the next step was to build a complete web-based application with separate containers for the frontend and backend. This phase focused on modularization, scalability, and inter-container communication.
Step 1: Creating a Dedicated Frontend Container
A new Docker container was created for the frontend interface using Streamlit. Streamlit was chosen for its simplicity and efficiency in building interactive data applications and chat interfaces with minimal effort.Step 2: Designing the Frontend Interface
A user-friendly frontend was developed to allow users to:
Upload PDF documents.
Enter questions or prompts related to the uploaded content.
View responses generated by the backend LLM service.
The interface was built using Streamlit and connected to the backend through REST API endpoints using the requests library.
Step 3: Implementing the Backend Container
The backend container was designed using FastAPI, which provided a lightweight and efficient framework for handling HTTP requests.
The backend included two major endpoints:
/upload: to handle document uploads and text extraction.
/ask: to process user queries, retrieve relevant document information, and generate responses using the Together AI model.
Part 3: The Final Protype in the Multi Container Architecture
Step 1: Connecting Frontend and Backend Containers
Once both containers were ready, they were configured to communicate via Docker’s internal network. The backend service was exposed on port 8000, while the Streamlit frontend ran on port 8501.Environment variables were used for configuration, ensuring security and flexibility.
Final Project Screenshot:
What modification is done in the containers:
Backend Container
The base image python:3.10-slim was downloaded and used as the foundation for the backend environment.
A dedicated working directory (/app) was created inside the container to store application files.
All required dependencies such as FastAPI, Uvicorn, PyPDF2, NumPy, Requests, and python-multipart were installed using the requirements file.
The backend application files were copied into the container, and a port (8000) was configured for the FastAPI service.
Environment variables (like API keys) and volume mounts for uploaded files were set up during container runtime to enable persistent storage and secure configuration.
Frontend Container
The frontend container was also built using the python:3.10-slim base image for consistency and smaller image size.
The Streamlit-based user interface files were added to the container’s working directory.
Essential packages such as Streamlit and Requests were installed through the requirements file.
The container was configured to run the Streamlit application on port 8501 and connect to the backend container through the internal Docker network.
Necessary environment and network configurations were applied in Docker Compose to ensure smooth communication between the frontend and backend services.
What are the outcomes of your DA?
The development of RAGChat resulted in a fully functional, intelligent, and user-friendly system for efficient information retrieval and question answering from PDF documents. The product successfully integrates Retrieval-Augmented Generation (RAG) techniques with modern web technologies to provide accurate, context-aware responses to user queries. By combining FastAPI for backend processing, Streamlit for the interactive user interface, Docker for containerization, and the Together AI API for natural language understanding, RAGChat demonstrates a seamless workflow for document analysis and intelligent interaction. The system enables users to upload PDF files, extract and embed textual content, and retrieve relevant information using semantic search powered by vector embeddings. As an outcome, RAGChat proves to be an effective and scalable solution for students, researchers, and professionals who need to extract insights quickly from large document collections, showcasing the practical potential of AI-driven document intelligence in real-world applications.
Conclusion:
The development of RAGChat has been a highly insightful and rewarding experience, combining concepts from artificial intelligence, web development, and containerization. Through this project, I gained a strong practical understanding of Docker, learning how to containerize applications, manage dependencies, and ensure consistent deployment across different environments. The system successfully achieves its goal of providing intelligent, context-aware information retrieval and question answering from PDF documents, demonstrating the power of Retrieval-Augmented Generation (RAG) and modern AI tools. While the current implementation effectively integrates FastAPI, Streamlit, and the Together AI API, there remains significant potential for further enhancement. Overall, this project not only strengthened my technical skills but also provided valuable experience in building and deploying real-world AI-driven web applications using Docker.
References:
Spoken-tutorial(https://spoken-tutorial.org/tutorial-search/?search_foss=Docker&search_language=English) served as an excellent resource for understanding the fundamental concepts and practical applications of Docker.
Acknowledgement:
I would like to express my sincere gratitude to VIT Chennai, School of Computer Science and Engineering (SCOPE), for offering the Cloud Computing course (Course Code: BCSE408L) during the Fall Semester 2025. This course provided a valuable platform to apply theoretical knowledge to a practical, hands-on project, greatly enhancing my understanding of containerization and cloud-based deployment.
Finally, I would like to extend my heartfelt thanks to my professor, Dr. T. Subbulakshmi, for her valuable guidance, clear instructions, and continuous support throughout all three phases of this Design Assignment. Her encouragement and insights were instrumental in successfully completing this project.
Name: Arjun A
Comments
Post a Comment