RAGChat: A Web App for Intelligent PDF Information Retrieval and Question Answering

In today’s digital era, vast amounts of information are stored in unstructured document formats such as PDFs. Extracting relevant insights from these files often requires manually reading through large volumes of text, which is both time-consuming and inefficient. This challenge has created a growing need for intelligent systems that can automatically understand, retrieve, and summarize information from documents — enabling users to access knowledge effortlessly.

RAGChat is a web-based intelligent information retrieval and question-answering system designed to address this need. The platform allows users to upload one or more PDF files and interact with them through natural language queries. Instead of searching manually, users can simply ask questions, and RAGChat intelligently retrieves the most relevant document sections to generate accurate and context-aware answers.

The system leverages the power of Retrieval-Augmented Generation (RAG) — a hybrid approach combining semantic search and large language models (LLMs). Uploaded documents are processed by splitting the text into chunks, generating vector embeddings, and storing them in a vector database. When a query is asked, the system retrieves semantically similar chunks and passes them to an LLM (via Together AI) to produce precise, human-like responses.

By integrating natural language processing (NLP), vector embeddings, and web-based interactivity, RAGChat offers a practical and efficient solution for students, researchers, and professionals who frequently work with large document collections, transforming the way we access and comprehend information.

Objectives:

DA1 – Proof of Concept Development:
The primary objective of DA1 is to design and implement a proof-of-concept (PoC) version of the product within a Docker container. This initial prototype will feature a command-line interface (CLI) that demonstrates the system’s core functionality — including PDF text extraction, vector embedding generation, and intelligent query response using the Retrieval-Augmented Generation (RAG) pipeline. This stage establishes the foundational architecture and validates the feasibility of the proposed solution.

DA2 – Full Web Application Implementation:
The objective of DA2 is to extend the proof-of-concept into a fully functional web-based application. This phase involves developing a complete system using FastAPI for backend services, Streamlit for the interactive user interface, and Docker for containerization and deployment. Integration with the Together AI API will enable the system to generate accurate, context-aware responses to user queries.

DA3 – Final Presentation and Documentation:
The objective of DA3 is to present the finalized product during the Docker Showdown event, showcasing its capabilities, design, and performance. This phase also includes the preparation and submission of comprehensive project documentation, covering system architecture, implementation details, usage guidelines, and future enhancement possibilities.

Name of the containers involved and the download links:

Frontend Container: arjunxaq/ragchat_v2-frontend

(https://hub.docker.com/r/arjunxaq/ragchat_v2-frontend)

Backend Container: arjunxaq/ragchat_v2-backend

(https://hub.docker.com/r/arjunxaq/ragchat_v2-backend)

The Entire Project's Codebase is available on Github:

https://github.com/arjunxaq/ragchat_v2.git

Name of the other software involved along with the purpose:

VS Code - Environment for Software Development

Docker Desktop - To manage the containers created

All the other packages:

Fastapi

Uvicorn

Chromadb

Pydantic

Pypdf

Together.ai API

Overall architecture:

Procedure:

Part 1: Prototype Development – Creating a Working Model in a Single Container

This phase focused on developing an initial prototype of the product — a simplified version of the Retrieval-Augmented Generation (RAG) system — within a single Docker container. The main objective was to ensure that the fundamental logic for document retrieval and language model integration worked correctly before moving on to multi-container orchestration.

Step 1: Acquiring API Credentials

The first step involved obtaining the required API keys from Together AI, which provides access to powerful large language models (LLMs). These keys are essential for authentication and for making secure API calls to the model endpoint.

Step 2: Setting Up the Docker Environment

Next, a Docker container was created to encapsulate the development environment.
This required studying Docker concepts, including images, containers, and networking, to understand how applications can be packaged and deployed in isolated environments.
A base Python image was selected, and a Dockerfile was written to install the necessary dependencies such as FastAPI, PyPDF2, numpy, and requests.

Step 3: Developing the Core Logic

In this step, the core logic of the prototype was implemented. The process involved:

Writing a Python script that accepts a PDF file as input.
Extracting the textual content from the uploaded document using PyPDF2.
Generating vector embeddings for the extracted text using a local embedding model (or ChromaDB in earlier versions).
Passing user queries along with the retrieved document context to the Together AI LLM for response generation.

This single-container prototype successfully demonstrated the workflow of uploading a document, retrieving relevant text sections, and generating meaningful responses from the LLM.

Part 2: Full Application Development

After validating the core logic, the next step was to build a complete web-based application with separate containers for the frontend and backend. This phase focused on modularization, scalability, and inter-container communication.

Step 1: Creating a Dedicated Frontend Container

A new Docker container was created for the frontend interface using Streamlit. Streamlit was chosen for its simplicity and efficiency in building interactive data applications and chat interfaces with minimal effort.

Step 2: Designing the Frontend Interface

A user-friendly frontend was developed to allow users to:

Upload PDF documents.
Enter questions or prompts related to the uploaded content.
View responses generated by the backend LLM service.

The interface was built using Streamlit and connected to the backend through REST API endpoints using the requests library.

Step 3: Implementing the Backend Container

The backend container was designed using FastAPI, which provided a lightweight and efficient framework for handling HTTP requests.
The backend included two major endpoints:

/upload: to handle document uploads and text extraction.
/ask: to process user queries, retrieve relevant document information, and generate responses using the Together AI model.

Part 3: The Final Protype in the Multi Container Architecture

Step 1: Connecting Frontend and Backend Containers

Once both containers were ready, they were configured to communicate via Docker’s internal network. The backend service was exposed on port 8000, while the Streamlit frontend ran on port 8501.
Environment variables were used for configuration, ensuring security and flexibility.

Step 2: Using Docker Compose for Orchestration

Finally, Docker Compose was employed to define and manage multi-container deployment.
A docker-compose.yml file was created to specify both services, their dependencies, network configurations, and build instructions.
With a single command (docker-compose up), the entire RAG Chat system — comprising both frontend and backend — could be launched seamlessly.

This stage resulted in a fully functional and containerized RAG Chat application capable of document-based conversational AI.

Final Project Screenshot:

What modification is done in the containers:

Backend Container

The base image python:3.10-slim was downloaded and used as the foundation for the backend environment.
A dedicated working directory (/app) was created inside the container to store application files.
All required dependencies such as FastAPI, Uvicorn, PyPDF2, NumPy, Requests, and python-multipart were installed using the requirements file.
The backend application files were copied into the container, and a port (8000) was configured for the FastAPI service.
Environment variables (like API keys) and volume mounts for uploaded files were set up during container runtime to enable persistent storage and secure configuration.

Frontend Container

The frontend container was also built using the python:3.10-slim base image for consistency and smaller image size.
The Streamlit-based user interface files were added to the container’s working directory.
Essential packages such as Streamlit and Requests were installed through the requirements file.
The container was configured to run the Streamlit application on port 8501 and connect to the backend container through the internal Docker network.
Necessary environment and network configurations were applied in Docker Compose to ensure smooth communication between the frontend and backend services.

Dockerhub link of the containers:

Frontend Container: arjunxaq/ragchat_v2-frontend

(https://hub.docker.com/r/arjunxaq/ragchat_v2-frontend)

Backend Container: arjunxaq/ragchat_v2-backend

(https://hub.docker.com/r/arjunxaq/ragchat_v2-backend)

The container can be pulled using the commands:

docker pull arjunxaq/ragchat_v2-frontend

docker pull arjunxaq/ragchat_v2-backend

What are the outcomes of your DA?

The development of RAGChat resulted in a fully functional, intelligent, and user-friendly system for efficient information retrieval and question answering from PDF documents. The product successfully integrates Retrieval-Augmented Generation (RAG) techniques with modern web technologies to provide accurate, context-aware responses to user queries. By combining FastAPI for backend processing, Streamlit for the interactive user interface, Docker for containerization, and the Together AI API for natural language understanding, RAGChat demonstrates a seamless workflow for document analysis and intelligent interaction. The system enables users to upload PDF files, extract and embed textual content, and retrieve relevant information using semantic search powered by vector embeddings. As an outcome, RAGChat proves to be an effective and scalable solution for students, researchers, and professionals who need to extract insights quickly from large document collections, showcasing the practical potential of AI-driven document intelligence in real-world applications.

Conclusion:

The development of RAGChat has been a highly insightful and rewarding experience, combining concepts from artificial intelligence, web development, and containerization. Through this project, I gained a strong practical understanding of Docker, learning how to containerize applications, manage dependencies, and ensure consistent deployment across different environments. The system successfully achieves its goal of providing intelligent, context-aware information retrieval and question answering from PDF documents, demonstrating the power of Retrieval-Augmented Generation (RAG) and modern AI tools. While the current implementation effectively integrates FastAPI, Streamlit, and the Together AI API, there remains significant potential for further enhancement. Overall, this project not only strengthened my technical skills but also provided valuable experience in building and deploying real-world AI-driven web applications using Docker.

References:

Spoken-tutorial(https://spoken-tutorial.org/tutorial-search/?search_foss=Docker&search_language=English) served as an excellent resource for understanding the fundamental concepts and practical applications of Docker.

Acknowledgement:

I would like to express my sincere gratitude to VIT Chennai, School of Computer Science and Engineering (SCOPE), for offering the Cloud Computing course (Course Code: BCSE408L) during the Fall Semester 2025. This course provided a valuable platform to apply theoretical knowledge to a practical, hands-on project, greatly enhancing my understanding of containerization and cloud-based deployment.

Finally, I would like to extend my heartfelt thanks to my professor, Dr. T. Subbulakshmi, for her valuable guidance, clear instructions, and continuous support throughout all three phases of this Design Assignment. Her encouragement and insights were instrumental in successfully completing this project.

Name: Arjun A

Search This Blog

Cloud Computing Blog