RAG Book Recommendation Chatbot

Retrieval-augmented generation (RAG) is one of the most useful and popular applications of large language models. LLMs often hallucinate, but RAG, along with prompt engineering, is a powerful way to improve the accuracy of an LLM’s responses.

I like to read books, and I found a books dataset scraped from Goodreads. Using that dataset, I created a very simple RAG chatbot that recommends books to users.

The backend is developed with FastAPI and the frontend with Streamlit. Backend and frontend are deployed to separate containers and communicate via a virtual Docker network - the backend is only accessible to the frontend, and the frontend is exposed via an EC2 instance.

I used OpenAI’s text-embedding-ada-002 model to embed the book descriptions and indexed the embeddings and relevant metadata into Qdrant. The frontend posts user messages to the backend, and the backend embeds the message and retrieves the top 5 most similar books from Qdrant. The book titles, authors, descriptions, and genres are passed to OpenAI’s chat completion API, and the chat completion is passed back to the frontend and added to the conversation history maintained in both backend and frontend.

Code repo

Tools, libraries, and APIs:

UI:

UI

Architecture:

Architecture

Project directory structure:

.
├── backend
   ├── __init__.py
   ├── dependencies.py             # Dependency functions to inject BookRetriever into /chat/ operation function
   ├── main.py                     # Create one instance of BookRetriever to be shared among all users
   ├── models
      ├── __init__.py
      ├── assistant.py            # BookAssistant class definition
      └── retriever.py            # BookRetriever class definition
   └── routers
       ├── __init__.py
       └── chat.py                 # /chat/ endpoint
└── frontend
    └── app.py                      # Streamlit application for UI

Future work:

Backend

The backend is developed with FastAPI and provides a single endpoint, /chat/, to handle user messages.

Key components:

Frontend

The frontend is developed with Streamlit. The frontend is responsible for the UI, for generating unique user IDs, and for sending HTTP POST requests to the backend.

Containerization and networking

This project uses separate Docker containers for the backend and frontend.

Architecture

Refer to ./compose.yaml for the Docker Compose configuration.

Setup instructions

  1. Create a file called ./backend/.env with these environmental variables defined:
     QDRANT_URL=<my-qdrant-cluster-url>
     QDRANT_API_KEY=<my-qdrant-api-key>
     OPENAI_API_KEY=<my-openai-api-key>
    
  2. Install Docker and Docker Compose on your machine
  3. In your terminal, navigate to the directory containing compose.yaml and execute docker compose up. This command builds and starts both containers.
  4. Open your web browser and navigate to http://localhost:8501 to access the application UI

Dataset and embedding

This dataset contains scraped books data from Goodreads, including fields like description, genre, number of ratings, rating on a 5-star system, etc.

For each book, I concatenated the book description with the genres list, fed them into the text embedding model, and then stored the embeddings, along with metadata, into the vector database.