Books Recommender System

Introduction

Recommender Systems play a crucial role in cross-selling and reducing customer churn across many industries. In cases where text data is available, recommender systems can identify semantic similarities among documents to find the most similar ones.

The text is converted into embeddings, which are dense representations. These vectors can effectively represent many words in a context window. The length of the context window depends on the capabilities of the embeddings model.

The objective of this work was to create a recommender system based on semantic embeddings to suggest similar books based on their descriptions and categories.

Data:

The data utilized in this project is sourced from the following Kaggle repository.
After filtering, the data without information on the book's category, description, publication year, average rating, and number of pages was excluded.
Subsequently, the book descriptions were transformed to lowercase and cleaned using various regular expressions.

Tools used:

LangChain
Python
OpeanAI API
Pinecone

Arquitecture used:

The code is saved in a GitHub repository, while the data is processed with Ada002 embedding from OpenAI.
The embeddings and their metadata is saved into a vector store in Pinecone.
The application is continuosly deployed in Streamlit, and this access the data saved in Pinecone for each request-response.

See app

Show code

Follow me on my social media channels to know more about my projects.

Get In Touch

Pujilí, Cotopaxi, Ecuador

sebitas.alejo@hotmail.com

Quick Links

Home About Us Our Services Contact Us