In today’s digital world, businesses are exploring ways to make online interactions easier to access. Online shoppers often struggle to find relevant information and assistance for their desired products. Unlike a physical store where a friendly chat with an employee helps you discover the right products, the online environment tends to overwhelm users with too much information.

This project aims to bridge this gap by introducing personalized shopping assistance tailored to individual needs. The goal is to improve the overall customer experience by addressing the shortcomings of generic online interactions. The solution comes in the form of a chatbot, developed using a Large Language Model (LLM) and enriched with relevant data. For this, a technique called Retrieval-Augmented Generation (RAG) is used. While this chatbot is designed specifically for a hardware store, it can be adapted to suit different needs.

The chatbot functions as an AI-assistant, enabling users to engage in conversations, obtain product information, and seek guidance for their DIY projects. The primary focus of this study revolves around the question: “How can the performance of an AI-driven chatbot be optimized to better assist users in obtaining information about DIY projects and product recommendations?“.

By answering this research question, we hope to provide practical insights into improving the performance of AI-driven chatbots, offering a more user-friendly and supportive online shopping experience.


The chatbot uses an LLM to answer a user query. Next to that, it makes use of product data and expert blogs, which are saved in a vector database.The final product is a proof-of-concept that demonstrates the possibilities of generative AI in e-commerce. The following figure gives an example of how the chatbot can be used.

As can be seen, the answer includes several links to the correct products. Furthermore, the answer ends with a link to a blog post which contains more relevant information.


To understand how the chatbot works and its impact on user interactions, let’s look at the steps it takes to provide personalized information:

  1. User Query Submission: The process begins when a user submits a question to the chatbot, seeking information about DIY projects or product recommendations.
  2. Query Refinement: The chatbot refines the user’s query by sending a request to the OpenAI API, utilizing the chat history as context.
  3. Query Classification: A request is sent to the OpenAI API to categorize the refined query into specific classes, such as ‘recommendations,’ ‘comparisons,’ ‘step-by-step guides,’ ‘availability,’ or ‘other.’
  4. Product Retrieval: The chatbot retrieves the top products most relevant to the user’s query from the vector database. This step ensures that the chatbot provides accurate and context-specific product recommendations.
  5. Blog Retrieval: Simultaneously, the chatbot fetches the most relevant blog related to the user’s query from the vector database. It also retrieves a collection of relevant blog chunks.
  6. Specification Generation: The chatbot sends a request to the OpenAI API to generate a list of product specifications based on the user’s query and the retrieved blogs.
  7. Additional product retrieval: For each listed product specification, the chatbot refines the information by retrieving additional products closely matching the user’s query from the vector database.
  8. Response Generation: Finally, a request is sent to the OpenAI API to formulate a comprehensive response to the user’s query. The chatbot utilizes the classified query, blog data, and product information to craft a detailed and contextually rich response.

Model development

To assess the progress made in the chatbot’s performance, we use a systematic approach of evolving through distinct versions. This approach enables a straightforward comparison between versions, helping us build upon the successes of each prior iteration.

Version 0: Laying the Foundation

The initial version lays the foundation, integrating a Large Language Model (LLM) with product data and expert blogs. Additionally, the query is classified in one of the query classes. The chatbot can now provide basic information, but there’s room for improvement.

Version 1: The Power of Prompt Engineering

Prompt engineering becomes a focal point in Version 1, breaking down the chatbot’s tasks into smaller, more manageable steps. This strategic move results in a notable increase in performance, demonstrating the significance of prompt engineering in optimizing the chatbot’s understanding of user queries.

Version 2: Smarter Information Retrieval

Acknowledging the need for smarter information retrieval, Version 2 introduces the use of the LLM to generate a list of product specifications based on the user’s query. In earlier versions, the chatbot would only suggest one type of product even if the user needed more, this solution forces the chatbot to obtain more varied products.

Version 3: Balancing Efficiency and Cost

Efficiency and cost considerations come into play in Version 3. The chatbot’s functionality is refined, with careful decisions on when to use advanced models and when a simpler approach suffices. Breaking down blog data into smaller parts allows the chatbot to retrieve specific information, striking a balance between efficiency and cost-effectiveness.

Version 4: Speeding Up the Process

Recognizing the importance of speed, Version 4 removes the summarization part, streamlining the chatbot’s response time. Furthermore, the product information is simplified, keeping only the essential details.

Version 5: Fine-Tuning for Excellence

Version 5 creates a fine-tuned model, trained on data obtained from answers generated in Version 4, making it require fewer instructions and tokens. It uses a faster and cheaper model (gpt-3.5), enhancing the chatbot’s efficiency. This proves to be a highly effective strategy, significantly reducing costs, response times, and enhancing overall performance.


To optimize the AI-driven chatbot for DIY projects and product recommendations, we evolved through different versions. Strategies like prompt engineering and query classification are used to improve user interactions. Balancing complexity and efficiency involved selecting models wisely and breaking down data for cost-effectiveness. Later versions prioritized speed by simplifying information, and the most optimal version makes use of model fine-tuning for increased efficiency.

The results clearly show that putting a lot of focus on prompt engineering brings significant gains to the chatbot’s performance. Improvements in economic considerations emphasize the need to look beyond just accuracy. Fine-tuning is an effective way to boost performance while keeping costs and duration low.

Emma Beekman

Intern Data Science at Squadra Machine Learning Company