Use Case

  1. Home
  2. Recommendation Engine

Uses Cases:
Recommendation Engine

In today's competitive landscape, delivering personalized experiences is vital for business success. A recommendation engine enhances customer satisfaction and boosts sales by providing tailored product suggestions based on individual preferences and behaviors. By analyzing user interactions, it uncovers hidden patterns that help businesses connect with their customers more effectively.

What will You Do

Discover how to leverage user behavior data to enhance product recommendations.

Learn effective strategies for personalizing customer experiences based on preferences and interactions.

Find out which algorithms are best suited for generating accurate recommendations – and why.

See a sample recommendation model that illustrates how to connect users with relevant products.

Introduction

In the world of data analysis and business solution development, having a representative and meaningful dataset is key to understanding various aspects of a business's operations. One dataset that is often used as a reference is the Northwind dataset. This dataset provides a comprehensive overview of product sales operations, involving key entities such as Customer, Order, Product, and Category.

In this section, we will explore and introduce the Northwind dataset. From the key entities to the relationships between them, this dataset provides a strong foundation for developing various analytics solutions, including the creation of a product recommendation system. Let us understand the characteristics of this dataset, and how the information contained in it can be a valuable source of knowledge for decision-making and business strategy development.

Dataset

The Northwind dataset is a sample dataset that presents comprehensive information about product sales business operations, with the main entities involving Customer, Order, Product, and Product Category. Specifically designed to provide application and query examples in a relational and graph database environment, Northwind not only provides a comprehensive overview of the relationships between its entities, but also offers potential for a variety of uses, including the development of recommendation engines. By mining Customer, Order, Product, and Product Category data, this dataset can assist in creating an intelligent recommendation system, enabling the use of purchase patterns and preferences to provide relevant product recommendations.

In the context of a graph database, the Northwind dataset can be represented as a graph where entities are nodes and relationships between entities are edges. Each node in the graph represents a different type of entity, and the relationships between edges represent the associations between those entities.

Nodes

There are four main entities that form the basic framework of business operations and product transactions. The four entities, namely Customer, Product, Order, and Category, create a rich structure of information about customers, products, orders, and product categories.

  • • Customer
  • Customer is an entity that represents individuals or companies that make purchases or transactions in the system. Each Customer node has properties such as customerID, companyName, contactName, and other contact information.

    • customerID: A unique identification for each customer in the dataset. It is a string used to distinguish one customer from another.
    • companyName: The name of the company or business entity of the customer. This property provides information about the legal entity that made the transaction or purchase.
    • contactName: The name of the contact associated with the customer. This is the name of the person who is responsible for or has a relationship with the business transaction.
    • contactTitle: The title or position held by the customer contact. It provides additional context about the contact's role or responsibilities within the company.
    • address: The physical address or office of the customer.
    • city: The name of the city where the customer is located.
    • country: The country where the customer is located.
    • region: A specific region or part of the country where the customer is located.
    • postalCode: The zip code or postal code associated with the customer's address.
    • phone: The phone number that can be used to contact the customer.
    • fax: The fax number associated with the customer.
  • • Product
  • Order is an entity that represents an order or transaction made by a Customer. Each Order node has properties such as orderID, orderDate, customerID, and other order information.

    • orderID: Unique identification for each order.
    • customerID: The unique identification of the customer who placed the order. It is a foreign key that can be used to link the Order node with the Customer node.
    • freight: The shipping cost or freight cost of the order.
    • orderDate: The date when the order was placed.
    • requiredDate: The date on which the order is expected to arrive or be fulfilled.
    • shipAddress: The shipping address of the order.
    • shipCity: The destination city of the order.
    • shipCountry: The country to which the order is destined.
    • shipName: The name of the order recipient.
    • shipPostalCode: The postal code of the order destination.
    • shipRegion: The region or specific part of the country the order is destined for.
    • shipVia: The shipping method or third party responsible for shipping the order.
    • shippedDate: The date when the order was actually shipped.
  • • Category
  • Category is an entity that represents a particular category or type of product. Each Category node has properties such as categoryID, categoryName, and other category information.

    • categoryID: Unique identification for each product category in the dataset.
    • categoryName: The name of the category that provides information about the types of products included in the category.
    • description: A short description that provides additional information about the type of product or characteristic of the category.
    • picture: Contains image data representing the category that has been encrypted in base64 format.

Relationships

In a graph database, relationships are very important. Relationships make it possible to traverse the graph so as to find relationships between entities. In a graph database, relationships between nodes can be represented by edges or relationships of a certain type. In the context of the Northwind dataset, there are three types of relations: PURCHASED, ORDERS, and PART_OF.

• PURCHASED

Cypher
(customer)-[:PURCHASED]->(order)

An example of using the PURCHASED relation:

The PURCHASED relation can be used to describe the relationship between a Customer node and an Order node. For example, if a customer purchases a particular product, you can use the PURCHASED relation to connect the Customer node with the Order node, indicating that the customer has made a purchase or order.

• ORDERS

Cypher
(order)-[:ORDERS]->(product)

An example of using the ORDERS relation:

The ORDERS relation can be used to describe the relationship between the Order node and the Product node. It reflects that within an order, there are certain products ordered or purchased by the customer.

• PART_OF

Cypher
(product)-[:PART_OF]->(category)

An example of using the PART_OF relation:

The PART_OF relation can be used to describe the relationship between a Product node and a Category node. If a product belongs to a category, the PART_OF relation can be used to show the relationship between the product and the category.


Nodes and Relationship in Northwind Graph

The result of this query will contain all paths that satisfy the defined graph pattern, which consists of nodes and relations connected through the relations :PURCHASED, :ORDERS, and :PART_OF. Each path in the result will include all the nodes and relations involved in the pattern.

  • • Customer and Order
    • Customer node represents the customer involved in the transaction.
    • [:PURCHASED] connects customers with the orders they place.
    • Order node represents an order placed by a particular customer.
  • • Order and Product
    • Order represents orders that have been created.
    • [:ORDERS] connects orders with the products in the order.
    • Product represents the products that have been ordered in order.
  • • Order and Product
    • Customer node represents the customer involved in the transaction.
    • [:PURCHASED] connects customers with the orders they place.
    • Order node represents an order placed by a particular customer.

Data Preparation

In exploring the Northwind dataset, data preparation is a crucial step to understanding the relationships between key entities. In this section, we will enter the Data Preparation stage, where we will create two important nodes, namely the Customer node and the Order node. These nodes will be the foundation for understanding the dynamics of the relationship between customers and orders in the Northwind dataset environment. With the creation of the nodes and the relationship between them, we will have a strong foundation to explore and analyze the business interactions that occur within the dataset.

Create Nodes

In a graph database, nodes are fundamental components that represent entities or objects in the system. Each node usually has a unique identification and can be associated with various properties that describe its characteristics. Nodes are used to model and store relevant entity data. To create nodes, we can use Cypher queries. Here is an example of how to create a node to represent a customer and order in the Northwind database:

Create New Customer Node

Cypher
// Create a new Customer node
CREATE (:Customer {
customerID: 'ALFKI',
companyName: 'Alfreds Futterkiste',
contactName: 'Maria Anders',
contactTitle: 'Sales Representative',
address: 'Obere Str. 57',
city: 'Berlin',
country: 'Germany',
postalCode: '12209',
phone: '030-0074321',
fax: '030-0076545'
});
  • In the example above:
    • Create : Used to create a new node.
    • (:Customer) : Gives the newly created node the label "Customer".
    • {...} : Provides properties for the node.

This query creates a new Customer node in the graph database with properties that include customer-related information such as address, company name, contact, and more.


Create New Order Node

Cypher
// Create a new Order node
CREATE (:Order {
orderID: '11011',
customerID: 'ALFKI',
employeeID: 3,
freight: 1.21,
orderDate: datetime('1998-04-09T00:00:00.000'),
requiredDate: datetime('1998-05-07T00:00:00.000'),
shipAddress: 'Obere Str. 57',
shipCity: 'Berlin',
shipCountry: 'Germany',
shipName: "Alfred's Futterkiste",
shipPostalCode: '12209',
shipRegion: NULL,
shipVia: 1,
shippedDate: datetime('1998-04-13T00:00:00.000')
});
  • In the example above:
    • Create : Used to create a new node.
    • (:Customer) : Gives the newly created node the label "Order".
    • {...} : Provides properties for the node.

This query creates a new Order node in the graph database with properties that include order-related information such as address, customer, ship time, and more.


Create Relationships

Relationships in a graph database represent connections or associations between nodes. They are a fundamental component that allows modeling the relationships between different entities in a system. Relationships are used to capture meaningful interactions, dependencies, or associations between nodes. Here is an example of how to create a relationship between a customer and order in the Northwind database:

Cypher
MATCH (customer:Customer {customerID: 'ALFKI'})
MATCH (order:Order {orderID: '11011'})
CREATE (customer)-[:PURCHASED]->(order);

The given Cypher query aims to create a [:PURCHASED] relationship between the Customer node and the Order node.

1. Matching the Existing Customer Nodes

Cypher
MATCH (customer:Customer {customerID: 'ALFKI'})

Using MATCH to search for the customer node with customerID 'ALFKI'. The customer variable is used to reference this Customer node.

2. Matching the Existing Order Nodes

Cypher
MATCH (order:Order {orderID: '11011'})

Using MATCH to search for the customer node with orderID '11011'. The order variable is used to reference this Order node.

3. Connects the Customer Node and Order Node

Cypher
CREATE (customer)-[:PURCHASED]->(order)

Use CREATE to create a relationship of type [:PURCHASED] between the Customer node and the Order node. This relationship indicates that the customer has made a purchase from a specific order.

Executing this query will create a new relationship between the Customer node and the Order node that matches the predefined matching conditions. This relationship will have a [:PURCHASED] label, which provides the context that the customer has made a purchase through a specific order.

Recommendation and Result

Creating a recommendation engine using the Northwind dataset in a graph database involves leveraging relationships between entities to create personalized product recommendations for customers. Graph databases are well suited for this due to their ability to traverse and query relationships efficiently. The following is how we can approach building using Northwind:

  • • Identify Customer Purchase History

    Start by tracing the graph from a specific customer node to the orders they have placed and the products in those orders.

  • • Explore Similar Customers

    Identify other customers who have purchased similar products or products from the same category.

  • • Recommend Products Purchased by Similar Customers

    Recommend products purchased by similar customers but not yet purchased by the target customer.


Queries

In the development of marketing and customer experience strategies, the discovery and implementation of appropriate product recommendations play a very important role. The two main approaches used to provide relevant product recommendations are Category-Based Recommendation and Collaborative Filtering Recommendation. In this section, we will explore both approaches and see how they can be applied in the context of the Northwind dataset. Category-Based Recommendation focuses on product category preferences, while Collaborative Filtering Recommendation uses customer purchase patterns to provide personalized recommendations. Let's explore more about how these two approaches can improve the customer shopping experience and provide valuable insights for business strategy.


Category-Based Recommendation - By Category

The category-based recommendation approach utilizes information about products that customers have liked or previously purchased to suggest other products in the same category. The principle is that customers who like products in a category may also be interested in other products in the same category. Below is an example of applying category-based recommendations using the Northwind dataset.

Identifying the Most Popular Products

Cypher
MATCH (cust:Customer)-[:PURCHASED]->(:Order)-[o:ORDERS]->(p:Product), (p)-[:PART_OF]->(c:Category)
RETURN DISTINCT cust.customerID as CustomerID, c.categoryName AS ProductCategory, p.productName AS Product, SUM(o.quantity) AS
TotalProductsPurchased
ORDER BY TotalProductsPurchased DESC

This query involves searching for patterns of product purchases by customers and generating summary statistics, such as product category, products purchased, and total number of purchases.

  • 1. Matching Graph Patterns

    Using MATCH to match graph patterns. The cust node represents a customer, the [:PURCHASED] relation relates customers to orders, the [o:ORDERS] relation relates orders to products by storing the quantity, and the [:PART_OF] relation relates products to categories.

  • Cypher
    MATCH (cust:Customer)-[:PURCHASED]->(:Order)-[o:ORDERS]->(p:Product), (p)-[:PART_OF]->(c:Category)
  • 2. Return the Expected Data
  • Cypher
    RETURN DISTINCT cust.customerID as CustomerID, c.categoryName AS ProductCategory, p.productName AS Product, SUM(o.quantity) AS TotalProductsPurchased

    Use RETURN to specify the columns to be returned:

    • cust.contactName : Customer ID.
    • c.categoryName : Name of the product category.
    • p.productName : Product name.
    • SUM(o.quantity) : The total number of products purchased, calculated using the SUM aggregation function from the quantity in the [o:ORDERS] relation.
  • 3. Using DISTINCT

    Using DISTINCT to ensure that each row of the returned result is unique.

  • 4. Sorting the Results
  • Cypher
    ORDER BY TotalProductsPurchased DESC

    Using ORDER BY to sort the results by the total number of products purchased in descending order (DESC).

    From the query results, we can get useful insights or understanding related to product purchasing patterns by customers. "Guaraná Fantástica" and "Konbu" were the two best-selling products with a total of 248 purchases each. In third place was "Camembert Pierrot" with 243 total purchases.


New Product Recommendation

Cypher
MATCH (c:Customer)-[:PURCHASED]->(o:Order)-[:ORDERS]->(p:Product)<-[o2:ORDERS]-(:Order)-[:ORDERS]->(p2:Product)-[:PART_OF]->(cat:Category)<-[:PART_OF]-(p)
WHERE c.customerID = 'QUICK' AND NOT((c)-[:PURCHASED]->(:Order)-[:ORDERS]->(p2)) AND p.productName = 'Camembert Pierrot'
RETURN p.productName AS has_purchased, cat.categoryName AS category, p2.productName AS recommendation, SUM(o2.quantity) AS
TotalProductsPurchased
ORDER BY TotalProductsPurchased DESC
  • 1. Matching Graph Patterns
  • Cypher
    MATCH (c:Customer)-[:PURCHASED]->(o:Order)-[:ORDERS]->(p:Product)<-[:ORDERS]-(o2:Order)-[:ORDERS]->(p2:Product)-[:PART_OF]->(cat:Category)<-[:PART_OF]-(p)

Using MATCH to match graph patterns. Here:

  • c:Customer : Customer node with label Customer.
  • o:Order : Order node with label Order.
  • p:Product : Product node with label Product.
  • p2:Product : The second product node.
  • cat:Category : Category node with the label Category.
  • This pattern includes :PURCHASED , o2:ORDERS, and :PART_OF relations between nodes that create product, order, and category purchase paths.
  • 2. Filter Customers and Products Purchased
  • Cypher
    WHERE c.customerID = 'QUICK' AND NOT((c)-[:PURCHASED]->(:Order)-[:ORDERS]->(p2)) AND p.productName = 'Camembert Pierrot'

Using WHERE to apply a filter for customers with ID 'QUICK' who have not purchased the same product (Camembert Pierrot) and retrieve the product with that name.

  • 3. Return the Expected Data
  • Cypher
    RETURN p.productName AS has_purchased, cat.categoryName AS category, p2.productName AS recommendation, SUM(o2.quantity) AS TotalProductsPurchased

Use RETURN to return specific columns:

  • p.productName : The name of the product that the customer has purchased.
  • c.categoryName : The name of the product category.
  • p2.productName : The name of the recommended product.
  • SUM(o.quantity) : The total number of products purchased, calculated using the aggregation function SUM of the quantities (quantity) in the relation [o2:ORDERS].
  • 4. Sorting the Results
  • Cypher
    ORDER BY TotalProductsPurchased DESC

Using ORDER BY to sort the results by the number of total products purchased of the recommended product order in descending order (DESC).


Result

In the query result, there are other product recommendations from the "Dairy Products" category that may be of interest to 'QUICK' customers based on the purchase pattern of other customers who also purchased 'Camembert Pierrot'. "Queso Manchego La Pastora" is the top recommendation with a total of 148 purchases by other customers. "Mascarpone Fabioli" was also recommended, although with a lower total purchase (8).

  • • First Query: Identifying Key Purchases
    • A customer with ID "QUICK" frequently purchases products recorded in the Northwind database. In an effort to understand this customer's preferences and purchasing patterns, the first query was conducted to find out what products he buys the most.
    • The results of the first query show that the product "Camembert Pierrot" from the "Dairy Products" category is one of the most frequently purchased products by customer "QUICK" with a total purchase of 243 units.
  • • Second Query: Improve Product Recommendations
    • By knowing the featured products of "QUICK" customers, the next step is to improve their shopping experience by recommending new products that have never been purchased by those customers before.
    • A second query was conducted to find products in the same category as "Camembert Pierrot" that have never been purchased by "QUICK" customers.
    • The result of the second query shows that "QUICK" customers can be directed to the product "Queso Manchego La Pastora" from the category "Dairy Products" as a recommendation. This product was chosen because it has the same category as the product that has been purchased by the customer, hoping to offer a new variety and fulfill their needs.
  • • New Product Recommendation
    • "QUICK" customers received a recommendation to try a new product, "Queso Manchego La Pastora," a product from the same category as their favorite, "Camembert Pierrot."
    • Customers may consider trying this new product, increasing the variety of products they purchase and feeling more included in their shopping experience.

By using this kind of query, companies can improve customer experience by providing relevant product recommendations based on other customers' purchasing behavior. This can help in marketing strategies, increasing sales, and even improving customer satisfaction.


Collaborative Filtering Recommendation - By Product Purchase

Collaborative filtering is a technique in recommender systems where recommendations are provided based on behavioral patterns and preferences of users similar to the target user. The principle is that recommending products based on similarities in purchases by other customers, without a specific focus on a particular category. Below is an example of applying collaborative filtering recommendations using the Northwind dataset.

Find Behavioral Patterns

Cypher
MATCH (targetCustomer:Customer {customerID: 'CONSH'})-[:PURCHASED]->(:Order)-[:ORDERS]->(p1:Product) WITH targetCustomer, p1
MATCH (otherCustomer:Customer)-[:PURCHASED]->(order:Order)-[:ORDERS]-(p2:Product)
WHERE otherCustomer <> targetCustomer AND NOT (targetCustomer)-[:PURCHASED]->(p2)
WITH targetCustomer, p2, COUNT(order) AS purchaseFrequency
RETURN p2.productName AS RecommendedProduct, purchaseFrequency AS PurchaseFrequency
ORDER BY purchaseFrequency DESC LIMIT 6;

The purpose of this query is to provide product recommendations to customers with ID 'CONSH' based on product purchasing patterns by other customers who have similar purchasing preferences.

  • 1. Search for Target Customer Purchases
  • Cypher
    MATCH (targetCustomer:Customer {customerID: 'CONSH'})-[:PURCHASED]->(:Order)-[:ORDERS]->(p1:Product) WITH targetCustomer, p1

Retrieves the products that have been purchased by the target customer ('CONSH').

Using MATCH to match the graph pattern. Here:

  • targetCustomer : The target customer node with customerID 'CONSH'.
  • p1 : The product node purchased by the target customer.
  • 2. Calculating the Frequency of Purchases by Other Customers
Cypher
MATCH (otherCustomer:Customer)-[:PURCHASED]->(order:Order)-[:ORDERS]-(p2:Product)
WHERE otherCustomer <> targetCustomer AND NOT (targetCustomer)-[:PURCHASED]->(p2)
WITH targetCustomer, p2, COUNT(order) AS purchaseFrequency

Search for products purchased by other customers and ensure that we only consider customers who are different from the target customer and products that have never been purchased by the target customer. Then calculate the frequency of purchase of each product by other customers.

Uses MATCH to match graph patterns.

  • otherCustomer : Another customer node that is not the same as the target customer.
  • p2 : A product node that is different from the product purchased by the target customer.
  • Using WHERE to filter otherCustomer instead of targetCustomer. Also ensures that targetCustomer has not purchased product p2.
  • Uses WITH to store the previous matching result and count the number of orders placed by otherCustomer.
  • 3. Return Product Recommendations
Cypher
RETURN p2.productName AS RecommendedProduct, purchaseFrequency AS PurchaseFrequency
ORDER BY purchaseFrequency DESC
LIMIT 6;

Use RETURN to return a product name (RecommendedProduct) containing product recommendations and their purchase frequency (PurchaseFrequency) for product p2, sorted in descending order based on the product's purchase frequency and limit the number of recommendations displayed.


Result

From the query results, we can infer some insights related to product recommendations to 'CONSH' customers based on other customers' purchasing patterns:

  • • Most Recommended Product
  • "Raclette Courdavault" is the most recommended product with a purchase frequency of 378. This shows that this product is highly demanded by other customers who share similar preferences with 'CONSH' customers.

  • • Most Recommended Product
  • "Guaraná Fantástica", "Camembert Pierrot", and "Gorgonzola Telino" have a high frequency of purchase with a score of 357 each. These products are also top recommendations with high popularity.

  • • Diversification of Recommended Products
  • Product recommendations include various types, such as beverages ("Guaraná Fantástica"), cheeses ("Raclette Courdavault", "Camembert Pierrot", "Gorgonzola Telino"), pasta ("Gnocchi di nonna Alice"), and cakes ("Tarte au sucre"). This shows the variety of products recommended to 'CONSH' customers.

  • • Potential 'CONSH' Customer Favorites
  • Based on the recommendations, we can assume that customer 'CONSH' is likely to have a preference for food products such as cheese, drinks, pasta, and cake based on the purchasing patterns of other customers.

  • • Potential Marketing Strategy
  • This information can be used to develop more targeted marketing strategies, such as special promotions, bundle packages, or discount campaigns for the most recommended products.

By understanding the results of this query, companies can improve the shopping experience of 'CONSH' customers by providing product recommendations that are more in line with their preferences.