Bundled Items

ASKED IN INTERVIEW
4pts
Amazon

Problem Statement

Amazon's recommendation engine needs to know which products have a high affinity for one another. A bundle is defined as any two unique products purchased within the same order.

Write an SQL query to find the top 5 most frequent product pairs.

Rules:

  • A pair consists of product_a and product_b.
  • To avoid duplicates, product_a must always be strictly less than product_b (alphabetically or numerically).
  • Output columns: product_a, product_b, bundle_count.
  • Results must be returned in descending order of bundle_count.
  • Only return the top 5 results.

Table Schema:

  • OrderItems: order_id, product_id.
Tests your understanding of
Self Join, Aggregation, Filtering and Market Basket Analysis

Input Tables

OrderItems
order_id(INTEGER)product_id(INTEGER)
1101
1102
2101
2102
2103
3101
3102
4104
4105

Expected Output

product_a(INTEGER)product_b(INTEGER)bundle_count(INTEGER)
1011023
1011031
1021031
1041051

Tags

MediumASKED IN INTERVIEWSelf JoinAggregationFilteringMarket Basket Analysis
15-20 min
47%

Hints