Overview

In this guide, you’ll learn how to use the Oxla database with eCommerce datasets through various sample queries. These datasets include information about sales transactions, such as customer details, supplier data, and order details.

You’ll explore how to analyze data by joining related tables and running queries to support informed decision-making for your business.

Data Source: https://github.com/apache/doris/tree/master/tools/ssb-tools .

Prerequisite
Ensure you’ve created the Data Storage with the appropriate demo dataset, the Cluster is running, and you’re connected to it using the PostgreSQL client. For more details, refer to the Quickstart - Oxla SaaS page.

Datasets Structure

Let’s explore the tables and their structures to understand better and fully utilize the eCommerce dataset.

Sample Queries

Sample 1: Find Customers with the Most Total Order

This example calculates the total order quantity for each customer by joining the customer and lineorder tables.

SELECT c.c_name AS customer_name, SUM(l.lo_quantity) AS total_order_quantity
FROM customer c
JOIN lineorder l ON c.c_custkey = l.lo_custkey
GROUP BY c.c_name
ORDER BY total_order_quantity DESC
LIMIT 10;

The result displays the 10 customer names with the highest total order quantity.

   customer_name    | total_order_quantity 
--------------------+----------------------
 Customer#002750845 |                15982
 Customer#002399458 |                15912
 Customer#001711105 |                15809
 Customer#001709917 |                15740
 Customer#000274669 |                15682
 Customer#001900141 |                15637
 Customer#001435495 |                15574
 Customer#002980366 |                15567
 Customer#000184897 |                15494
 Customer#002495023 |                15491
(10 rows)

Sample 2: Average Order Quantity and Revenue per Customer

In this example, we aim to retrieve the average order quantity and revenue per customer. We will use the LIMIT clause to find the top 5 customers based on their revenue.

SELECT c.c_name AS customer_name, 
       AVG(l.lo_quantity) AS avg_order_quantity, 
       AVG(l.lo_revenue) AS avg_order_revenue
FROM customer c
JOIN lineorder l ON c.c_custkey = l.lo_custkey
GROUP BY c.c_name
ORDER BY avg_order_revenue DESC
LIMIT 5;

This query displays the average order quantities and revenues for the top 5 customers.

   customer_name    | avg_order_quantity | avg_order_revenue 
--------------------+--------------------+-------------------
 Customer#002535380 | 31.348148148148148 |  4482488.65925926
 Customer#002547308 | 31.519083969465647 | 4419699.503816794
 Customer#002906426 | 29.959798994974875 | 4371241.924623116
 Customer#000640106 | 29.676470588235293 | 4370783.544117647
 Customer#002794070 | 29.681528662420384 | 4370103.968152867
(5 rows)

Sample 3: Find the Top 3 Sold Parts

This query lists the top 3 parts by total revenue and total quantity sold.

SELECT p.p_name AS part_name, 
       SUM(l.lo_quantity) AS total_quantity_sold, 
       SUM(l.lo_revenue) AS total_revenue
FROM lineorder l
JOIN part p ON l.lo_partkey = p.p_partkey
GROUP BY p.p_name
ORDER BY total_revenue DESC
LIMIT 3;

It displays the 3 parts that are most sold based on the quantity sold.

     part_name     | total_quantity_sold | total_revenue 
-------------------+---------------------+---------------
 aquamarine beige  |             3871448 |  562986438412
 aquamarine black  |             3662718 |  521680920074
 aquamarine bisque |             3692528 |  519083669350
(3 rows)

Sample 4: Monthly Revenue and Order Count per Supplier

This example shows how to calculate each supplier’s monthly revenue and order count.

SELECT s.s_name AS supplier_name, 
       EXTRACT(YEAR FROM l.lo_orderdate) AS order_year,
       EXTRACT(MONTH FROM l.lo_orderdate) AS order_month,
       COUNT(*) AS order_count,
       SUM(l.lo_revenue) AS total_revenue
FROM lineorder l
JOIN supplier s ON l.lo_suppkey = s.s_suppkey
JOIN part p ON l.lo_partkey = p.p_partkey
GROUP BY s.s_name, order_year, order_month
ORDER BY order_year, order_month, total_revenue DESC
LIMIT 10;

You will get the total revenue for each supplier during the respective period.

   supplier_name    | order_year | order_month | order_count | total_revenue 
--------------------+------------+-------------+-------------+---------------
 Supplier#000178170 |       1992 |           1 |          61 |     272171509
 Supplier#000024596 |       1992 |           1 |          64 |     261609404
 Supplier#000056984 |       1992 |           1 |          55 |     260734222
 Supplier#000163429 |       1992 |           1 |          56 |     259933505
 Supplier#000054770 |       1992 |           1 |          59 |     259102073
 Supplier#000141166 |       1992 |           1 |          63 |     259080854
 Supplier#000005060 |       1992 |           1 |          61 |     258319232
 Supplier#000065417 |       1992 |           1 |          59 |     256780075
 Supplier#000040250 |       1992 |           1 |          59 |     256695041
 Supplier#000171886 |       1992 |           1 |          56 |     255549592
(10 rows)