Skip to main content

Overview

In this guide, you’ll learn how to use the Oxla database with eCommerce datasets through various sample queries. These datasets include information about sales transactions, such as customer details, supplier data, and order details. You’ll explore how to analyze data by joining related tables and running queries to support informed decision-making for your business. Data Source: https://github.com/apache/doris/tree/master/tools/ssb-tools .
Prerequisite
Ensure you’ve created the Data Storage with the appropriate demo dataset, the Cluster is running, and you’re connected to it using the PostgreSQL client. For more details, refer to the Quickstart - Oxla SaaS page.

Datasets Structure

Let’s explore the tables and their structures to understand better and fully utilize the eCommerce dataset.
Attribute NameType
c_custkeyInteger
c_nameText
c_addressText
c_cityText
c_nationText
c_regionText
c_phoneText
c_mktsegmentText
Attribute NameType
lo_orderkeyInteger
lo_linenumberInteger
lo_custkeyInteger
lo_partkeyInteger
lo_suppkeyInteger
lo_orderdateDate
lo_orderpriorityText
lo_shippriorityInteger
lo_quantityInteger
lo_extendedpriceInteger
lo_ordtotalpriceInteger
lo_discountInteger
lo_revenueInteger
lo_supplycostInteger
lo_taxInteger
lo_commitdateDate
lo_shipmodeText
Attribute NameType
s_suppkeyInteger
s_nameText
s_addressText
s_cityText
s_nationText
s_regionText
s_phoneText
Attribute NameType
p_partkeyInteger
p_nameText
p_mfgrText
p_categoryText
p_brandText
p_colorText
p_typeText
p_sizeInteger
p_containerText

Sample Queries

Sample 1: Find Customers with the Most Total Order

This example calculates the total order quantity for each customer by joining the customer and lineorder tables.
SELECT c.c_name AS customer_name, SUM(l.lo_quantity) AS total_order_quantity
FROM customer c
JOIN lineorder l ON c.c_custkey = l.lo_custkey
GROUP BY c.c_name
ORDER BY total_order_quantity DESC
LIMIT 10;
The result displays the 10 customer names with the highest total order quantity.
   customer_name    | total_order_quantity 
--------------------+----------------------
 Customer#002750845 |                15982
 Customer#002399458 |                15912
 Customer#001711105 |                15809
 Customer#001709917 |                15740
 Customer#000274669 |                15682
 Customer#001900141 |                15637
 Customer#001435495 |                15574
 Customer#002980366 |                15567
 Customer#000184897 |                15494
 Customer#002495023 |                15491
(10 rows)

Sample 2: Average Order Quantity and Revenue per Customer

In this example, we aim to retrieve the average order quantity and revenue per customer. We will use the LIMIT clause to find the top 5 customers based on their revenue.
SELECT c.c_name AS customer_name, 
       AVG(l.lo_quantity) AS avg_order_quantity, 
       AVG(l.lo_revenue) AS avg_order_revenue
FROM customer c
JOIN lineorder l ON c.c_custkey = l.lo_custkey
GROUP BY c.c_name
ORDER BY avg_order_revenue DESC
LIMIT 5;
This query displays the average order quantities and revenues for the top 5 customers.
   customer_name    | avg_order_quantity | avg_order_revenue 
--------------------+--------------------+-------------------
 Customer#002535380 | 31.348148148148148 |  4482488.65925926
 Customer#002547308 | 31.519083969465647 | 4419699.503816794
 Customer#002906426 | 29.959798994974875 | 4371241.924623116
 Customer#000640106 | 29.676470588235293 | 4370783.544117647
 Customer#002794070 | 29.681528662420384 | 4370103.968152867
(5 rows)

Sample 3: Find the Top 3 Sold Parts

This query lists the top 3 parts by total revenue and total quantity sold.
SELECT p.p_name AS part_name, 
       SUM(l.lo_quantity) AS total_quantity_sold, 
       SUM(l.lo_revenue) AS total_revenue
FROM lineorder l
JOIN part p ON l.lo_partkey = p.p_partkey
GROUP BY p.p_name
ORDER BY total_revenue DESC
LIMIT 3;
It displays the 3 parts that are most sold based on the quantity sold.
     part_name     | total_quantity_sold | total_revenue 
-------------------+---------------------+---------------
 aquamarine beige  |             3871448 |  562986438412
 aquamarine black  |             3662718 |  521680920074
 aquamarine bisque |             3692528 |  519083669350
(3 rows)

Sample 4: Monthly Revenue and Order Count per Supplier

This example shows how to calculate each supplier’s monthly revenue and order count.
SELECT s.s_name AS supplier_name, 
       EXTRACT(YEAR FROM l.lo_orderdate) AS order_year,
       EXTRACT(MONTH FROM l.lo_orderdate) AS order_month,
       COUNT(*) AS order_count,
       SUM(l.lo_revenue) AS total_revenue
FROM lineorder l
JOIN supplier s ON l.lo_suppkey = s.s_suppkey
JOIN part p ON l.lo_partkey = p.p_partkey
GROUP BY s.s_name, order_year, order_month
ORDER BY order_year, order_month, total_revenue DESC
LIMIT 10;
You will get the total revenue for each supplier during the respective period.
   supplier_name    | order_year | order_month | order_count | total_revenue 
--------------------+------------+-------------+-------------+---------------
 Supplier#000178170 |       1992 |           1 |          61 |     272171509
 Supplier#000024596 |       1992 |           1 |          64 |     261609404
 Supplier#000056984 |       1992 |           1 |          55 |     260734222
 Supplier#000163429 |       1992 |           1 |          56 |     259933505
 Supplier#000054770 |       1992 |           1 |          59 |     259102073
 Supplier#000141166 |       1992 |           1 |          63 |     259080854
 Supplier#000005060 |       1992 |           1 |          61 |     258319232
 Supplier#000065417 |       1992 |           1 |          59 |     256780075
 Supplier#000040250 |       1992 |           1 |          59 |     256695041
 Supplier#000171886 |       1992 |           1 |          56 |     255549592
(10 rows)