A Data-Driven Deep Dive into Olist's Business Performance (2016-2018)
Olist is one of the largest e-commerce marketplaces in Brazil. Between 2016 and 2018, the platform experienced rapid transaction growth. However, to sustain profitability and ensure high customer satisfaction, management needed clear visibility into the company's operational bottlenecks, sales trends, and customer behavior.
To transform raw transactional data into actionable business strategies, this project seeks to answer five key Business Objectives:
To answer these Business Objectives, a robust data pipeline was developed, transitioning from raw data extraction to visual storytelling.
The data used is the "Brazilian E-Commerce Public Dataset by Olist," sourced from Kaggle. It contains over 100,000 anonymized orders from 2016 to 2018, spread across multiple relational tables (Orders, Customers, Payments, Products, and Reviews).
PostgreSQL was used to clean, join, and aggregate the raw data. Click on the toggles below to view the specific queries designed to process the data for each business bjectives:
SELECT
DATE_TRUNC('month', o.order_purchase_timestamp::TIMESTAMP)::DATE AS order_month,
COUNT(DISTINCT o.order_id) AS total_orders,
ROUND(SUM(op.payment_value)::NUMERIC, 2) AS total_revenue
FROM orders o
JOIN order_payments op ON o.order_id = op.order_id
WHERE o.order_status = 'delivered'
AND o.order_purchase_timestamp IS NOT NULL
GROUP BY order_month
ORDER BY order_month ASC;
SELECT
pt.product_category_name_english AS category_name,
COUNT(oi.product_id) AS total_items_sold,
ROUND(SUM(oi.price)::NUMERIC, 2) AS total_sales_revenue,
ROUND(AVG(r.review_score)::NUMERIC, 2) AS average_rating
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
JOIN product_category_name_translation pt ON p.product_category_name = pt.product_category_name
JOIN orders o ON o.order_id = oi.order_id
LEFT JOIN order_reviews r ON o.order_id = r.order_id
WHERE o.order_status = 'delivered'
GROUP BY category_name
ORDER BY total_sales_revenue DESC
LIMIT 10;
SELECT
c.customer_state,
COUNT(o.order_id) AS total_orders,
ROUND(AVG(EXTRACT(DAY FROM (o.order_delivered_customer_date::TIMESTAMP - o.order_purchase_timestamp::TIMESTAMP))), 1) AS avg_delivery_days,
SUM(CASE WHEN o.order_delivered_customer_date::TIMESTAMP > o.order_estimated_delivery_date::TIMESTAMP THEN 1 ELSE 0 END) AS late_deliveries,
ROUND((SUM(CASE WHEN o.order_delivered_customer_date::TIMESTAMP > o.order_estimated_delivery_date::TIMESTAMP THEN 1 ELSE 0 END) * 100.0) / COUNT(o.order_id), 2) AS late_delivery_percentage
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_status = 'delivered'
AND o.order_delivered_customer_date IS NOT NULL
GROUP BY c.customer_state
HAVING COUNT(o.order_id) > 100
ORDER BY late_delivery_percentage DESC;
SELECT
c.customer_unique_id,
c.customer_city,
c.customer_state,
COUNT(DISTINCT o.order_id) AS total_transactions,
SUM(op.payment_value) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_payments op ON o.order_id = op.order_id
WHERE o.order_status = 'delivered'
GROUP BY c.customer_unique_id, c.customer_city, c.customer_state
ORDER BY total_spent DESC
LIMIT 10;
SELECT
payment_type,
COUNT(order_id) AS total_transactions,
ROUND(AVG(payment_value)::NUMERIC, 2) AS average_transaction_value,
MAX(payment_installments) AS max_installments,
ROUND(AVG(payment_installments)::NUMERIC, 1) AS avg_installments
FROM order_payments
GROUP BY payment_type
ORDER BY total_transactions DESC;
After processing the data using the SQL queries above, the structured datasets were connected to Tableau. Explore the interactive executive dashboard below to dynamically filter the processed metrics.
Based on the interactive dashboard and the underlying data, the following results were identified:
By translating the data into actionable insights, Olist management can adopt the following strategies to optimize business performance:
Shift marketing budgets to heavily promote the Health & Beauty and Watches & Gifts categories during non-peak months to stabilize the sharp revenue drop typically seen after Black Friday.
Immediate operational focus is required for routes heading to AL and MA. Consider auditing logistics partners in these states or establishing intermediate fulfillment centers closer to the North.
Implement an exclusive loyalty tier for the top-spending customers identified in SP and RJ. Offering perks such as expedited shipping will increase Customer Lifetime Value (CLV).
Knowing that installments drive higher average order values, partner with major banks to offer "0% interest installment" promotions or targeted cashback on high-ticket items to encourage upselling.