Optimizing E-Commerce Strategy

A Data-Driven Deep Dive into Olist's Business Performance (2016-2018)

1. Executive Summary

Context

Olist is one of the largest e-commerce marketplaces in Brazil. Between 2016 and 2018, the platform experienced rapid transaction growth. However, to sustain profitability and ensure high customer satisfaction, management needed clear visibility into the company's operational bottlenecks, sales trends, and customer behavior.

Business Objectives

To transform raw transactional data into actionable business strategies, this project seeks to answer five key Business Objectives:

  1. Is the business growing consistently over time, and what are the peak sales periods?
  2. Which product categories drive the most revenue while maintaining high customer satisfaction?
  3. Which geographic regions (states) suffer from the highest rates of delivery delays?
  4. Who are the most valuable "VIP" customers based on total monetary value spent?
  5. What are the preferred payment methods, and how do installments affect average order value?

2. The Approach & Data Engineering

To answer these Business Objectives, a robust data pipeline was developed, transitioning from raw data extraction to visual storytelling.

Data Source

The data used is the "Brazilian E-Commerce Public Dataset by Olist," sourced from Kaggle. It contains over 100,000 anonymized orders from 2016 to 2018, spread across multiple relational tables (Orders, Customers, Payments, Products, and Reviews).

Data Processing (SQL)

PostgreSQL was used to clean, join, and aggregate the raw data. Click on the toggles below to view the specific queries designed to process the data for each business bjectives:

Query 1: Monthly Revenue Trend
SELECT 
    DATE_TRUNC('month', o.order_purchase_timestamp::TIMESTAMP)::DATE AS order_month,
    COUNT(DISTINCT o.order_id) AS total_orders,
    ROUND(SUM(op.payment_value)::NUMERIC, 2) AS total_revenue
FROM orders o
JOIN order_payments op ON o.order_id = op.order_id
WHERE o.order_status = 'delivered' 
  AND o.order_purchase_timestamp IS NOT NULL
GROUP BY order_month
ORDER BY order_month ASC;
Query 2: Top 10 Product Categories by Revenue & Rating
SELECT 
    pt.product_category_name_english AS category_name,
    COUNT(oi.product_id) AS total_items_sold,
    ROUND(SUM(oi.price)::NUMERIC, 2) AS total_sales_revenue,
    ROUND(AVG(r.review_score)::NUMERIC, 2) AS average_rating
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
JOIN product_category_name_translation pt ON p.product_category_name = pt.product_category_name
JOIN orders o ON o.order_id = oi.order_id
LEFT JOIN order_reviews r ON o.order_id = r.order_id
WHERE o.order_status = 'delivered'
GROUP BY category_name
ORDER BY total_sales_revenue DESC
LIMIT 10;
Query 3: Logistics Performance (Late Deliveries by State)
SELECT 
    c.customer_state,
    COUNT(o.order_id) AS total_orders,
    ROUND(AVG(EXTRACT(DAY FROM (o.order_delivered_customer_date::TIMESTAMP - o.order_purchase_timestamp::TIMESTAMP))), 1) AS avg_delivery_days,
    SUM(CASE WHEN o.order_delivered_customer_date::TIMESTAMP > o.order_estimated_delivery_date::TIMESTAMP THEN 1 ELSE 0 END) AS late_deliveries,
    ROUND((SUM(CASE WHEN o.order_delivered_customer_date::TIMESTAMP > o.order_estimated_delivery_date::TIMESTAMP THEN 1 ELSE 0 END) * 100.0) / COUNT(o.order_id), 2) AS late_delivery_percentage
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_status = 'delivered' 
  AND o.order_delivered_customer_date IS NOT NULL
GROUP BY c.customer_state
HAVING COUNT(o.order_id) > 100 
ORDER BY late_delivery_percentage DESC;
Query 4: Customer Segmentation (RFM: Monetary Value)
SELECT 
    c.customer_unique_id,
    c.customer_city,
    c.customer_state,
    COUNT(DISTINCT o.order_id) AS total_transactions,
    SUM(op.payment_value) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_payments op ON o.order_id = op.order_id
WHERE o.order_status = 'delivered'
GROUP BY c.customer_unique_id, c.customer_city, c.customer_state
ORDER BY total_spent DESC
LIMIT 10;
Query 5: Payment Behavior & Installments
SELECT 
    payment_type,
    COUNT(order_id) AS total_transactions,
    ROUND(AVG(payment_value)::NUMERIC, 2) AS average_transaction_value,
    MAX(payment_installments) AS max_installments,
    ROUND(AVG(payment_installments)::NUMERIC, 1) AS avg_installments
FROM order_payments
GROUP BY payment_type
ORDER BY total_transactions DESC;

3. Interactive Dashboard & Results

After processing the data using the SQL queries above, the structured datasets were connected to Tableau. Explore the interactive executive dashboard below to dynamically filter the processed metrics.

Based on the interactive dashboard and the underlying data, the following results were identified:

  • Revenue Trends: The company experienced steady growth, peaking significantly in November 2017. This surge is heavily attributed to Black Friday promotions. However, post-November sales immediately dropped, indicating a reliance on major seasonal events rather than steady organic retention.
  • Category Performance: Health & Beauty and Watches & Gifts are the most lucrative product categories. Not only do they generate the highest total revenue, but they also maintain excellent average review scores (>4.0), making them high-value, low-risk segments.
  • Logistics Bottlenecks: There is a severe disparity in delivery performance across Brazil. Customers in Northern states, specifically AL (Alagoas) and MA (Maranhão), experience the highest percentage of late deliveries. This directly correlates with a drop in customer satisfaction scores in those regions.
  • Customer Concentration: The "Top 10 VIP Customers" analysis shows that the highest spenders are highly concentrated in major economic hubs like Rio de Janeiro (RJ) and São Paulo (SP).
  • Payment Behavior: Credit cards are the overwhelmingly dominant payment method. Furthermore, the data reveals that customers who utilize credit card installments tend to have a significantly higher average transaction value compared to those paying via single-payment methods like Boleto.

4. Strategic Recommendations

By translating the data into actionable insights, Olist management can adopt the following strategies to optimize business performance:

🎯 Optimize Marketing Allocation

Shift marketing budgets to heavily promote the Health & Beauty and Watches & Gifts categories during non-peak months to stabilize the sharp revenue drop typically seen after Black Friday.

🚚 Restructure Northern Logistics

Immediate operational focus is required for routes heading to AL and MA. Consider auditing logistics partners in these states or establishing intermediate fulfillment centers closer to the North.

💎 Launch a VIP Loyalty Program

Implement an exclusive loyalty tier for the top-spending customers identified in SP and RJ. Offering perks such as expedited shipping will increase Customer Lifetime Value (CLV).

💳 Leverage Credit Incentives

Knowing that installments drive higher average order values, partner with major banks to offer "0% interest installment" promotions or targeted cashback on high-ticket items to encourage upselling.