๐Ÿง  Objective

Segment customers of a UK-based online gift retailer using transaction data to uncover behavior-driven groups for targeted marketing.

๐Ÿ“ฆ Dataset

  • Transactions from Dec 2010 to Dec 2011
  • 541,909 records, 8 columns
  • Key columns: InvoiceDate, Quantity, UnitPrice, CustomerID, Country

๐Ÿ” Methodology

  1. Data Wrangling:

    • Removed canceled orders and missing customer IDs
    • Filtered out negative or zero Quantity and UnitPrice
    • Parsed dates into datetime format
  2. Feature Engineering:

    • Built RFM features:
      • Recency: Days since last purchase
      • Frequency: Total purchases
      • Monetary: Total amount spent
  3. Preprocessing:

    • Log-transformed and scaled RFM values
    • Applied PCA for 2D visualization
  4. Clustering (K-Means):

    • Determined optimal k=3 using Elbow + Silhouette methods
    • Assigned customers to 3 clusters
  5. Visualization:

    • PCA scatter plot of clusters
    • Boxplots and heatmap of RFM by cluster
    • Choropleth map of customer countries

๐Ÿงฉ Segment Profiles

ClusterRecencyFrequencyMonetarySegment Type
0LowHighHigh๐Ÿ’Ž Lapsed VIPs
1HighLowLow๐ŸงŠ One-Time Buyers
2MediumMediumMediumโณ Mid-Tier Customers

๐Ÿง  Business Insight

  • Cluster 0: High spenders, but inactive โ†’ win back with rewards or early access.
  • Cluster 1: Likely one-time buyers โ†’ nudge with follow-ups or discounts.
  • Cluster 2: Moderate engagement โ†’ develop into loyal customers with tailored campaigns.

๐Ÿ› ๏ธ Tools

  • Python (pandas, sklearn, seaborn, plotly)
  • K-Means, PCA
  • Jupyter Notebook

๐Ÿ““ View Full Jupyter Notebook on GitHub