Partner

Blog

HelpDesk

Ecommerce-Machine Learning and AI
Ecommerce

Ecommerce

Data driven strategic decision making by developing a better understanding of customers and their choices

Objective

  • Identify business problems that can be solved through analytics depending on the availability of data.
  • Optimize the strategic decision making by developing a better understanding of customers and their choices.

Outcome

  • Problem space that needs to be addressed using data driven analytics in the niche domain has been identified.
  • Customer segmentation, churn prediction, propensity modelling and recommender engines etc. have helped in data driven decision making.

What are the most common problems faced by e-commerce companies?

Segmentation of customers using their demographic and buying / visit behavior

For each month, the average growth for both the groups is compared against each other for all the 5 segments.

  • Demographics
  • Visits on websites and apps
  • Buying on websites and apps
  • Visiting and buying from stores
Demographics
Customer Age Location Product Views
A 25 North Mobile 4
B 33 West Book 6

K-Means Clustering:

It is an iterative unsupervised machine learning algorithm to cluster the data.

Age: 21 – 25 yrs

  • North: India
  • Gender: Male
  • Product: Mobile

Age: 30 – 35 yrs

  • West: India
  • Gender: Female
  • Product: Book

Customer churn prediction

How to identify customers who are likely to churn due to inactivity and when should they be sent promotional offers to retain them?

Preliminary factors that affect churn of the customers

Product Attributes

Unavailability of products on the website

Unavailability of products on the website

Variety of products available

Variety of products available

Quality of products

Quality of products

Competitors pricing

Competitors pricing

Promotion of products

Promotion of products

Service Attributes

Complaint resolution time

Complaint resolution time

Loyalty programs

Loyalty programs

Post purchase service

Post purchase service

Delivery delays

Delivery delays

Staff behaviour at stores

Staff behaviour at stores

Some of the Hypotheses are...

  • Instances of unavailable products
  • Variety of products that are better priced
  • Promotions for the products on the website
  • Resolution time for the complaints registered at the call centre
  • Delays in delivery of the products
  • No of hours of soft skills training to the staff at the store

Approach to predict when the existing customers can churn

Approach to predict when the existing customers can churn

Customer omni-channel behaviour

How to identify the customers who are likely to view products on the website and then make purchases in stores?

Preliminary factors affecting propensity to buy online

Unavailability of product reviews on website

Unavailability of product reviews on website

Activity on social media

Activity on social media

Competitors pricing

Competitors pricing

Promotions of the products

Promotions of the products

Service at the showroom

Service at the showroom

Demographics

Demographics

Visits on website

Visits on website

Product views

Product views

Visits to store

Visits to store

Purchase history

Purchase history

Shipping charges

Shipping charges

Return policies

Return policies

Product description

Product description

User experience on website

User experience on website

Options for fast-track delivery

Options for fast-track delivery

Some of the Hypotheses are…

  • Number of reviews for the products
  • Activity on social media for the company
  • Number of customers in the same area
  • The user experience for any website visitor
  • The familiarity of the brand ambassador
  • No of visits on the website for a product
  • The store visits for the website visitor

Predict the Omni-channel behavior of customer – viewing products online yet buying in stores

Predict the Omni-channel behavior of customer

Personalized marketing

What are the products that should be recommended to the customers based on their past behaviour?

Preliminary factors that affect personalized marketing.

Company Attributes

Products that are viewed together

Products that are viewed together

Products bought together

Products bought together

Categories of products

Categories of products

Range of prices

Range of prices

Promotions of the products

Promotions of the products

Service at the showroom

Service at the showroom

Customer Attributes

Demographics

Demographics

Visits on website

Visits on website

Product views

Product views

Device used

Device used

Purchase history

Purchase history

Session Behavior

Session Behavior

Referring Website

Referring Website

Website attributes

Views of product

Views of product

Visitors frequency

Visitors’ frequency

Recency of visits

Recency of visits

Some of the Hypotheses are…

  • Customers are recommended complementary products while buying the primary product
  • Highly viewed product by a particular segment is recommended to the customer
  • Higher the recency of the visit to product page by the customer
  • A particular category of Products has been viewed earlier by a visitor
  • Products having highest number of views are shown as suggestions
  • Promotional festive offers are displayed

…then customers:

  • end up buying the complementary products
  • end up viewing that product
  • will view the same product while navigating other products
  • will view similar products in the same price range if suggested
  • will view that product
  • will buy the promoted products

Personalized marketing to the customers using a recommender engine

Personalized marketing to the customers using a recommender engine

Data Preparation

The data preparation phase covers all activities needed to construct the final dataset [data that will be fed into the modeling tool(s)] from the initial raw data. Data preparation tasks are likely to be performed multiple times and not in any prescribed order. Tasks include table, record, and attribute selection, as well as transformation and cleaning of data for modeling tools.

  • Select Data
  • Clean Data
  • Construct Data
  • Integrate Data
  • Format Data
  • Data Set Description

Sources Of Relevant Data And Gaining Access

  • Arrow
  • Arrow
  • Arrow
  • Arrow
  • Arrow
  • Arrow

Obtain Data Set For Modeling And Establish Dependency

  • Decide on the data to be used for analysis. Criteria include relevance to the data mining goals, quality, and technical constraints such as limits on data volume or data types.
  • Raise the data quality to the level required by the selected analysis techniques. This may involve selection of clean subsets of the data, the insertion of suitable defaults, or more ambitious techniques such as the estimation of missing data by modeling.
  • Constructive data preparation operations such as the production of derived attributes or entire new records, or transformed values for existing attributes.
  • These are methods whereby information is combined from multiple tables or records to create new records or values.
  • Formatting transformations refer to primarily syntactic modifications made to the data that do not change its meaning, but might be required by the modeling tool.
  • Create the final Data Set required for modeling and give business definition for proper consumption.

Data Integration and Develop Understanding

Exploratory Data Analysis

The data understanding phase starts with initial data collection and proceeds with activities that enable you to become familiar with the data, identify data quality problems, discover first insights into the data, and/or detect interesting subsets to form hypotheses regarding hidden information.

  • Select Modeling Techniques
  • Generate Test Design
  • Build Model
  • Assess Model

Sources Of Relevant Data And Gaining Access

  • Arrow
  • Arrow
  • Arrow
  • Arrow

Conduct Exploratory Data Analysis, including

Univariate non-graphical: to help identify any outliers and better understand the distribution of the sample by primarily utilizing descriptive statistics

  • Frequency reports
  • Measures of spread
  • Central tendency measures

Univariate graphical: visualization of the descriptive statistics from the non-graphical techniques, including but not limited to:

  • Histograms
  • Box-plots
  • Quantile normal plot (looks at the observed and expected values)
  • Stem and leaf plots

Multivariate non-graphical: to understand the relationship between two or more of the variables contained in the databases through statistical techniques, including but not limited to:

  • Cross-tabs
  • Correlation
  • Covariance
  • ANOVA

Multivariate graphical:

  • Scatterplots
  • Side-by-side box-plots

Data Modeling

The data preparation phase covers all activities needed to construct the final dataset [data that will be fed into the modeling tool(s)] from the initial raw data. Data preparation tasks are likely to be performed multiple times and not in any prescribed order. Tasks include table, record, and attribute selection, as well as transformation and cleaning of data for modeling tools.

  • Select Modeling Techniques
  • Generate Test Design
  • Build Model
  • Assess Model

Select And Fine Tune The Model For Optimum Results

  • Arrow
  • Arrow
  • Arrow
  • Arrow

Segmentation Modeling

  • Multinomial logistic regression
  • Segmentation techniques
  • Traditional cluster analysis
  • Regression guided segmentation
  • Classification guided segmentation
  • Bayesian hierarchical models

Persistency/Churn Modeling:

  • Proportional hazards model
  • Kaplan-Meier
  • Other Survival modeling techniques

Model Evaluation and Maturity Iteration

At this stage in the project, you have built a model (or models) that appears to have high quality from a data analysis perspective. Before proceeding to final deployment of the model, it is important to thoroughly evaluate it and review the steps executed to create it, to be certain the model properly achieves the business objectives.

  • Evaluate Results
  • Review Process
  • Determine Next Steps

Measure And Improve Performance

  • Arrow
  • Arrow
  • Arrow
variable

Variable 1

Modal Performance%

variable

Variable 1

Variable 2

Modal Performance%

variable

Variable 1

Variable 2

Variable n

Modal Performance%

Deployment and Continuous Improvement

Creation of the model is generally not the end of the project. Even if the purpose of the model is to increase knowledge of the data, the knowledge gained will need to be organized and presented in a way that the customer can use it. It will involve applying “live” models within an organization’s decision making processes and improve by applying machine learning techniques

  • Deployment
  • Track and Monitor
  • Machine Learning (On-Going)

Put Insights Into Practice

  • Arrow
  • Arrow
  • Arrow

Deployment

  • Establish the drivers for the customer persistency
  • Identify the strategy for using the score during the life cycle of the customer
  • Integrate the score in applicable processes
  • Learn from persistent customers and applying techniques to reduce churn

Track, Monitor and Learn

  • Monitor the prediction to the actual results for applicability
  • Find out discrepancy where exist and apply learning to model for improvement of prediction