Machine learning and data visualization for clickstream analysis


Before analyzing customer data, we need to describe the customers. Descriptive features for customers usually revolve around three categories: revenues, demographics and behavior. While revenues and demographics are easy to quantify, customer behavior is harder to define and therefore harder to quantify.

Customer behavior depends heavily on the kind of business. Behavior in energy usage requires different metrics than behavior in newspaper reading. Loyalty to the business is measured differently in subscription-based businesses than in a traditional retail store. When it comes to online shopping, the is just overwhelming.

How long does it take before we finally decide to buy a product? How long do we read the product description? How many times do we come back to the same page in search of a convincing reason to buy? How many other product pages do we explore to compare?

We are not all the same when it comes to buying. There are the impulsive buyers, the buyers who need deep reflection before buying, the buyers who need comparisons to be convinced, and so on. We all follow our own buying path—even more so when it comes to online shopping.

Clicks, visiting times, purchases, and related actions are recorded on all websites. If you are just a guest, your actions are recorded anonymously. If you are a known customer, your actions are recorded in connection with your user ID. Anonymously or not, all of us leave a trail as we click our way from page to page.