
Introduction
A Market Basket Analysis (MBA) is an industry-standard in retails to study consumer’s purchasing habits. It helps the retail industry to identify what items are bought together frequently. The whole mechanism is to mine the combinations or associations of items using any retail store’s transaction database.
Ever wondered why there is always chocolate syrup next to icecream stalls in a supermarket? or why you will always find spice rubs next to the meat section selling steaks? These are not coincidences; it is a very well thought out strategy based on Market Basket Analysis.
And not only retail stores but online shopping sites also use this analysis to give you recommendations. Ever wondered how your favorite online shop shows you a list of items under “Frequently bought together” that’s MBA.
MBA helps retailers in many ways. It helps them design a smart and efficient store plan which promotes product placements, which drive more sales. It also helps them design promotions based on consumers’ buying habits.
A simple example can be that a customer who’s buying a flour sometimes buys sugar. Customer who buys sugar also buys flour sometimes. But a customer who mostly buy sugar and flour together always buy eggs. Why? Because they’re baking a cake.
Essentials
The MBA is usually done using the Apriori algorithm. The process involves two stages: mining the transactions done at the store to look for itemsets and then finding confidence. Each association has three key components, which are:
Support
Support represents the number or percentage of transactions in your dataset containing the corresponding items in the rule you are assessing—for example, the number of transactions where Flour, Suger, and Eggs exist.
Confidence
Confidence shows the chances IF Flour and Suger are bought, THEN Eggs are also bought. The higher the confidence parameter value, the more likely all three items will be bought together.
Lift
Lift represents how strong the association is between the three items (Flour, Sugar, and Egg). Higher Lift represents the strongest link between the product bought together.
Python in Action
Let’s see a small example of Market Basket Analysis using the Apriori algorithm in Python. For this purpose, I will use a grocery transaction dataset available on Kaggle. You can find the dataset here.
Import libraries and read the dataset.

The dataset comprises of member number, date of transaction, and item bought. A member can have multiple transactions on the same date having a different item.

Performed one-hot encoding and converted all the unique items into individual features having datatype boolean. In our dataset, each row represents one item in a transaction. Therefore only one of the items will be TRUE in the whole row against a transaction.

In the dataset, each row represents a single item bought by a member on a given date. Having multiple occurrences for the same member on the same date buying different items, we can assume that a customer buying all those items on the same date are part of a single transaction. Makes sense?
After further preprocessing, we compiled our data into a list of itemsets, where each item is a complete transaction containing products which are bought together.

This list of itemsets is then passed to the Apriori algorithm to identify association rules. The values passed to the algorithm are required to be tuned to get the desired result.

The retrieved association rules returned by the algorithm are than formated for further analysis.

Our top ten associations based on the dataset

Conclusion
So next time, when you’re wandering the aisles of your favorite grocery store, remember that the way product stalls and product placement is done has a deep analysis behind it. Even the occasional shuffling of aisles and food stalls, which is a common occurrence, is also the direct impact of an MBA.
Market Basket Analysis is a great tool for retailers to provide excellent offerings to the customer while driving sales.
You can find all the work done in my GitHub repository here.
Leave a Reply