At first we need to decide on the objective of clustering.

One useful objective could be to find clusters or segments of products which are sold together.

So our tactic to address this problem could be in two steps: 1.

Find products which are sold together 2.

Find cluster of all products which are sold togetherSo let us first start by finding all products which are sold together.

Market basket analysis helps us to find which products are sold together.

There are various algorithms for market basket analysis.

One of the commonly used algorithm for market basket analysis is apriori algorithm.

The result of this algorithm is product couples which are frequently sold together.

Here are some examples based on the dataset of shopping invoicesVINTAGE BILLBOARD LOVE/HATE MUG — & — EDWARDIAN PARASOL REDWHITE HANGING HEART T-LIGHT HOLDER — & — WHITE METAL LANTERNSo now as we have determined products which are sold together, next step is to find clusters of such products.

To do this we can get inspiration from Graph theory, or sometimes also called Network theory.

We can think each product as a node.

And if they have been sold together, we can create an edge between the products (nodes).

The graph could be visualized as shown here, for few top selling productsGraph Algorithms to make clusterWe can clearly see some clusters of products in this visualization.

Additionally graph algorithms such as modularity algorithm can be used to extract these clustersSo as we have seen, we can do clustering with apriori and graph algorithms.

When you think of clustering , you should not automatically conclude use of KMeans , DBScan or other similar algorithms.

One should think of the objective of clustering and then decide on which algorithms to useTry to find different ways to do the same thingOne way of avoiding repetitive work and keep the innovation edge sharp is to look at the problem from a different angle.

To illustrate this point, let us take as an example automobile dataset.

This automobile dataset has various technical characteristics about cars.

Automotive datasetLet us say that our objective is to find the covariance between all features.

One of the first things which would come to the mind is to apply covariance algorithm between all features.

And repeatedly applying the same algorithm always makes you loose the innovative edge.

So what could be an another way to find co-variance.

One such different way is PCA (Principal Component analysis).

Though the objective of PCA algorithm is dimension reduction, it is based on finding variation between features.

The features with the largest variations are used for dimension reduction.

As a by-product of this algorithm, you also get features which have positive covariance as well as those which have negative covarianceShown here is a plot between the features and its influence (or Eigenvalues) on the first principal component.

One can conclude that the width, length, height have positive covariance and are positively correlated.

And that mpg, rpm have negative covariance and negatively correlatedPCA to find varianceHere we see that we can solve a problem with a different approach.

This also helps us to understand how different algorithms are related.

Once you develop this thinking on relations and similarities between algorithms, you can start approaching a problem from various different angles.

This will take you to the next level of data science innovationUse Deep Learning not as an end result , but as a source of dataHow many times we have seen those green boxes on images after execution of YOLO algorithms.

When YOLO was introduced to the data scientists few years back, it was super fun and exciting.

There were many data scientists who used YOLO for object identification on variety of images and video.

However now just using YOLO for object identification has become very mechanical.

All those green boxes does not excite the data scientists as it used it to exicte few years back.

Deep learning is very innovative and cutting edge.

But the manner in which it is used has become very mechanical.

One of the ways is to keep its innovative usage is to think of deep learning as source of data.

Imagine you have video analysis of people moving in a retail shop.

We can use deep learning algorithms to identify humans , as shown in example below.

The examples illustrate identifying persons in an airport or in a retail shopDeep learning YOLO outputNow what if we consider above as result as not an end , but as a data source.

YOLO can help detect object, but it also gives the position of the object.

So the result of YOLO can be then be fed into another algorithm, such as path analysis.

We can analyze people movement.

We can use it to find zones in retail shop or airport where there is lot of movement and zones which are not very busy.

We can also find the trajectory which people usually take.

Like in retail shop, we can find out which zones are visited before coming to the cash counter as well shoppers who do not pass through the cash countersAs illustrated below is path analysis of zones visited in a retail shopPath Analysis based on output from YOLOHere are we are going beyond just object detection and drawing green boxes.

Thinking output of deep learning algorithms as source of data will help you apply intelligent algorithms to come out with business oriented and exciting outcomesIn this article we saw some interesting ways to think about data science.

Use these techniques in order not to get trapped into a mechanical and assembly-line manner of doing data science.

Data science profession is innovative in nature.

So think different, think beyond the usual, and keep your innovative edge always sharp.