Now we have to count how many items sold in previous period (could be year, or month…).
In data analysis, counting the appearance of single object is called “frequency”.
To count the frequency, in this case, we will segment the data set by:CountrySizeGender (as I mentioned before, Size could not be used without Gender, so we have to include the Gender too)Our crosstable will have 2 dimensions only but we already have 3 dimensions here so we have to split the table into 2 different one by Gender.
This is the result:Sizes distributed by countryDefine the problemThe problem: Based on our observation, what is the number of shoes (model, size) that are likely to be sold?The condition: We will find the answer of our problem with 95% confidence intervalThe plan:- Use the last 12 months data set- Use men shoes data set- Use US data setThis step, you will think about why do we use men shoes and US data set only.
Let’s think: if a man visited the store, would a woman come?.We cannot answer, because gender in this case is identical.
It doesn’t have any reflection together.
In another hand, for instance, if a kid visited the store, there is a high chance that his parents come too, because that kid cannot go shopping alone and using money by himself.
But for our case, if a man visited the store, he could come alone, or with his bros, or with his girlfriend, or even a group of his friends.
There is no connection between the chance of a male visitor and a female visitor.
That’s how we call it identical.
Of course, the Country is the same with Gender, they are both identical.
Then why 12 months data set?.Because it’s enough the represent a full cycle of a year.
Shopping behavior mostly bases on a full year cycle such as: year-end sale off, Black Friday, company’s promotion on its anniversary, people likely to change their shoes when winter comes (to buy warmer shoes)…Men shoes sold in 2016 distributed by Size and MonthPut the data to heatmap, let’s see the hidden message from our data setMen shoes sold in 2016 distributed by Size and Month — Heatmap on TableauNow we can easily see the hidden message:Too small sizes (6 to 7.
5) and too big sizes (13 to 16) is not popular.
So the chance that a man visits our store and buys a pair of shoes on these sizes is quite low, especially the oversize 15, and 16 — we have no unit sold for this size whole year!Average sizes (9 to 10.
5) seems very good on unit sold.
US man has foot size of 9.
5 seems out number of other sizes.
Focus to produce and store this size, we have great chance to sell shoes and get money!But this is just a high level of examination.
Keep diving deeper to reveal the entire of hidden messages our data set hiding.
CalculationBecause we have 17 different shoes sizes (men only — based on our plan), we need to calculate 17 different CI (confidence intervals).
Firstly let’s calculate the means with Microsoft Excel using the =average(number1, [number2].
) function (in this post, I will not mention much about programming.
Just stick with KISS principal)Men shoes, US 2016 & MeanWe do not know the population variance and our data set consists of only 12 observations (represented by 12 months of 2016), then we have to use the T-Statistic.
Let’s find the value in T-Statistic Table for 95% Confidence Interval with 11 Degrees of Freedom!Let’s explain a bit in case you concern:11 is the Degree of Freedom of 12 months calculated using df = n— 1 (with n=12)0.
025 is the Significant Level of 95% Confidence Interval calculated using SL = (1 – CI) / 2 (prefer image below)Probability Density 2 tailsAlright!.Look at the T-Statistic Table below and we can find out the T-Value is 2.
201 (right at the cross point between red column and blue row)T-Statistic Table of 95%CI (0.
05%SL) and 11DFT-Statistic ValueNow, let’s calculate Standard Errors and Margin Errors!In Microsoft Excel, you can calculate Standard Errors by using this formula:=STDEV.
)/SQRT(n)with:number1, number2… are number of shoes was soldn = 12 (months)And Margin Errors = Standard Errors * t(11, 0.
025)Mean, SE, and MELast one, calculate the Confidence IntervalConfidence Interval formulaI know that you are getting headache with above formula but wait!.We already have the Mean and Margin Errors, right?.Then we can calculate the CI as (Mean — ME; Mean + ME)Mean, SE, ME, and CI(95%)In 95% of the cases, the true population mean of sales for each shoes will fall into the respective interval.
The ceiling values (upper bound, or the higher values) of the CI shows us the maximum number of shoes needed.
And vice versa for the flooring values as they are the minimum number of shoes was sold.
As we don’t want to be low in stock a possible solution to the problem is get as many pairs of shoes as the closest number to the ceiling limit of the Confidence Interval to maximum selling possibility.
And of course, we need to store more than the flooring limit of the CI to ensure all customers (who are men, and have particular shoes size) can buy their favorite shoes instead of leaving our store without buying because the items are out of stocks.
Let’s round up the CI and see:Conclusions based on sample data setDid we solved the problem?Nope!.Don’t blame on meAlright.
This is the conclusion for the year of 2016 but now is 2019, so can we use it?.Absolutely yes!.We can calculate for 2017, 2018 and then get the Mean of them and give the prediction for 2019The prediction for 2019 can be incorrect due to other factors such as: fashion trend this year is changed, people like using sneakers more than shoes; or there is another shoes store just opened next to us.
But this post ONLY analyze on historical dataExtension!.Let’s deep dive a bit moreI wonder do you notice the number of 105 and 175 in the bottom on previous image of “Conclusions based on sample data set”.
What does it mean?105: the minimum number of pairs of shoes was sold.
So as a store manager, we have to prepare big inventory enough to store this number of shoes or we will run out of stock175: a big inventory is good, but it’s no need to have a huge inventory!.Big enough to store 175 pairs is enough.
Remember, the bigger inventory you have, the more expensive you pay for managementWhat do you do for size US 16?.If your eyes are keen enough, you can see there was no product sold with this size.
Let’s check the data for 2017, and 2018 as well.
If no item was sold, you need to stop producing or importing this sizeWhat if your inventory is not big enough to store the minimum number of shoes?. More details