Understand Your Customer And Beat The Competition with Conjoint AnalysisAndrej PivčevićBlockedUnblockFollowFollowingDec 4, 2018In our small case study, I will show you how you a can understand your customer by their actual underlying utilities and preferences by showing you a concrete example of a conjoint analysis.

The case is fictional.

Conjoint analysis is a set of methods that enables you derive the underlying utilities and preferences of consumers by looking at their decision.

In contrast to classical methods, you do not need to run after the customer and ask him what he likes, but rather you just observe his actually choice or judgement.

Based on the customers’ choices, you then derive the most likely set of preferences, here called utility function.

If you want to know more about conjoint analysis, then check out my in-depth article about conjoint analysis.

If you want to know how you can build your own conjoint analysis, check out my detailed step-by-step guide for constructing your own conjoint analysis.

The Problem: Can a laptop startup company compete against Apple, Dell and co.

?In the small case today, I will help a laptop startup company named Ethos understand its primary target customer: students at a university.

Ethos wants to sell their laptop mainly online through platforms and it is excited to bring their vision into realilty.

However, they know that they have to make the right decisions and have three main questions that they want to have answered.

What would be the ideal laptop for students?Will Ethos face any disadvantages for the fact that their startup is unknown when comparing it to well-known brands like Apple, Dell or Asus?What would be the preference share at the market?The Method and Premises: Constructing a Conjoint AnalysisIn this section, I will shortly go through the seven steps presented by me on how you can construct your own conjoint analysis.

Step 1: The Problem and AttributesAfter having talked to the product manager of Ethos, it is clear that the attributes we want to look for are the following ones with the following expectations: the brand, number of cores, RAM, the size of the hard drive, the display size, the display quality and touch screen functionality.

These are the variables that are thought to be the most important ones, because the consumers make decisions on them.

Ideally, the variables have resulted from a qualitative investigation such as focus groups and interviews.

One interesting point is that we might expect an interaction between the variable cores and RAM, since many cores with little RAM is thought to be much less interesting for a consumer than many cores with lots of RAM.

If the concept of interactions is new to you, then I recommend you look at the two articles provided in the introduction that provide the theoretical background.

Step 2: The preference ModelIn our case the problem is relatively clear, we want to understand the possible customer.

Therefore, a vector model or a mixed model cannot help us further.

The ideal-point solution on the other side offers an interesting map for each person, but it is less useful in answering the second and third question that Ethos posed.

The ideal model would be a part-worth model in our case.

A part-worth model fits very well with using a fractional factorial design.

We can use it to answer all three questions and we can even visualize the results with clear graphs.

This makes it the ideal model in order to understand the customer.

With respect to predicting the market share, the mixed-model should be prefered over the part-worth model.

However, we also have mostly categorical variables and for the sake of simplicity, we will also use the part-worth model to predict the market share instead.

Step 3: Data CollectionIf we want to use a part-worth model, it makes most sense to use the concept evaluation method.

Since Ethos wants to sell its laptop online, the goal is to make the conjoint analysis as similar to this situation as possible.

Since on a platform like Amazon the laptops are usually indeed shown in a concept way, concept evaluation seems to be the best fit.

By making it similar, we can increase the probability that we can later on generalize it to the real case, e.

g.

students buying laptops from Ethos on an online platform like Amazon one day.

Another thought is that when customers search for laptops on online platforms, they do not buy them directly.

A second important aspect is that, according to the interviews conducted with potential customer prior to constructing the conjoint analysis, the customers do not make immediate decision about the purchase of the laptops.

They rather first go through the laptops they can find online and make a first evaluation of them.

Then they in most cases decide for the one they consider the best depending on their preferences.

This makes us believe that it makes sense to ask our customers to rate each alternative rather than let them make decisions immediately.

Step 4: Experimental DesignSince there are no interaction effects, we will use a fractional factorial design that we can generate simply using the package “DoE.

base” in R.

Using this package, it is possible to test out the optimal number of levels and variables for a fractional factorial design.

There are many other packages available, but “DoE.

base” is the simplest and most straightforward way, as the other packages require more in-depth knowledge.

We use the following code to generate a fractional factorial design and insert our level descriptions:####################### Preparation #### Step 4: Experimental Design # Creating a fractional Design install.

packages("DoE.

base") library(DoE.

base) test.

design <-oa.

design(nlevels =c(6,2,3,3,3,2,2))FracDesign <-as.

data.

frame(test.

design) names(FracDesign) <-c("Brand", "Cores", "RAM", "HardDrive","DSize","DQuality","TouchScreen") levels(FracDesign$Brand) <-c("Apple", "Lenovo", "Acer", "Asus","Ethos", "Other") levels(FracDesign$Cores) <-c("Dual Core", "Quad Core") levels(FracDesign$RAM) <- c("4GB", "8 GB", "16 GB") levels(FracDesign$HardDrive) <-c("256 GB", "512 GB", "1024 GB") levels(FracDesign$DSize) <-c("12 Inch", "14 Inch", "15.

2 Inch") levels(FracDesign$DQuality) <-c("Normal", "HD") levels(FracDesign$TouchScreen) <-c("Yes", "No") rm(test.

design) # Save design into an excel file install.

packages("xlsx") library(xlsx) write.

xlsx(FracDesign, "C:/Users/Economalytics/Desktop/ExperimentalDesign.

xlsx")The idea is that each person that participates in our conjoint analysis, will go through each “run” of the design created and rate the laptop.

The more people participate, the better and more precise information we will have in order to estimate the market share and to understand our potential customers.

Now, let’s have a look how many runs would be necessary if we were to run a full factorial design:# Example for full factorial design install.

packages("AlgDesign") library(AlgDesign) numberlevel = c(c(6,2,3,3,3,2,2)) fulldesign <-gen.

factorial(numberlevel) nrow(fulldesign) # Runs full factorial nrow(FracDesign) # Runs fractionalfactorialHere, it becomes evident the advantage of the fractional factorial design.

If we had to run a full factorial design, we would have needed let one person go through 1296 runs.

This means that each person participating in the study would need to rate 1296 laptops in that case!.Using a fractional factorial design, we managed to reduce it to only 36 runs, that is an incredible reduction of 97%.

However, this is only possible if there are no interaction effects between our variables.

Initially we expected an interaction between the variables Cores and RAM, but upon some interviews, it seems like that there does not seem to be any significant interaction.

Therefore, the main prerequisite for a fractional factorial design is met.

We will not discuss the disadvantages and further thoughts for designing an experiment here, because we want to keep it simple.

Step 5: Presentation of AlternativesSince it was clear from the very beginning, that Ethos will go for an online sales strategy, it was very important to design the presentation of an alternative as realistic as possible.

While for a physical shop you might showcase prototypes of different products in the real environment and then ask for a rating, you will also want to make it realistic for the online scenario.

Since Ethos considered to sell its laptops on Amazon because it would be difficult to attract customers from the scratch, it was necessary to adapt the concept evaluation to the design of Amazon including the disadvantages and advantages it might offer.

Therefore, I constructed an experimental homepage that resembled amazon for collecting data.

Below you can find an example of how the 20th run would look on the homepage:Another consideration is that it might be useful to add a description of all attributes and why they might be important, before the customer starts to rate the laptops.

A laptop purchase by a student can be considered an investment on which they will spend a considerable amount of time and inform themselves prior to the purchase.

It cannot be compared to a drink in the supermarket or an ice cream.

We need to make sure that the customer can fully inform themselves before they make decisions.

Therefore, we include a description of all attributes, the importance and the relevance.

For instance, we would explain that high RAM might be important if you edit videos, edit high resolution images or process high amounts of data.

Furthermore, we would add a constraint in such that they have to read through the description and that the whole experiment cannot be completed under 30min.

This forces the customer to think every option through and really engage with the alternatives in order to achieve realistic and accurate ratings.

Step 6: Measurement ScaleSince we want the customer to rate each alternative, we will need a metric measurement, particularly a likert scale.

Likert scales are per default interval scales, which means that we would only have the knowledge of how much the overall utility would increase by changing the level of an attribute.

We are also restricted to an interval scale due to the fact that we chose a part-worth model as well as fractional factorial design.

A continuous or ratio variable would generally not be possible with a fractional factorial design or part worth model unless we can make some assumption about linearity and interactions which are simply unrealistic.

But the advantage of a likert scale is that it has proven to be more reliable in studies.

The rating of a run might look like this:Step 7: Estimation MethodFinally, there is not much room left to choose from the pool of estimation methods.

The best fitting estimation method for our case is to use multiple linear regressions to estimate the utility function for each individual, because multiple linear regressions are perfectly able to estimate each factor and are well suited for fractional factorial design.

What the method will do in our context, in a nutshell, is to look at the ratings of a customer and calculate the most likely utility function.

Hence it tries to understand the choices and understand which attributes are the most important for each individual.

Finally, we create a survey, gather participants from our target group and we let them participate in our survey.

They basically rate each run from the experimental design with a number ranging from 1 to 9, where the higher number indicates that the laptop suits the preferences more.

9 indicates a perfect fit, while 1 a very bad fit.

Now that we prepared the complete conjoint analysis, it is about time to collect the data.

For our case, we create a simulated dataset using the following code:####################### Creating Utility Functions#### Data Collection (Create Dataset)# Create basisset.

seed(1234)n <- 89 # number of participantsData <- data.

frame(Participant =1:89)Data$Participant <-as.

factor(Data$Participant)for (run in 1:36) { Data[,paste("Run",as.

character(run), sep = "")]<- sample(c(1:9), n, replace = TRUE)} # Shaping the dataData[,c(6,11,17,28,33)] <-Data[,c(6,11,17,28,33)] + 2 # Improve AppleData[,c(8,13,14,15,18,35)] <-Data[,c(8,13,14,15,18,35)] – 2 # Decrease EthosData[,c(2,4,5,7,8,11,12,13,16,18,19,25,28,29,31,32,33,37)]<- Data[c(2,4,5,7,8,11,12,13,16,18,19,25,28,29,31,32,33,37)] – 0.

6 Data[,c(2,3,5,9,11,13,15,16,19,23,26,30)]<- Data[, c(2,3,5,9,11,13,15,16,19,23,26,30)] + 0.

9Data[,c(2,3,6,9,10,13,18,19,20,21,22,23,25,28,29,31,33,35)]<- Data[,c(2,3,6,9,10,13,18,19,20,21,22,23,25,28,29,31,33,35)] + 1 Data[,-1] <- round(Data[,-1])Data[,-1][Data[,-1] < 1] <- 1Data[,-1][Data[,-1] > 9] <- 9Solution: What is important to customers and how do you win them?Now that we collected the data, it is time to run the analysis.

We will run the analysis in four steps and try to answer the questions that we need to know for Ethos.

First of all, we will estimate the part-worth model and visualize it for a few variables.

The part worth models are supposed to help us understand the target customers and help us derive the “ideal” laptop.

In a second step, we will dig deeper into our customers’ minds and try to understand what variables really matter.

For this purpose, we will calculate the relative variable importance and compare these.

Especially, we want to understand how the brand influences the consumers and whether there are any disadvantages for Ethos.

Finally, we will show you quickly how you can estimate your potential future preference share and make simulations.

The question that we want to answer, are the following ones:What would be the ideal laptop for students?Will Ethos face any disadvantages for the fact that their startup is unknown when comparing it to well-known brands like Apple, Dell or Asus?What would be the preference share at the market?Step 1: Estimating the Part-Worth ModelsFirst of all, we will need to merge the results with the design, so that each row represents a laptop with its features followed by the ratings it received by the 89 participants:########################## Estimatingthe Part-Worth Models# Merging FracDesign and Datainstall.

packages("data.

table")library(data.

table) Data$Participant <- NULLData <- transpose(Data)rownames(Data) <- c(1:36)Conjoint <- cbind(FracDesign, Data)In the next step, we estimate the part-worth values for each person using a multiple linear regression model.

At this point, the procedure might differ depending on the purpose, but since we want to estimate the preference share at a later point in time, we need a model for each person.

# Compute linear regression for eachperson install.

packages("rlist")library(rlist)Regressions <- list() for (person in 8:ncol(Conjoint)) { model <- lm(Conjoint[,person]~ factor(Brand) + factor(Cores) + factor(RAM) + factor(HardDrive) + factor(DSize) + factor(DQuality) + factor(TouchScreen) , data =Conjoint) Regressions <- list.

append(Regressions, model)}The estimates of the linear regression are our part-worth utilities, whereby we need to remember, that for each categorical variable, one level is used as reference level.

This means that for one level in each categorical variable no estimate will be shown because its value will be automatically 0.

This shows us that part-worth utilities are interval scale variables.

We will need to consider this when we construct a dataframe with all the part-worth utilities for each person.

The following code does exactly that.

It creates a dataframe where each row represents a level of a variable and where each column represents a participant.

# Create dataframe with part-worthvalues vars <- c("Intercept", rep("Brand",6), rep("Cores",2), rep("RAM",3), rep("HardDrive", 3), rep("DSize",3), rep("DQuality",2), rep("TouchScreen",2))lvls <- c("Intercept", as.

character(levels(Conjoint$Brand)), as.

character(levels(Conjoint$Cores)), as.

character(levels(Conjoint$RAM)), as.

character(levels(Conjoint$HardDrive)), as.

character(levels(Conjoint$DSize)), as.

character(levels(Conjoint$DQuality)), as.

character(levels(Conjoint$TouchScreen))) Results <-data.

frame(Variable=vars,Levels=lvls) for (person in 1:n) { c <- as.

vector(Regressions[[person]]$coefficients) coef <-c(c[1],0,c[2:6],0,c[7],0,c[8:9],0,c[10:11],0,c[12:13],0,c[14],0,c[15]) Results[,paste("Person",person,sep="")] <-round(coef, digits = 1)}Now that we have the table, we simply calculate the average for each level and plot the result for each variable.

Optionally, it might be also interesting to add the standard deviation as whiskers for each level as well.

The standard deviation would tell us how homogenous the target group is with respect to one level and might give us a hint on whether it would be even useful to offer more than one laptop.

# Create dataframe with part-worthvalues vars <- c("Intercept", rep("Brand",6), rep("Cores",2), rep("RAM",3), rep("HardDrive", 3), rep("DSize",3), rep("DQuality",2), rep("TouchScreen",2))lvls <- c("Intercept", as.

character(levels(Conjoint$Brand)), as.

character(levels(Conjoint$Cores)), as.

character(levels(Conjoint$RAM)), as.

character(levels(Conjoint$HardDrive)), as.

character(levels(Conjoint$DSize)), as.

character(levels(Conjoint$DQuality)), as.

character(levels(Conjoint$TouchScreen))) Results <-data.

frame(Variable=vars,Levels=lvls) for (person in 1:n) { c <- as.

vector(Regressions[[person]]$coefficients) coef <-c(c[1],0,c[2:6],0,c[7],0,c[8:9],0,c[10:11],0,c[12:13],0,c[14],0,c[15]) Results[,paste("Person",person,sep="")] <-round(coef, digits = 1)}These tables are the core of every conjoint analysis and give us precious information on how changing the feature of our laptop for Ethos would improve the utility.

For instance, increasing the hard drive from 256 GB to 512 GB interestingly decreases the utility substantially, which might be a sign that the target group has a low budget and prefers others feature.

We might have also included price as feature to assess the price sensitivity of our target group for instance.

A more interesting approach would be, if we used price as the predicting variable instead of utility, e.

g.

we measured utility on the basis of how much our future customers would be willing to pay for a laptop.

Using these figures we can already answer the first two questions that Ethos had:What would be the ideal laptop for students?The answer to this question is simple.

We just look at the levels that maximize the utility within each variable: an Asus or Lenovo laptop with a dual core processor, a simple 14-inch screen, 256 GB of hard drive, 8 GB RAM and a touch screen.

Will Ethos face any disadvantages for the fact that their startup is unknown when comparing it to well-known brands like Apple, Dell or Asus?You surely realized, Ethos cannot produce “Asus” or “Lenovo” laptops and that gives rise to brand-disadvantages.

Interestingly, Ethos will have a brand-advantage compared to Apple or Acer, but will clearly be disadvantaged against Asus or Lenovo.

This implies that the target customers are a little brand sensitive, but not as much as expected, especially because the more prestigious brand Apple scores lower.

However, if Asus, Lenovo and Ethos produce the exact same laptops, Ethos would tend to loose and here it has to lever different strategies in order to beat the competition.

Either a) it starts building the brand and creates a unique user experience (for instance online order, extra offers like free streaming) in order to offset the brand disadvantage in that segment or b) it uses a more tailored marketing strategy with different market channels that will help them to be always a step closer to the target customer than the competition.

The decision is up to the managers of Ethos.

Step 2: The Relative Variable ImportanceSo, in building the laptop required, where should Ethos start?.What should be the first priority?.A simple approach to this question is by looking at the relative variable importance, which basically tells us how important a variable is compared to others when a consumer makes a purchase decision.

The relative importances can be simply calculated in two steps.

First, for each variable calculate the biggest possible difference by subtracting the level with the lowest utility from the level with the highest utility.

Second, for a variable A, its relative variable importance is simply the ratio of the biggest possible difference of A and the sum of all biggest possible differences of all variables.

But luckily enough, there is a R-function that does the math for us.

# Compute relative importanceinstall.

packages("relaimpo")library(relaimpo) Importances <- data.

frame(Variable= c("Brand", "RAM", "HardDrive","DSize", "Cores", "DQuality","TouchScreen")) for (model in 1:n) { relImp <- calc.

relimp(Regressions[[model]], type =c("lmg"), rela = TRUE) relImp <- as.

vector(relImp@lmg) Importances[,paste("Person",model,sep="")] <-round(relImp, digits = 3)} Importances$Average <-rowMeans(Importances[,-1]) Importances <- reorder(Importances$Variable,Importances$Average)ggplot(Importances,aes(x=reorder(Variable, Average), y=Average)) + geom_col() + coord_flip() + scale_y_continuous(labels = function(x)paste(x*100, "%"))For predicting the market share, we will assume that the board of Ethos decided to produce the “ideal” laptop that we defined in the first step.

Now the board wants to know, what would be the potential market share in the best case, if Ethos would go to the market with the laptop.

Ideally, we would have now a dataframe with all the available laptops from all brands that our laptop would need to compete against.

However, for the sake of simplicity, I will create an example competition of about 49 different laptops that Ethos’ laptop will compete against.

The following code will create the laptop list:######### Predict potential marketshare# Simulate laptops from brands vnames <- c("Brand","Cores", "RAM", "HardDrive","DSize","DQuality","TouchScreen") brand <-sample(c("Apple", "Lenovo", "Acer","Asus", "Other"),49,replace = TRUE)cores <- sample(c("DualCore", "Quad Core"), 49, replace = TRUE)ram <- sample(c("4 GB","8 GB", "16 GB"), 49, replace = TRUE)harddrive <- sample(c("256GB", "512 GB", "1024 GB"), 49, replace = TRUE)dsize <- sample(c("12Inch", "14 Inch", "15.

2 Inch"), 49, replace = TRUE)dquality <-sample(c("Normal", "HD"), 49, replace = TRUE)touchscreen <-sample(c("Yes", "No"), 49, replace = TRUE) Market <- data.

frame(a=brand,b=cores, c=ram, d=harddrive, e = dsize, f= dquality, g = touchscreen) names(Market) <- vnamesNow I basically create the fitted or predicted values for each user for each laptop using the regression models that I derived from earlier.

# Caclulate utility scores for eachlaptop for each userfor (participant in 1:n) { Market[,paste("P",participant,sep="")] <-predict(Regressions[[participant]], newdata = Market[,1:7])}Finally, I just have to look at which laptop “wins” for each person that participated and just count the brands, in order to derive the “market share”.

The “win” can be considered to be the laptop purchase decision that this individual would make under neutral and optimal conditions.

And here comes one main limitation of the procedure.

What I calculate here, in fact, will not be the real market share.

It will rather be a “preference” share, because you will never find neutral and optimal conditions at the market.

Neutral and optimal conditions would be for instance that there is no distribution channel advantage for none of the brands or that the consumer had the chance to evaluate all the laptops in the basket, which is rather unlikely.

Of course, it is possible to enhance the method by correcting the result with respect to these constraints present in the real market, but the preference share gives you the information, how you would stand if you would not have any disadvantages (or advantages depending on the perspective) compared to your competitors.

# Determine the potential market sharepurchased <-unlist(apply(Market[,8:ncol(Market)], 2, function(x) which(x == max(x))))purchased <-Market$Brand[purchased]brandcount <-as.

data.

frame(table(purchased))brandcount$Freq <- brandcount$Freq/ sum(brandcount$Freq) ggplot(brandcount, aes(x=purchased,y=Freq)) + geom_bar(stat="identity")Now we can see that the market would be governed mainly by Acer, Asus and Lenovo given our simulated market and Ethos would be far off with approximately 1% market share.

Is that surprising? No, for two reasons.

Firstly, as we found out, Ethos faces some significant brand disadvantages.

Secondly, Ethos is selling only one laptop compared to any of the other competitors who are selling 10 laptops on average in our simulated market.

Conclusion: A unique distribution channel is the key to beat the competitionAt the end we managed to answer all three questions that our consulting client Ethos had and we demonstrated how powerful and informative a conjoint analysis can be.

Of course, there some disadvantages that we have not touched upon like the fact that it is difficult to gather data accurately.

When you conduct the conjoint analysis, you should also integrate ways to ensure validity and reliability.

However, the main advantage of a conjoint analysis is that it is flexible and you can adapt it to your needs.

First, you can use different preference models if you want to achieve more realistic results.

Second, after you derived the preferences, you can conduct further analyses on them.

You could condunct a principal component analysis or cluster analysis to find out which customers are similar.

You could also calculate how many different laptops you should launch to optimize your market share or you might even combine conjoint analysis with machine learning methods.

Third, instead of using survey data, you might also use actual purchase data.

So what is the story now? Ethos will be able to gain a 1% market share if it is able to produce and sell the laptops for a competitive price and if the market conditions were ideal.

Ethos knows now how the customer thinks and knows what would be the laptop that would fit to the needs.

However, will that be enough to beat the competition? I would say no, because Ethos will need to develop a unique distribution channel if it wants to beat the competition.

The reason is simple.

You can produce the ideal laptop, but if you customer never finds out, he will never buy it.

Therefore, you will need to be one step ahead of your competition.

What do you think?Give me feedback, and tell me how you liked the article ;)!Originally published at economalytics.

com on December 4, 2018.

.