A Bayesian approach to estimate the effect of a content and a weekday on the post published on a Facebook pageGulzina KuttubekovaBlockedUnblockFollowFollowingJan 3I assume a reader has knowledge of Bayesian statisticsSample code is hereToday, social media has become an interactive platform, where customers can exchange ideas and leave reviews about products.

Based on marketing statistics, adults aged 18–34 are most likely to follow their favorite brand using social media (95%).

As a result, these adults become potential customers and a good target for the industry giants.

Almost every popular brand uses this tool “for granted” because it is:Impressed with the brand’s new post about product followers of that page tend to “interact” with that post by “liking”, leaving comments and sharing the post with their connections.

The total number of interactions can serve as a good indicator of post impact.

There might be a tremendous impact or no impact at all.

Also, the extent of impact may depend on other factors such a weekday on which post was published, its content, etc.

It is our goal to help business owners make smarter decisions with Facebook ads by identifying whether there is an effect of a content or a week day on the number of total interactions with that post.

To this end, we developed a hierarchical conditionally autoregressive (CAR) model which takes into consideration time dependenceWe used the dataset related to posts published during the year 2014 on Facebook’s page of a renowned cosmetics brand.

For the analysis, we chose “Total Interactions” as a primary response variable, defined as:(1) Total interactions = (# of likes + # of comments + # of shares)To model the post impact, we selected categorical variables “Type” and “Post weekday”.

Here, “Type” categorizes post into link, photo, status, and video.

All of these metrics are known prior post publication.

The “Post weekday” indicates on which day of week post was published.

For starters, descriptive analysis shows that, on average, Facebook users interacted with the video post the most while links were clicked least.

As histogram shows, the distribution of “Total Interactions” is highly skewed to the right.

Since the model expects normally distributed data, we log-transformed the response variable (by adding a small constant) to obtain normally distributed-ish data.

For the future reference of log-transformed “Total Interactions”, use following:(2) y = log(Total Interactions)Post impact can be measured by the number of total interactions along with the content and week day on which the post was published.

Since the variable “weekday” is neighborhood dependent, it is assumed to have a spatial random effect on y.

Another categorical variable “type” is assumed to have a fixed effect on y.

Therefore, we add CAR model and the linear trend terms to the final model.

For all “type” and “weekday” combinations, we will calculate posterior probabilities of whether, for a given combination, marginal mean is higher than those of others:for j = {1, 2, …, n}/{i}, where n = # of “type” and “weekday” combinations.

Also, posterior distributions of fixed and random effect parameters will help us to answer the problem statement question.

ModelFor the model of interest, let y_i be the value for the ith post.

Let i, ranging from 1 to 500, be the post published on a particular day with the particular content.

We assume that y_i are distributed independently.

Then, for a given post, i = {1, 2, 3, …}:where the mean is defined as:where the effect of days is defined as:Furthermore, you can see how we defined prior to the model:Let’s break this down.

First, as a diffuse prior for the standard deviation, ????, we chose half Cauchy prior with positive support.

Second, we selected uniform priors over the real line on all fixed effect parameters ????.= [????1, ????2, ????3, ????4].

????’s uniform priors have sufficient coverage over the possible values of the linear trend term, both negative and positive trends.

Finally, ????, a paramater of CAR model, is assumed to have uniform distribution constrained by the eigenvalues ofWe used Stan with R package “rstan” to generate samples from the posterior distribution of the parameters.

A 4 chains MCMC was employed with 10000 iterations in total and 5000 burn-in iterations (We’ll discuss MCMC in the next post).

We ended up with 20000 MCMC samples generated for ???? , ???? , ????.and ????.parameters.

The potential scale reduction factor on split chains are given below.

Note that Rhat = 1 for all parameters, meaning that MCMC converged normally.

ResultsIn the previous table, you also can see posterior medians and 95% credible intervals for each parameter.

Furthermore, below, you can find posterior distributions of ????, ????, ????, and ????.parameters.

Using resulting posteriors of ????.and ????.parameters, and equation for ????_????, we calculated posteriors of ????_????.

After that, we determined marginal posterior medians and 95% credible intervals for each ????_????.

Our goal is to identify whether there is an effect of the content of a post or the day it was published on the number of total interactions that post collects.

To answer the posed question, we found the posterior probability whether, for a given type + weekday, its marginal mean is higher than those of others (see previous table).

The video + thursday combination has the highest posterior marginal mean of 0.

9529.

This means a post has a higher impact if it’s a video published on Thursday.

Posterior of ????_????ℎ????.has greater values than other days, while posterior of ????_????????????.has smaller values.

Credible intervals for ????.are tight and around zero, slightly moving to the left after Thursday.

The following figure shows videos, on average, have a higher number of total interactions.

Status and photo posts collect almost similar interactions.

As expected, link posts have the smallest number of total interactions.

This analysis can be extended further by including “type” and “weekday” interactions in the model.

Originally published at kgulzina.

github.

io.

.