Plotly Express: the Good, the Bad, and the UglyIt might be newer, but is it better?Reilly MeinertBlockedUnblockFollowFollowingJun 14Creating effective data visualizations is a very important part of data science from the beginning to the end of the data science process.
Using visualizations during your exploratory data analysis is a great way to get a good idea of what your data is about.
Creating visualizations at the end of your project is a great way to communicate your findings in an easy-to-understand way.
There are so many different tools for data visualization in Python, from cult favorites like Matplotlib and Seaborn, to the newly-released Plotly Express.
All three are pretty simple to use, and don’t require a lot of in-depth programming knowledge, but how do you decide which one to use?What is Plotly Express?If you’ve ever used Plotly, or even just looked at code written to use Plotly, you know that it’s definitely not the simplest library to use for visualizations.
That’s where Plotly Express comes in.
Plotly Express is a high-level wrapper for Plotly, which essentially means it does a lot of the things that you can do it Plotly with a much simpler syntax.
It is pretty easy to use, and doesn’t require connecting your file to Plotly or specifying that you want to work with Plotly offline.
After Plotly Express is installed, a simple import plotly_express as px is all you need to start creating simple, interactive visualizations with Python.
The GoodThere are several advantages to using Plotly Express to create visualizations.
The entire visualization can be created with one line of code (kind of).
scatter(df, x='ShareWomen', y = 'Median', color = 'Major_category', size = 'Total', size_max = 40, title = 'Median Salary vs Share of Women in a Major', color_discrete_sequence = px.
Paired, hover_name = 'Major'While it technically took 6 lines to create this, it still only took a single command.
In creating a Plotly Express visualization, everything can be done in the same command, from adjusting the size of the graphic, to the colors it uses, to the axes labels.
In my opinion, Plotly Express is the easiest way to quickly create and modify a visualization.
Also, the visualization is automatically interactive, which brings me to my next point.
A mouseover of a specific point will bring up a box that has any of the information that was used to create the graph, as well as any extra information you want to include.
In this particular graph, including hover_name = 'Major' made the specific major the point was referring to the title of each box.
This allows us to get a lot of information out of our graphic that we wouldn’t be able to get otherwise.
Additionally, we can also see what the two largest majors are, which we were unable to do when creating a similar plot using Seaborn.
You can isolate certain information.
Clicking a category in the legend of the visualization twice will isolate that category so it is the only one we can see in the graphic.
Clicking it once will remove that category, so we can see all of the categories with the exception of that one.
If you want to zoom in on a certain area, all you have to do is click and drag to create a rectangle that encompasses the smaller are you want examine more closely.
You can animate change.
One of the coolest features available with Plotly Express is the ability to add an animation frame.
By doing so, you allow yourself to view how something changes over a certain variable.
Most often, the animation frame is based on year, so you can visualize how something changes over time.
Not only is this cool to see as you’re creating visualizations for yourself, but being able to create an animated AND interactive visualization seriously make you look like you know what you’re doing.
The BadIt doesn’t have ton of features.
Don’t get me wrong, there is a LOT you can do with Plotly express.
It just doesn’t have as many options when it comes to adjusting the appearance of your graph.
In Seaborn, for example, you can change the why the points on your categorical scatterplot line up by changing things like jitter = False and kind = 'swarm' .
To my knowledge, neither of these are possible using Plotly Express.
This really isn’t the end of the world, especially considering that one of the main goals of Plotly Express was to allow users to quickly and easily create interactive visualizations while performing exploratory data analysis.
I would guess that most people using it for this purpose don’t care too much about how their points are lined up on their scatter plot.
You need to set the color every single time you create a new graph.
catplot(x = 'Major_category', y = 'Median', kind = 'box', data = df)plt.
xticks(rotation = 90)plt.
show()# Plotly Expresspx.
box(df, x = "Major_category", y = 'Median', hover_name = 'Major')You would expect both of these to create very similar visualizations, and they do (for the most part).
Boxplots created with Seaborn (left) and Plotly Express (right)Prior to the code being run using Seaborn, the color scheme was set to “Paired”, and this carries through throughout the rest of the notebook, unless it is changed later on.
In Plotly Express, the color scheme a graphic uses needs to be included in the creation of every single plot.
Additionally, Seaborn automatically assigns different colors to different categories, whereas Plotly Express does not.
This may or may not be a good thing, depending on your data.
A good rule of thumb is that if there is no reason to give categories different colors in your plot, don’t.
You may want each category to have its own color throughout your creation of visualizations, but you also may not want this.
If this is something you want, simply add color = 'Major_category' to your px.
It might be a little bit inconvenient, but it’s also not a big deal.
However, there are some issues that arise in Plotly Express when assigning different colors to different categories.
box(df, x = "Major_category", y = 'Median', color = 'Major_category', color_discrete_sequence = px.
Paired, hover_name = 'Major')Plotly Express creates significantly smaller boxplots when the categories are assigned a color.
Because the plot is interactive and you can mouse over the points to see what number the correspond to, this isn’t as much of a problem as it could be.
It is, however, still pretty annoying for one small thing to change the actual format of the graphic by that much.
I couldn’t find any information on this issue, and I don’t believe that there is a good fix for this, which brings me to my next point.
Plotly Express is still relatively new, so there’s not a lot of online help available.
Plotly Express was released in March of this year, so as of now, it is only 3 months old.
Therefore, there hasn’t been a lot of questions asked and answered about it online.
Another issue is that, anytime you Google “how to do x in plotly express”, all the information that comes up is related to Plotly.
I imagine that this will become less of an issue as time goes on.
I also imagine that some of these problems with creating plots are just an issue with some bugs in the program that will hopefully be resolved as time goes on.
The UglyPlotly Express is still fairly new and is meant for exploratory data analysis, so some of the issues with it are downright ugly.
This line graph#Seabornsns.
relplot(x= 'ShareWomen', y = 'Women', kind = 'line', data = df)plt.
title('Share of Women vs Total Women in a Particular Major')plt.
line(df, x ='ShareWomen', y = 'Women', title = 'Median Salary vs Share of Women in a Major', color_discrete_sequence = px.
Paired, hover_name = 'Major')Line Graphs created with Seaborn (left) & Plotly Express (right)By default, Seaborn sorts the points in order of x- and y- values in order to avoid the graph looking like the one on the right.
Plotly Express simply plots the points in the order they appear in the dataframe.
There is no sort parameter in the Plotly Express line() function, so your best bet would be to sort the data by the variable of interest and then plot the data accordingly.
Not the most complicated fix, but it’s definitely inconvenient nonetheless.
This violin plotpx.
violin(df, x = "Major_category", y = 'Median', color = 'Gender Majority', color_discrete_sequence =px.
Paired, hover_name = 'Major')Once again, we’ve run into the issue where assigning different categories to be represented by different colors has made them appear much smaller on the visualization than they did before categories were broken up by category.
If you didn’t construct this plot, you wouldn’t even know that you were looking at a violin plot, and there is absolutely no way of getting a good idea of the densities of these things with this plot.
Sure, you can still mouse over the graphic to get the associated numerical values, but one of the main reasons for creating visualizations is to get a feel for your data visually, and this visualization just doesn’t do that.
There’s not a good way to put the visualizations into a presentation.
Again, because Plotly Express is meant mainly for exploratory data analysis, it isn’t a surprise that this isn’t a feature offered by Plotly Express.
I’m not that familiar with the regular version of Plotly, but I believe that you need to create presentations through their website in order to include the interactive visualizations in your presentation.
This is probably a result more of PowerPoint/Google Slides not having the capability of embedding interactive visualizations than Plotly not making their visualizations so they can be embedded in presentations.
ConclusionPlotly Express is really cool.
I would 100% recommend using it for exploratory data analysis.
The interactivity of the plots allows you to do a much more thorough investigation of your data with ease.
Being able to mouse over a point and get all of the information associated with it allows you to draw better conclusions than if you just had to look at a graph and guess what the points were.
However, it probably won’t be able to do everything you want it to as well as you would like.
I would recommend using Plotly Express in addition to Matplotlib and Seaborn in order to create the best array of visualizations possible.
I would highly recommend checking out this code in order to see many more of the different options available with Plotly Express.