Getting Started with TabPyAn introduction to integrating Tableau and PythonKayla HartmanBlockedUnblockFollowFollowingJan 16In any field, it is essential to be resilient and adaptive, approaching new tools and ideas with diligence.
In a field as young as data science, it is ever more important to be a self-learner and to take on new challenges.
As I entered the final month of my data science bootcamp, I thought back to the below chart I had seen at the beginning of my course.
com/the-most-in-demand-skills-for-data-scientists-4a4a8db896dbJeff Hale scraped several job listing sites and outlined the technology skills most commonly listed in data science job postings.
While my bootcamp has covered a large number of these skills, it would be impossible for the program to exhaustively introduce skills pertinent to data science.
For the skills that I did not learn, just because they were not introduced in the program doesn’t mean they won’t be useful during my career.
If they do come up, I want to ensure that I am ready to best utilize available resources and my existing knowledge to develop a well-rounded understanding of these skills.
For this blog, I decided to dive into one of these relevant concepts that we did not cover in my bootcamp.
I had heard about Tableau several times before, but I had never used it.
I understood its capabilities as a data visualization tool that easily connects to several data sources and allows users to build dashboards, but I didn’t know much beyond that, so I decided to investigate.
Once I downloaded the free trial (https://www.
com/products/trial), I imported a CSV file I had used for a recent project using data from James LeDoux.
This file included a large amount of data surrounding coffee beans and their quality.
Once I imported the file, I was able to view the data in a table on Tableau, as shown below.
After importing this dataset, I was pleasantly surprised by the simplicity in creating data visualizations with the user-centered interface.
The dataset features were listed on the left side of the workbook and the geographic coordinates associated with the country of origin for each bean were automatically generated.
I was able to simply drag the measures listed below to an area labeled “columns” and “rows” to generate visualizations.
In order to create the below heat map, I dragged the “latitude” and “longitude” tags to the “column” and “row” entries, as shown below.
To create the below scatterplot, I simply dragged the “Aroma” and “Aftertaste” fields to the “column” and “rows” entries.
Next, I clicked on the arrow to on the right side of the “Aroma” and “Aftertaste” tags in order to select standard deviation as the unit of measure to display.
After creating the heat map, scatterplot, and a few other basic visualizations, I was curious about how to integrate Python with Tableau.
I discovered that this can easily be accomplished using the API Tableau Python Server (TabPy), which enables remote execution of Python code.
Connecting Tableau with TabPyIn order to integrate TabPy with Tableau, I cloned the following Github repository.
tableau/TabPyExecute Python code on the fly and display results in Tableau visualizations – tableau/TabPygithub.
comNext, using instructions on the repository and instructions here, I followed the below steps:1.
Install tabpy-server by typing the following in the command line.
pip install tabpy-server2.
In the cloned repository, go to the path tabpy-server/tabpy_server .
Run the file tabpy.
In order to run this file, I needed to change the file common/config.
template to common/config.
Open a Tableau workbook.
Follow the path Help > Settings and Performance > Manage External Service Connection to select an external service connection on Tableau.
Using TabPy to run Python in TableauAt this point, the connection was set up to run Python code in the Tableau workbook.
This can be done by filling in calculated fields.
Calculated fields follow the general format seen in the examples below.
mdThe function can begin with SCRIPT_REAL, SCRIPT_INT, SCRIPT_STR, and SCRIPT_BOOL, which represents the function return type.
Within quotation marks, the inputs are referred to as _arg1, _arg2, … , _argN.
The closed quotation mark is followed by a comma and the argument definitions.
According to a community post on Tableau’s website, these are table calculations, so they must be aggregated, such as SUM(), MAX(), MIN(), ATTR().
Next StepsThis blog post only provides a general overview of what can be done with Tableau and TabPy.
There are a plethora of other capabilities.
Tableau can be connected to additional data sources and can create real-time dashboards that are constantly updated.
I simply wanted to gain an understanding of the way Python can be incorporated into Tableau and there are certainly additional ways in which Python can be incorporated.
The links below include additional resources describing how to use Tableau and more advanced ways of incorporating Python, such as machine learning models.
I plan to use these resources continue my exploration into Tableau and hope to further my abilities to integrate my knowledge of Python and Tableau.
Building advanced analytics applications with TabPyBack in November, we introduced TabPy, making it possible to use Python scripts in Tableau calculated fields.
comTableau and Python Integration |Tableau Community Forumsmust have an understanding of table calculations, including an understanding of how dimensions and measures affect the…community.
comData Visualisation with TableauOur goal as Data Analysts is to arrange the insights of our data in such a way that everybody who sees them is able to…www.
comTableau Training & TutorialsLearning Are you doing deep data prep and analysis?.Responsible for creating content for others?.If you have Tableau…www.