And what are visualization researchers doing to tackle these challenges?Unsurprisingly, implementing NLIs is challenging since we need to build software that first interprets the human language as other humans would and then performs an appropriate set of actions based on its interpretation.
Below I list a subset of challenges these systems pose from a design and usability standpoint.
I also highlight how researchers are trying to address some of these challenges in the context of NLIs for data visualization.
Ambiguity and Underspecification: User questions are often ambiguous and underspecified.
For example, imagine you were exploring an Olympic Games dataset with details about all medals won by different countries over several decades.
While exploring this dataset with an NLI, you say “show me medals for hockey and skating by country.
” This seemingly simple query presents multiple ambiguities.
Specifically, the word “medals” can map to either the total number of medals or specific types of medals (e.
Similarly, “hockey” and “skating” can refer to different sports (e.
, ice hockey vs.
field hockey or figure skating vs.
Even if we assume all of these ambiguities were resolved, there is still the question of which visualization to use (e.
stacked bar chart).
One proposed solution to expose and help users resolve these ambiguities has been to use multimodal input (i.
, leverage another form of input such as a mouse).
In this approach, given an input query, the system tries to identify ambiguous phrases using a suite of string matching and word association algorithms.
These ambiguous phrases are then displayed through interactive widgets (referred to as ambiguity widgets) — allowing users to refine their queries and resolve the ambiguity.
This idea was first presented and incorporated in a system called DataTone (Figure 3) by Adobe.
An example of how the DataTone system handles the ambiguous query corresponding to the Olympic Games dataset can be seen in Figure 3.
Figure 3: A snapshot of the DataTone system highlighting ambiguities through interactive widgets.
The image is taken from the DataTone research paper.
Figure 4: Tableau’s response to the underspecified query “What’s the correlation of GDP?” The image is based on their research paper.
An associated problem to ambiguity is that of underspecification.
While ambiguity arises when the system has multiple options to consider, underspecification refers to cases where the input query lacks enough information (e.
, attributes, keywords that help infer intent) for the system to make a decision.
People naturally tend to be imprecise when asking questions, frequently presenting incomplete or underspecified queries.
For instance, when exploring a dataset about the population and associated economic metrics of different counties, one might wonder about the gross domestic product (GDP) of different counties and ask the question “What’s the correlation of GDP?” While this question may be clear to the user, from the system’s standpoint, it is both ambiguous (e.
there might be different GDP columns, perhaps for multiple years) and underspecified (e.
correlation of GDP against what?)Addressing such issues, researchers at Tableau recently devised a set of techniques to infer missing details in a query based on the meaning and usage frequency of different data attributes as well as constraints imposed by the system’s supported operations.
Using these inference techniques, given the underspecified query “What’s the correlation of GDP?”, the system can generate a scatterplot visualizing GDP per Capita and Life Expectancy since they were the most popular combination of attributes explored for the same dataset (Figure 4).
Preserving context to support an analytic flow: To support natural language input, it is not enough if a system allows users to enter one-off commands that result in a visualization.
During visual data analysis, users often need to iterate upon their questions and refine existing visualizations — diving deeper into specific aspects of a chart or adding new visualizations to the current view.
Supporting such actions implies that the system should be able to support a “conversation” between the user and the data.
A key component of supporting a conversation is interpreting the context in which a query is posed.
In other words, in addition to interpreting the current query, the system also needs to consider previously issued queries (to identify data attributes and values used) and the active view (e.
, visualizations, colors) so it can answer questions effectively.
For instance, in Figure 2, if as a follow-up to the question “What is the profit for each state?” the user asked “What about different cities?”, the system needs to understand that the implicit attribute the user is referring to is “Profit” and accordingly adjust the choropleth map to color cities as opposed to states.
As a first step towards supporting such interactions, research systems have employed conversational centering techniques from the field of pragmatics which is a subfield of linguistics focusing on the ways in which context contributes to meaning.
Figure 5: An example of pragmatics concepts being used to support a follow-up command.
The image is taken from this research paper by Tableau.
Consider the example in Figure 5 where a user is exploring a Seattle house price dataset.
The user first says “houses less than 1M in Ballard” understanding which the system applies two filters for sales price and the neighborhood “Ballard”.
Next, with the previous query in mind, the user simply says “townhomes” implying that the system must show townhomes in the Ballard neighborhood that cost less than 1M.
To capture this implicit meaning, the system preserves (or technically, retains) the neighborhood and price filters and changes (or technically, shifts) the focus on all houses in the dataset to only townhomes.
This combination of the retain and shift operations results in the system updating the chart as per the user’s expectations.
Discoverability: A key challenge faced by users of NLIs (particularly new users) is that they are unsure about what the system is capable of doing (i.
which operations it can perform) and if the system expects the user to conform to a specific language structure.
This uncertainty in knowing what can be asked and how is commonly referred to as the lack of discoverability.
Although the advances in natural language understanding are allowing users to more freely phrase their intended operations, I would argue that discovering “what” can be done remains an open problem.
Compared to the other challenges, discoverability has received relatively little attention in current visualization NLIs.
The most common approach current systems take to aid discoverability is using an autocomplete feature.
An example of this can be seen Microsoft’s Power BI (Figure 6).
However, as found in a study, this approach gives users a false sense of the system’s ability to interpret more complex queries.
Furthermore, this approach is limited to text input and does not work well for spoken commands.
Figure 6: Autocomplete feature in Microsoft Power BI Q&A.
The image is taken from this official blog post by Microsoft.
A more recent approach to aid discoverability of natural language commands is using command suggestions.
As shown in Figure 2, Tableau’s Ask Data system tries to help users start interacting with the system by suggesting sample commands.
We also recently proposed a general framework to suggest natural language commands as a way to enhance the discoverability of natural language input in speech-based multimodal interfaces.
While the idea of suggesting commands is simple at a surface-level, deciding which commands to show, and when and how the suggestions must be made are important design considerations that warrant additional research.
Furthermore, the effects of such suggestions during visual data analysis (e.
do they encourage/discourage users from thinking about new questions) are yet to be investigated.
Given the increasing popularity of speech-based UIs, devising more ways to address the challenge of discoverability remains an open area for research.
Emerging Themes and Future DirectionsFigure 7: Interactive data facts in Voder helping users interpret visualizations and embellish them to highlight key findings.
Natural language as i̵n̵p̵u̵t̵ output: Until now, I have only discussed natural language as an input modality (i.
, as a way for users to communicate with the system).
An emerging theme in visualization research is to complement visualizations with natural language and use language as an output modality (i.
, as a way for systems to communicate with users).
This idea is being explored commercially by companies such as Narrative Science and Automated Insights that offer services to “summarize” visualizations in plain text.
We also recently presented a system called Voder (see Figure 7) that automatically infers key data facts from a visualization and allows users to interact with these facts to highlight them in the visualization.
Current work on this topic is largely focusing on exploring the best way to identify key facts and creating natural language generation (NLG) models to present them in a communicable manner.
However, with increasing concerns regarding the ethical dimensions of visualization research pertaining to automated analysis, I believe an interesting opportunity lies in leveraging NLG to communicate the logic behind the system’s actions so that it can build trust in the users’ mind and lead to more confident decision making.
In my opinion, this is not only specific to NLIs but is an area where natural language research can contribute to data analytics and visualization tools in general.
Complementing Visual Data Analysis with Question Answering: While this post mostly exemplifies queries that create or modify visualizations, not all queries during data analysis may need a visualization as a response.
For instance, looking at the map in Figure 2, one might ask “What were sales in California last year?” or “What are the total sales across regions?” In such cases, all the user wants is the value of sales.
Although the system could generate visualizations in response to these questions, it is probably better if the system returned the actual value or answer and not just a chart that the user needs to interpret to get the answer.
Going forward, I hypothesize that to truly support a cycle of visual analysis through natural language, we need to design systems that can not only render a visualization that “contains” an answer but also provide direct responses when appropriate.
Building such tools warrants further research at the intersection of data visualization systems that can identify the best chart in response to a question and question answering systems, or specifically, NLIs for databases that can directly compute and present values in response to user questions.
This post only briefly discusses some of the ongoing research on NLIs for data visualization.
A more comprehensive review of a subset of the systems described in this post along with additional research opportunities can be found in our paper.
If you find this topic exciting, let’s keep talking (both to systems and each other).
You can learn more about our work on natural language and multimodal interfaces for data analysis with visualization on the Information Interfaces Research Group project page or my personal website.
AcknowledgmentsThanks to Vidya Setlur, Enrico Bertini, Jessica Hullman, and the Georgia Tech Visualization Lab for their feedback on this post.
.. More details