Word trees with googleVis 0.6.4
It’s been while since the last update on googleVis. Well, the Google Chart Tools are fairly settled now, but some time ago Google added Word Trees:
A word tree depicts multiple parallel sequences of words. It could be used to show which words most often follow or precede a target word (e.g., “Cats are…”) or to show a hierarchy of terms (e.g., a decision tree). Google word trees are able to process large amounts of text quickly. Modern systems should be able to handle novel-sized amounts of text without significant delay.
Ashley Baldry contributed the gvisWordTree
function to googleVis 0.6.4, which allows us to create word trees from R.
Examples
Let’s take a look at the Cats
data in googleVis:
library(googleVis)
Cats
## Phrase Size Sentiment
## 1 cats are better than dogs 1 8
## 2 cats eat kibble 1 5
## 3 cats are better than hamsters 1 6
## 4 cats are awesome 1 10
## 5 cats are people too 1 9
## 6 cats eat mice 1 7
## 7 cats meowing 1 3
## 8 cats in the cradle 1 5
## 9 cats eat mice 1 7
## 10 cats in the cradle lyrics 1 5
## 11 cats eat kibble 1 5
## 12 cats for adoption 1 5
## 13 cats are family 1 8
## 14 cats eat mice 1 7
## 15 cats are better than kittens 1 5
## 16 cats are evil 1 0
## 17 cats are weird 1 2
## 18 cats eat mice 1 7
Default Word Tree
To visualise the phrase of the Cats
data and analyse the order of
words in those phrases we can simple call gvisWordTree
on the data and
specify the column name containing the phrases.
# set googleVis plot option to display chart in RMarkdown
op <- options(gvis.plot.tag='chart')
# create word tree chart
wt1 <- gvisWordTree(Cats, textvar = "Phrase")
plot(wt1)
Hover over the words to see information about frequency, click on any of the words in the chart to make it the root of the word tree.
Styling a Word Tree
As with the other googleVis functions we can set various options to change the root, style and look of the plot.
Here is one example with ‘cats’ set as the root and some styling options set, for more details visit the Google documentation.
Cats2 <- Cats
Cats2$Phrase.style <- ifelse(Cats$Sentiment >= 7, "green",
ifelse(Cats$Sentiment <= 3, "red", "black"))
wt2 <- gvisWordTree(Cats2, textvar = "Phrase",
stylevar = "Phrase.style",
options = list(fontName = "Times-Roman",
wordtree = "{word: 'cats'}",
backgroundColor = "#cba"))
plot(wt2)
Implicit and explicit Word Trees
There are two ways to create word trees: implicitly (default) and explicitly.
The choice is specified with the wordtree.format
option.
- ‘implicit’: The word tree will take a set of phrases, in any order, and construct the tree according to the frequency of the words and sub-phrases.
- ‘explicit’: We tell the word tree what connects to what, how big to make each sub-phrase, and what colours to use.
Example of an explicit word tree:
# Explicit word tree
exp.data <- data.frame(id = as.numeric(0:9),
label = letters[1:10],
parent = c(-1, 0, 0, 0, 2, 2, 4, 6, 1, 7),
size = c(10, 5, 3, 2, 2, 2, 1, 1, 5, 1),
stringsAsFactors = FALSE)
wt3 <- gvisWordTree(exp.data, idvar = "id", textvar = "label",
parentvar = "parent", sizevar = "size",
options = list(wordtree = "{format: 'explicit'}"),
method = "explicit")
plot(wt3)
For other chart types, visualisations and documentation see the googleVis vignettes on CRAN.
Citation
For attribution, please cite this work as:Markus Gesmann (May 24, 2019) Word trees with googleVis 0.6.4. Retrieved from https://magesblog.com/post/word-trees-with-googlevis-0.6.4/
@misc{ 2019-word-trees-with-googlevis-0.6.4,
author = { Markus Gesmann },
title = { Word trees with googleVis 0.6.4 },
url = { https://magesblog.com/post/word-trees-with-googlevis-0.6.4/ },
year = { 2019 }
updated = { May 24, 2019 }
}