This article has moved: https://solomonmg.github.io/blog/2012/visualization-series-insight-from-cleveland-and-tufte-on-plotting-numeric-data-by-groups/
-
Recent Posts
- Response to FiveThirtyEight’s Podcast about our paper, “Projecting confidence”
- Replication of Study 2 in Bias in the Flesh: Skin Complexion and Stereotype Consistency in Political Campaigns
- Exposure to Ideologically Diverse News and Opinion, Future Research
- When to Use Stacked Barcharts?
- Visualization Series: Using Scatterplots and Models to Understand the Diamond Market
- Streamline Your Mechanical Turk Workflow with MTurkR
- Generating Labels for Supervised Text Classification using CAT and R
- Working with Bipartite/Affiliation Network Data in R
- Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups
- Putting it all together: concise code to make dotplots with weighted bootstrapped standard errors
- Map the distribution of your sample by geolocating ip addresses or zip codes
Archives
Categories
Meta
Pingback: Research tips - Data visualization
I’ve read Nathan Yau’s “Visualise This!” book which was a pretty nice intro, but I have to admit you have really succeeded in pulling all the main arguments for “perfect” visualisations together. Great article, can’t wait for the next one – good work!!
Pingback: Visualização gráfica « De Gustibus Non Est Disputandum
Pingback: HotPearls to be sorted | Pearltrees
Pingback: Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups | R | Scoop.it
While I certainly agree with you, Tufte, and Cleveland, there remains the aphorism, “figures don’t lie, but liars figure”. If one’s venue is political communication, one’s agenda is the prime objective, not objectivity. Thus, figuring with figures (in the pictorial sense) in an effective way, meaning to convince the unbelievers of one’s agenda (especially when said agenda is actually contrary to the interests of the unbelievers; see: Frank on Kansas), is the goal. If that means manipulating weaknesses in human perception, then that’ll be done. The Mad Men have been doing it for decades. Nixon, the first in my memory, used Ad Men deeply.
From an academic point of view, where objectivity and fairness is the goal, then, certainly. As a “know thine enemy” exercise, by all means, continue.
Thanks for the comment Robert, it gives me an excuse to delve into some social-historical issues related to data and visualization. You are right that data can always be subject to cherry-picking, biases in measurement, subtle transformations, and sometimes outright fabrication, which can serve to manipulate the audience’s interpretation of the actual state of things. This is especially concerning because people who do not analyze data are generally not prepared to detect and counter these obfuscations. In fact, it is often difficult for data analysts to detect such problems. But just because some people use visualization to manipulate doesn’t mean we should throw away such a valuable tool.
Data visualization is intricately linked to the scientific revolution (i.e., the Enlightenment), which by most accounts has made the world a better place. I’ll argue that the critical catalysts of the scientific Enlightenment were Galileo for his emphasis on objective measurement of physical phenomena and experimental intervention (which eventually led to an early theory of inertia and the eventual acceptance of Copernicus’ heliocentric model of the solar system); and DeCartes, for his emphasis on deductive logic and for the invention of the the Cartesian plane, which brings together measurement, mathematics, and geometry, depicting the relationship between variables (or functions) visually. This in my view constitutes the birth of modern data visualization.
The development of the Cartesian plane as a tool for understanding data, variables, functions, etc., had a powerful impact on our world. It was crucial to the discovery/invention of Calculus, without which very little of our modern world would be possible. More immediately, it would be nearly impossible to understand or teach math or science or any quantitative endeavor without visualization.
While folks do use visualization as a tool of manipulation, the way to neutralize any such deception is the same for propaganda in prose—education. Rather than rejecting visualization as a tool of the propagandist, we ought to improve society’s understanding of data and visualization.
LOL
“I did see plenty of maps, however, which I suppose one could argue are reminiscent of noodles.”
Nice and needed article
Thanks
Interesting and useful. Just one objection: Since you mention the interplay of lexical and visual expressions, I can’t help cringing when you write “I suppose I could extend the axis from 0-100.” A hyphen is not a synonim for the word “to.” The hyphenated word “0-100” already means “the range from 0 to 100,” so you’re effectively writing “from from 0 to 100.” Thought I had to comment on this because I’d hate to see any similar construction be absorbed into one of your nice graphic examples.
Thanks for the comment, you’re probably right. Fixed.
Pingback: Are pie charts really always bad? (and other thoughts on graphs…) | Aid Writing
Pingback: Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups @ Solomon Messing | Martin Larsson
Pingback: datanalytics » Representación de datos asociados a grupos
Pingback: Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups | Estadística y R | Scoop.it
Pingback: Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups @ Solomon Messing : Martin Larsson
Pingback: Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups | Encouraging moderation: Clues from a simple model of ideological conflict | Scoop.it
Pingback: Enlaces de la semana | Politikon
Pingback: Visualization Series: Using Scatterplots and Models to Understand the Diamond Market (so You Don’t Get Ripped Off) | Solomon Messing
Pingback: When to Use Stacked Barcharts? | Solomon Messing
Pingback: visualization | Future Yada Yada Yada
Great read, thank you! I realize this is over two years old, but I was linked here from a newer article.
Regarding the Primary Elections example (plot vs pie vs stacked bar). I feel it’s worth mentioning that simply adding Candidate dimension / vote percent measure labels directly to pie slices and stacked bar segments debunks the cognitive processing argument. That being said, both are still not scalable in terms of how many candidates visualize well, and clearly plot is more effective.
Pingback: How data visualizations are tools (and what you're building with them) - SHARP SIGHT LABS
Hi,
many thanks for this! I tried to run your script, but it throws up a message:
“Error: Use ‘theme’ instead. (Defunct; last used in version 0.9.1)”
Being ignorant of the finer details of R (just a beginner), could you tell me how to fix the script so that it runs on a current installation of R? That would be much appreciated.
Other than that, I have profited from reading your material and would like to say thank you for sharing it!
Best,
Andreas
I enjoyed the article. The link to Cleveland’s paper was broken. I found it here: https://www.cs.ubc.ca/~tmm/courses/cpsc533c-04-spr/readings/cleveland.pdf.
Great article !
Have you the link for “primaryres.csv” dataset ?
Best,
Yes, I’ve now updated the code to point the right location for the data files and updated the syntax to work w/ latest versions of ggplot2 and dplyr.