this article has moved: https://solomonmg.github.io/blog/2014/when-to-use-stacked-barcharts/
-
Recent Posts
- Response to FiveThirtyEight’s Podcast about our paper, “Projecting confidence”
- Replication of Study 2 in Bias in the Flesh: Skin Complexion and Stereotype Consistency in Political Campaigns
- Exposure to Ideologically Diverse News and Opinion, Future Research
- When to Use Stacked Barcharts?
- Visualization Series: Using Scatterplots and Models to Understand the Diamond Market
- Streamline Your Mechanical Turk Workflow with MTurkR
- Generating Labels for Supervised Text Classification using CAT and R
- Working with Bipartite/Affiliation Network Data in R
- Visualization series: Insight from Cleveland and Tufte on plotting numeric data by groups
- Putting it all together: concise code to make dotplots with weighted bootstrapped standard errors
- Map the distribution of your sample by geolocating ip addresses or zip codes
Archives
Categories
Meta
TBH even these examples require more to illuminate. The campaign style chart gives the impression that all methods are equally important whilst ‘Endorsements’ might still be 50x more important than ‘Saring Content’ even for Democrats
Hey Andrew, right, these are purely descriptive. You should not conclude that one campaign style is more effective based on this visual. I’d encourage you to have a look at the full post with context if you haven’t already at https://www.facebook.com/notes/facebook-data-science/campaign-rhetoric-and-style-on-facebook-in-the-2014-us-midterms/10152581594083859
The R code does not run. The first error is in line 12. Where does the data come from that drives the code?
I have had a look now. As you say, gives context. Some elegant combo of the two tables in that section would be interesting
The last one has me thinking a bit. It seems it would be of interest in a traditional 5 point rating scale (agree/disagree) situation if you were interested in polarity (or lack thereof) in responses. It seems that this information could be informative where just plotting the means (probably via dot plot as you advocate) might obscure some differences in response styles to questions. Am I on the wrong track? And are there any issues with such an approach?
I think the use case you describe is great. But you might run into problems if you’re interested in the difference in the distribution across subgroups. If you were to examine the proportion of responses across each the 5 levels of agreement when you look at each of those contrasts you can quickly run into multiple comparisons issues, especially when it comes to post-hoc analyses (e.g., if you were to look at every possible subgroup in your data set). This is less of an issue when your contrasting just the mean of (ideally a battery of) 5 point responses. Note that it’s also less of an issue when you are working with larger data sets that yield more precise estimates.
Awesome. Your point is well taken. I was thinking of it more as a descriptive tool than anything. I work as a consultant and I’m trying to get my colleagues to use R and ggplot more often in their reports, because as you’ve noted elsewhere, it helps explain data to non-experts (I’m a relative newcomer to R having been trained primarily on SPSS). The standard now seems to be to hit clients over the head with frequencies or means until their eyes glaze over, so I’m always looking for new potential graphs to make things more intelligible.
One other question. Should there be a corresponding dataset for the code? Perhaps that should be included in the “party” object?
I just posted the code as an example.
Awesome, no worries. It just helps a bit for me to see what the data look like before I get to manipulating. I’ll play around with the code and make something work. Thanks for posting this stuff, I’ll keep reading.
Pingback: Dime, ¿qué quieres comparar con qué? – datanalytics
Pingback: Notes on Data Visualization – D3.js – data | poly