Data Viz Done Right

February 11, 2016

London Viz Club: The History of Famous People

No comments
Inspired by the fabulous VizClub projects that the folks in Leicester have been doing, a few of us decided to give it a shot here in London. The VizClub normally meets at a pub, but given the noisiness of pubs in London, we decided to meet at The Data School. The great Sophie Sparks of the Tableau Public team started a Twitter chat and invited Graeme Wiggins, Emily Chen, Matthew Nixon, Waseem Ali, Eric Hannell and me. But Sophie had a surprise in store for us, she brought along Andy Cotgreave (sound the groans). However, she brought beer and pizza so we let Andy stay.

We had been discussing using the data from the Open Beer Database to try to build something that would let people identify the beers they might like best. Graeme had warned us that the data wasn't particularly exciting, but we marched on anyway, blind to his advice.

Emily did a fabulous job of joining all of the various datasets using Alteryx and quickly got us a clean data set we could visualise. And boy was Graeme right; there was absolutely nothing interesting about the data. All it had was a list of beers, their ABV and IBU and their location. That's it. So we built a map, then another map, then a bar chart and we all were quickly bored.

On to Plan B. Andy C mentioned that he had been wanting for a long time to have a crack at making over this chart called Horizontal History (click on the image to view a larger version):

Sweet! This looked like a fabulous idea, yet like most projects, finding the data quickly became a problem. We ended up finding a great data set by MIT as part of their Pantheon project. So exciting! Until we looked at the data and realized it only included birth years.

To build a timeline-like viz, we would need death dates for those no longer living. Ugh!!! Back to Google we went and this time we found this data set that included many more people and also their death dates. We download this file (which was in JSON format) and Emily began combing them in Alteryx. This took way, way longer than we expected because we couldn't figure out how to get the JSON Parse tool in Alteryx to behave like we expected. We wasted a good hour here.

While Emily was working on that, I decided to see if anyone had already built a tool to convert a JSON to CSV and low and behold I found this great little tool. A few minutes later I had a CSV and we were able to join this CSV with the TSV from Pantheon within Tableau.  Phew! That took was too long.

By this time, it was about 9:30pm (we started at 6:15) and the team needed to get going. So we started playing with the data, built a simple timeline. Then we started playing with some of the dimension that we get from the Pantheon dataset.

For example, only about 14% of the famous people in the list are women. What??? That's sad.

Note: Not all women are shown (this is merely a screenshot)

Ok, what occupations are associated with these women?

Note: Top 15 occupations only
On we went with several more iterations and the questions were flying about. Fortunately Tableau makes answer all of these questions at a super fast pace possible. At this point we needed to build something, anything so we could get home. Since Andy C left, we decided (well, I decided) to pick on him. We all know his great love for pies, and who doesn't love a good donut, so we build a donut chart of all of the historical figures sorted by their name and used the Cyclic color palette. We wanted to make sure Andy could see it well, so we stuck him in the middle of the chart like a donut chart.

Then someone proposed sorting the names by birth year and then changing the fonts to Comic Sans and Papyrus, really only in an effort to troll Andy for leaving. Yes, this was it! Have a look at the tooltips (hover outside of Andy's pretty face)...fabulous!

Don't worry, you'll have a chance to improve this in a future Makeover Monday.

February 9, 2016

Tableau Tip Tuesday: How Discrete & Continuous Dimensions Affect Your Tick Marks & Gridlines

No comments
In this week's Tableau Tip Tuesday, I show you the impact that discrete and continuous pills have on your tick marks and gridlines. This is very important because, if not used properly, you could easily mislead your audience.

February 8, 2016

Makeover Monday: How Many Blacks Did the Police Kill in 2015?

This week's Makeover Monday was much harder than I anticipated and I must note that it took me way, way over an hour to create something I was happy with (more about that in a bit). The website that we reviewed had a series of three charts about police killings in the United States. I'll focus on the first two:

What works:
  • Good title and subtitle
  • Bars are sorted properly
  • Using a rate is good practice because it normalizes the data
  • Using a different color of the U.S. average

What doesn't work:
  • I hate charts that make me turn my head sideways to read the labels.
  • I would have the U.S. average as a reference line.
  • There's no sense for the rest of the population.
  • It feels like there's more to the story.

Ok, so how about the second chart:

What works:
  • Good title and subtitle
  • Ranking the police departments
  • Using a rate is good practice because it normalizes the data
  • Calling out those that police departments that have only killed blacks

What doesn't work:
  • I almost didn't notice the U.S. average (it's above the first police department).
  • The column headers should wrap so they fit better.
  • Again, it feels like there's more to the story.
  • The table makes comparing police departments harder than necessary.

A quick bit of background before we get to my viz. Last week, we brought Caroline Beavon to the Data School to teach an infographics and information visualisation course. I would highly, highly recommend the course. It was a perfect blend of the courses that Andy Kirk and Cole Nussbaumer teach if you've ever been to their classes. We learned a ton about knowing your audience, choosing the aim for your visualisation and picking out the proper story in the data. In addition, we designed several infographics, which is something I was particularly excited to put into practice this week. With that being said, here's is my makeover of the two original charts, but really, it's completely different and delves much deeper into the story the data is trying to tell.

You can download the workbook from Tableau Public here.

February 3, 2016

Dear Data Two | Week 42: Laughing

I couldn't wait for this week to come around. We laugh A LOT at the Data School so I knew data collection would not only be easy, but fun too. I also have this weird way of making fun of people as a way of showing that I like them or that they are a friend.

For my analysis, I only considered laughing that occurred at the Data School, which means data is only through Thursday as we had a company trip the rest of the week.

I collected a few characteristics about each laugh:

  1. When
  2. Who was I with
  3. What was I laughing about
  4. How big was the laugh

From there, I started my analysis, focusing mainly on Lorna's accusations that I pick on her too much. What you'll find in the analysis below is that it looks like I laughed AT Lorna way more than I laughed WITH her. However, once I created the postcard and included what we were laughing about, it's evident that I was often laughing at someone else, not her.

In the end, I wanted the postcard to look like an emoji, so I built an emoji view of all of the laughs in Tableau first before creating the postcard. What I like about the postcard is that I gave me the opportunity to incorporate more information and more detail into the final product.

Overall, a very fun week!

January 31, 2016

Makeover Monday: Travel Agents Are a Relic of the Past and Hotels Could Be Next


The chart that we're reviewing this week for Makeover Monday is from Tech Chart of the Day. The title of the article "Travel Agents Are a Relic of the Past and Hotels Could Be Next" is quite catching, but the accompanying chart leaves a lot to be desired.

What can be improved?

  1. The title is boring and doesn't capture my attention.
  2. I'm not exactly sure which axis goes with which colour because neither axis is labeled.
  3. The colours of the lines are too similar for me.
  4. For some reason, my eyes want to match the darker blue line with online hotel revenue and the lighter blue line with # of travel agents, but they're actually the opposite.
  5. A dual-axis chart is often used to show a correlation, but is the correlation explained by this chart.

Ultimately, if I have to work this hard to understand such a simple chart, then it must not be done well.

This week, I'm going to walk you through my step-by-step makeover, ending with my final version. Each step along the way addresses one or more of the problems mentioned above. Let's get started.

First, I'll (1) change the title, (2) add the axis titles,  (3) change the line colours to make them easier to distinguish, (4) colour each axis to match its line colour to aid in understanding, and (5) colour the keywords in the title to match the line colours.

Ok, that's definitely better, yet for me it still doesn't address the issue with correlations and dual-axis charts. Let's make one minor change and convert the number of travel agents to a bar chart to make the view a combination chart.

A combination chart like this makes our eyes separate the two metrics while still allowing us to see the patterns of both at the same time. For example, it's easy to see that online hotel revenue is steadily increasing and the number of travel agents is decreasing.

However, the story here is the growth of revenue for online hotel services like Airbnb, VRBO, Flipkey, and HomeAway compared to the decline of travel agents. So, with that in consideration, I wanted to see how the respective growth (or decline) rates since 2000.

In this view, notice how I've significantly changed the title to meet the objectives of the story and how I've changed the axis titles. This view clearly shows the significant increase in online hotel revenue since 2000 and the fairly significant decrease in travel agents. Yet I still don't love it.

I'm thinking a connected scatterplot might do the trick. As Ben Jones said:
The connected scatterplot imparts a sense of travelling a pathway through a terrain that has twists and turns, loops and sudden rises and falls that encode how the two different variables changed together.
That's exactly what I'm looking for; a method for showing how the two variables move together. So taking the % change since 2000 from above and converting it into a connected scatterplot, I get this:

I really like how this shows how the decrease in travel agents and the increase in online hotel revenue shifts together. I also added annotations to drive the point home even farther. Yet I still feel like this isn't quite done. I like the annotations, but not the axis scales. While the % change gives me nice context, I feel like I'm losing the overall magnitude of the changes.

Lastly, I removed the % change over time from each axis and went back to the raw values. I then made the line dashed because I feel like the dashed lines show the trails through time better.

In particular, I like how the design of this connected scatterplot starts at the upper left and moves down and to the right. I used Cole Nussbaumer's "where are your eyes drawn" test by turning my head away then back and seeing where my eyes go first. They went directly to the upper left dot for the year 2000, just like I had hoped.

You can download this workbook from my Tableau Public profile here.

January 26, 2016

Tableau Tip Tuesday: How to Create Diverging Bar Charts


UPDATE: Some followers have let me know that they've heard these charts called tornado charts or butterfly charts. Basically, the consensus is that this type of chart doesn't have a name. That's ok though!

For this week's tip, I show you how to make diverging bar chart or bikini charts or whatever they are called.

January 25, 2016

Makeover Monday: Two-Thirds of Americans Don’t Have Enough Money Saved

No comments

For this week's Makeover Monday, we look at this article from Go Banking Rates. In the article, there are two charts that need some work. First, there is this exploding donut chart.

What works well?

  1. The data is sorted clockwise by the savings amount.
  2. The donut chart starts at 12 o'clock.
  3. The labels tell us the actual values.

What needs improvement?

  1. Remove the donut as it's hard to compare slices.
  2. By adding all of the labels, they've basically created a table.
  3. The title could be improved.

Here is my initial makeover of this chart:

In this view, I've incorporated a few elements to aid in understanding:

  1. The title is more meaningful and let's the reader know what they are looking at.
  2. I colour-coded the bars by the survey response rate.
  3. I labeled the bars with the response rate and the cumulative response rate.
  4. I kept the sort order by the savings amount. This makes it easier to see how little savings people have.

Ok, so that's a quick glance at the overall numbers. The article later breaks these down by age groups in this stacked bar chart.

Again, this chart has several issues that hinder comprehension:

  1. The title is pretty meaningless.
  2. The stacked bar chart allows you to only easily compare the first and last segments across ages.
  3. I find the text on the bars to be distracting.
  4. I have to look back and forth to the colour legend to keep things straight in my head, which slows down comprehension.

To makeover this chart, I iterated through several options, none of which I really love, but I like them all better than the stacked bars.

This view is useful in that it allows me to compare the different savings amounts for each age group. What it lacks, though, is the ability to compare across age groups for the different savings amounts.  To do that, I need to flip the chart like this:

Great, but ideally we should be able to compare in both directions. A line chart can help me understand and compare the data across both dimensions.

What I like about the line chart is that it helps me see patterns better. For example, you can clearly see that as people age there are more people with $10,000 or more in savings. At the same time, I can understand the mix within each age band.

One last alternative that works similarly to a line chart, except the patterns aren't as easy to see, is a heat map.

The heat map is great in that it shows me concentrations and I can include to overall values. The downside is that the patterns are not as easy to distinguish as the line chart.

Overall, this exercise showed me that there's no one "best" way to visualise this data set. The chart I would choose to display would depend on the message I want to send and the question my audience needs to answer.  Below is the Tableau workbook with all of my iterations.

January 21, 2016

Dear Data Two | Week 41: Music

No comments
Week 41: Music was probably the easiest from a data collection perspective so far. In week 32, I tracked everything I listened to, so I didn't want to do that again. Inspired by Andy Cotgreave's great post about his music trends I thought I would do an analysis on the music that's in my iTunes playlist. I've never thought about looking at the type of music I listen to, how long songs are, nor the bands and their songs that I prefer, so this provided me with the perfect opportunity.

This time, I started my analysis in Vizable and after a few minutes I had a good feeling for the overall patterns in the data.

Using Vizable this way definitely helped speed up my analysis in Tableau. I like how Vizable keeps me from overthinking the analysis. It encourages me to play rather than overdiagnose. That being what it is, I was quite surprised to learn how much alternative music I listen to. I suppose that goes back to when I was in college and that genre first became more popular.

I also had a playlist of the songs that are in my primary running playlist, so I used that throughout the analysis as well to help me see if what I listened to in general was the same music I listen to when I run.

For the final visualisation, I accidently created this radial pie chart thingy. I would never create something like this if it weren't for this project. I consider this more "data art" than data visualisation.

Click through the story below to see how I conducted my analysis and to see the postcard I created.

January 19, 2016

Tableau Tip Tuesday: Combining Shapes and Colours for Simpler Legends

1 comment

In the week's Tableau Tip Tuesday, I look back to Dear Data Two Week 40 and how I combined the color and shape legends into a single legend.

January 18, 2016

Makeover Monday: Are Consumers Bored With Technology?

For week 3 of Makeover Monday, I challenged Andy Cotgreave to not only makeover this graphic below, but also to use only greyscale colours. Let's start by looking at the graph/chart/infographic in question:

This chart is so bad, it's tough to know where to start. Let's start with what works well:

  • There's a clear title and subtitle that tell us what the chart is about.
  • While I don't use icons often, their use of icons for each type of technology might aid some people in understanding, though I suspect they add them more for decoration.
  • There's a clear order to the donuts, from highest to lowest based on the 2016 purchase rate.
  • The font is consistent throughout the graph.
  • They fit a lot of information in a small space.

Let's now consider what could be improved:
  • The use of donut charts makes comparing the technologies more difficult than necessary.
  • They used sized bubbles for negatives and positives. Really bad idea because this might make people think -1% is the same as +1%.
  • Using bubbles to represent 2015 makes comparing 2015 values really difficult as you have to do the math in your head. Quick, which technology ranks third for 2015?
  • It's harder than necessary to compare 2015 to 2016 for each technology.
  • The red/green bubbles will be challenging for the red/green colour-blind folks.

With these problems in consideration, I've created this alternative version. This took only about 15 minutes to create and 15 minutes to tidy up and organize. Click on the image to view the interactive version and to download the workbook.

In this view, I've focused on the change between 2016 and 2015 in both views. The slope graph on the left helps you see the ranking of each technology in each year and allows you to compare see the year over year change. The bar chart on the right shows only the year over year change, sorted in descending order by the change, whereas the original version ordered them by the purchase intent rate.

Both graphs use the same colour scale: black for an increase, grey for a decrease. I had considered using a diverging scale, but I didn't thought that it made the distinction between positive and negative growth too difficult to understand.

My initial idea was this bikini chart, but it has the major problem of making comparisons between 2016 and 2015 nearly impossible. It also only allows me to sort by one of the years. I wanted to include it in this post, though, so you could get some other ideas.

January 16, 2016

Dear Data Two | Week 40: Meeting New People

No comments
For week 40, meeting new people, I was really hoping that I would meet a ton of new people like Stefanie and Giorgia did when they collected data for this week. I even tweeted out that I would buy people coffee, but I had no takers and didn't meet anyone new through Tuesday.

I reached out to Jeff and told him what was happening and I came up with the idea of looking at my new Twitter followers this week. I believe Jeff is going to do the same. I then set up an IFTTT channel to log new followers to a spreadsheet so that I could start my Tableau analysis.

Not surprisingly, most of my new followers (60%) were males, as Twitter tends to have more males.

Next, I looked at when they started following me. The IFTTT channel logs in my local time, which i then broke down by AM/PM.

64% of my new followers were after 12pm GMT

I didn't have very much metadata to work with otherwise this week, so from this points, I focused on different ways I could visualise my postcard in Tableau.

Click through the story points below to see my brief analysis and my different postcard options. The last tab has the actual postcard for Jeff.

January 11, 2016

Makeover Monday: Stephen Curry Hates Mid-Range Jump Shots


This week's Makeover Monday looks at this simple stacked bar chart from Sports Chart of the Day:

What works well:

  1. Nice labelling on the axes; this makes it clear what we data is displayed
  2. Nice annotation of the point of the chart

What could be done better:

  1. Needs a better title
  2. Needs to incorporate shooting % so that I can better understand the relationship between shots taken and shooting %
  3. Group together all shots over 30 feet
  4. Find a better way to compare the number of shots taken on each range. I find the stacked bars hard to compare within in a shot distance.
  5. Make the different shot ranges more clear, e.g., what defines a long 2?
  6. Provide a summary
  7. How are Curry's shot selections changing over time?

With these considerations in mind, here is my my makeover of the chart:

Note: If you'd like to participate in Makeover Monday, check this link for the details.

January 7, 2016

Dear Data Two | Week 39: Beauty

Last week I tracked the rather depressing topic of negative thoughts, so it was great to turn the page this week and begin taking in the things around me that I find beautiful. I started the week tracking every little thing I found beautiful, which made the data collection overwhelming. So I stepped back and decided to only track those things where in my head I said "Wow! That's stunning!"

This week really helped me appreciate the exceptional beauty that is around me every day, whether it be our wonderful parks, the people I see, or the amazing architecture London offers. I had this week off from work, so I was able to spend a lot of time with my family going on walks, visiting museums, playing games, etc. So the topic, combined with the time I spent with my family, really made for a wonderfully positive week after a rather crappy Christmas week.

My analysis of the data started much like every other week: exploring what each dimension I tracked offered, looking for stories, trying to find patterns. I didn't learn a whole lot about myself, but trying to emulate what Giorgia did in her postcard for this week helped me continue to learn about using Tableau as a drawing canvas.

The trouble I run into time and time again is that I have so many dimensions that I want to put in a single view, but if I add them all in Tableau, the view quickly becomes impossible to comprehend. However, I'm really beginning to appreciate the freedom that drawing on a postcard gives me. I can incorporate as many dimensions as I'd like through the use of symbols and marks in a way that doesn't clutter the visualisation and aids in understanding.

I'm not sure why, but I feel like I'm starting to grow as a data artist, particularly when it comes to pen and paper. My ideas are flowing at the moment. Let's hope I can continue the momentum for the remaining 13 weeks.

Explore my week below by click through the story points. Enjoy!