## Are We Solving The Wrong Problems With Machine Learning?

Corn and how it gets from growing in fields onto your table.

Below is a video of a corn harvesting machine:

And here is a video of people gathering corn:

So, I hear you say, what all does this have to do with machine learning?

A lot, as it so happens.

## Training an RNN on the Archer Scripts

#### Introduction

So all the hype these days is around “AI”, as opposed to “machine learning” (though I’ve yet to hear an exact distinction between the two), and one of the tools that seems to get talked about most is Google’s Tensorflow.
I wanted to get playing around with Tensorflow and RNN’s a little bit, since they’re not the type of machine learning I’m most familiar with, with a low investment in time to see what kind of outputs I could come up with.

#### Background

A little digging and I came across this tutorial, which is a pretty good brief overview intro to RNNs, and uses Keras and computes things character-wise.
This is turn lead me to word-rnn-tensorflow, which expanding on the works of others, uses a word-based model (instead of character based).
I wasn’t about to spend my whole weekend rebuilding RNNs from scratch – no sense reinventing the wheel – so just thought it’d be interesting to play around a little with this one, and perhaps give it a more interesting dataset. Shakespeare is ok, but why not something a little more culturally relevant… like I dunno, say the scripts from a certain cartoon featuring a dysfunctional foul-mouthed spy agency?

## When to Use Sequential and Diverging Palettes

#### Introduction

I wanted to take some time to talk an about important rule for the use of colour in data visualization.
The more I’ve worked in visualization, the more I have come to feel that one of the most overlooked and under-discussed facets (especially for novices) is the use of colour. A major pet peeve of mine, and a mistake I see all too often, is the use of a diverging palette instead of a sequential one or vice-versa.
So what is the difference between a sequential and diverging palette, and when is it to correct to use each? The answer is one that arises very often in visualization: it all depends on the data, and what you’re trying to show.

#### Sequential vs. Diverging Palettes

First of all, let’s define what we are discussing here.
Sequential Palettes
A sequential palette ranges between two colours (typically having one “main” colour) ranging from white or a lighter shade to a darker one, by varying one or more of the parameters in the HSV/HSL colour space (usually only saturation or value/luminosity, or both).
For me, at least, varying hue is going between two very distinct colours and is usually not good practice if your data vary linearly, as it is much closer to a diverging palette which will discuss next. There are others reasons why this is bad visualization practice, and, of course, exceptions to this rule, which we will discuss later in the post.
 A sequential palette (generated in R)
Diverging Palettes
In contrast to a sequential palette, a diverging palette ranges between three or more colours with the different colours being quite distinct (usually having different hues).
While technically a diverging palette could have as many colours as you’d like in a (such as in the rainbow palette which is the default in some visualizations like in MATLAB), diverging palettes usually range only between two contrasting colours at either end with a neutral colour or white in the middle separating the two.
 A diverging palette (generated in R)

#### When to Use Which

So now that we’ve defined the two different palette types of interest, when is it appropriate and inappropriate to use them?

The rule for the use of diverging palettes is very simple: they should only be used when there is a value of importance around which the data are to be compared.

This central value is typically zero, with negative values corresponding to one hue and positive the other, though this could also be done for any other value, for example, comparing numbers around a measure of central tendency or reference value.

A Simple Example
For example, looking at the Superstore dataset in Tableau, a visualizer might be tempted to make a map such as the one below, with colour encoding the number of sales in each city:

Here points on the map correspond to the cities and are sized by total number of sales and coloured by total sales in dollars. Looks good, right? The cities with the highest sales clearly stick out in the green against the dark red?

Well, yes, but do you see a problem? Look at the generated palette:

The scale ranges from the minimum sales in dollars ($4.21) to max (~$155K), so we cover the whole range of the data. But what about the midpoint? It’s just the dead center point between the two, which doesn’t correspond to anything meaningful in the data – so why would the hue change from red to green at that point?

This is a case which is better suited using a sequential palette, since all the values are positive and were not highlighting a meaningful value which the range of data falls around. A better choice would be a sequential palette, as below:

Here, the range is full covered and there is no midpoint, and the palette ranges from light green to dark. The extreme values still stand out in dark green, however there is no well-defined center where the hue arbitraily changes, so this is a better choice.

There are other ways we could improve this visualization’s encoding of quantity as colour, for one, by using endpoints that would be more meaningful to business users instead of just the range of the data (say, $0 to$150K+), and another which we will discuss later.

Taking a look at the two palettes together, it’s clearer which is a better choice for encoding the always positive value of the metric sales dollars across its range:

Going Further
Okay, so when would we want to use a diverging palette? As per the rule, if there was a meaningful midpoint or other important value you wanted to contrast the data around.

For example, in our Superstore data, sales dollars are always positive, but profit can be positive or negative, so it is appropriate to use a diverging palette in this case, with one hue corresponding to negative values and another to positive, and the neutral colour in the middle occurring at zero:

Here it is very clear which values fall at the extremes of the range, but also which are closer to the meaningful midpoint (zero): that one city in Montana is in the negative, and the others don’t seem to be very profitable either; we can tell they are close to zero by how washed out their colours are.

Tableau is smart enough to know to set the midpoint at zero for our diverging palette. Again, you could tinker with the range to make the end-points more meaningful (e.g. round values), as well as varying the range: sometimes a symmetrical range for a diverging palette is easier to interpret from a numerical standpoint, though of course you have to keep in mind how perceptually this going to impact the salience of the colour values for the corresponding data.

## Plotting Choropleths from Shapefiles in R with ggmap – Toronto Neighbourhoods by Population

#### Introduction

So, I’m not really a geographer. But any good analyst worth their salt will eventually have to do some kind of mapping or spatial visualization. Mapping is not really a forte of mine, though I have played around with it some in the past.
I was working with some shapefile data a while ago and thought about how its funny that so much of spatial data is dominated by a format that is basically proprietary. I looked around for some good tutorials on using shapefile data in R, and even so it took me a while to figure it out, longer than I would have thought.
So I thought I’d put together a simple example of making nice choropleths using R and ggmap. Let’s do it using some nice shapefile data of my favourite city in the world courtesy of the good folks at Toronto’s Open Data initiative.

#### Background

We’re going to plot the shapefile data of Toronto’s neighbourhoods boundaries in R and mash it up with demographic data per neighbourhood from Wellbeing Toronto.
We’ll need a few spatial plotting packages in R (ggmap, rgeos, maptools).
Also the shapefile originally threw some kind of weird error when I originally tried to load it into R, but it was nothing loading it into QGIS once and resaving it wouldn’t fix. The working version is available on the github page for this post.

#### Analysis

First let’s just load in the shapefile and plot the raw boundary data using maptools. What do we get?
# Read the neighborhood shapefile data and plotshpfile <- "NEIGHBORHOODS_WGS84_2.shp"sh <- readShapePoly(shpfile)plot(sh)
This just yields the raw polygons themselves. Any good Torontonian would recognize these shapes. There’s some maps like these with words squished into the polygons hanging in lots of print shops on Queen Street. Also as someone pointed out to me, most T-dotters think of the grid of downtown streets as running directly North-South and East-West but it actually sits on an angle.

Okay, that’s a good start. Now we’re going to include the neighbourhood population from the demographic data file by attaching it to the dataframe within the shapefile object. We do this using the merge function. Basically this is like an SQL join. Also I need to convert the neighbourhood number to a integer first so things work, because R is treating it as an string.

# Add demographic data# The neighbourhood ID is a string - change it to a integersh@data$AREA_S_CD <- as.numeric(sh@data$AREA_S_CD)# Read in the demographic data and merge on Neighbourhood Iddemo <- read.csv(file="WB-Demographics.csv", header=T)sh2 <- merge(sh, demo, by.x='AREA_S_CD', by.y='Neighbourhood.Id')
Next we’ll create a nice white to red colour palette using the colorRampPalette function, and then we have to scale the population data so it ranges from 1 to the max palette value and store that in a variable. Here I’ve arbitrarily chosen 128. Finally we call plot and pass that vector of colours into the col parameter:
# Set the palettep <- colorRampPalette(c("white", "red"))(128)palette(p)# Scale the total population to the palettepop <- sh2@data\$Total.Populationcols <- (pop - min(pop))/diff(range(pop))*127+1plot(sh, col=cols)
And here’s the glorious result!

Cool. You can see that the population is greater for some of the larger neighbourhoods, notably on the east end and The Waterfront Communities (i.e. condoland)

I’m not crazy about this white-red palette so let’s use RColorBrewer’s spectral which is one of my faves:

#RColorBrewer, spectralp <- colorRampPalette(brewer.pal(11, 'Spectral'))(128)palette(rev(p))plot(sh2, col=cols)

There, that’s better. The dark red neighborhood is Woburn. But we still don’t have a legend so this choropleth isn’t really telling us anything particularly helpful. And it’d be nice to have the polygons overplotted onto map tiles. So let’s use ggmap!

#### ggmap

In order to use ggmap we have to decompose the shapefile of polygons into something ggmap can understand (a dataframe). We do this using the fortify command. Then we use ggmap’s very handy qmap function which we can just pass a search term to like we would Google Maps, and it fetches the tiles for us automatically and then we overplot the data using standard calls to geom_polygon just like you would in other visualizations using ggplot.

The first polygon call is for the filled shapes and the second is to plot the black borders.

#GGPLOT points <- fortify(sh, region = 'AREA_S_CD')# Plot the neighborhoodstoronto <- qmap("Toronto, Ontario", zoom=10)toronto +geom_polygon(aes(x=long,y=lat, group=group, alpha=0.25), data=points, fill='white') +geom_polygon(aes(x=long,y=lat, group=group), data=points, color='black', fill=NA)
Voila!

Now we merge the demographic data just like we did before, and ggplot takes care of the scaling and legends for us. It’s also super easy to use different palettes by using scale_fill_gradient and scale_fill_distiller for ramp palettes and RColorBrewer palettes respectively.

# merge the shapefile data with the social housing data, using the neighborhood IDpoints2 <- merge(points, demo, by.x='id', by.y='Neighbourhood.Id', all.x=TRUE)# Plottoronto + geom_polygon(aes(x=long,y=lat, group=group, fill=Total.Population), data=points2, color='black') +  scale_fill_gradient(low='white', high='red')# Spectral plottoronto + geom_polygon(aes(x=long,y=lat, group=group, fill=Total.Population), data=points2, color='black') +  scale_fill_distiller(palette='Spectral') + scale_alpha(range=c(0.5,0.5))

So there you have it! Hopefully this will be useful for other R users wishing to make nice maps in R using shapefiles, or those who would like to explore using ggmap.

#### References & Resources

Neighbourhood boundaries at Toronto Open Data:
Demographic data from Well-being Toronto:

## Toronto Data Science Group – One or the Other: An Overview of Binary Classification Methods

So Chris was kind enough to invite me to speak at the Toronto Data Science Group again this past Thursday. I spoke on binary classification, and made an effort to cover a fair bit of ground and some technical detail, while still making it accessible. I wanted to give an overview for an audience that was more interested in the ‘how’, and the practical realities of using classification to solve problems within an organization.

As before, I’ll keep my observations to be more about presenting and less about the content.

The meetup is a lot different now, having presentations at venues like MaRS or the conference room at Thompson Hotel with large audiences, as opposed to the early days when it was much smaller.

Speaking to a larger group is challenging; both in that it’s more nerve-racking, and I also noticed it was harder to make eye contact and include the whole audience than I am used to with smaller groups. The temptation is to just look out straight ahead in front of you. Speaking in front of a podium has its disadvantages this way, but it does keep you anchored and give you something on which to rest your hands and remain centered. Looking back toward the screen is usually a bad idea when presenting regardless of audience size, unless you are pointing something out, and is doubly so when that screen is very large and above you.

Some folks were kind enough to take some photos of me during the talk for social media and the like. In retrospect, while I do try to have a very visual style (and inject some humour with it) I think it can come across as overly simplistic and flippant in certain contexts, such as with this larger group. There’s a balance to be struck there, I’m sure. Also, as always, you need to be mindful of how large you are making things on your slides (especially text), given the size of the screen with respect to the venue.

The point I made about the explainability of different classification methods to the non-technical audience or end consumer (i.e. client) receiving the results of their application was less controversial than I would have thought. Chris commented on this as well.

As always I was overly ambitious and was able to get through a lot less material in the timeframe than I originally would have thought.

I was asked some very insightful and detailed questions, some of which I wasn’t totally prepared to answer. Talking about something is fairly easy, I think, because you can put together exactly what you want to say and rehearse; it’s in the answering of the questions that people decide whether you really know the subject, or just putting pretty pictures up on the screen and painting in broad verbal strokes. Many people in the audience seemed to have assumed that because I was speaking on the topic of binary classification that I was a complete expert on it – there’s a danger here too, I think, when you see anyone give a presentation.

All in all, I think the talk was very well received. As always I learned a lot putting it together, and even more afterward, discussing with Toronto’s data scientists and knowledgeable analysts with insightful points of view.

Looking forward to the next one.

## I’m Dreaming of a White Christmas

I’m heading home for the holidays soon.

It’s been unseasonably warm this winter, at least here in Ontario, so much so that squirrels in Ottawa are getting fat. I wanted to put together a really cool post predicting the chance of a white Christmas using lots of historical climate data, but it turns out Environment Canada has already put together something like that by crunching some numbers. We can just slam this into Google Fusion tables and get some nice visualizations of simple data.

#### Map

It seems everything above a certain latitude has a much higher chance of having a white Christmas in recent times than those closer to the America border and on the coast, which I’m going to guess is likely due to how cold it gets in those areas on average during the winter. Sadly Toronto has less than a coin-flip’s chance of a white Christmas in recent times, with only a 40% chance of snow on the ground come the holiday.

#### Chart

But just because there’s snow on the ground doesn’t necessary mean that your yuletide weather is that worthy of a Christmas storybook or holiday movie. Environment Canada also has a definition for what they call a “Perfect Christmas”: 2 cm or more of snow on the ground and snowfall at some point during the day. Which Canadian cities had the most of these beautiful Christmases in the past?

Interestingly Ontario, Quebec and Atlantic Canada are better represented here, which I imagine has something to do with how much precipitation they get due to proximity to bodies of water, but hey, I’m not a meteorologist.
A white Christmas would be great this year, but I’m not holding my breath. Either way it will be good to sit by the fire with an eggnog and not think about data for a while. Happy Holidays!

I Heart Sushi

## Introduction¶

I like sushi.

I’ve been trying to eat a bit better lately though (aren’t we all?) and so got to wondering: just how bad for you is sushi exactly? What are some of the better nutritional choices I can make when I go out to eat at my favorite Japanese(ish) place? What on the menu should I definitely avoid?

And then I got thinking like I normally get think about the world, that hey, it’s all just data, and I remembered how I could just take some nutritional information as raw data as I’ve previously done ages ago for Mickey D’s and see if anything interesting pops out. Plus this seemed like as good as an excuse as any to do some work with the good old data analysis and visualization stack for python, and ipython notebooks, instead of my usual go-to tool of R.

So let’s have a look, shall we?

## Background¶

As always, the first step is getting the data; sometimes the most difficult step. Here the menu in question I chose to use was that from Sushi Stop (I am in no way affiliated nor associated with said brand, nor I am endorsing it), where the nutritional information unfortunately was only available as a PDF, as is often the case.

This is a hurdle data analysts, but more often I think, research analysts and data journalists, can often run into. Fortunately there are tools at our disposal to deal with this kind of thing, so not to worry. Using the awesome Tabula and a little bit of ad hoc cleaning from the command line, it was a simple matter of extracting the data from the PDF and into a convenient CSV. Boom, and we’re ready to go.

Tabula

The data comprises 335 unique items in 17 different categories with 15 different nuritional variables. Let’s dig in.

## Analysis¶

First we include the usual suspects in the python data analysis stack (numpy, matplotlib and pandas), then read the data into a dataframe using pandas.

In [1]:
%matplotlib inlineimport numpy as npimport matplotlib.pyplot as pltimport pandas as pd
In [2]:
data = pd.read_csv("tabula-nutritional-information.csv", delimiter=",")

Okay, are we wokring with here? Let’s take a look:

In [3]:
print data.columnsprint len(data.columns)data.head()
Index([u'category', u'item', u'serving_size', u'calories', u'fat', u'saturated_fat', u'trans_fat', u'cholesterol', u'sodium', u'carbohydrates', u'fibre', u'sugar', u'protein', u'vitamin_a', u'vitamin_c', u'calcium', u'iron'], dtype='object')17
Out[3]:
category item serving_size calories fat saturated_fat trans_fat cholesterol sodium carbohydrates fibre sugar protein vitamin_a vitamin_c calcium iron
0 APPETIZERS & SALADS Shrimp Tempura 60 180 8.0 0.0 0 40 125 18 0 0 8 0 0 0 0
1 APPETIZERS & SALADS Three salads 120 130 3.5 0.0 0 60 790 13 4 8 8 2 6 40 8
2 APPETIZERS & SALADS Wakame 125 110 2.0 0.0 0 0 1650 13 4 9 0 0 0 110 0
3 APPETIZERS & SALADS Miso soup 255 70 3.0 0.5 0 0 810 8 1 1 6 0 0 20 25
4 APPETIZERS & SALADS Grilled salmon salad 276 260 19.0 2.5 0 30 340 12 3 6 12 80 80 8 8

5 rows × 17 columns

Let’s look at the distribution of the different variables. You can see that most a heavily skewed or follow power law / log-normal type distributions as most things in nature do. Interestingly there is a little blip there in the serving sizes around 600 which we’ll see later is the ramen soups.

In [4]:
# Have a lookplt.figure(0, figsize=(25,12), dpi=80)for i in range(2,len(data.columns)):    fig = plt.subplot(2,8,i)    plt.title(data.columns[i], fontsize=25)    plt.hist(data[data.columns[i]])#    fig.tick_params(axis='both', which='major', labelsize=15)plt.tight_layout()

Let’s do something really simple, and without looking at any of the other nutrients just look at the caloric density of the foods. We can find this by dividing the number of calories in each item by the serving size. We’ll just look at the top 10 worst offenders or so:

In [5]:
data['density']= data['calories']/data['serving_size']data[['item','category','density']].sort('density', ascending=False).head(12)
Out[5]:
item category density
314 Yin Yang Sauce EXTRAS 5.000000
311 Ma! Sauce EXTRAS 4.375000
75 Akanasu (brown rice) HOSOMAKI 3.119266
0 Shrimp Tempura APPETIZERS & SALADS 3.000000
312 Spicy Light Mayo EXTRAS 2.916667
74 Akanasu HOSOMAKI 2.844037
67 Akanasu avocado (brown rice) HOSOMAKI 2.684564
260 Teriyaki Bomb ‐ brown rice (1 pc) TEMARI 2.539683
262 Teriyaki Bomb ‐ brown rice (4 pcs) TEMARI 2.539683
201 Inferno Roll (brown rice) SUMOMAKI 2.395210
259 Teriyaki Bomb (1 pc) TEMARI 2.380952

12 rows × 3 columns

The most calorically dense thing is Ying-Yang Sauce, which as far as I could ascertain was just mayonnaise and something else put in a ying-yang shape on a plate.Excluding the other sauces (I assume Ma! also includes mayo), the other most calorically dense foods are the variations of the Akanasu roll (sun-dried tomato pesto, light cream cheese, sesame), shrimp tempura (deep fried, so not surprising) and teriyaki bombs, which are basically seafood, cheese and mayo smushed into a ball, deep fried and covered with sauce (guh!). I guess sun-dried tomato pesto has a lot of calories. Wait a second, does brown rice have more calories than white? Oh right, sushi is made with sticky rice, and yes, yes it does. Huh, today I learned.

We can get a more visual overview of the entire menu by plotting the two quantities together. Calories divided by serving size = calories on the y-axis, serving size on the x. Here we colour by category and get a neat little scatterplot.

In [7]:
# Get the unique categoriescategories = np.unique(data['category'])# Get the colors for the unique categoriescm = plt.get_cmap('spectral')cols = cm(np.linspace(0, 1, len(categories)))# Iterate over the categories and plotplt.figure(figsize=(12,8))for category, col in zip(categories, cols):    d = data[data['category']==category]    plt.scatter(d['serving_size'], d['calories'], s=75, c=col, label=category.decode('ascii', 'ignore'))    plt.xlabel('Serving Size (g)', size=15)    plt.ylabel('Calories', size=15)    plt.title('Serving Size vs. Calories', size=18)legend = plt.legend(title='Category', loc='center left', bbox_to_anchor=(1.01, 0.5),          ncol=1, fancybox=True, shadow=True, scatterpoints=1)legend.get_title().set_fontsize(15)

You can see that the nigiri & sashimi generally have smaller serving sizes and so less calories. The ramen soup is in a category all its own with much larger serving sizes than the other items, as I mentioned before and we saw in the histograms. The other rolls are kind of in the middle. The combos, small ramen soups and some of the appetizers and salads also sit away from the ‘main body’ of the rest of the menu.

Points which lie further from the line y=x have higher caloric density, and you can see that even though the top ones we picked out above had the highest raw values and we can probably guess where they are in the graph (the sauces are the vertical blue line near the bottom left, and the Akanasu are probably those pairs of dark green dots to the right), there are other categories which are probably worse overall, like the cluster of red which is sushi pizza. Which category of the menu has highest caloric density (and so is likely best avoided) overall?

In [8]:
# Find most caloric dense categories on averagedensity = data[['category','density']]grouped = density.groupby('category')grouped.agg(np.average).sort('density', ascending=False).head()
Out[8]:
density
category
EXTRAS 2.421875
SUSHI PIZZA 2.099515
CRISPY ROLLS 1.969304
TEMARI 1.807691
HAKO 1.583009

5 rows × 1 columns

As expected, we see that other than the extras (sauces) which have very small serving sizes, on average the sushi pizzas are the most calorically dense group of items on the menu, followed by crispy rolls. The data confirm: deep fried = more calories.

What if we were only concerned with fat (as many weight-conscious people dining out are)? Let’s take a look at the different categories with a little more depth than just a simple average:

In [9]:
# Boxplot of fat contentfat = data[['category','fat']]grouped = fat.groupby('category')# Sortdf2 = pd.DataFrame({col:vals['fat'] for col,vals in grouped})meds = df2.median()meds.sort(ascending=True)df2 = df2[meds.index]# Plotplt.figure(figsize=(12,8))fatplot = df2.boxplot(vert=False)

While the combos and appetizers and salads have vary wide ranges in their fat content, we see again that the sushi pizza and crispy rolls have the most fat collectively and so are best avoided.

Now another thing people are often worried about when they are trying to eat well is the amount of sodium they take in. So let’s repeat our previous approach in visually examining caloric density, only this time plot it as one metric on the x-axis and look at where different items on the menu sit with regards to their salt content.

In [10]:
fig = plt.figure(figsize=(12,8))plt.xlim(0,6)plt.ylim(-50, 2000)for category, col in zip(categories, cols):    d = data[data['category']==category]    plt.scatter(d['density'], d['sodium'], s=75, c=col, label=category.decode('ascii', 'ignore'))    plt.xlabel('Caloric density (calories/g)', size=15)    plt.ylabel('Sodium (mg)', size=15)    plt.title('Sodium vs. Caloric Density', size=18)legend = plt.legend(title='Category', loc='center left', bbox_to_anchor=(1.01, 0.5),          ncol=1, fancybox=True, shadow=True, scatterpoints=1)legend.get_title().set_fontsize(15)

Here we can see that while the extras (sauces) are very calorically dense, you’re probably not going to take in a crazy amount of salt unless you go really heavy on them (bottom right). If we’re really worried about salt the ramen soups should be avoided, as most of them have very high sodium content (straight line of light green near the left), some north of 1500mg, which is the daily recommended intake by the Health Canada for Adults 14-50. There’s also some of the other items we’ve seen before not looking so good (sushi pizza). Some of the temari (like the teriyaki bombs) and sumomaki (‘regular’ white-on-the-outside maki rolls) should be avoided too? But which ones?

A plot like this is pretty crowded, I’ll admit, so is really better explored, and we can do that using the very cool (and very under-development) MPLD3 package, which combines the convenience of matplotlib with the power of D3.

Below is the same scatterplot, only interactive, so you can mouse over and see what each individual point is. The items to be most avoided (top right in grey and orange), are indeed the teriyaki bombs, as well as the inferno roll (tempura, light cream cheese, sun-dried tomato pesto, red and orange masago, green onion, spicy light mayo, spicy sauce, sesame) as we saw before. Apparently that sun-dried tomato pesto is best taken in moderation.

The Akanasu rolls are the horizontal line of 4 green points close by. Your best bet is probably just to stick to the nigri and sashimi, and maybe some of the regular maki rolls closer to the bottom left corner.

In [11]:
import mpld3fig, ax = plt.subplots(figsize=(12,8))ax.set_xlim(0,6)ax.set_ylim(-50,2000)N = 100for category, col in zip(categories, cols):    d = data[data['category']==category]    scatter = ax.scatter(d['density'], d['sodium'], s=40, c=col, label=category.decode('ascii', 'ignore'))    labels = list(d['item'])    tooltip = mpld3.plugins.PointLabelTooltip(scatter, labels=labels)    mpld3.plugins.connect(fig, tooltip)    mpld3.display()
Out[11]:

# Conclusion¶

Well, there we have it folks. A simple look at the data tells us some common-sense things we probably already new:

• Deep fried foods will make you fat
• Mayo will make you fat
• Soup at Japanese restaurants is very salty
• Sashimi is healthy if you go easy on the soy

And surprisingly, one thing I would not have thought: that sundried tomato pesto is apparently really bad for you if you’re eating conscientiously.

That’s all for now. See you next time and enjoy the raw fish.

## References and Resources¶

Sushi Stop – Nutritional Information (PDF)
http://www.sushishop.com/themes/web/assets/files/nutritional-information-en.pdf

http://www.hc-sc.gc.ca/fn-an/nutrition/sodium/index-eng.php

code & data on github
https://github.com/mylesmharrison/i_heart_sushi/