#Foodporn: Obesity Patterns in Culinary Interactions
#Foodporn: Obesity Patterns in Culinary Interactions
#Foodporn: Obesity Patterns in Culinary Interactions
1
Qatar Computing Research Institute, Qatar
2
Queen Mary University of London, UK
3
University of Cambridge, UK
ABSTRACT ubiquity and popularity, there are strong indicators for use-
We present a large-scale analysis of Instagram pictures taken fulness of these media [1, 10] to record the everyday inter-
at 164,753 restaurants by millions of users. Motivated by the actions between individuals, society, and food.
obesity epidemic in the United States, our aim is three-fold:
(i) to asses the relationship between fast food and chain Meanwhile, the alarming rise in economic and societal costs
arXiv:1503.01546v1 [cs.CY] 5 Mar 2015
restaurants and obesity, (ii) to better understand people’s of obesity and diabetes have put these diet-related ailments
thoughts on and perceptions of their daily dining experi- on an “epidemic” scale [18], and fast food has been widely
ences, and (iii) to reveal the nature of social reinforcement cited as an important contributor [12]. Recent studies have
and approval in the context of dietary health on social me- shown that neighborhoods with more fast food restaurants
dia. When we correlate the prominence of fast food restau- had significantly higher odds of diabetes and obesity [2],
rants in US counties with obesity, we find the Foursquare and, in particular, childhood obesity [4]. These statistics,
data to show a greater correlation at 0.424 than official sur- however, are static, and fail to capture the everyday din-
vey data from the County Health Rankings would show. ing outings of their participants, their thoughts and feelings
Our analysis further reveals a relationship between small about the food, and the social setting in which it takes place.
businesses and local foods with better dietary health, with Recently, the medical and healthcare communities have pro-
such restaurants getting more attention in areas of lower posed utilizing social media data as a useful resource for
obesity. However, even in such areas, social approval favors monitoring people’s eating habits and specifically those of
the unhealthy foods high in sugar, with donut shops produc- obesity and diabetes patients [6].
ing the most liked photos. Thus, the dietary landscape our
study reveals is a complex ecosystem, with fast food playing In this paper, we use data from two of the world’s largest On-
a role alongside social interactions and personal perceptions, line Social Networks (OSNs), namely Foursquare and Insta-
which often may be at odds. gram, in order to study (i) the relationship between fast food
and chain restaurants to obesity, (ii) the user thoughts, feel-
Keywords ings, and perceptions of their dining experiences, and (iii)
Social Media, Foursquare, Instagram, Dietary Health, Fast social approval and interaction captured on these sites. In-
Food, Food Perception, Social Approval, Obesity stagram is currently the world’s most popular photo-sharing
platform and with over 300 million users, it is bigger than
Twitter. Using this rich source of data and social interac-
1. INTRODUCTION tions, we focus on the pictures shared at 164,753 restaurants
Food and dining is an important social and cultural expe- around the United States by over 3 million individuals.
rience, and today our social media feeds are filled with in-
dividuals checking in restaurants with friends, sharing both This data, we find, reveals more clearly the correlation be-
healthy and unhealthy dining experiences, and recommend- tween the prevalence of fast food restaurants in a county
ing restaurants and dishes to their social network1 – to the to its obesity rate, compared to statistics obtained by the
point that some have suggested our obsession with food has Robert Wood Johnson Foundation and the University of
grown to a “food fetish” [16]. But precisely because of its Wisconsin Population Health Institute in 2013 County Health
1
http://www.psychologytoday.com/blog/comfort- Rankings. Chain and fast food restaurants have fewer pic-
cravings/201008/10-reasons-why-people-post-food- tures and unique users visiting them, and pictures taken
pictures-facebook there receive both fewer likes and comments. Indeed, the
connection between local restaurants and obesity is empha-
sized by the fact that the users from low-obesity regions
use tags such as #smallbiz and #eatlocal much more fre-
quently than in more obese areas. Even the users them-
selves realize the food in fast food places is unhealthy, with
pictures taken there having twice as many tags associated
with unhealthy topics. Unfortunately, social feedback (in
terms of likes) often favors unhealthy restaurants – donut,
cupcake, and burger places – despite individual users associ-
ating #foodporn with predominantly Asian cuisines. These
and other findings we describe here provide a window into
the daily dining experiences of millions of people, potentially
informing intervention and policy decisions.
8 5 2.5
pictures posted
Unique users
4 2
population
6
3 1.5
4
2 1
2
1 0.5
0 0 0
0 100 200 300 0 100 200 300 0 100 200 300
County (n=316) County (n=316) County (n=316)
(a) Population per county used (b) Pictures posted per county used (c) Unique users per county
Figure 1: Populations, pictures posted, and unique users per US county in the Instagram dataset, ordered by population
(shown in (a)) in all three graphs
Institute. It provides vital health factors including obesity 3.1 Fast food and obesity
and diabetes rates, demographics including income and edu- Are fast food restaurants correlated with obesity? Here, we
cation, and community variables including prevalence of fast consider only the counties which have more than 5 unique
food restaurants, in each county in America. In particular, restaurants in our Foursquare dataset (N =280). We first
we focus on the obesity rate, which ranges from counties with examine the County Health Rankings (CHR) data, corre-
13% obesity (Teton, Wyoming) to 48% (Greene, Alabama), lating percentage of fast food places to percentage of obese
as well as the percentage of fast food restaurants. population, with a resulting Pearson correlation of 0.246.
When we weight the means by population (using weighted
2.4 Fast food restaurants correlation), we get a slightly higher one, of 0.267. Now,
Among its meta-data, the Foursquare dataset contains a an alternative source of information is the share of fast food
classification of restaurants, which includes fast food. How- places is our Foursquare dataset, in which the restaurants
ever, there is often variability in the way individual restau- are present only if a check-in took place. We find a substan-
rants are associated with a class, with some fast food chains tially higher correlation of 0.424. The relationship between
sometimes having Burger designation. In attempt to at least obesity rate with percent fast food places for the two sources
capture the most prominent fast food chains, we supple- of data is shown in Figures 4a,b.
ment this knowledge with the list of top 50 fast food chains
by sales6 and Wikipedia lists of fast-food and fast-casual Note that our data shows a much less prominent share of
chains in the US7 . We then consider all restaurants in the fast food restaurants. This may be due to our definition
Fast Food and Burger categories of Foursquare or in these of fast food restaurants (see previous section) differing from
lists to be a fast food place. that of CHR (unfortunately, their definition is not publicly
available). Alternatively, social media users may label the
restaurants as something other than fast food. Finally, they
2.5 Population representativeness may not check in to fast food restaurants as much, and we
Finally, we check whether the number of users our sam- are capturing the peculiar dining behavior of the population
ple contains is representative of the overall population of active on Foursquare.
the counties, summarized in Figures 1. We find that Spear-
man’s rank correlation ρ between the county population and We also check the correlation of the prevalence of fast food
the unique users is 0.746. We also find the users to con- restaurants and various demographics (available in the CHR
tribute proportional number of pictures, with ρ=0.9954 be- data). Among race, income, and poverty-related variables,
tween number of users and that of pictures per county. Thus, the highest was the correlation with the population of chil-
although the subset is likely to be of tech-savvy and young dren under 18 years of age at 0.499 for CHR and 0.450 for
people, it is at least proportional to the overall population. Foursquare data, indicating higher exposure of families and
kids to these restaurants.
3. FAST FOOD
In his book, “Fast Food Nation: The Dark Side of the All- 3.2 Local places versus chain restaurants
American Meal”, Eric Schlosser puts fast food as a central Because our dataset allows us to get an aggregate view of
player in the rise of obesity in US [12]. Here, we examine all restaurants Foursquare users visit, we also take a data-
to what extent data obtained from CHR and that obtained driven approach at defining what is a chain restaurant. Con-
using social media reveal the relationship between the promi- cretely, across the 164,753 unique Foursquare locations we
nence of fast food restaurants and obesity. looked for exact repetitions in place names, irrespective of
6
http://www.qsrmagazine.com/reports/qsr50-2012- upper or lower case. The most frequent repetition was “Star-
top-50-chart bucks” (4,870) followed by other popular chains. Any name
7 repeated 10 or more times was then considered a chain.
https://en.wikipedia.org/wiki/List_of_restaurant_
chains_in_the_United_States Other places exactly matching, up to casing, or contain-
40
40
40
% Obesity
% Obesity
% Obesity
30
30
30
20
20
20
10
10
10
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100
Figure 4: County-wide percentage of fast food places as measured by CHR (a) or Foursquare (b), and number of chain
restaurants (c) to percentage of obesity
10000
● ● ● ● ●
● ●
● ●
● ●
● ● ●
● ● ●
local chains get 107 users. Per-user, there is also a difference,
100
100
Photos per restaurant
●
● ●
● ●
● ●
●
●
● ● ● ● ● ●
●
●
● ●
● ● ●
●
● ●
●
● ●
● ● ●
●
Photos per user
● ● ● ●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
with those going to local restaurants sharing 3.6 images com-
●
● ● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
pared to 3.2 at a chain. Similar observations can be made for
●
● ●
● ●
● ●
●
● ●
fast vs slow food. Figures 5(a-d) illustrate these statistics,
10 100
10 100
●
● ●
● ●
● ●
●
●
10
●
10
●
● ●
● ●
● ●
●
●
● ●
● ●
● ●
●
●
● ●
● ●
● ●
●
●
● ●
●
● ● ●
●
● ● ● ●
●
● ●
●
●
●
● ●
●
●
● with long tails of extremely active restaurants (note that
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
y-axis is in log scale) for local and slow food restaurants.
●
●
● ●
● ●
●
● ●
●
● ● ● ●
● ● ● ●
0
0 1
Is fast food
0 1
Is fast food
0 1
Is a chain
0 1
Is a chain
4. FOOD PERCEPTION
(a) (b) (c) (d) Users tag their pictures to provide the context in which the
dietary experience took place. First, we turn our attention
Figure 5: Distribution of per-restaurant statistics: number to the most used hashtag in our dataset, which is also one
of photos posted for a restaurant (a,c) and average number unambiguously expressing delight in a dietary experience:
of photos each user posted (b,d) for fast-food vs slow-food #foodporn. We track this hashtag across the restaurant
(a,b) and local vs chain (c,d) restaurants (y-axis is log-scale) categories, and the resulting top and bottom restaurants by
shares of the tag are listed in Table 1. Asian foods, including
Indonesian, Malaysian, and Vietnamese, dominate the top,
along with Molecular Gastronomy, which uses technical in-
ing such a location, such as “Starbucks at Super Target” novations to create new textures and dishes. At the bottom
were then considered as part of a chain. This worked well we see drinking establishments (which is understandable,
and also caught frequent variants such as both “McDon- since #foodporn is about food, not drinks), as well as Swiss,
ald’s” and “McDonalds”. Manual inspection showed that German, and Fast food. Also, Wings and Fish & Chips join
the minimum threshold of 10 worked well to distinguish be- Gluten-free as the least exciting restaurants. This ranking
tween proper chains and merely frequently repeated restau- is in a stark difference to the number of pictures associated
rant names such as “Asia Wok”. The threshold was also with the categories (shown in Npic column), with American,
low enough to allow a restaurant to have a handful of lo- Coffee Shop, and Mexican being the most popular, suggest-
cal branches, without being considered a chain. Out of the ing that the food people love often does not come from their
chain restaurants we detected, 30% were not designated as everyday visitations. But it is not necessarily the case that
fast food (i.e. they are “slow” food), suggesting that most exciting foods come from little-posted places. In fact, the
chains are fast food, but not all (whereas only 8% of fast correlation between the #foodporn hashtag and number of
food restaurants were not detected as chain). Since 72% of pictures is only slightly negative at -0.174.
our dataset is slow food, chains are overwhelmingly likely to
be fast food, compared to the overall distribution. Note that whereas #foodporn is the most frequently used
hashtag (1,890,691 occurrences, followed by #food, #nyc,
Figure 4c shows the relation of the number of chain restau- #yum, #love, #dinner, and #instagood), among the fre-
rants in a county to % obesity. There is no clear correlation quent hashtags, one associated with a camera application
between obesity and these restaurants. However, we do find #vscocam8 is the most liked one (on average and in median).
a large discrepancy in the activity of Instagram users – they However, #instahealth, which is associated with health and
are much more likely to share a photo from a local restau- motivation, is even more liked when used, even though it is
rant than a chain. On average, there are 150 photos shared not in the top 50.
from a chain restaurant, compared to 396 at a local one.
8
https://play.google.com/store/apps/details?id=
We also find that although chain restaurants benefit from com.vsco.cam&hl=en
Table 1: Top and bottom restaurant categories by shares of
0.20
fast food
0.10
1 Malaysian 0.21 6,169
2 Vietnamese 0.17 119,915
3 Indonesian 0.17 257
0.00
4 Dumplings 0.17 8,092
5 Molecular Gastronomy 0.16 6,668 emotion healthy unhealthy social foodporn
6 Korean 0.16 115,898
7 Peruvian 0.15 7,472 Tag type
8 Mac & Cheese 0.15 6,154
9 Thai 0.14 119,563
10 Japanese 0.14 260,188
11 Dim Sum 0.14 24,322 Figure 6: Proportion of pictures with tags of a type (and sin-
12 Filipino 0.13 6,138 gle hashtag #foodporn) in fast food versus all other restau-
13 Sushi 0.12 425,425 rants
14 Chinese 0.12 178,512
15 Asian 0.12 210,087
66 Fish & Chips 0.06 538 it to be unhealthy. However, when we compare the use of
67 Gluten-free 0.06 614
68 Gastropub 0.05 139,059 #foodporn, we see it used more for slow food restaurants.
69 Fast Food 0.05 114,103 Social occasions are also more present in slow food places.
70 Tea Room 0.05 69,477 Keep in mind that the sensitivity of this method depends on
71 Food 0.05 21,619 the extent of the vocabulary, and the particular percentages
72 Wings 0.05 98,737 of the images are not as informative as their comparative
73 Swiss 0.05 4,198 proportion.
74 German 0.04 38,412
75 Juice Bar 0.04 38,944
76 Brewery 0.02 284,258 When we look at chains versus local, we find that, on the
77 Afghan 0.02 229 contrary, users tended to post hashtags related to social
78 Coffee Shop 0.02 848,212 events at chain restaurants (at 0.158) more than local (0.137)
79 Winery 0.01 31,229 (significant at p<0.001), suggesting that chain restaurants
80 Distillery 0.00 2,315 which may not be labeled as fast food (such as The Cheese-
cake Factory or California Pizza Kitchen) play an important
role in people’s social outings.
4.1 Hashtag labeling
Beyond #foodporn, among the many dimensions present in 4.3 Chatter around obesity
these tags, we focus on those concerning health, emotion, and Beyond the fast food restaurants, we are interested in the
social aspects. We use CrowdFlower9 platform to crowd- user activities and food perception that can be associated
source the labeling of the top 2,000 used tags. The labeling with areas of high obesity. Thus, we segment the data into
of each dimension involved a worker to complete a task con- low, middle, and high terciles by the percentage of obese res-
sisting of 10 words. For each word, we required 3 labels by idents, with the breaks at 21.8% and 29.0%, which were com-
different annotators. The experiments ran in “quiz” mode, puted in the range of the counties in our dataset. Whereas
requiring correct answers for 3 out of 4 posed gold-standard the top ranking hashtags in each tercile are quite similar
questions to begin the task, and with 1 test question in every (dominated by #foodporn, #food, and #yum), when we sub-
following task for quality control. The agreement was high, tract the rankings of lowest from highest tercile, we reveal
with label overlap at 92-99%. Our labeling effort resulted in tags which are associated more with high or low obesity re-
four binary variables: healthy (refers to healthy food), un- gions.
healthy (refers to unhealthy food), social (refers to a social
setting), and emotion (expresses some emotion). We made Table 2 shows a selection of hashtags which have the most
this term list available to download online10 . different rankings between the lowest and highest terciles.
Concretely, we considered top 20 terms in two lists that
4.2 Perception of fast food were filtered according to the number of minimum occur-
Using the above hashtag classifications, we now can observe rence counts requited (>= 100 for more strict or >= 500
the extent to which each tag group was used in pictures for more inclusive lists). Note that this is not a complete
taken at fast food restaurants versus all others. Figure 6 list of terms, most of which are places, cuisines, and particu-
shows the proportion of the use of these tags. Hashtags lar restaurants, thus the tags in the table are those left after
that have to do with emotion are the most used, and about excluding the above. First, we notice food sharing tags, in-
healthy foods the least. We see a marked difference between cluding #sharefood and #ilovesharingfood, to appear in
the use of unhealthy food tags, with fast food showing twice the high obesity list, as well as #fatlife. Interestingly,
as much use as slow food, indicating that the users perceive whereas #desserts is more prominent in high obese areas,
the singular term #dessert is the 17th most popular tag in
9
http://www.crowdflower.com/ low obesity areas – showing a fine distinction between hav-
10
http://bit.ly/14ROSvn ing one versus more than one dessert. On the low obesity
Table 2: Health and perception-related hashtags more
45
prominent in high and low obesity counties, determined by
prominence rank difference
35
Healthy
Unhealthy
25
Social
List High Obesity Low Obesity Emotion
15
100 #foodstyling #smallbiz None
#ilovesharingfood #guiltfree
#foodstamping #whodat low med high
#foodphoto #mycurrentsituation
#fatlife Obesity tercile
#f52grams
500 #tacotuesday #myview
#foodforfoodies #eatlocal Figure 8: Average number of likes for pictures having or not
#firsttime #plantbased having a certain type of hashtag, with 95% confidence bars
#finedining #smoke
#sharefood #eatwell
and against unhealthy lifestyles. In low obesity areas healthy
and unhealthy tags are associated with (roughly) a similar
number of likes whereas this difference is more pronounced
in high obesity areas. Interestingly, “social” gains in popu-
40
Average # of comments
33.37
1.956 larity compared to “none”. Is being gregarious linked with
2.0
Average # of likes
30
list we find tags associated with healthy food – #guiltfree, Finally, we find that pictures of dessert are most liked in our
#eatwell, #plantbased – and those referring to local foods dataset, followed by mac & cheese, burgers, and French food
– #smallbiz, #eatlocal. There were also more individual- (see Table 3). Interestingly, donuts and cupcakes top the
istic tags such as #mycurrentsituation and #myview. Once rankings for low-obesity areas, and instead New American
again, we see an association between small local restaurants and street food dominate in high-obesity ones (full lists omit-
and healthier communities. ted due to space). Overall, the foods at the top of this list
are high in sugar, fat, and carbohydrates, which have been
5. SOCIAL APPROVAL shown to have addictive properties [5, 17]. Furthermore, the
Finally, we turn to variables we associate with social interac- overwhelming approval of donuts in the low-obesity tercile is
tion, and, more concretely, approval and engagement – likes surprising, at 70 likes on average per picture, showing that
and comments the picture receives. Figure 7 shows the aver- the online interactions may bring out our desires, but not
age number of likes (a) and comments (b) for a picture from necessarily illustrate the typical offline behavior.
areas with low, medium, and high obesity rate (as defined
in the previous section). We find a large distinction between 6. RELATED RESEARCH
the implicit approval in term of likes and conversation en- Our study contributes to a growing body of work which uses
gagement in terms of comments between the three groups. social media for monitoring health-related activity. The po-
Pictures taken in high obesity areas tend to have fewer likes tential for such studies to provide insights into larger scale
and comments, and in terms of likes, 56% fewer than their societal behaviour trends and social interactions presents
counterparts coming from low obesity areas. unique advantage over standard methods of tracking dietary
behavior, such as food diaries. Recently, Culotta [3] per-
Considering the tags associated with the pictures, we also formed a linguistic analysis of tweets from US users and
find a distinct difference between the likes and comments found many categories that were significant predictors of
when certain types of tags are present. Figure 8 shows the health statistics including teen pregnancy, health insurance
average number of likes for pictures with certain tags (for coverage, and obesity. Silva et al. [13] use 5 million Foursquare
pictures having at least one tag). We observe that both check-ins (using Twitter data) and survey data, finding strong
healthy and unhealthy tags provoke more likes, echoing the temporal and spatial correlation between individuals’ cul-
observation of [14] that there are societal pressures both for tural preferences and their eating and drinking habits. Fried et
Table 3: Top and bottom restaurant categories by average density of fast food restaurants, even amplifying the effect
number of likes per picture, and overall number of pictures beyond the available statistics.