My Home Nutrition Analysis

Code for this project can be found at this repository. Check out some of my other work here.

Objective:

Over the past few years, I have started paying more attention to my nutrition since nutrition is a large part in maintaining a healthy lifestyle. With this analysis I want to observe how the nutrition of foods within my home varies. In addition to our household foods, I collected some nutritional data for fast food items we commonly enjoy.

About The Data:

I collected the data by reading the nutrition fact labels on food items within my house. For nutrition on whole fruits and fast food items I took the nutrition from the company's web site.

Each item contains its macronutrients and micronutrients with the measures given from the nutrition label. Some items contain a value of 0.99 which is a placeholder I used if the nutrition label said '< 1g'. Some items were given a value of zero in certain micronutrients where the micronutrient was labeled with a percentage as opposed to an actual measurement.

Not all of the foods in my house were used for this data, I mainly collected the nutrition facts for foods and drinks we regularly consume since there are some items in our pantry which never get touched.

NOTE: I use the term "Caution Zone" to describe foods which I feel should should be consumed cautiously. This is not a definitive categorization since most foods can be consumed in moderation and just because an item did not meet the caution criteria does not mean it is exempt from moderation. The values for the cautious zones I used correspond to being 16-20% of my personal daily intake which I determined from a macronutrient calculator which is linked at the end of this notebook.

Analysis

I will begin by importing the modules, importing the excel file with the data, and making sure there's no missing data or incorrect data types

In [1]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly as py
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
In [2]:
df = pd.read_excel('https://github.com/a-camarillo/nutrition-analysis/blob/master/data/nutrition_facts.xlsx?raw=true')
In [3]:
df.head()
Out[3]:
Food Item Category Serving Size Serving Measurement(grams) Calories Total Fat(grams) Saturated Fat(grams) Trans Fat(grams) Polyunsaturated Fat(grams) Monounsaturated Fat(grams) ... Vitamin B6 Folate(micrograms Folic Acid(micrograms) Vitamin A(micrograms) Vitamin C(milligrams) Vitamin E(milligrams) Zinc(milligrams) Vitamin B12(micrograms) Phosphorus(milligrams) Magnesium(milligrams)
0 Oroweat Whole Wheat Bread Bread 1 Slice 38.0 100 1.0 0.0 0.0 0.5 0.0 ... 0.0 0 0 0.0 0.0 0.0 0.0 0.0 0 0
1 Cap'n Crunch Crunch Berries Cereal 1 Cup 37.0 150 2.0 0.5 0.0 0.0 0.5 ... 0.4 200 133 0.0 0.0 0.0 0.0 0.0 0 0
2 Cap'n Crunch Cereal 1 Cup 38.0 150 2.0 0.5 0.0 0.0 0.5 ... 0.4 200 134 0.0 0.0 0.0 0.0 0.0 0 0
3 Kellogg's Froot Loops Cereal 1 1/3 Cup 39.0 150 1.5 0.5 0.0 0.0 0.0 ... 0.0 0 45 0.0 0.0 0.0 0.0 0.0 0 0
4 Post Honey Bunches of Oats Cereal 1 Cup 42.0 170 3.0 0.0 0.0 0.5 1.5 ... 0.0 400 240 0.0 0.0 0.0 0.0 0.0 0 0

5 rows × 34 columns

In [4]:
df.isna().sum()
Out[4]:
Food Item                     0
Category                      0
Serving Size                  0
Serving Measurement(grams)    0
Calories                      0
Total Fat(grams)              0
Saturated Fat(grams)          0
Trans Fat(grams)              0
Polyunsaturated Fat(grams)    0
Monounsaturated Fat(grams)    0
Cholesterol(milligrams)       0
Sodium(milligrams)            0
Total Carbohydrates(grams)    0
Fiber(grams)                  0
Total Sugars(grams)           0
Added Sugars(grams)           0
Protein(grams)                0
Vitamin D(micrograms)         0
Calcium(milligrams)           0
Iron(milligrams)              0
Potassium(milligrams)         0
Thiamin(milligrams)           0
Riboflavin(milligrams)        0
Niacin(milligrams)            0
Vitamin B6                    0
Folate(micrograms             0
Folic Acid(micrograms)        0
Vitamin A(micrograms)         0
Vitamin C(milligrams)         0
Vitamin E(milligrams)         0
Zinc(milligrams)              0
Vitamin B12(micrograms)       0
Phosphorus(milligrams)        0
Magnesium(milligrams)         0
dtype: int64
In [5]:
df.dtypes
Out[5]:
Food Item                      object
Category                       object
Serving Size                   object
Serving Measurement(grams)    float64
Calories                        int64
Total Fat(grams)              float64
Saturated Fat(grams)          float64
Trans Fat(grams)              float64
Polyunsaturated Fat(grams)    float64
Monounsaturated Fat(grams)    float64
Cholesterol(milligrams)         int64
Sodium(milligrams)              int64
Total Carbohydrates(grams)      int64
Fiber(grams)                  float64
Total Sugars(grams)           float64
Added Sugars(grams)           float64
Protein(grams)                float64
Vitamin D(micrograms)         float64
Calcium(milligrams)             int64
Iron(milligrams)              float64
Potassium(milligrams)           int64
Thiamin(milligrams)           float64
Riboflavin(milligrams)        float64
Niacin(milligrams)            float64
Vitamin B6                    float64
Folate(micrograms               int64
Folic Acid(micrograms)          int64
Vitamin A(micrograms)         float64
Vitamin C(milligrams)         float64
Vitamin E(milligrams)         float64
Zinc(milligrams)              float64
Vitamin B12(micrograms)       float64
Phosphorus(milligrams)          int64
Magnesium(milligrams)           int64
dtype: object

There are no missing values and all of the data types are as expected so now I'm going to do some quick cleaning of the column names to make things a little easier

In [9]:
df.columns
Out[9]:
Index(['food_item', 'category', 'serving_size', 'serving_measurement_g',
       'calories', 'total_fat_g', 'saturated_fat_g', 'trans_fat_g',
       'polyunsaturated_fat_g', 'monounsaturated_fat_g', 'cholesterol_mg',
       'sodium_mg', 'total_carbohydrates_g', 'fiber_g', 'total_sugars_g',
       'added_sugars_g', 'protein_g', 'vitamin_d_mcg', 'calcium_mg', 'iron_mg',
       'potassium_mg', 'thiamin_mg', 'riboflavin_mg', 'niacin_mg',
       'vitamin_b6', 'folate_mcg', 'folic_acid_mcg', 'vitamin_a_mcg',
       'vitamin_c_mg', 'vitamin_e_mg', 'zinc_mg', 'vitamin_b12_mcg',
       'phosphorus_mg', 'magnesium_mg'],
      dtype='object')

Before I begin visualizing the data, I am going to create a function for normalization allowing for another comparison of each food item.

In [10]:
def per_100g(Series):
    ''' Pass in a macronutrient series and find it's value per 100 grams
        for each item '''
    
    value = (Series/df['serving_measurement_g']) * 100 
    return value

Fat

First macronutrient I want analyze is total fat, I will begin by adding a column for fat per 100 grams, and looking at some of the top results

Out[16]:
Text(0.5, 1.0, 'Top 20 Items by Fat Per 100g')

Much to my surprise, Best Foods Mayonnaise and Skippy Peanut Butter are considerably high in fat per 100 grams. As someone who frequently consumes both of these products I will definitely have to monitor my intake to avoid having too much fat in my diet.

Another surprise high contender for me is the Ritz Crackers, it's hard to not eat an entire pack of these in one sitting but I might have to reconsider next time a craving hits.

Among the top 20 items for fat per 100 grams, the expected fast food items are there but much lower than some of the household items.

I want to also look at how the fat per 100 grams compares to a single serving since I am curious to see if the same items are as fatty relative to serving size.

The above plot compares each item's single serving to its respective total fat per 100 grams. Some takeaways from this plot are:

Assuming you adhere to proper serving sizes, Ritz Crackers and the Sabra Hummus are not as fattening as the previous plot might have indicated. Due to each having a small serving size relative to fat per 100 grams, the actual fat per serving becomes relatively small(about 5g each).

Lucerne Cheese Blend is also not as bad as the fat per 100 grams alone might have indicated, however it should still be consumed cautiously since the fat for a single serving is still about.

THE CAUTION ZONE: I am considering the caution zone(for total fat) to be foods that are shown to have high fat content per serving(greater than or equal to 10g). These can easily be identified as the items around the 10g mark for Total Fat Per 100 Grams and 100 gram Serving Size or greater.

Looking all the way to the right is my go to Rubio's choice, the Ancho Citrus Shrimp Burrito. At about 450 grams for the burrito and 10 grams of fat per 100 grams of serving, this burrito packs a whopping 45 grams of fat. This is definitely something to take note of as I have never shied away from eating the whole thing in one sitting.

On the opposite side of the graph, but should be noted as well is one of my favorites, Skippy Creamy Peanut Butter. Although its serving size is on the lower end, the high fat per 100 grams reveals a single serving of peanut butter to have about 16 grams of fat. Again, the amount of Peanut Butter I use is something I will have to keep in mind the next I go to make a sandwich.

Other culprits of high fat vary from fast food items like fries to some favorite household foods like tortillas.

I would also like to reiterate, as I likely will in each section, the caution zone is not definitive and does not mean these items have to be exempt from one's diet rather I feel they should be consumed moderately.

Carbohydrates

For Carbohydrates and Protein I will perform analysis similar to Fats.

Out[22]:
Text(0.5, 1.0, 'Top 20 Items by Carbohydrates Per 100g')

Looking at the carbohydrates per 100 gram the main culprits are, for the most part, as expected. A lot of items in this list are grain based products which are known to have a higher carbohydrate content.

The surprise items for this list are the fruit snacks, Fruit By The Foot and Welch's mixed fruit. Being fruit based foods I did not expect these to rank high in carbohydrates.

The first thing I noted from this visualization is that the Annie's Organic Macaroni and Cheese actually contains more carbs than the Kraft Single Serving. However, the Kraft Macaroni and Cheese does contain more fat so there is the trade-off.

The second, more obvious, thing I noted is how few items there are in the cautious zone for carbohydrates. The criteria for the carb cautious zone was for a single serving to contain 44 grams of carbs or more.

So despite cereal topping the charts for carbs per 100 grams, if you adhere to the single serving size they are actually an adequate source of carbohydrates.

Sugars

Sugars are actually a form of carbohydrates and contribute to overall carbohydrate intake so for the sake of consistency I will analyze sugar content next.

Interestingly enough, there appears to be quite some overlap not only with the total sugars and added sugars items, but with the high carb items as well. This makes sense since sugar content makes up part of the total carbohydrate content, so any item with both high carbohydrates and high sugar could be a potential red flag.

One other thing I find interesting from these charts is the Snickers Candy Bar is second highest in terms of total sugars per 100 grams but does not appear in the top 20 of added sugar per 100 grams. This indicates that in terms of sugar content, a Snickers Bar might actually be better than some other food choices here.

The first graph here displays the total sugars per 100 grams versus single serving size in grams with caution on any items that contain 9 or more grams of sugar per serving. The second graph contains all of the items which contain added sugars and since added sugars generally want to be avoided, this can be considered its own caution zone.

Surprisingly all of the fruit/fruit-based items, except the single clementine, met the criteria for being high in sugar. Although fruits, particularly whole fruits, are generally considered to be essential in a well-balanced diet, those generally consumed in my household are still high in natural sugar.

The big culprits from this sugar analysis are cereals. In the previous section I noted how cereal can still be considered an adequate source of carbohydrates but after some further investigation an overwhelming amount of total carbohydrates comes from added sugar.

To get a better understanding, below is a plot showing the relationship between food items' carbohydrates content and their added sugar content.

The cereals from this data reside in the middle of the plot and it can be seen that added sugar makes up between 1/3 to 1/2 of total carbohydrates for a single serving. Some huge red flags are the Coca-Cola and Aunt Jemima Syrup which both contain 100% of their carbohydrates from added sugars.

The American Heart Association reccomended added sugar intake is no more than 36 grams a day for men and 25 grams for women so it is quite alarming that some of these foods contain half or exceed that amount in just a single serving.[5]

Protein

Now for the analysis of protein content. Generally high protein is recommended in a nutritious diet so I will choose to omit a caution zone for the items.

However, it is important to note excess protein can actually be detrimental since excess protein is stored as fats.[6]

Out[35]:
Text(0.5, 1.0, 'Top 20 Items by Protein Per 100g')

There isn't a whole lot of surprise from the protein results but I was surprised to see the El Pollo Loco Chicken Breast to have the highest protein content. I expected either the Foster Farms Chicken Breast or Foster Farms Ground Turkey to contain the most protein.

One other surprise to me is how little protein the Ancho Citrus Burrito has, especially in comparison to the amount of fat and carbohydrates it contains.

The rest of the high protein items are what I expected being lean meat, fish, or legumes.

Conclusion

The first thing I conclude from this analysis is that I should definitely reconsider ordering the Ancho Citrus Shrimp Burrito since it does not seem nutritionally worth it to me. Additionally I want to avoid cereal as much as possible due to the high content of added sugar in a single serving alone.

Overall, anything in excess can be detrimental and moderation is important in sustaining nutrition. No one of the cautious items will destroy nutrition the same as no single non-cautious item will fix nutrition, the critical part is to maintain balance.

Resources

[1] A link to an overview provided by USDA for more resources related to macronutrients:

https://www.nal.usda.gov/fnic/macronutrients

[2] A link to to the Dietary Reference Intake for macronutrients(A consensus study publication):

https://www.nap.edu/catalog/10490/dietary-reference-intakes-for-energy-carbohydrate-fiber-fat-fatty-acids-cholesterol-protein-and-amino-acids

[3] A link to current USDA dietary guidelines which focus more on substance of foods(vegetables, fruits, etc.) consumed as opposed to simply the macronutrients:

https://health.gov/our-work/food-nutrition/2015-2020-dietary-guidelines

[4] A link for a macronutrient calculator:

https://www.calculator.net/macro-calculator.html

[5] A link to an article on sugar intake:

https://www.heart.org/en/healthy-living/healthy-eating/eat-smart/sugar/added-sugars#:~:text=The%20American%20Heart%20Association%20

[6] A link to an article on protein intake:

https://www.healthline.com/health/too-much-protein