A Dataset for Evaluating Large Language Models on Nutrition Estimation from Meal Descriptions
NutriBench consists of 11,857 meal descriptions generated from real-world global dietary intake data. NutriBench is human-verified and annotated with macro-nutrient labels, including carbohydrates, proteins, fats, and calories.
All data can be accessed in the data
directory.
We divide the data into four subsets based on the source (What We Eat in America (WWEIA) or FAO/WHO Gift) and the measurement units used in the meal descriptions (natural, e.g., 'a cup' or metric, e.g., '50g').
The subsets are named in the following format:
{data_source}_{measurement_units}.csv
# pip install pandas
import pandas as pd
source = 'wweia'
units = 'natural'
file_path = f'data/{source}_{units}.csv'
df = pd.read_csv(file_path)
For any questions, please contact the authors at {dongx1997,mdhaliwal}@ucsb.edu