Applicability of Machine Learning Techniques in
Applicability of Machine Learning Techniques in
To cite this article: Larissa Oliveira Chaves, Ana Luiza Gomes Domingos, Daniel Louzada
Fernandes, Fabio Ribeiro Cerqueira, Rodrigo Siqueira-Batista & Josefina Bressan (2021):
Applicability of machine learning techniques in food intake assessment: A systematic review,
Critical Reviews in Food Science and Nutrition, DOI: 10.1080/10408398.2021.1956425
Article views: 69
REVIEW
ABSTRACT KEYWORDS
The evaluation of food intake is important in scientific research and clinical practice to understand Food intake; diet; artificial
the relationship between diet and health conditions of an individual or a population. Large vol- intelligence; machine
umes of data are generated daily in the health sector. In this sense, Artificial Intelligence (AI) tools learning; supervised and
unsupervised algorithms;
have been increasingly used, for example, the application of Machine Learning (ML) algorithms to computational tools
extract useful information, find patterns, and predict diseases. This systematic review aimed to
identify studies that used ML algorithms to assess food intake in different populations. A literature
search was conducted using five electronic databases, and 36 studies met all criteria and were
included. According to the results, there has been a growing interest in the use of ML algorithms
in the area of nutrition in recent years. Also, supervised learning algorithms were the most used,
and the most widely used method of nutritional assessment was the food frequency question-
naire. We observed a trend in using the data analysis programs, such as R and WEKA. The use of
ML in nutrition is recent and challenging. Therefore, it is encouraged that more studies are carried
out relating these themes for the development of food reeducation programs and public policies.
accurately interpret information derived from the algorithms comparison of the compiled data to ensure their integrity
in a clinical setting and epidemiological studies. In addition, and reliability was conducted by the authors. Divergent deci-
there are still studies to be carried out involving the applica- sions were resolved by consensus or by consultation with a
tion of ML algorithms in the area of nutrition, especially in third author (D.L.F). For each included study, the following
food intake. Therefore, this review can serve as a guide for information was extracted: authors, year of publication, the
health professionals interested in food intake and data sci- country where the research was developed, the objective of
ence, and to enlighten as well as to assist other researchers the study, characteristics of the participants, method of food
in the development of new studies. This systematic review intake evaluation, ML approach and algorithms, and compu-
aimed to identify and analyze original articles that applied tational tools.
ML algorithms of supervised and unsupervised approaches
to assess food consumption in different populations.
Data analysis
All studies selected in this article are summarized in Table 1
Methods
according to their main characteristics. The studies were
Protocol and registration organized chronologically by year of publication, starting
with the first published study. The year of publication, loca-
This review was conducted in accordance with Preferred
tion of the study, methods for assessing food consumption,
Reporting Items for Systematic Reviews and Meta-Analyses
and ML approaches and algorithms along with the computa-
(PRISMA) (Liberati et al. 2009) and was registered in the
tional tools used were considered the main characteristics of
PROSPERO database (www.crd.york.ac.uk/prospero/), regis-
this systematic review.
tration number CRD42020198633.
The performance of a meta-analysis was not justified due
to the heterogeneity among the studies included. Therefore,
Literature research according to the Cochrane manual, the authors performed a
systematic review (Higgins and Green 2011).
Two authors (L.O.C and A.L.G.D) independently searched
for original articles that used ML algorithms to evaluate
food intake using the following electronic databases: Results
MEDLINE (PubMed, www.pubmed.com), Lilacs (www.lilacs.
Study selection
bvsalud.org), Science Direct (www.sciencedirect.com),
SciELO (www.scielo.org) and Google Scholar (https:// A total of 252 studies were identified through the searches
scholar.google.com.br/). The following descriptors were used in the databases. After the removal of 49 duplicate studies,
as a strategy for research in titles and abstracts: (“machine 203 unique records remained, of which 133 studies were
learning” OR “deep learning” OR “data mining” OR excluded based on their titles and abstracts because they
“unsupervised learning”) AND (“food intake” OR “diet” OR were considered irrelevant: 72 did not use ML algorithms to
“food pattern” OR “dietary pattern” OR “food frequency evaluate food consumption, 45 were animal studies, 11 were
consumption” OR “food questionnaire”) NOT review. not original articles, 3 did not use ML algorithms and 2
The research strategy was not restricted by publication were in vitro studies. The remaining 70 studies were
year and language. The research was conducted between reviewed and evaluated in full for eligibility, and 36 met all
July 1st and 6th, 2020. A reverse search was conducted to the criteria adopted for this systematic review and were thus
identify relevant articles cited in the selected studies. included (Figure 1).
(continued)
5
Table 1. Continued. 6
Iwendi et al. (2020) Recommending diets using deep 30 men and women - Food intake data Supervised - MLP, RNN, GRU, Google Colaboratory
China learning in medical data, to of hospitals available from the LSTM, LR, Naïve
detect which food should be hospital database Bayes
administered RF
Abbreviations: AI: Artificial Intelligence; ALM: Appendicular Lean Mass; ANN: Artificial Neural Networks; BC: Bladder Cancer; BMD: Bone Mineral Density; CARTs: Classification and Regression Trees; CNN: Convolutional
Neural Network; CRC: Predictors of Colorectal Cancer; CVD: Cardiovascular Disease; DHQ: Diet History Questionnaire; DQI: Diet Quality Indexes; EPIC: European Prospective Investigation into Cancer and Nutrition; FAH:
Food at Home; FAFH: Food Away From Home; FFQ: Food Frequency Questionnaire; FGFQ: Food Groups Frequency Questionnaire; GA: Genetic Algorithms; GI: Glycemic Index; GPLVM: Gaussian Process Latent Variable
Model; GR: Generalized Regression; GRU: Gated Recurrent Units; GSEM: Generalized Structural Equation Model; GTM: Generative Topological Mapping; HEI: Healthy Eating Index; ICP-OES: Inductively Coupled Plasma
Optical Emission Spectrometry; k-NN: k-Nearest Neighbors; KPCA: Kernel Principal Component Analysis; LR: Logistic Regression; LSTM: Long Short-Term Memory; MARS: Multivariate Adaptive Regression Splines; ML:
Machine Learning; MLP: Multilayer Perceptron; NDNS: National Diet and Nutrition Survey; NHANES: National Health and Nutrition Examination Survey; NMR: Nuclear Magnetic Resonance; OT: On Track; PCA: Principal
Component Analysis; PPGR: Postprandial Glycemic Response; PSID: Panel Study of Income Dynamics; RF: Random Forest; RNN: Recurrent Neural Network; R24H: 24-h recall; SDHQ: Short Dietary Habits Questionnaire;
SEBBQ: Short Eating Habits Behaviors & Beliefs Questionnaire; SNAP: Supplemental Nutrition Assistance Program; SOM: Self-Organizing Map; STDD: Short-Term Depression Detector; SVM: Support Vector Machine; T2D:
Type 2 Diabetes; TMLE: Targeted Maximum Likelihood Estimation; WW: Weight Watchers.
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION 7
studies were conducted. For a better visual analysis of these studies (11.1%) included only women, and 25 studies
results, the webColorBrewer 2.0 tool was used to choose the (69.4%) included individuals of both sexes. Regarding the
color palette that could also be differentiated by colorblind age group, 20 studies(55.5%) were conducted with adult and
individuals. elderly population, three studies (8.3%) with adults, two
studies (5.6%) with children and adults, and only one study
(2.8%) with children (Lazarou et al. 2012). It was observed
Population characteristics that the studies included mostly individuals with overweight
As shown in Table 2, five studies (13.8%) did not describe or obesity (11.1%) and with a diagnosis of cancer (11.1%),
the population characteristics for gender and age. Four followed by studies with postmenopausal women (8.3%) and
8 L. OLIVEIRA CHAVES ET AL.
Table 2. Characteristics of the population and publications in relation to the year and countries in which the studies were conducted.
Characteristics References
Publications in relation to the year
First: 2008 Hearty and Gibney 2008
2008–2016 Hearty and Gibney 2008; De Cos Juez et al. 2009; Ord ~ez et al. 2009; Zenitani, Nishiuchi, and Kiuchi 2010; De Cos
on
Juez et al. 2011; Lazarou et al. 2012; Silvera et al. 2014; Zeevi et al. 2015; Giabbanelli and Adams 2016
2017–2020 Dipnall et al. 2017; Kanerva et al. 2018; Mezgec and Seljak 2017; Mutter et al. 2017; Silva et al. 2018; Easton, Sicilia,
and Stephens 2019; Forman, Goldstein, Zhang, et al. 2019; Guan et al. 2018; Jia et al. 2019; Rosso and Giabbanelli
2018; Shiao et al. 2018a; Shiao et al. 2018b; Shiokawa, Date, and Kikuchi 2018; Panaretos et al. 2018; Faruqui et al.
2019; Forman, Goldstein, Crochiere, et al. 2019; Hamad et al. 2019; Shao et al. 2019; Yu et al. 2020; Burgermaster
et al. 2020; He et al. 2020; Kwon et al. 2020; Jiang et al. 2020; Xu et al. 2020; Bodnar et al. 2020; Narziev et al.
2020; Iwendi et al. 2020
Publications in relation to the countries
Australia Guan et al. 2018
China Mutter et al. 2017; Shao et al. 2019; Iwendi et al. 2020
Denmark Jiang et al. 2020
Finland Kanerva et al. 2018
Greece Panaretos et al. 2018
Ireland Hearty and Gibney 2008
Israel Zeevi et al. 2015
Japan Zenitani, Nishiuchi, and Kiuchi 2010; Shiokawa, Date, and Kikuchi 2018
Mexico Easton, Sicilia, and Stephens 2019
Republic of Cyprus Lazarou et al. 2012
Slovenia Mezgec and Seljak 2017
South Korea Kwon et al. 2020; Narziev et al. 2020
Spain De Cos Juez et al. 2009; Ord ~ez et al. 2009; De Cos Juez et al. 2011
on
United Kingdom Giabbanelli and Adams 2016; Rosso and Giabbanelli 2018; He et al. 2020
United States Silvera et al. 2014; Dipnall et al. 2017; Forman, Goldstein, Zhang, et al. 2019; Shiao et al. 2018a; Shiao et al. 2018b;
Faruqui et al. 2019; Forman, Goldstein, Crochiere, et al. 2019; Hamad et al. 2019; Burgermaster et al. 2020; Xu et al.
2020; Bodnar et al. 2020
United States and Canada Silva et al. 2018
Population characteristics
Adults Faruqui et al. 2019; Xu et al. 2020; Narziev et al. 2020
Adult and elderly Hearty and Gibney 2008; De Cos Juez et al. 2009; Ord ~ez et al. 2009; De Cos Juez et al. 2011; Silvera et al. 2014;
on
Zeevi et al. 2015; Dipnall et al. 2017; Kanerva et al. 2018; Mutter et al. 2017; Easton, Sicilia, and Stephens 2019;
Forman, Goldstein, Zhang, et al. 2019; Guan et al. 2018; Shiao et al. 2018a; Shiao et al. 2018b; Panaretos et al.
2018; Forman, Goldstein, Crochiere, et al. 2019; Shao et al. 2019; He et al. 2020; Kwon et al. 2020; Jiang et al. 2020
Anemia Mutter et al. 2017
Both sexes Hearty and Gibney 2008; Zenitani, Nishiuchi, and Kiuchi 2010; Lazarou et al. 2012; Silvera et al. 2014; Zeevi et al. 2015;
Giabbanelli and Adams 2016; Dipnall et al. 2017; Kanerva et al. 2018 ; Mutter et al. 2017; Easton, Sicilia, and
Stephens 2019; Forman, Goldstein, Zhang, et al. 2019; Guan et al. 2018; Rosso and Giabbanelli 2018; Shiao et al.
2018a; Shiao et al. 2018b; Panaretos et al. 2018 ; Faruqui et al. 2019; Forman, Goldstein, Crochiere, et al. 2019;
Shao et al. 2019; Yu et al. 2020; He et al. 2020; Kwon et al. 2020; Jiang et al. 2020; Xu et al. 2020; Narziev
et al. 2020
Cancer Silvera et al. 2014; Shiao et al. 2018a; Shiao et al. 2018b; Yu et al. 2020
Children Lazarou et al. 2012
Children and adults Giabbanelli and Adams 2016; Rosso and Giabbanelli 2018
Depression Xu et al. 2020; Bodnar et al. 2020
Did not describe Jia et al. 2019; Shiokawa, Date, and Kikuchi 2018; Hamad et al. 2019; Iwendi et al. 2020; Burgermaster et al. 2020
Overweight and / or obesity Kanerva et al. 2018; Forman, Goldstein, Zhang, et al. 2019; Guan et al. 2018; Forman, Goldstein, Crochiere, et al. 2019
Postmenopausal women De Cos Juez et al. 2009; Ord ~ez et al. 2009; De Cos Juez et al. 2011
on
Type 2 diabetes mellitus Burgermaster et al. 2020
Two or more noncommunicable diseases Easton, Sicilia, and Stephens 2019; Faruqui et al. 2019
Women De Cos Juez et al. 2009; Ord ~ez et al. 2009; De Cos Juez et al. 2011; Bodnar et al. 2020
on
individuals diagnosed with depression (5.6%), anemia (2.8%) (2.8%) used hedonic scales. Four studies (11.1%) of the
and type 2 diabetes mellitus (2.8%). Only two studies (5.6%) selected ones used food intake data recorded in database
investigated a population with two or more NCD. The other systems and/or studies already published. One study (2.8%)
studies did not inform the population characteristics. Note aimed at detecting and recognizing food and beverage
that most of the studies included in this review were with a images used Google site as a search tool (Table 3).
population composed of both sexes, adults and elderly, with
the presence of overweight, obesity, or cancer.
ML algorithms and computational tools
The studies included in this review used different ML algo-
Methods of food intake evaluation rithms, and the most used ones were in the category super-
A total of 13 studies (36.1%) used the FFQ as a method to vised learning. Of the 36 studies included, 32 studies
evaluate food consumption, followed by ten studies (27.8%) (88.9%) used supervised approach algorithms, being 23 stud-
that used smartphone/software applications. Five studies ies (63.9%) of the classification type and 14 studies (38.9%)
(13.9%) used other types of questionnaires, four studies of the regression type. The most used classification algo-
(11.1%) used the food registry, two studies (5.6%) used the rithms were based on Decision Trees with 13 studies
R24H applied by a trained interviewer, and only one study (36.1%), and the Artificial Neural Networks with 6 studies
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION 9
Figure 3. Heatmap relating the number of articles published in the countries where the studies were conducted.
(16.7%). We found only four studies (11.1%) in which food intake in healthy and unhealthy individuals. Currently,
unsupervised approaches were applied. The clustering algo- there is considerable scientific interest in the use of those
rithms found were Hierarchical clustering, k-means, and algorithms due to the high predictive performance in large
Self-Organizing Maps, while the Apriori algorithm was the volumes of data, such as in agriculture, transport, finance,
only association rule procedure encountered. Only one study criminal justice, and health (Cutillo et al. 2020). In the
(2.8%) applied both ML approaches, supervised and health area, ML algorithms have great potential to improve
unsupervised (Kwon et al. 2020) (Table 3). the results of patients from clinical research to hospital care,
Nine studies (25%) did not inform the computational helping in the process of diagnosis and prediction of dis-
tools used. However, ten studies (27.8%) used the Statistical eases (Cutillo et al. 2020; Fernandes and Filho 2019).
Program R, five studies (13.9%) used the WEKA program, This growth in studies and publications involving the
three studies (8.3%) used SAS, two studies (5.6%) used application of ML algorithms in health is confirmed in our
STATA, two studies (5.6%) used Google Colaboratory, one review due to the significant increase in studies that
study (2.8%) used CART, one study (2.8%) MATLAB, one addressed ML and food intake from 2016 onwards with the
study (2.8%) SPSS and one study (2.8%) Google Custom start of the peak in 2017. Of the 36 studies included, 27
Search API. Only one study (2.8%) used two computational were published between 2017 and mid-2020. Moreover, we
tools, R and MATLAB (Table 3). observed an increase since 2011 in studies related to AI in
the health area, with peaks also from the year 2017 onwards
(data not shown). This observation emphasizes that the use
Discussion of computational methods is not restricted only to nutrition.
Overview of growth in studies and publications Therefore, there is a trend in the use of AI in the health
involving ML and nutrition area in general.
Even though there has been a recent increase in the num-
According to our knowledge, this was the first systematic ber of publications addressing the various applications of
review that analyzed the application of different ML algo- ML algorithms in nutrition, the use of artificial intelligence
rithms, supervised and unsupervised approaches, to evaluate approaches in other areas has been under discussion for
10 L. OLIVEIRA CHAVES ET AL.
Table 3. Characteristics of included studies in relation to the method of assessing food intake and the type of algorithm and computational tools.
Assessment methods of food intake
Food consumption data recorded in database systems and / Silvera et al. 2014; Giabbanelli and Adams 2016; Shiokawa, Date, and Kikuchi 2018; Iwendi et
or studies already published al. 2020
Food Frequency Questionnaires De Cos Juez et al. 2009; Ord o~
nez et al. 2009; De Cos Juez et al. 2011; Lazarou et al. 2012;
Silvera et al. 2014; Kanerva et al. 2018; Easton, Sicilia, and Stephens 2019; Shiao et al. 2018a;
Shiao et al. 2018b; Panaretos et al. 2018; Yu et al. 2020; Jiang et al. 2020; Bodnar et al. 2020
Food registry Hearty and Gibney 2008; Mutter et al. 2017; Rosso and Giabbanelli 2018; He et al. 2020
Google as a search tool Mezgec and Seljak 2017
Hedonic scales Xu et al. 2020
24-hour recall Dipnall et al. 2017; Kwon et al. 2020
Other types of questionnaires Lazarou et al. 2012; Zeevi et al. 2015; Guan et al. 2018; Hamad et al. 2019; Xu et al. 2020
Smartphone / software applications Zenitani, Nishiuchi, and Kiuchi 2010; Zeevi et al. 2015; Silva et al. 2018; Forman, Goldstein,
Zhang, et al. 2019; Jia et al. 2019; Faruqui et al. 2019; Forman, Goldstein, Crochiere, et al.
2019; Shao et al. 2019; Burgermaster et al. 2020; Narziev et al. 2020
Supervised approach algorithms
Type: classification Hearty and Gibney 2008; Ord on~ez et al. 2009; De Cos Juez et al. 2011; Lazarou et al. 2012;
Silvera et al. 2014; Giabbanelli and Adams 2016; Kanerva et al. 2018; Mezgec and Seljak 2017;
Silva et al. 2018; Easton, Sicilia, and Stephens 2019; Forman, Goldstein, Zhang, et al. 2019; Jia
et al. 2019; Rosso and Giabbanelli 2018; Shiokawa, Date, and Kikuchi 2018; Panaretos et al.
2018; Faruqui et al. 2019; Forman, Goldstein, Crochiere, et al. 2019; Shao et al. 2019; Yu et al.
2020; Burgermaster et al. 2020; Bodnar et al. 2020; Narziev et al. 2020; Iwendi et al. 2020
Type: regression De Cos Juez et al. 2009; Zenitani, Nishiuchi, and Kiuchi 2010; Zeevi et al. 2015; Dipnall et al.
2017; Kanerva et al. 2018; Forman, Goldstein, Zhang, et al. 2019; Shiao et al. 2018a; Shiao et
al. 2018b; Forman, Goldstein, Crochiere, et al. 2019; Hamad et al. 2019; He et al. 2020; Kwon
et al. 2020; Xu et al. 2020; Iwendi et al. 2020
Artificial Neural Networks Hearty and Gibney 2008; De Cos Juez et al. 2011; Mezgec and Seljak 2017; Silva et al. 2018; Jia
et al. 2019; Faruqui et al. 2019
Decision Trees Hearty and Gibney 2008; Ord on~ez et al. 2009; Lazarou et al. 2012; Silvera et al. 2014; Giabbanelli
and Adams2016; Kanerva et al. 2018; Rosso and Giabbanelli 2018; Shiokawa, Date, and
Kikuchi 2018; Panaretos et al. 2018; Shao et al. 2019; Yu et al. 2020; Burgermaster et al. 2020;
Narziev et al. 2020
Unsupervised approach algorithms
Apriori Guan et al. 2018; Jiang et al. 2020
Hierarchical Clustering Guan et al. 2018
K-means Kwon et al. 2020
Self-Organizing Map Mutter et al. 2017; Jiang et al. 2020
Computational tools
CART Silvera et al. 2014
Did not inform De Cos Juez et al. 2009; Ord o~
nez et al. 2009; Zeevi et al. 2015; Easton, Sicilia, and Stephens
2019; Forman, Goldstein, Zhang, et al. 2019; Jia et al. 2019; Faruqui et al. 2019; Jiang et al.
2020; Bodnar et al. 2020
Google Colaboratory Iwendi et al. 2020; Silva et al. 2018
Google Custom Search API Mezgec and Seljak 2017
MATLAB He et al. 2020
R De Cos Juez et al. 2011; Kanerva et al. 2018; Mutter et al. 2017; Guan et al. 2018; Panaretos et
al. 2018; Forman, Goldstein, Crochiere, et al. 2019; Hamad et al. 2019; Yu et al. 2020;
Burgermaster et al. 2020; Kwon et al. 2020
SAS Zenitani, Nishiuchi, and Kiuchi 2010; Shiao et al. 2018a; Shiao et al. 2018b
SPSS Hearty and Gibney 2008
STATA Dipnall et al. 2017; Xu et al. 2020
Statistical Program R and MATLAB Shiokawa, Date, and Kikuchi 2018
WEKA Lazarou et al. 2012; Giabbanelli and Adams 2016; Rosso and Giabbanelli 2018; Shao et al. 2019;
Narziev et al. 2020
many years (Smallwood and Sondik 1973). One of the pos- noted. Most studies were conducted in North America, fol-
sible explanations for the increase in the use of ML algo- lowed by Europe, Asia, and Oceania, and no studies were
rithms in nutrition and health in general in recent years is developed in Central America, South America, and Africa. It
precisely the search for more accurate procedures to meet is known that food intake is highly influenced by the region
the needs of professionals in their daily decision-making in which we live, so it is important that countries can con-
activities, treatment options, and reduction of health costs duct their research to understand the eating behavior of indi-
(Reis et al. 2017). In the long term, it is believed that ML viduals in the same region (Latha and Thegaleesan 2019).
approaches will benefit professionals in diverse fields, by The food intake pattern is directly influenced by social,
offering objective suggestions and ways to improve the effi- cultural, and economic factors (Savage, Bambrick, and
ciency, reliability, and accuracy of processes. Gallegos 2020). In addition, the characteristics of the popu-
lation in terms of customs, level of education, knowledge
about healthy eating, workplace, family and friends circle
Influence of regionalization on food consumption also have a major impact on food choice and habits (Latha
and Thegaleesan 2019). These differences can be observed
According to the results achieved, a small diversity of coun- between different countries or regions within the same
tries investigating food intake using ML algorithms was country (Vasileska and Rechkoska 2012).
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION 11
In this review, an interesting result found is that the food records are necessary. Repeated measurement requires
United States was the country that developed the largest resources and time and can influence respondents’ food
number of studies involving ML and food consumption. It intake, improving the quality of the diet, changing or omit-
is believed that this great interest in research on nutrition is ting information intentionally (Rupasinghe, Perera, and
related to the low quality of the diet consumed and also to Wickramaratne 2020). The R24H is conducted by interview
the reduction of physical activity practices of its inhabitants, and usually requires 20 to 30 minutes, and the information
which has been worsening since the 1980s (Popkin, Adair, depends on the interviewees’ memory and the interviewer’s
and Ng 2012). Economic development and increasing urban- skills. On the other hand, food recording is a method that
ization in developed countries, such as the United States, takes more time to obtain accurate data and respondents
brought benefits and negative consequences for lifestyle and must undergo prior training. Therefore, a high level of
dietary patterns, which include quantitative and qualitative motivation becomes necessary. Also, each questionnaire
changes in the diet. This more industrialized dietary pattern requires a thorough review to ensure that all reported data
includes an increase in the consumption of high-calorie are correct (Shim, Oh, and Kim 2014; Rupasinghe, Perera,
foods, refined carbohydrates, and saturated fats of animal and Wickramaratne 2020).
origin, in addition to a reduction in the intake of complex However, both methods to evaluate food intake also have
carbohydrates, fibers, vitamins, and minerals (Vasileska and common strengths, such as being easy to apply, having a
Rechkoska 2012). wide variety of foods, are made up of open questions that
It is essential to point out that cultural and behavioral allow the collection of great information on consumption
factors are also susceptible to change and that the circle of and can be used to estimate the average consumption of a
family and friends is extremely important in the correct given population. Moreover, food registration does not
choice of food. In addition, an increasing number of indi- depend on the individual’s memory since the information is
viduals have been eating outside their homes, which further self-reported when the food is consumed (Shim, Oh, and
increases the consumption of processed and ultra-processed Kim 2014; Chmurzynska et al. 2018).
foods since access to healthy options is often limited in As seen above, the most used methods nowadays have
many places, including at work and in school environments many limitations, including memory dependency, under-
(Latha and Thegaleesan 2019). standing of food portions, literacy, and training of inter-
viewers. Motivated by the development of reliable evaluation
methods, the technology emerges as a viable solution to cur-
Methods of food intake evaluation
rent methodological deficiencies with the potential to
The evaluation of adequate and reliable food intake in scien- improve adherence, communication, and data quality (Sharp
tific research is important to understand the association and Allman-Farinelli 2014). As a result, a large number of
between diet and the health conditions of an individual or a studies that used mobile applications were found in our
population. It has also been useful in predicting NCD review. In recent years, mobile devices have been used to
(Vucic et al. 2009). However, an accurate assessment of food evaluate individual and group diets in real-time, incorporat-
intake remains a major challenge, as it is subject to bias, ing their daily food routines. The easy access and interactive
and none of it is considered the gold standard. The most features of these applications, such as setting goals and diet-
commonly used methods to assess food intake are food his- ary lapses, allow users to monitor the diet and trigger
tories, food records, R24H, and FFQ (Vucic et al. 2009; healthier behaviors. Mobile applications have demonstrated
Shim, Oh, and Kim 2014). validity and reliability, similar to conventional methods. In
In this review, most of the selected studies used the FFQ addition, the use of the application feeds continuous pro-
to evaluate food consumption. This method is considered gress data to be used in future studies (Chmurzynska et al.
one of the simplest, cheapest, fastest, and easiest to adminis- 2018; Ahn et al. 2019).
ter and process, and allows for long-term food evaluation
(Chmurzynska et al. 2018). This method contains a defined
list of about 100 to 150 food items and options of the usual ML algorithms and computational tools
frequency of consumption over the period consulted. In ML is a subarea of AI whose objective is to develop algo-
some cases, portion sizes are also investigated, but little rithms that give computers or computer systems the ability
information is collected on the additional characteristics of to learn specific knowledge, behavior, or pattern automatic-
the food consumed (Shim, Oh, and Kim 2014). Despite this ally or semi-automatically from examples or informed obser-
methodological limitation, FFQ has been widely used in epi- vations (Michalski, Carbonell, and Mitchell 2013). ML
demiological studies since the 1990s. It is important to note approaches can be of types: supervised, unsupervised, semi-
that the FFQ should be developed specifically for each study supervised and reinforcement learning. Here, we will discuss
and research group because diet can be influenced by ethni- the first two main approaches, which were the ones found
city, culture, economic status, among others (Shim, Oh, and in the studies selected in this review.
Kim 2014; Rupasinghe, Perera, and Wickramaratne 2020).
The R24H and food registration have some important
limitations that can influence their choices, such as collect- Supervised learning approach
ing information for a specific period, usually for short-term In situations where supervised learning is applied, one has
intake. Thus, to measure the average intake, several R24H or prior knowledge of the values of the output variable, i.e., the
12 L. OLIVEIRA CHAVES ET AL.
classes or labels represented by categorical or continuous It is believed that the vast demand for this type of algo-
values of the input dataset used - composed of registers rithm in the review studies is due to its advantages, espe-
(instances) and variables (attributes). Therefore, the objective cially: fast construction of the predictive model; fast
of supervised learning is to learn, employing algorithms for classification of new instances; no need for normalization or
this type of task, a mapping function that best approximates standardization in the preprocessing phase; simplicity in
the relationship between input data and observable output understanding and interpreting the rules generated even for
so that when new instances are available, the output can be non-specialist users, as the resulting tree provides a consoli-
predicted with considerable accuracy (Pedregosa et al. 2011). dated view of the classification logic (Khan et al. 2010;
This learning process works in the following way: first, Rajput et al. 2011).
the dataset is split into two parts: training and test data. A According to Yu et al. (2020) the application of the DT
predictive model is then built based on an algorithm that algorithm in their study strongly contributed to the high
uses the training set so that the resulting model learns pat- accuracy found in the proposed classification, indicating that
terns by associating the input data values with the output the ML can adequately deal with missing data and measure-
labels. After the training, the model will receive the test set ments of complex investigation. The investigators concluded
split, which was left out of the previous step, and it will that the DT algorithm provided an effective approach to
apply the knowledge learned from previous experiences identify some food groups related to bladder cancer risk.
(training data) to this test set so that the accuracy, sensitiv- Another very powerful and frequently applied supervised
ity, specificity – and other important statistical measures – ML approach is ANN. ANN algorithms are inspired by the
are calculated to evaluate the predictive power of the model operating structure of the biological neural system concern-
(Dey 2016). Thus, together with performance metrics, the ing the ability to learn from data and improve its perform-
model ability to generalize to predict labels for previously ance according to what was learned through operations such
unseen instances during the training will be evaluated. as parallel calculations for data processing and knowledge
(Michalski, Carbonell, and Mitchell 2013). representation (Tan, Steinbach, and Kumar 2006).
Interestingly, it was observed that most of the studies An ANN is built by a set of processing units, also known
included in this review, totaling 32 out of 36, used some as neurons, linked by weighted connections or synaptic
weights responsible for the propagation of attribute values
supervised learning algorithm. Below, we will discuss these
between the neurons in the layers (Tan, Steinbach, and
algorithms, which are of the classification and regression
Kumar 2006; Michalski, Carbonell, and Mitchell 2013). A
type, and the reasons why they were the most used in the
neuron is a component that calculates the weighted sum of
studies included in this review.
the values received as input, applies an activation function,
and passes the result forward to the next layer. The inter-
Classification algorithms. The classification algorithms are
mediate layers, if any, between the input and output layers,
used when the goal is to map the input variables to a spe-
are known as hidden layers. The value propagation process
cific categorical class. It is common to find in the literature
continues until reaching the output layer with the predicted
works that applies some of the supervised algorithms based
response (Tan, Steinbach, and Kumar 2006; Michalski,
on Decision Trees (DT), Artificial Neural Networks (ANN), Carbonell, and Mitchell 2013).
Naïve Bayes (NB), Support Vector Machine (SVM), Logistic The main advantage of implementing ANN-based algo-
Regression, and k-Nearest Neighbor (kNN) (Khan et al. rithms is the high capacity to learn from large volumes of
2010). In this review, we provide further details of the algo- data, whether structured or not and in diverse applications
rithms based on DT and ANN, as they are the most present (e.g., speech recognition, machine translation, image cap-
approaches in the selected studies. tioning generator, among many others). However, ANN has
DT or derivatives from this approach are quite popular some disadvantages, such as its high computational cost and
classification algorithms used to build predictive models physical memory use. Moreover, their training is relatively
(Rajput et al. 2011). This type of technique expresses the slow, and the results learned are difficult for users to inter-
possible results of a series of choices related to attributes pret (Khan et al. 2010).
and classes through rules. Each tree is represented through a Many of the classification algorithms, including ANN, are
structure with nodes and branches, and each non-leaf node known to be difficult to understand and explained in simple
in the tree is a decision rule. A DT usually uses the top- terms how the predictions were made. When built, these
down approach, i.e., from the root node to leaves (Rajput models are called black-box models. In the health area, this
et al. 2011). It is started with a single root (parent node), kind of model is even more challenging because the profes-
representing the most important attribute in the dataset sionals will be apprehensive in making decisions, especially
according to an impurity metric. Between the root and the those related to death risk, without a firm understanding of
(child nodes), which represent other attributes, the branches how the algorithm came to that predicted recommendation
or edges connect these nodes and represent the possible val- (Khan et al. 2010).
ues of the attribute analyzed by the predecessor or parent A study by Silva et al. (2018) trained an effective food
node. Finally, after traversing the tree, one reaches the leaf classification model in a food image dataset and found that
nodes representing the target, i.e., predicted classes neural networks achieved an overall performance of 87.2%
(Dey 2016). (with 90.0% sensitivity and specificity of 84.4%). When the
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION 13
model was trained based on this food image dataset, it Unsupervised learning approach
achieved a precision of 65.5% (with a sensitivity of 59.0% Unlike supervised approaches, the algorithms of unsuper-
and a specificity of 72.0%). They concluded that the main vised learning are used to explore unlabeled data, i.e., when
contribution of neural networks is that they automatically instances have no associated value or category. As a result
learn resources through convolutional layers, with high per- the unsupervised learning algorithms do not aim to make
formance and accuracy. predictions but, instead, to find potentially useful hidden
In this context, we strongly believe that the frequent and structures and patterns that humans can interpret and that
broad use of DT in the studies addressed in this systematic allow a better description and understanding of the data
review was for the speed of training and mainly for provid- (Tan, Steinbach, and Kumar 2006).
ing a clear explanation for the results found by the model In this approach, the task of the ML algorithms is not to
(Lundberg and Lee 2017). find the right output from the input data but to explore the
data and be able to find clusters or make inferences accord-
Regression algorithms. Regression algorithm sare used in sit- ing to the similarities, patterns, and differences found evalu-
uations where the aim is to map the input variables to an ating the attributes of the instances, without any previous
output with a continuous value, i.e., any numerical value training (Tan, Steinbach, and Kumar 2006). The motivation
between two limits (Kan et al. 2019). Note that regression to use this approach is due to its ability to provide initial
algorithms as well as classification algorithms were widely insights that can then be used for testing scientific hypothe-
used in the studies selected in this review. ses and conduct research from a starting point for analysis.
The objective of regression is to define the parameter val- Unsupervised learning tasks are typically to find underly-
ues of a mathematical equation that defines y (the output to ing groups (clusters) in the data and/or reveal important
be predicted) as a function of variables x (input variables) so associations rules (Dey 2016). In our review, only four stud-
that the error concerning the adjusted curve and all the data ies applied unsupervised ML approaches of which three
used clustering algorithms and one used association
points is minimized. This equation, the final model, can
rules algorithms.
then be used to predict the result for new instances. In gen-
eral, a model fits the data well if the differences between the
Clustering algorithms. The most common task in unsuper-
observed values and the predicted values are small and
vised learning is clustering. In this case, the unlabeled data
unbiased (Pedregosa et al. 2011).
are analyzed and organized in clusters by their similarities
Among the many forms of regression described in the lit-
or dissimilarities (Tan, Steinbach, and Kumar 2006). The
erature (Multiple Linear Regression, Lasso Regression,
measurement of how similar or dissimilar the instances are
among others), one must select the best technique that
to each other is done using a proximity calculation, such as
explains the data to be analyzed. The best way to verify this
the Euclidean distance (Pedregosa et al. 2011). The goal is to
is by applying different regression models and comparing
create a clustering (a set of clusters) where instances in the
the performance in predicting for new instances (Goldstein,
same cluster are very similar to each other (each cluster is
Navar, and Carter 2017). For regression, we use as a meas-
cohesive), while instances in distinct clusters are highly dis-
ure of performance the calculated error in relation to the similar (i.e., clusters are well-separated from each other)
model obtained (curve) and the points belonging to the (Zheng et al. 2019). In a sense, clustering algorithms reveal
training set. Thus, the smaller the prediction error, the bet- hidden categories, i.e., each cluster can be thought as a class
ter the final performance of the model (Goldstein, Navar, of its instances (Ghorbani and Ghousi 2019).
and Carter 2017). The k-means algorithm is the most largely used clustering
The study by Pagamunici et al. (2014) aimed to develop a algorithm. To apply this procedure, it is necessary to give as
high nutritional value gluten-free granola and evaluate it input to the algorithm the number k of clusters sought.
during storage using ML techniques such as multivariate Initially, k centroids (center points) are randomly defined.
analysis and simple linear regression. Over the storage Then, for each following iteration, every instance is associ-
period analyzed, a positive correlation was observed between ated to its closest centroid, and each centroid is redefined
appearance and general acceptance and the product according to the grouped instances (typically, the new cen-
remained stable in relation to these parameters. The results troid location will be the mean point in the cluster). The
of this study demonstrate the high contribution and effect- redefinition of centroid and resulting association of instan-
iveness of the application of regression analysis. These ana- ces continue throughout multiple iterations until the cent-
lyzes enabled a presentation of an innovative predictive roids do not change anymore (Jain 2010).
report that gives greater prediction accuracy and a better- According to the study by Kwon et al. (2020), the appli-
tuned model to identify significant predictors. cation of the k-means algorithm was fundamental to extract
It is important to mention that some of the algorithms important and hidden information, such as the relashionship
used in classification problems, such as DT, Random Forest, between total energy and protein intake, which were difficult
SVM and ANN, also work as regression algorithms. to distinguish with conventional analyses. The k-means
However, those algorithms are modified to adapt the desired interestingly contributed to the proper formation of clusters
output type, in this case, a numeric value, not a categorical and the comparison of risk factors between them. Cluster-
label (Rodriguez-Galiano et al. 2015). specific risk factors were found to include high consumption
14 L. OLIVEIRA CHAVES ET AL.
of fat and smoking in the men’s cluster and low consump- potentially relevant associations or regularities between items
tion of carbohydrates, protein, fat, and alcohol consumption (or attribute values) of the instances (Lakshmi and Vadivu
in the women’s cluster. 2017). The following implication can represent rules: X ! Y,
Self-Organizing Map (SOM) is another well-known clus- where X is called rule antecedent and Y is called consequent.
tering technique that was found in the studies selected in The most well-known association rule algorithm is
this review. SOM is a particular type of unsupervised neural Apriori. Initially, it identifies the frequent individual items,
network, where neurons are arranged in a 2-dimensional i.e., those whose the number of occurrences in the dataset is
grid. Throughout the iterations, the neurons gradually agglu- greater or equal to a threshold called minimum support. In
tinate around regions presenting high density of data points. the second iteration, the algorithm seeks for frequent pair of
Therefore, regions with many neurons can be interpreted as items containing the frequent individual items of the previ-
clusters. (Fernandes and Filho 2019). ous iteration, taking the same minimum support into
The study by Mutter et al. (2017) used the SOM algo- account. Similarly, in the third iteration, the algorithm
rithm to highlight the inherent natural heterogeneity of determines the frequent item triplets containing the pairs
nutritional profiles and how they are associated with inci- found in iteration 2. Apriori keeps augmenting the sets of
dent anemia in a population setting, showing how nutri- frequent items in each following iteration until no changes
tional and economic differences between northern and are detected. Finally, the resulting frequent item sets are
southern Jiangsu predict differences in incident anemia. The used to build the association rules that unveil trends in the
authors highlighted the excellent contribution of this algo- dataset. To evaluate the putative rules, a minimum confi-
rithm for the complete separation between training and dence value is considered. The confidence value measures
evaluation data, being one of the strengths of the SOM the force of the implication described by the rule, i.e., it
approach, as its architecture avoids overfitting. Transparency measures how often items in Y appear in instances that con-
is another strong point of the SOM approach, where the tain X. Rules generated with a confidence value below the
process of defining subgroups and investigating their profiles minimum are discarded. Additionally, the lift measure can
is guided by the user and open to constructive criticism be used to evaluate the degree of correlation between X and
from other observers. Y in a rule (Lakshmi and Vadivu 2017).
Another clustering technique found in the studies The study by Jiang et al. (2020) used Apriori algorithm
included in this review was Hierarchical clustering, where to reveal correlations between dietary factors and anthropo-
data are partitioned successively, producing a hierarchical metric changes in middle-aged Danish citizens. This study
representation of the group. Hierarchical methods require a successfully identified subgroups that shared similar dietary,
matrix containing metrics of distance between clusters, this lifestyle, and anthropometric profiles. The authors mention
matrix is known as a matrix of similarities between groups. that this algorithm effectively contributes to the evaluation
Distance methods between groups are used to calculate of eating habits assessed by food frequency questionnaires,
proximity values between groups, such as the Euclidean dis- and that was able to retrieve known association rules, such
tance. Through the analysis of the dendogram (diagram that as the beneficial role of fruits and red meats in relation to
shows the hierarchy and the relationship of the clusters in a changes in waist circumference in both sexes.
structure) it is possible to infer the number of suitable clus- As demonstrated in this review, supervised learning algo-
ters. Hierarchical clustering generally falls into two types: rithms, whether of classification or regression, are more
agglomerative, with a bottom-up approach, in this case, all widely used for data mining. This is because supervised
elements start separately and are grouped in stages, one by learning is a much more objective task when compared to
one to form clusters, and the divisive, with a top-down unsupervised learning which has an exploratory characteris-
approach where all elements start together in a single cluster tic. Additionally, for the same reason, supervised learning
and divisions are performed recursively as the hierarchy is models are easier to validate and there are more validation
descended. As with the agglomerative method, we choose metrics available. Also, the possibility of classifying future
instances with the resulting model is of broad application
the optimal number of clusters from all possible
(Tan, Steinbach, and Kumar 2006).
combinations.
On the other hand, it is not always possible to perform
Guan et al. (2018), in his study, applied hierarchical clus-
supervised learning as it is common the situation in which
tering to explore food choices at meals in a sample of over-
only unlabeled data is available. In such cases, unsupervised
weight and obese participants. This algorithm allowed the
learning can be of great utility and that is why this machine
identification of food clusters closely related to meals based
learning field is also of great importance (Tan, Steinbach,
on reported foods and item frequencies in the screening of
and Kumar 2006). Exploratory analyses make it possible to
dietary data. According to the authors, these results can aid
obtain initial insights and an understanding of the behavior
in the development of strategies to improve food choices
of the data, thus facilitating the conduct of research that
and behavior change at the individual level through a deeper
does not yet have a final objective outlined.
understanding of these choices.
paper, has led to a change in traditional data analysis forms. were the most used. The more frequent use of DT is pos-
Historically, the most used programs in the medical field are sibly because they are fast to apply, simple to understand,
SAS, SPSS, and STATA. However, some difficulties may and whose results are easy to interpret and explain. In add-
arise, such as updating or adding datasets of different types ition, there was a change in the use of computational tools
and sources or in unstructured data such as text and images for statistical analysis, with a tendency to use other software,
(Fernandes and Filho 2019). such as R, for being more complete, instead of the classic
In this work, the change in data analysis tools was noted, statistical programs.
as the vast majority of studies used the R software. This Regarding the assessment of food intake, we observed
software is an open-source, multi-platform, and free statis- that the FFQ was the most used method since it allows a
tical environment created by Ross Ihaka and Robert long-term evaluation, is simple, fast, and easy to administer
Gentleman in 1997 (Matloff 2009). R has become so popular and process. However, even understanding the importance
because it has a wide variety of integrated functions, pack- of investigating food intake in each population and how the
ages, and libraries that perform from simple to more com- use of ML algorithms can be interesting, there was little
plex tasks, such as applying statistical tests and ML diversity of countries involved in the studies analyzed. In
algorithms (Murrell 2005; Matloff 2009).
this sense, it is encouraged that studies focusing on the
Therefore, there was a growing trend to use the R soft-
application of ML algorithms in the investigation of food
ware instead of classic statistical software, especially for
intake in each country are conducted, as the problems faced
researchers who are not in the field of computing, such as
by different regions require different levels of research and
health, since R is more accepted by the scientific community
intervention for the development of food reeducation pro-
and is a more complete program.
The use of WEKA in the studies included in this review grams and specific public policies. Furthermore, health pro-
is also noteworthy. Its high adoption is probably due to its fessionals should understand that the use of ML is a
simplicity, to the fact that it contains many ML libraries collaborative activity, combining professional experience
implemented and ready to use, and, very importantly, it can with data analysis and processing, in order to facilitate deci-
be used without prior programming knowledge. sion making in planning and delivering health care.
We would like to stress that there are other machine
learning methods applied to food intake other than the ones
Strengths that we cited here. However, we exploited the most-fre-
This systematic review had several strengths, including the quently applied ML procedures to provide an overview of
fact that it is the first systematic review that analyzed the the main ML methods used in relevant publications in
application of different ML algorithms to evaluate food recent years.
intake in healthy and unhealthy individuals. In addition, we We suggest to researchers who use machine learning
include all studies regardless of the characteristics of the techniques in their studies that they mention broader search
populations studied, the type of study, the language, and the terms – such as: machine learning, deep learning, and data
year of publication. This decision allowed a broader search mining – in their texts and not just the specific names of
to identify and include all studies investigating food intake techniques, in order to expand their visibility and make it
with the application of different ML algorithms. Another easier the identification of their articles during the use of
strong point was the inclusion of studies that identified food search engines.
and beverage images to evaluated food consumption. This
inclusion allowed a more comprehensive review of the ML
application in the area of health, focusing on nutrition, Author contributions
highlighting the growth in the use of these algorithms in L.O.C., A.L.G.D., D.L.F., and J.B. designed the study. L.O.C and
recent years and the countries involved in these researches. A.L.G.D. selected and reviewed the articles and extracted the data.
Besides, both the main methods used to evaluate food intake L.O.C., A.L.G.D., and D.L.F. analyzed and interpreted the data and
and the main ML algorithms as well as computational tools drafted the manuscript. J.B., R.D-B., and F.R.C improved the manu-
script and critically revised the scientific content. All authors read and
employed were presented.
approved the final manuscript.
Conclusion
Conflict of interest
This review summarizes the latest information on the use of
The authors have no relevant interests to declare.
different ML algorithms to evaluate food intake. It can serve
as a guide for health professionals who want to work in the
area of AI. It is concluded from the results found that, cur- Funding
rently, there is a great and growing interest in the use of
This work was supported by the Fundaç~ao de Amparo a Pesquisa do
ML algorithms in the area of nutrition, mainly due to a sig- Estado de Minas Gerais (FAPEMIG), Belo Horizonte, Brazil; the
nificant increase in publications in recent years. Coordenaç~ao de Aperfeiçoamento de Pessoal de Nıvel Superior
In addition, it is also noted that the supervised learning (CAPES), Brasilia, Brazil; and the Conselho Nacional de
algorithms, more precisely those based on Decision Trees, Desenvolvimento Cientıfico e Tecnol
ogico (CNPq), Brasilia, Brazil.
16 L. OLIVEIRA CHAVES ET AL.
of esophageal and gastric cancers: Classification tree analysis. Annals Xu, R., B. E. Blanchard, J. M. McCaffrey, S. Woolley, L. M. L. Corso,
of epidemiology 24 (1):50–57. doi:10.1016/j.annepidem.2013.10.009. and V. B. Duffy. 2020. Food liking-based diet quality indexes (DQI)
Singh, P. S., Singh, and G. S. Pandi-Jai. 2018. Effective heart disease generated by conceptual and machine learning explained variability
prediction system using data mining techniques. International in cardiometabolic risk factors in young adults. Nutrients 12 (4):882.
Journal of nanomedicine 13:121–124. doi:10.2147/IJN.S124998. doi:10.3390/nu12040882.
Siqueira-Batista, R., and E. Silva. 2019. Notas sobre os fundamentos Yu, E. Y. W., A. Wesselius, C. Sinhart, A. Wolk, M. C. Stern, X. Jiang,
matematicos da Intelig^encia Artificial. Revista De Ci^encia, Tecnologia L. Tang, J. Marshall, E. Kellen, P. van den Brandt, et al. 2020. A
e Inovaç~ao 4:44–54. data mining approach to investigate food groups related to incidence
Smallwood, R. D., and E. J. Sondik. 1973. The optimal control of par- of bladder cancer in the bladder cancer epidemiology and nutri-
tional determinants international study. The British Journal of nutri-
tially observable Markov processes over a finite horizon. Operations
tion 124 (6):611–619. doi:10.1017/S0007114520001439.
Research 21 (5):1071–1088. doi:10.1287/opre.21.5.1071.
Zeevi, D., T. Korem, N. Zmora, D. Israeli, D. Rothschild, A. Weinberger,
Tan, P. N., M. Steinbach, and V. Kumar. 2006. Introduction to data
O. Ben-Yacov, D. Lador, T. Avnit-Sagi, M. Lotan-Pompan, et al.
mining. S~ao Carlos: Pearson Education.
2015. Personalized nutrition by prediction of glycemic responses. Cell
Vasileska, A., and G. Rechkoska. 2012. Global and regional food con-
163 (5):1079–1094. doi:10.1016/j.cell.2015.11.001.
sumption patterns and trends. Procedia - Social and Behavioral Zenitani, S. H., Nishiuchi, and T. Kiuchi. 2010. Smart-card-based auto-
Sciences 44:363–369. doi:10.1016/j.sbspro.2012.05.040. matic meal record system intervention tool for analysis using data
Vucic, V., M. Glibetic, R. Novakovic, J. Ngo, D. Ristic-Medic, J. Tepsic, mining approach. Nutrition Research (New York, N.Y.) 30 (4):
M. Ranic, L. Serra-Majem, and M. Gurinovic. 2009. Dietary assess- 261–270. doi:10.1016/j.nutres.2010.04.003.
ment methods used for low-income populations in food consump- Zheng, Q., H. Delingette, K. Fung, S. E. Petersen, and N. Ayache. 2019.
tion surveys: A literature review. British Journal of Nutrition 101 Unsupervised shape and motion analysis of 3822 cardiac 4D MRI of
(S2):S95–S101. doi:10.1017/S0007114509990626. UK Biobank. Preprint submitted toarXiv.