The EAT-Lancet Commission’s Planetary Health Diet Compared With the Institute for Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis

Background This article aimed to compare the EAT-Lancet Commission’s “Planetary Health Diet” (PHD) with the Institute for Health Metrics and Evaluation (IHME) Global Burden of Disease Study 1990-2017 (GBD2017) dietary and other risk factor data. In the PHD/GBD comparison, we also intended to show the relevance of a new multiple regression analysis methodology with dietary and non-dietary risk factors (independent variables) for noncommunicable disease (NCD) deaths/100000/year in males and females 15-69 years old from 1990 to 2017 (NCDs, dependent variable). Methods We formatted worldwide GBD2017 dietary risk factors and NCD data on 1120 worldwide cohorts to obtain 7846 population-weighted cohorts. Each cohort represented about one million people, totaling about 7.8 billion people from 195 countries. With an empirically derived methodology, we compared the PHD animal- and plant-sourced food recommended ranges (kilocalories/day=KC/d) with optimal dietary ranges (KC/d) from GBD cohort data. Using GBD data subsets with low and high animal food consumption cohorts, our new GBD multiple regression formula derivation methodology equated risk factor formula coefficients to their population-attributable risk percents (PAR%s). Results We contrasted PHD recommendations for the available 14 dietary risk factors (KC/d means and ranges) with our GBD analysis methodology’s optimal ranges for each dietary variable (KC/d mean and range): PHD beef, lamb, and pork mean: 30 KC/d (range: 0-60 KC/d)/GBD processed meat: 8.86 (1.69-16.03)+GBD red meat: 44.52 (20.37-68.68), PHD fish: 40 (0-143)/GBD: 19.68 (3.45-35.90), PHD whole milk or equivalents: 153 (0-306)/GBD: 40.00 (18.89-61.11), PHD poultry: 62 (0-124)/GBD: 56.10 (24.13-88.07), PHD eggs: 19 (0-37)/GBD: 19.42 (9.99-28.86), PHD: saturated oils 96 (0-96)/GBD added saturated fatty acids (SFA): 116.55 (104.04-129.07), PHD all added sugars: 120 (0-120)/GBD sugary beverages: 286.37 (256.99-315.76), PHD tubers or starchy vegetables: 39 (0-78)/GBD potatoes: 84.16 (75.75-92.58)+GBD sweet potatoes: 9.21 (4.05-14.37), PHD fruits: 126 (63-189)/GBD: 63.03 (21.61-113.71), PHD vegetables: 78.32 (9.48-196.14)/GBD: 85.05 (66.75-103.36), PHD nuts: 291 (0-437)/GBD nuts and seeds: 10.97 (5.95-15.98), PHD whole grains: 811 (811/811)/GBD: 56.14 (50.53-61.76), PHD legumes: 284 (0-379)/GBD: 59.93 (45.43-74.43), and total animal food PHD: (0/400)/GBD: 329.84 (212.49-447.19). Multiple regression low and high animal food subsets’ (animal foods mean=147.09 KC/d versus animal foods mean=482.00 KC/d) formulas each with 28 dietary and non-dietary risk factors (independent variables) accounted for 52.53% and 28.83% of their respective total formula PAR%s with NCDs (dependent variable). Conclusions GBD data modeling supported many but not all the PHD dietary recommendations. GBD data suggested that the amount of consumption of animal foods was the dominant determinate of NCDs of countries globally. Adding to the univariate associations, multiple regression risk factor formulas with risk factor coefficients equated to their PAR%s further elucidated dietary influences on NCDs. This paper and the soon-to-be-released IHME GBD2021 (1990-2021) data should help inform the EAT-Lancet 2.0 Commission’s work.


Introduction
The main characteristics of the IHME GBD data sources, the protocol for the GBD study, and all risk factor values have been published by IHME GBD data researchers and discussed elsewhere [5]. These include detailed descriptions of categories of input data, potentially important biases, and methodologies of analysis. We did not clean or preprocess any of the GBD data. GBD cohort risk factor and health outcome data from the IHME had no missing records other than dietary covariates (poultry, eggs, potatoes, corn, rice, and sweet potatoes) for the United States. Table 1 lists the relevant GBD dietary risk factors and covariates with definitions of those risk factor exposures [6].
Food risk factors came from surveys that IHME researchers utilized as grams/day (g/day) consumed on average. GBD dietary covariate data originally came from Food and Agriculture Organization [7] surveys of animal and plant food commodities available per capita in countries worldwide (i.e., potatoes, corn, rice, sweet potatoes, poultry, and eggs), as opposed to consumed on average.

Study design and population
For NCDs, with dietary risk factors, non-dietary risk factors, and combinations of dietary or non-dietary risk factors, we averaged the values for ages 15-49 years old together with 50-69 years old for each male and female cohort for each year. Finally, for each male and female cohort, we averaged data from all 28 years (1990-2017) of the means of the rate of NCDs and dietary and other risk factor exposures using the computer software program R (R Foundation for Statistical Computing, Vienna, Austria).
World population data from the World Bank or the Organization for Economic Co-operation and Development could not be used because they did not include all 195 countries or any subnational data. To weigh the data according to population, internet searches (mostly Wikipedia) yielded the most recent population estimates for countries and subnational states, provinces, and regions. The 1120 GBD cohorts available were population-weighted by the software program R, resulting in an analysis dataset with 7846 population-weighted cohorts, representing about 7.8 billion people projected for 2019. Each male or female cohort in the population-weighted analysis dataset represented approximately one million people (range: from <100000 to 1.5 million). Table 2 details how omega-3 fatty acid gram/day was converted to fish gram/day using data on the omega-3 fatty acid content of frequently eaten fish from the National Institutes of Health Office of Dietary Supplements (the United States) [8].   As shown in Table 3, we converted all of the animal and plant food data, including alcohol and sugary beverage consumption, from gram/day to KC/d. For the gram/day to KC/d conversions, we used the Nutritionix Track app (Nutritionix LLC, Washington, DC) [9], which tracks types and quantities of foods consumed.  Saturated fatty acids (SFA: 0-1 portion of the entire diet KC/d) was not available with GBD2017 data, so we used GBD SFA risk factor data from GBD2016. Polyunsaturated fatty acid (PUFA) and trans fatty acid (TFA) GBD risk factor data from 2017 (0-1 portion of the entire diet KC/d) were also utilized, but monounsaturated fat data were not available. These fatty acid data expressed for each cohort as 0-1 portion of the entire diet were converted to KC/d by multiplying by the total KC/d available per capita (a covariate from the Food and Agriculture Organization [7]). deaths/100000/year of male and female cohorts 15-69 years old from (1) cardiovascular diseases, (2) type 1 diabetes, (3) type 2 diabetes, (4) chronic respiratory disease, (5) chronic renal disease, (6) liver cirrhosis, (7) inflammatory bowel disease, (8) liver cancer, (9) esophageal cancer, (10) stomach cancer, (11) prostate cancer, (12) breast cancer, (13) bladder cancer, (14) non-Hodgkin's lymphoma, (15) ovarian cancer, (16) brain cancer, (17) lung cancer, (18) multiple myelomas, (19) colorectal cancer, (20) kidney cancer, (21) melanoma, (22) pancreatic cancer, and (23) many other less common noncommunicable diseases.
We did not include three of the plant food covariates (potatoes, corn, and rice) with the healthy plant foods because of the following: (1) Half or more of potatoes available worldwide were ultra-processed and contained many additives [11]. (2) Corn available included high-fructose corn syrup as demonstrated by the high correlation of corn with sugary beverages in this database (r=0.330, 95% confidence interval {CI}=0.310-0.349, p<0.0001). (3) Rice available was mostly refined without bran (the fibrous outer layer) and germ (the nutritious core). Whole grain rice was included in the analysis with the whole grains.  The mean ratio of the 169 countries was 0.556. We used this value as the ratio for the 26 countries that did not have data. To derive the added SFA, added PUFA, and added TFA for each country, we multiplied the SFA, PUFA, and TFA of that country by their respective ratios of the added fats and oils/total fat (KC/d).

Statistical methods
To determine the strengths of the risk factor correlations with NCDs, we utilized Pearson correlation coefficients: r, 95% confidence intervals (CIs), and p values. We did this for the entire analysis dataset and subgroups including continents, countries, and sociodemographic index quartiles.
The EAT-Lancet Commission authors [2] considered 0-400 KC/d of animal foods as optimal for human health and global climate. For our comparison, we began with finding the 1000 cohorts (500 pairs of males and females), representing about one billion people 15-69 years old with the lowest NCDs (mean male/female {m/f}). From these low-NCD cohorts, we obtained the subsets with mean animal-sourced food consumption of <400 KC/d and ≥400 KC/d. From this start, we defined low and high animal food subsets as follows: (1) Low animal food subset=animal food seven of <400 KC/d of the lowest 1000 NCD cohorts (mean KC/d m/f)+all cohorts with animal food seven (mean m/f KC/d)<the animal food seven of the lowest NCD cohorts' country with the lowest animal food seven (e.g., Kenya). (2) High animal food subset=all 1000 lowest NCD cohorts+all cohorts with animal food seven of ≥400 KC/d.
With these low animal food and high animal food subsets, we derived multiple regression formulas from NCDs (dependent variable) versus dietary and other risk factors (independent variables). See Appendices for the detailed methodology of deriving multiple regression formulas with risk factor coefficients equated to their population-attributable risk percents (PAR%s).
We also developed a methodology for estimating the optimal dietary risk factor (KC/d) range upper and lower boundaries to minimize NCDs (see Appendices). We used SAS OnDemand for Academics software 9.4 (SAS Institute, Cary, NC) for the data analysis.  There were 23 dietary risk factors potentially relating to NCDs, including two risk factor combinations, six dietary covariates, and total available KC/d. Table 5 also includes 20 non-dietary risk factors that we screened for significant PAR%s for NCDs. Table 6 shows the 500 pairs (mean males/females) of cohorts (n=1000 cohorts, representing about one billion people) with the lowest NCDs and the 500 pairs of cohorts with the highest NCDs.   This breakdown will facilitate the comparisons of the EAT-Lancet Commission's Planetary Health Diet recommendations with GBD data analysis. Table 8 and Table 9 list the low-NCD countries in Table 7, distinguishing the low-NCD countries or subnational states/regions with mean animal food seven of <400 KC/d and animal food seven of ≥400 KC/d.    Table 10 shows scenarios of data from 12 continents, countries, and SDI quadrants that illustrate the diverse relationships of animal food seven with NCDs in different subsets of the global analysis dataset. For example, animal food seven positively correlated with NCDs in five subsets, negatively correlated with NCDs in six subsets, and has no significant correlation in one.

Multiple regression-derived formulas of risk factors versus NCDs
With methods detailed in Appendices, we derived a multiple regression risk factor formula with risk factors from the lowest 416 NCD cohort pairs with mean m/f animal food seven of <400 KC/d ( Table 7) together with m/f pairs of cohorts with mean animal food seven of <149 KC/d (the lowest mean m/f animal food seven {KC/d} in Table 3, Kenya).   Table 12 shows the three-step derivation of the multiple regression risk factor formula from this low animal food seven subset with paired risk factors.
Step 1 d     Table 14 shows the derivation for the above subset (n=2724 cohorts) with individual cohorts.
Step 1 d  Note the similarities and differences of the low animal food seven formulas. All animal food seven and healthy plant seven risk factors in both formulas had negative coefficients. This suggests that they would have had lower risk of early deaths from NCDs with higher consumption of healthy animal and plant foods.
In contrast to Table 15 with mean values of male/female cohorts, Table 17 shows the high animal food seven subset with individual male and female cohorts.  From the above Table 17 subset, Table 18 shows the two-step derivation of the high animal food seven multiple regression risk factor formula.
Deriving multiple regression risk factor formulas with mean risk factor values from male and female cohorts eliminated the dominant role of sex in some of the PAR% values. The difference between the total formula PAR% with m/f mean values and unpaired risk factors was accounted for by sex differences. Illustrating this factor, males worldwide had much higher NCDs and were exposed to more meat than females (e.g., mean red meat: males=60. Major differences between exposures of males and females affected the multiple regression formula when males and females were unpaired but not when combined as mean values.

Comparing Planetary Health Diet (PHD) recommendations with GBD data
In what is analogous to the 22 dietary risk factors with GBD data, EAT [1] published the PHD animal-and plant-based foods (KC/d) recommended ranges. To compare GBD data with the PHD, we began with the mean m/f values from  Table 19 shows the three-step derivation of estimates of the optimal ranges for 22 dietary risk factors, as well as the PHD KC/d suggested dietary ranges of 14 dietary risk factors.
Step 1 f Step 2  . This suggested that less animal food consumption, other factors being equal, would reduce NCD risk. Less animal food seven would also decrease the risks of common cancers ( Table 8 and Table 9).

Discussion
The EAT-Lancet Commissioners [2] designated red meat as a detrimental food item for which worldwide consumption should be reduced by more than 50%. Table 19 indicated more nuanced health effects of red meat and processed meat than simply being detrimental at any consumption level. These GBD data suggested that minimizing NCDs in developing countries with low animal food seven requires dramatically increasing meat production and consumption. Conversely, reducing meat consumption in high-meateating, wealthy countries such as the United States (mean m/f red meat=138.72 KC/d, and mean m/f processed meat=39.90 KC/d, Table 9) would associate with lower NCDs.
Since the PHD lower boundaries for all animal foods were zero, GBD data did not support the PHD recommendation that humans can thrive on a lifelong vegan diet. There has never been a documented case of a human living into old age without ever eating animal-sourced foods. Evolutionary biologist Katharine Milton [13] persuasively maintained that humans could not have satisfied the high nutritional and metabolic demands required to develop a highly evolved, large brain without meat. This is not to say that adopting a vegan diet to counteract overweight or obesity with the associated metabolic and other complications would be inappropriate. Table 19 shows that the GBD fish consumption optimum range mean (19.68 KC/d) almost doubled the mean consumption of fish worldwide (9.99 KC/d, Table 5). The upper boundary of the fish optimal ranges with GBD data and the PHD recommendation are similar (GBD: 35.90 KC/d versus PHD: 40 KC/d, Table 19). In any case, it will be challenging to even double worldwide fish consumption. About 60% of world fish stocks are fully fished, more than 30% are overfished, and catch by global marine fisheries has been declining since 1996. In addition, a rapidly expanding aquaculture sector can negatively affect coastal habitats and freshwater and terrestrial systems (related to the area directly used for aquaculture and feed production) [14]. To improve human health and reduce NCDs, environmentally regenerating methods of fish farming and aquaculture should be sought.
It would take increasing the consumption of milk-derived products by over sixfold worldwide to achieve the PHD 2050 mean milk recommendation of 153 KC/d ( Table 19). The 1990-2017 worldwide mean per capita milk consumption is 25.04 KC/d ( Table 5). While increasing global milk output sixfold, it would be practically impossible to halve global processed and red meat consumption (recommended in the PHD). However, with the GBD optimum range mean m/f milk consumption being 40.00 KC/d (Table 19), GBD data suggest that significantly increasing worldwide dairy cow milk production with additional cows going predominantly to developing countries may reduce global NCDs. Countries with mean m/f milk production and consumption greater than the upper boundary of the GBD optimal range (mean m/f milk of >61.11 KC/d, Table 19) might want to reduce milk production, which has been proposed in some European countries based on greenhouse gas emissions [15].
Except for dairy food consumption, the comparisons of GBD with PHD optimal dietary ranges of animal food seven in Table 19 show a significant degree of concordance in orders of magnitude of the mean and upper boundary values for (1) processed meat+red meat/beef, lamb, and pork; (2) fish; (3) poultry; (4) eggs; (5) added SFA; and (6) animal food seven.
With the GBD animal food seven optimal range of 212.49-447.19 KC/d (Table 19) and 20 low-NCD countries with <400 KC/d animal food seven consumption ( Table 8), GBD data support the EAT-Lancet Commission's contention that >400 KC/d of animal food seven is not required for optimal human health. Indeed, early deaths from common cancers were much lower in cohorts with <400 KC/d than cohorts with animal foods of ≥400 KC/d ( Table 8 and Table 9).
The amounts of sugary beverages in the GBD optimal range (256.99-315.76 KC/d) was clearly not optimal. The PHD recommendation of 0-120 KC/d for all added sugar would be better for global health but probably not practical. The low global price of sugar ($0.214/pound in February 2023; one pound of sugar contains 1864 KC [16]) suggests that sugar is replacing healthy foods especially in poor countries. Compared with the rest of the world (mean m/f sugary beverages=298.36 KC/d,  Table 19). The large reduction in starchy vegetable intake recommended in the PHD appeared to be based on prospective observational studies by the Harvard Department of Nutrition [17,18] that showed potatoes are associated with an increased risk of type 2 diabetes and hypertension among US health professionals. However, half or more of the potatoes consumed worldwide were in the form of ultra-processed food products [11]. Data from 79 high-and middle-income countries showed that ultra-processed products dominate the food supplies of high-income countries and that their consumption is now rapidly increasing in middle-income countries [19]. Indeed, recent large prospective observational studies have found higher consumption of ultraprocessed foods including potatoes associated with an increased risk of cardiovascular disease incidence and mortality [20].
Maillot and associates [21] found in an econometric evaluation of food groups that "Starches and grains were unique because they were low in disqualifying nutrients yet provided low-cost dietary energy." Headey and Alderman [22] found that "In lower-income countries, healthy foods were generally expensive, especially most animal-sourced foods." Given the low cost of starchy vegetables relative to animal foods, fruits, vegetables, and nuts and seeds, there would seem to be no reason to severely reduce starchy vegetable consumption (including minimally processed potatoes) worldwide.
This analysis shows that these crop increases would not be practical from a worldwide farming perspective and would not be necessary to minimize NCDs. If the global population moved animal food seven consumption into the GBD optimal range to minimize NCDs (212.49-447.19 KC/d), the global animal food consumption would probably not increase above the worldwide mean animal food seven consumption from 1990 to 2017 (worldwide mean animal food seven=254.66 KC/d, Table 5).  Table 5).
The global average rice intake was 152.00 KC/d ( Table 5). In Table 11 . This might be explained by relatively inexpensive rice substituting for more expensive healthy animal and plant foods in poor countries. It might also relate to mostly refined white rice (without bran {the fibrous outer layer}) and germ (the nutritious core) having less nutrition than whole grain rice) [23].
In 2014, Mozaffarian et al. [24] attributed 1.65 million cardiovascular deaths worldwide to sodium consumption above 2.0 g per day. However, based on a prospective cohort study, O'Donnell et al. [25] reported an optimal average sodium intake range of 3-5 g/day, with cardiovascular events most prominently associated with higher sodium intake (>5 g/day) in those with hypertension. The joint working group of the World Heart Federation, the European Society of Hypertension, and the European Public Health Association in 2017 [26] concluded that the guidelines restricting sodium intake were far too restrictive.
In this GBD analysis, Japanese had the world's highest mean sodium (gram/day) (sodium=6.01 g/day versus global average sodium {gram/day}=4.45 g/day, Table 5). Japanese also had a relatively high prevalence of smoking (smoking prevalence=26.8%) and the lowest mean NCDs in the world after Kuwait (mean Japanese NCDs=725.61 deaths/100000/year). Even a 5 g/day guideline may not be needed for people without medical indications for restrictions on sodium intake. The American Heart Association might note these GBD data in reconsidering sodium intake recommendations.
The worldwide negative correlation of LDL cholesterol with NCDs was also unexpected (r, -0.339; 95% CI, -0358 to -0.319; p<0.0001, Table 5). In the high animal food seven subset (NCDs of <1070.23 or animal food seven of >400 KC/d, n=1722 cohorts m/f, The non-dietary risk factors in the multiple regression risk factor formula have plausible PAR%s given whether the subset analyzed had low animal food seven (Tables 10-13) or high animal food seven (Tables  15-17). As might be expected, vitamin A deficiency in children, severe underweight in children, ambient air pollution, and household air pollution were prominent in the low animal food seven/low sociodemographic index (mean SDI=0.411) cohorts. Smoking prevalence appeared only when cohorts were unpaired (Table 14), allowing the higher NCDs and higher smoking prevalence in males to have full influence.
In the high animal food seven/high sociodemographic index cohorts (mean SDI=0.750, Tables 15-17), the major non-dietary risk factors were stopping breast feeding before six months, smoking prevalence, lead, and body mass index.
The limitations of this study included using observational data, which can only show association and cannot establish causation between risk factors and NCDs. Also, this study focused on the relationship between diet and NCDs at the population level and did not provide individual-level analysis. Our study was subject to all the limitations discussed in previous GBD publications [29,30]. These included gaps, biases, and inconsistencies in data sources, as well as limitations in the methods of data processing and estimation.
Having comprehensive data on dietary inputs is key to more accurate and reliable analyses. These GBD data on animal foods, plant foods, alcohol, sugary beverages, and fatty acids were not comprehensive and comprised only 1218.98 KC/d per person on average worldwide ( Table 5). Subnational data on all risk factors were available in only four countries. Because the data formatting and statistical methodology were new, this was necessarily a post hoc analysis, and no pre-analysis protocol was possible. We and other researchers should repeat this GBD data analysis when the IHME releases the GBD2021 data and make them available to IHME volunteer collaborators.
GBD data modeling supported many but not all of the PHD dietary recommendations. This evidence-based methodology of analyzing IHME GBD data may have advantages over systematic literature review studies in developing health policy strategies, clinical practice guidelines, and public health recommendations. First, using a form of artificial intelligence (a large dataset from 195 countries), this study provided comprehensive analyses of the relationship between dietary and non-dietary risk factors and NCDs in selected subsets. Second, it provided estimates of optimal ranges of food risk factors for minimizing NCDs, using a methodology that can apply to individual noncommunicable diseases (e.g., colon cancer, ischemic heart disease, or BMI). Third, the multiple regression analyses provided quantitative formulas for estimating the risk of NCDs based on various risk factors in selected subsets of the GBD data. This can be useful for identifying high-risk populations and targeting interventions. Last but not least, this study included data on 20 low-NCD countries with relatively low animal food intake (mean m/f animal food seven of <400 KC/d). This can be helpful for identifying dietary and lifestyle patterns that may be protective against NCDs or other health outcomes (e.g., BMI). It can also lead climate scientists to learn from countries that have limited greenhouse gas emissions from animal foods while achieving low NCDs.

Appendices
Appendix 1: Methodology for deriving multiple regression risk factor formulas Table 20 provides an overview of the steps in deriving the multiple regression risk factor formula Step Name  Derive the parameter estimates and the partial R 2 of all the variables.
Multiply each dietary and non-dietary risk factor by its parameter estimate or its partial R 2 based on empirical judgement.

10
From the above, derive a preliminary NCD risk factor formula and a final risk factor formula.
Copy the preliminary risk factor formula into Excel.
Algebraically equate the preliminary risk factor formula into the final risk factor formula. Our multiple regression formula derivation method differed from standard modeling in several important ways. We did not seek to minimize the number of individual dietary and non-dietary risk factors included or to maximize the total variance (and population-attributable risk percents {PAR%s}) of each formula. Instead, we developed several strategies to combat the confounding of risk factors by risk factor to risk factor interactions and to enhance the plausibility of each risk factor's PAR%.
GBD analysis database subsets were used to derive two risk factors versus NCD multiple regression formulas: (1) The first analysis included those cohorts with mean (m/f pairs) animal food seven of <400 KC/d out of the 500 pairs of the lowest NCD cohorts (1000 cohorts, representing about one billion people) and all other mean m/f cohorts with animal food intake less than the mean m/f animal food consumption of the lowest NCD country in the subset (e.g., Kenya). (2) The second included all 500 m/f pairs of cohorts with the lowest NCDs and all other cohort pairs with mean animal food seven consumption of ≥400 KC/d (i.e., all 500 pairs of cohorts in the lowest NCD subset and say 500-1500 pairs of other cohorts with mean m/f animal food seven of ≥400 KC/d).
Using Statistical Analysis System (SAS) and Excel (spreadsheets), the resulting multiple regression analysisderived formulas with dietary and other risk factors (independent variables) for NCDs (dependent variable) came from these subsets. Since there was no published ecological epidemiologic methodology to derive PAR%s for each of >20 risk factors, we used the following empirically developed 10 steps in the multiple regression analyses: independent variables, they would be multiplied by the partial R 2 instead of the parameter estimate to capture only the additional total formula R 2 they contributed to the risk factor formula. An implausible dietary risk factor sign (e.g., "-" for sugary beverages or alcohol) might be reversed when the risk factor became an independent variable in the multiple regression. All independent risk factors would have as their coefficients their partial R-squared values. If implausible risk factor(s) signs were not reversed in the multiple regression, the partial R-squared coefficient(s) would be reversed in the plausible direction. (c) Step 9a-b created a single combination risk factor variable composed of all the dietary and non-dietary risk factors. We called this preliminary risk factor formula 1 and copied it into a data step in SAS. 10. With the single combination risk factor variable derived in step 9, we performed the following steps to equate the risk factor coefficients to their PAR%s: (a) In Excel, we totaled the risk factor coefficients of the single combination risk factor variable ("preliminary risk factor formula 1"). (b) We determined the correlation (r) of the preliminary risk factor formula 1 in SAS, copied it into Excel, and subsequently calculated the R 2 of the risk factor formula. (c) We then divided the preliminary risk factor formula 1's R 2 (step 10b) by the sum of the absolute values of the risk factor coefficients (step 10a) to generate a multiplier.
(d) We copied preliminary risk factor formula 1 onto an adjacent location in Excel in preparation to equate the risk factor coefficients to their PAR%s by using the multiplier. (e) We then multiplied each risk factor coefficient in step 10d with the multiplier. (f) We multiplied times 100 to derive the final risk factor formula with coefficients equated to final PAR%s. (g) Finally, we then took the final risk factor formula from step 10f to the PROC CORR function in SAS to confirm that it had the same r and R 2 as preliminary risk factor formula 1. Table 21 provides a methodology synopsis.

Appendix 2: Methodology for deriving the optimal ranges of dietary risk factors (KC/d)
Step Name Intent/purpose 1 Intent/purpose 2

1
Select the subsets to be used for deriving the optimal ranges for dietary risk factors (KC/d).
Select a low animal food seven subset. Select a high animal food seven subset.  From the two GBD subsets used in deriving the multiple regression risk factor formulas (Appendix 1), we derived optimal range estimates for 22 dietary risk factors (including animal food seven and healthy plant seven) with the following steps: