Hostname: page-component-7c8c6479df-ws8qp Total loading time: 0 Render date: 2024-03-28T21:24:42.966Z Has data issue: false hasContentIssue false

Field validation of food outlet databases: the Latino food environment in North Carolina, USA

Published online by Cambridge University Press:  17 June 2014

Pasquale E Rummo*
Affiliation:
University of North Carolina at Chapel Hill, Carolina Population Center, CB# 8120, University Square, 123 West Franklin Street, Chapel Hill, NC 27516-2524, USA
Penny Gordon-Larsen
Affiliation:
University of North Carolina at Chapel Hill, Carolina Population Center, CB# 8120, University Square, 123 West Franklin Street, Chapel Hill, NC 27516-2524, USA Department of Nutrition, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Sandra S Albrecht
Affiliation:
University of North Carolina at Chapel Hill, Carolina Population Center, CB# 8120, University Square, 123 West Franklin Street, Chapel Hill, NC 27516-2524, USA Department of Nutrition, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
*
*Corresponding author: Email prummo@live.unc.edu
Rights & Permissions [Opens in a new window]

Abstract

Objective

Obtaining valid, reliable measures of food environments that serve Latino communities is important for understanding barriers to healthy eating in this at-risk population.

Design

The primary aim of the study was to examine agreement between retail food outlet data from two commercial databases, Nielsen TDLinx (TDLinx) for food stores and Dun & Bradstreet (D&B) for food stores and restaurants, relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA. We also examined differences by proportion of Hispanic population (</≥23·4 % Hispanic population) in the census tract and for outlets classified in the field as ‘Latino’ on the basis of signage and use of Spanish language.

Setting

One hundred and seventy-four food stores and 337 restaurants in Durham County, NC, USA.

Results

We found that overall sensitivity of food store listings in TDLinx was higher (64 %) than listings in D&B (55 %). Twenty-five food stores were characterized by auditors as Latino food stores, with 20 % identified in TDLinx, 52 % in D&B and 56 % in both sources. Overall sensitivity of restaurants (68 %) was higher than sensitivity of Latino restaurants (38 %) listed in D&B. Sensitivity did not differ substantially by Hispanic composition of neighbourhoods.

Conclusions

Our findings suggest that while TDLinx and D&B commercial data sources perform well for total food stores, they perform less well in identifying small and independent food outlets, including many Latino food stores and restaurants.

Type
Short Communication
Copyright
Copyright © The Authors 2014 

Evidence suggests that the local food environment has implications for diet and physical activity behaviours( Reference Caspi, Sorensen and Subramanian 1 Reference Gustafson, Hankins and Jilcott 6 ), but a lack of accurate environmental data remains problematic for much of this research( Reference Rossen, Pollack and Curriero 7 Reference Liese, Colabianchi and Lamichhane 10 ). Most studies have relied on two secondary commercial data sources, Dun & Bradstreet (D&B) and InfoUSA, to characterize the retail food environment( Reference Fleischhacker, Rodriguez and Evenson 9 Reference Gustafson, Lewis and Wilson 14 ). Results from field validation studies demonstrate moderate levels of agreement between these data sources and ground-level observations( Reference Gustafson, Hankins and Jilcott 6 , Reference Fleischhacker, Rodriguez and Evenson 9 Reference Han, Powell and Zenk 11 , Reference Powell, Han and Zenk 13 , Reference Longacre, Primack and Owens 15 ), suggesting that these data are best used in combination when characterizing the retail food environment( Reference Auchincloss, Moore and Moore 12 , Reference Bader, Ailshire and Morenoff 16 Reference Svastisalee, Holstein and Due 18 ).

However, no studies have assessed the validity of Nielsen TDLinx (TDLinx), a commercial database known for its rigorous data collection and research-based outlet type classification( Reference Auchincloss, Moore and Moore 12 , 19 , Reference Hoehner and Schootman 20 ). Unlike other commercial databases that update listings on a quarterly basis (e.g. D&B), TDLinx updates its listings on a monthly basis( 21 ), providing an advantage in areas with rapid food outlet turnover.

Few studies have evaluated the accuracy of commercial listings to characterize the food environment in areas with a high proportion of Latino residents, a fast-growing segment of the US population with high risk for diet-related chronic diseases( Reference Perez-Escamilla 22 ). While Latinos, particularly less acculturated Latinos, tend to shop at tiendas ( Reference Ayala, Mueller and Lopez-Madurga 23 , Reference Kaufman and Karpati 24 ), it is unknown how well represented tiendas and other small specialty stores are in commercial data sources. Obtaining valid, reliable measures of food environments in Latino communities is important for understanding barriers to healthy eating in this at-risk population.

Our primary aim was to examine agreement between retail food outlet data from two commercial databases, TDLinx (food stores only) and the D&B Duns Market Identifiers File (food stores and restaurants), relative to a field-based census of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA of varying Hispanic population composition. We also tested whether agreement differed by Hispanic composition of the census tract and by field-based classification of ‘Latino’ stores.

Methods

Geographic area

Direct field observations were conducted in thirty-one of the sixty census tracts in Durham County, NC, USA, an area experiencing rapid population growth and increase in its Hispanic population( 25 ). Census tracts were selected to obtain a balanced representation of neighbourhoods with predominantly Hispanic, Black and White populations. Census tracts with the highest proportions each for non-Hispanic White (n 10), non-Hispanic Black (n 10) and Hispanic (n 10) were visited. Given its population and food outlet density, we also included the census tract containing the Central Business District (CBD). The observed tracts represented 49·9 % of the Durham County population.

Data sources

We obtained data for Durham County from two commercial databases for 2012: Nielsen TDLinx (referenced May 2012; Nielsen, New York, NY, USA)( 19 ) and the D&B Duns Market Identifiers File (referenced July 2012; Dun & Bradstreet, Inc., Short Hills, NJ, USA)( 26 ). TDLinx uses official industry-standard definitions for food store categories when available or its own rigorously developed definitions supported by trade associations (e.g. Food Marketing Institute) and trade publications (e.g. Progressive Grocer), classified with a standard trade channel and sub-channel code (see online supplementary material, Supplemental Table 1). D&B uses eight-digit US Census standard industry classification (SIC) codes to categorize food outlets. TDLinx only captures food stores with ≥$US 1 million in sales, while D&B does not have a criterion for sales volume and collects both food store and restaurant data.

Table 1 Agreement statistics for Nielsen TDLinx (TDLinx) and Dun & Bradstreet (D&B), by food store and restaurant type, relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA, July–August 2012

N/A, not applicable.

* Match defined as a food store or restaurant observed in the field and listed in a secondary data source.

Food outlets that were classified as both a food store and a restaurant by the field auditors (n 10) were included in both the food store and restaurant counts, regardless of their classification in secondary data sources.

One food store was not given a category during the field audit.

§ One food store in the grocery and supermarket category was not given a sub-category during the field audit.

|| Two restaurants were not given a store sub-category during the field audit.

Three matches were categorized as ‘both food store and restaurant’ by the field team.

** Eight matches were categorized as ‘both food store and restaurant’ by the field team.

†† Eight matches were categorized as ‘both food store and restaurant’ by the field team.

Field census

We developed an iPad data collection program adapted from a web-based Counter Tobacco Audit Tool( Reference Ribisl and Myers 27 ), that was preloaded with harmonized categories of food stores (from TDLinx and D&B) and restaurants (D&B); categories not found in the TDLinx and D&B databases for Durham County were classified as ‘Other’ (Supplemental Table 1). Between July and August 2012 (4 weeks), two teams of two trained data collectors each conducted a driving census of all food stores and restaurants in the thirty-one census tracts, recording and classifying all food outlets and collecting latitude and longitude of the locations using the iPad data collection tool.

The pairs of field data collectors (one driver, one data collector) drove all roads and streets in each census tract except private, unpaved or residential roads. All food outlets open for business and selling publicly accessible food were included and the following data were collected: name, address, latitude/longitude, currently open/closed, outlet type and whether it was a primarily Latino outlet. Conjoined outlets (e.g. KFC/Taco Bell) were separately classified as two outlets. All field censuses took place between 09·00 and 17·00 hours, with data collection from the car except in the CBD where, due to store density, data were collected on foot.

Outlets were classified as a food store, restaurant or both using categories and type sub-categories (Supplemental Table 1) based on characteristics observed from the outside and at the entrance of each establishment. Size of the facility, items sold, type of service provided and posted menus (restaurants only) guided the selection of outlet type. Stores and restaurants were classified as Latino/non-Latino on the basis of store name and language of signage on windows and doors (English, mostly Spanish, both languages equally)( Reference Emond, Madanat and Ayala 28 ).

Reliability analysis

Inter-rater reliability for identifying food outlets was conducted in the census tract that contained the CBD and a second census tract containing the largest number of food outlets. The observed proportion of agreement for both census visits in each tract (i.e. number of agreements divided by the total observations) was calculated for food stores, restaurants and total food outlets.

Statistical analysis

Sensitivity (proportion of outlets observed on the ground that were listed in the commercial databases) was calculated to assess the level of agreement between field census and secondary data sources of food stores (TDLinx and D&B) and restaurants (D&B), with the field census considered as the ‘gold standard’. Food outlets from the field census and commercial databases were matched based on food outlet name and address. Sensitivity was calculated by Latino/non-Latino classification and by Hispanic composition of the census tract (defined as ≥23·4 % Hispanic population (upper quartile of distribution)). Food outlets present in TDLinx or D&B and absent from the field census were investigated using the databases’ latitude and longitude coordinates and ArcGIS and Google Earth.

Results

Inter-rater reliability was 91 % for all food outlets in one census tract and 79 % in the census tract containing the CBD, a tract with relatively high number of outlets. The data collectors identified 174 food stores on the ground across the thirty-one census tracts (Table 1). One hundred and eleven (64 %) and ninety-five (55 %) of these food stores were listed in TDLinx and D&B, respectively. For TDLinx and D&B combined, 131 (75 %) food stores observed on the ground were listed in either source. For TDLinx, sensitivity was highest for convenience stores (76 %), whereas agreement in D&B was highest for grocery stores and supermarkets (65 %); levels of agreement in TDLinx and D&B were lowest for small specialty stores (6 % and 29 %, respectively).

The field data collectors identified 337 restaurants (Table 1). Among these, 228 (68 %) were listed in D&B. A moderately high number of counter-service restaurants and sit-down restaurants were missing from D&B (40 % and 32 %, respectively).

Twenty-five food stores were characterized by data collectors as Latino food stores, with 20 % identified in TDLinx, 52 % in D&B and 56 % in either D&B or TDLinx (Table 2). The data collectors identified twenty-six Latino restaurants, 38 % of which were listed in D&B. Agreement between the databases and the field census of food stores and restaurants did not differ substantially by Hispanic composition of census tracts (Table 2).

Table 2 Agreement statistics for Latino food stores and restaurants, and for all food stores and restaurants by Hispanic composition of the census tract, for Nielsen TDLinx (TDLinx) and Dun & Bradstreet (D&B) relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA, July–August 2012

N/A, not applicable.

* Match defined as a food store observed in the field and listed in a secondary data source.

Food outlets that were classified as both a food store and a restaurant by the field auditors (n 10) were included in both the food store and restaurant counts, regardless of their classification in secondary data sources.

One food store was not given a store sub-category during the field audit.

§ Hispanic census tract defined as ≥23·4 % Hispanic population (upper quartile).

Discussion

Studies investigating associations between neighbourhood food environments and health outcomes commonly use commercial data sources to characterize the food environment. These data are usually less expensive and more time-efficient than direct field observations, albeit of lesser quality and validity. Secondary data sources often underestimate total food outlets, resulting in inaccuracies that may bias study findings( Reference Liese, Colabianchi and Lamichhane 10 ). Furthermore, the quality and validity of these data may differ by racial/ethnic composition of the population( Reference Rossen, Pollack and Curriero 7 , Reference Fleischhacker, Rodriguez and Evenson 9 , Reference Han, Powell and Zenk 11 , Reference Powell, Han and Zenk 13 ). While others have investigated validity of food stores in rural( Reference Sharkey, Dean and Nalty 29 , Reference Pitts, Bringolf and Lawton 30 ) and Native American communities( Reference Fleischhacker, Rodriguez and Evenson 9 ), there has been little research in Latino communities and by Latino food outlets, despite the fact that Latinos are at high risk for diet-related chronic diseases.

No research to date has investigated the validity of TDLinx, a comprehensive and time-varying database of retail food stores. We found that overall agreement between field census and TDLinx data in Durham, NC, USA was higher than that for D&B, suggesting that TDLinx may be more useful for characterizing total food stores. Additionally, we found that combining both secondary data sources improved overall accuracy by 12 % (75 % for both databases minus 63 % for TDLinx alone). On the other hand, the comparatively low levels of agreement in TDLinx and D&B for small specialty stores (6 % and 29 %, respectively) suggests that smaller stores were poorly identified by both databases. Our reliability assessment in the CBD indicated agreement for thirty-seven of forty-seven food stores and restaurants. We speculate reasons for relatively poor reliability included: stores were closed (n 2); lack of signage or poor signage (n 5); and human error (n 3), potentially due to high density of stores in the CBD (n 47).

In our study, the accuracy of food outlet listings in both databases did not differ considerably between Hispanic and non-Hispanic census tracts. TDLinx captured only 20 % of Latino stores (compared with 68 % of overall stores), while D&B performed better, capturing 58 % of Latino stores (compared with 52 % of overall stores). However, Latino-specific accuracy was much poorer than in the total sample. Furthermore, the added value of using both databases for this purpose was minimal (56 %), suggesting that both secondary data sources may be inadequate for characterizing local Latino food stores. However, it is possible that such food stores in Durham County, NC, an area with a new and growing Latino population( 25 ), may not have yet become part of these commercial food listings.

A potential limitation of TDLinx is that the database only captures food outlets with ≥$US 1 million in sales. Latino food stores, such as tiendas and bodegas, tend to be smaller than non-Latino food stores( 31 ) and thus more likely to be missed in the commercial databases. Latino food stores captured in D&B and absent from TDLinx had sales volumes in the hundreds of thousands of dollars, and thus did not meet the sales volume criterion of TDLinx. Tiendas and bodegas are an important food resource in Latino communities and immigrant neighbourhoods, but it is unclear to what extent these stores are supportive of healthy eating.

Although data collectors were extensively trained before collecting data, the field team may have under-counted food outlets. These data were obtained approximately a month prior to data collection, during which food outlets may have opened, closed or moved, resulting in additional variation in food outlet counts. These results may also not be generalizable for other areas with different neighbourhood characteristics (e.g. communities with more long-standing Latino communities or a higher percentage of Hispanic residents). Nevertheless, ours is the first study to assess the validity of a novel commercial database, TDLinx, in Latino and non-Latino food outlets. In addition, we compare findings using D&B, a more commonly used database, which had relatively similar sensitivity compared with other studies( Reference Fleischhacker, Evenson and Sharkey 32 ).

Because of the comparatively higher agreement between TDLinx and the field census for total food stores, our study provides support for using TDLinx, alone or combined, with other commercial databases such as D&B to characterize neighbourhood food stores. However, both secondary data sources poorly identified small and independent food stores, with D&B performing slightly better for Latino food stores. Investigators should be cautious of using these data to characterize neighbourhoods with small and ethnic food stores, and consider supplementing secondary data sources with primary data collection if resources are available.

Acknowledgements

Acknowledgments: The authors would like to thank Marc Peterson, of the University of North Carolina at Chapel Hill (UNCH-CH) Carolina Population Center (CPC), and the CPC Spatial Analysis Unit for creation of the environmental variables; Nicole Wilkes, Matthew Lewis, Andrew Bousquet, Molly O’Dwyer and Antony Wambui for the audit; Dr Kurt Ribisl for the iPad program; and Ms Erica Brody for her helpful administrative assistance. Financial support: This study was supported by the National Institutes of Health (NIH; grant numbers R01-HL104580 and R01-HL 114091). Additional support came from the NIH, the UNC-CH Clinic Nutrition Research Center (grant number NIH DK56350) and the UNC-CH CPC (grant number R24 HD050924); and from contracts with the University of Alabama at Birmingham, Coordinating Center (contract number N01-HC-95095); the University of Alabama at Birmingham, Field Center (contract number N01-HC-48047); the University of Minnesota, Field Center (contract number N01-HC-48048); Northwestern University, Field Center (contract number N01-HC-48049); and the Kaiser Foundation Research Institute (contract number N01-HC-48050 from the National Heart, Lung, and Blood Institute). Support for S.S.A. (PhD, MPH) was from the Postdoctoral Ruth L. Kirschstein National Research Service Award (award number T32 HD07168-33) through the UNC-CH CPC. The NIH and related contracts had no role in the design, analysis or writing of this article. Conflict of interest: None. Authorship: P.G.-L. and S.S.A. designed the study; P.G.-L. and S.S.A. coordinated data collection; P.E.R. carried out data analysis, interpretation and drafted the manuscript; P.G.-L. and S.S.A. made major revisions to the manuscript and all authors approved it for submission. Ethics of human subject participation: Ethical approval was not required.

Supplementary material

To view supplementary material for this article, please visit http://dx.doi.org/10.1017/S1368980014001281

References

1. Caspi, CE, Sorensen, G, Subramanian, SV et al. (2012) The local food environment and diet: a systematic review. Health Place 18, 11721187.Google Scholar
2. Ding, D & Gebel, K (2012) Built environment, physical activity, and obesity: what have we learned from reviewing the literature? Health Place 18, 100105.Google Scholar
3. Feng, J, Glass, TA, Curriero, FC et al. (2010) The built environment and obesity: a systematic review of the epidemiologic evidence. Health Place 16, 175190.Google Scholar
4. Holsten, JE (2009) Obesity and the community food environment: a systematic review. Public Health Nutr 12, 397405.Google Scholar
5. Sallis, JF, Floyd, MF, Rodriguez, DA et al. (2012) Role of built environments in physical activity, obesity, and cardiovascular disease. Circulation 125, 729737.Google Scholar
6. Gustafson, A, Hankins, S & Jilcott, S (2012) Measures of the consumer food store environment: a systematic review of the evidence 2000–2011. J Community Health 37, 897911.Google Scholar
7. Rossen, LM, Pollack, KM & Curriero, FC (2012) Verification of retail food outlet location data from a local health department using ground-truthing and remote-sensing technology: assessing differences by neighborhood characteristics. Health Place 18, 956962.Google Scholar
8. Cummins, S & Macintyre, S (2009) Are secondary data sources on the neighbourhood food environment accurate? Case-study in Glasgow, UK. Prev Med 49, 527528.Google Scholar
9. Fleischhacker, SE, Rodriguez, DA, Evenson, KR et al. (2012) Evidence for validity of five secondary data sources for enumerating retail food outlets in seven American Indian communities in North Carolina. Int J Behav Nutr Phys Act 9, 137.Google Scholar
10. Liese, AD, Colabianchi, N, Lamichhane, AP et al. (2010) Validation of 3 food outlet databases: completeness and geospatial accuracy in rural and urban food environments. Am J Epidemiol 172, 13241333.Google Scholar
11. Han, E, Powell, LM, Zenk, SN et al. (2012) Classification bias in commercial business lists for retail food stores in the US. Int J Behav Nutr Phys Act 9, 46.Google Scholar
12. Auchincloss, AH, Moore, KA, Moore, LV et al. (2012) Improving retrospective characterization of the food environment for a large region in the United States during a historic time period. Health Place 18, 13411347.Google Scholar
13. Powell, LM, Han, E, Zenk, SN et al. (2011) Field validation of secondary commercial data sources on the retail food outlet environment in the US. Health Place 17, 11221131.Google Scholar
14. Gustafson, AA, Lewis, S, Wilson, C et al. (2012) Validation of food store environment secondary data source and the role of neighborhood deprivation in Appalachia, Kentucky. BMC Public Health 12, 688.Google Scholar
15. Longacre, MR, Primack, BA, Owens, PM et al. (2011) Public directory data sources do not accurately characterize the food environment in two predominantly rural states. J Am Diet Assoc 111, 577582.Google Scholar
16. Bader, MD, Ailshire, JA, Morenoff, JD et al. (2010) Measurement of the local food environment: a comparison of existing data sources. Am J Epidemiol 171, 609617.Google Scholar
17. Hosler, AS & Dharssi, A (2010) Identifying retail food stores to evaluate the food environment. Am J Prev Med 39, 4144.Google Scholar
18. Svastisalee, CM, Holstein, BE & Due, P (2012) Validation of presence of supermarkets and fast-food outlets in Copenhagen: case study comparison of multiple sources of secondary data. Public Health Nutr 15, 12281231.Google Scholar
19. The Nielsen Company (2012) Gain a comprehensive view of retail with the leader in location information management with Nielsen TDLinx. http://nielsen.com/content/dam/nielsen/en_us/documents/pdf/Fact%20Sheets%20III/Nielsen%20TDLinx.pdf (accessed December 2013).Google Scholar
20. Hoehner, CM & Schootman, M (2010) Concordance of commercial data sources for neighborhood-effects studies. J Urban Health 87, 713725.CrossRefGoogle ScholarPubMed
21. The Nielsen Company (2011) Counter Store Audit Center. http://www.nielsen.com/us/en.html (accessed December 2013).Google Scholar
22. Perez-Escamilla, R (2011) Acculturation, nutrition, and health disparities in Latinos. Am J Clin Nutr 93, issue 5, 1163S1167S.Google Scholar
23. Ayala, GX, Mueller, K, Lopez-Madurga, E et al. (2005) Restaurant and food shopping selections among Latino women in Southern California. J Am Diet Assoc 105, 3845.Google Scholar
24. Kaufman, L & Karpati, A (2007) Understanding the sociocultural roots of childhood obesity: food practices among Latino families of Bushwick, Brooklyn. Soc Sci Med 64, 21772188.Google Scholar
25. City of Durham (2010) Demographic and Economic Profile. http://durhamnc.gov/ich/cb/cdd/Documents/5%20yr%20con%20plan%20comm%20prof%20draft.pdf (accessed December 2013).Google Scholar
26. Dun & Bradstreet Inc. (2012) D&B – Dun’s Market Identifiers. http://library.dialog.com/bluesheets/pdf/bl0516.pdf (accessed December 2013).Google Scholar
27. Ribisl, KM & Myers, A (2011) Counter Tobacco. http://audit.countertobacco.org (accessed December 2013).Google Scholar
28. Emond, JA, Madanat, HN & Ayala, GX (2012) Do Latino and non-Latino grocery stores differ in the availability and affordability of healthy food items in a low-income, metropolitan region? Public Health Nutr 15, 360369.Google Scholar
29. Sharkey, JR, Dean, WR, Nalty, CC et al. (2013) Convenience stores are the key food environment influence on nutrients available from household food supplies in Texas Border Colonias. BMC Public Health 13, 45.Google Scholar
30. Pitts, SB, Bringolf, KR, Lawton, KK et al. (2013) Formative evaluation for a healthy corner store initiative in Pitt County, North Carolina: assessing the rural food environment, Part 1. Prev Chronic Dis 10, E121.Google Scholar
31. Food Marketing Institute (2005) El Mercado 2004: A Perspective on US Hispanic Shopping Behavior. http://www.fmi.org/news-room/latest-news/view/2005/05/01/new-fmi-report-examines-purchasing-preferences-and-behaviors-of-u.s.-hispanic-grocery-shoppers#sthash.BrDH6w5N.dpuf (accessed December 2013).Google Scholar
32. Fleischhacker, SE, Evenson, KR, Sharkey, J et al. (2013) Validity of secondary retail food outlet data: a systematic review. Am J Prev Med 45, 462473.Google Scholar
Figure 0

Table 1 Agreement statistics for Nielsen TDLinx (TDLinx) and Dun & Bradstreet (D&B), by food store and restaurant type, relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA, July–August 2012

Figure 1

Table 2 Agreement statistics for Latino food stores and restaurants, and for all food stores and restaurants by Hispanic composition of the census tract, for Nielsen TDLinx (TDLinx) and Dun & Bradstreet (D&B) relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA, July–August 2012

Supplementary material: File

Rummo Supplementary Material

Table S1

Download Rummo Supplementary Material(File)
File 32 KB