1. bookVolume 37 (2021): Issue 2 (June 2021)
    Special Issue on New Techniques and Technologies for Statistics
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
access type Open Access

A Product Match Adjusted R Squared Method for Defining Products with Transaction Data

Published Online: 22 Jun 2021
Page range: 411 - 432
Received: 01 Jun 2019
Accepted: 01 Apr 2020
Journal Details
License
Format
Journal
First Published
01 Oct 2013
Publication timeframe
4 times per year
Languages
English
Abstract

The occurrence of relaunches of consumer goods at the barcode (GTIN) level is a well-known phenomenon in transaction data of consumer purchases. GTINs of disappearing and reintroduced items have to be linked in order to capture possible price changes.

This article presents a method that groups GTINs into strata (‘products’) by balancing two measures: an explained variance (R squared) measure for the ‘homogeneity’ of GTINs within products, while the second expresses the degree to which products can be ‘matched’ over time with respect to a comparison period. The resulting product ‘match adjusted R squared’ (MARS) combines explained variance in product prices with product match over time, so that different stratification schemes can be ranked according to the combined measure.

MARS has been applied to a broad range of product types. Individual GTINs are suitable as products for food and beverages, but not for product types with higher rates of churn, such as clothing, pharmacy products and electronics. In these cases, products are defined as combinations of characteristics, so that GTINs with the same characteristics are grouped into the same product. Future research focuses on further developments of MARS, such as attribute selection when data sets contain large numbers of variables.

Keywords

ABS. 2017. “Making Greater Use of Transactions Data to Compile the Consumer Price Index.” 15th Meeting of the Ottawa Group on Price Indices, 10–12 May 2017, Eltville am Rhein, Germany. Available at: https://www.bundesbank.de/en/homepage/making-greater-use-of-transactions-data-to-compile-the-consumer-price-index-australia-635722 (accessed November 2019).Search in Google Scholar

Bilius,Å., O. Ståhl, and C. Tongur. 2018. “Coverage Bias and the Effect of Re-launches in Scanner Data: A Coffee Index.” Meeting of the Group of Experts on Consumer Price Indices, 7–9 May 2018, Geneva, Switzerland. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.22/2018/Sweden_poster_2_ppt.pdf (accessed November 2019).Search in Google Scholar

Boriah, S., V. Chandola, and V. Kumar. 2008. “Similarity Measures for Categorical Data: A Comparative Evaluation.” In Proceedings of the 2008 SIAM International Conference on Data Mining, 24–26 April 2008, Atlanta, Georgia, United States: 243–254. DOI: https://doi.org/10.1137/1.9781611972788.22.Search in Google Scholar

Chessa, A.G. 2013. “Comparing Scanner Data and Survey Data for Measuring Price Change of Drugstore Articles.” Workshop on Scanner Data for HICP, 26–27 September 2013, Lisbon, Portugal. Available at: https://www.ine.pt/xportal/xmain?xpid=INE&xpgid=ine_sem_lista&tipo=r&detalhe=165101941 (accessed November 2019).Search in Google Scholar

Chessa, A.G. 2016. “A New Methodology for Processing Scanner Data in the Dutch CPI.” Eurostat Review on National Accounts and Macroeconomic Indicators 2016(1): 49–69. Available at: https://ec.europa.eu/eurostat/cros/content/new-methodology-processing-scanner-data-dutch-cpi-antonio-g-chessa_en (accessed November 2019).Search in Google Scholar

Chessa, A.G. 2018. Product Definition and Index Calculation with MARS-QU: Applications to Consumer Electronics. The Hague: Statistics Netherlands. Available at: https://circabc.europa.eu/sd/a/16b279bd-04d1-44bd-8972-a537f09f7c59/Report%20Grant%202017-18%20Objective%201C.PDF.pdf (accessed May 2021).Search in Google Scholar

Chessa, A.G. 2019. “MARS: A Method for Defining Products and Linking Barcodes of Item Relaunches.” 16th Meeting of the Ottawa Group on Price Indices, 8–10 May 2019, Rio de Janeiro, Brazil. Available at: https://eventos.fgv.br/sites/eventos.fgv.br/files/arquivos/u161/product_definition_with_mars_chessa_og19.pdf (accessed November 2019).Search in Google Scholar

Chessa, A.G., and R. Griffioen. 2019. “Comparing Scanner Data and Web Scraped Data for Consumer Price Indices.” Economie et Statistique/Economics and Statistics 509: 49–68. DOI: https://doi.org/10.24187/ecostat.2019.509.1984.Search in Google Scholar

Chessa, A.G., J. Verburg, and L. Willenborg. 2017. “A Comparison of Price Index Methods for Scanner Data.” 15th Meeting of the Ottawa Group on Price Indices, 10–12 May 2017, Eltville am Rhein, Germany. Available at: http://www.ottawagroup.org/Ottawa/ottawagroup.nsf/4a256353001af3ed4b2562bb00121564/1ab31c25da944ff5ca25822c00757f87/$FILE/A%20comparison%20of%20price%20index%20methods%20for%20scanner%20data%20-Antonio%20Chessa,%20Johan%20Verburg,%20Leon%20Willenborg%20-Paper.pdf (accessed November 2019).Search in Google Scholar

De Haan, J., and H.A. van der Grient. 2011. “Eliminating Chain Drift in Price Indices Based on Scanner Data.” Journal of Econometrics 161: 36–46. DOI: https://doi.org/10.1016/j.jeconom.2010.09.004.Search in Google Scholar

Diewert, W.E., and K.J. Fox. 2017. Substitution Bias in Multilateral Methods for CPI Construction Using Scanner Data. Vancouver: The University of British Columbia. Discussion paper 17-02. Available at: https://www.ottawagroup.org/Ottawa/ottawa-group.nsf/4a256353001af3ed4b2562bb00121564/1ab31c25da944ff5ca25822c00757f87/$FILE/Substitution%20bias%20in%20multilateral%20methods%20for%20CPI%20construction%20using%20scanner%20data%20-Erwin%20Diewert,%20Kevin%20Fox%20-Paper.pdf (accessed May 2021).Search in Google Scholar

Granville, V., M. Krivanek, and J.-P. Rasson. 1994. “Simulated Annealing: A Proof of Convergence.” IEEE Transactions on Pattern Analysis and Machine Intelligence 16: 652–656. DOI: https://doi.org/10.1109/34.295910.Search in Google Scholar

Hoffmann, U., A. da Silva, and M. Carvalho. 2015. “Finding Similar Products in E-commerce Sites Based on Attributes.” In Proceedings of the 9th Alberto Mendelzon International Workshop on Foundations of Data Management, 6–8 May 2015, Lima, Peru. Available at: http://ceur-ws.org/Vol-1378/ (accessed November 2019).Search in Google Scholar

Hov, K., and R. Johannessen. 2018. “Using Scanner Data for Sports Equipment.” Meeting of the Group of Experts on Consumer Price Indices, 7–9 May 2018, Geneva, Switzerland. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.22/2018/Norway_-_session_1.pdf (accessed November 2019).Search in Google Scholar

ILO, IMF, OECD, UNECE, Eurostat, and The World Bank. 2004. Consumer Price Index Manual: Theory and Practice. Geneva: ILO Publications. DOI: https://doi.org/10.5089/9787509510148.069.Search in Google Scholar

Ivancic, L., W.E. Diewert, and K.J. Fox. 2011. “Scanner Data, Time Aggregation and the Construction of Price Indexes.” Journal of Econometrics 161: 24–35. DOI: https://doi.org/10.1016/j.jeconom.2010.09.003.Search in Google Scholar

Keating, J., and M. Murtagh. 2018. “Quality Adjustment in the Irish CPI.” Meeting of the Group of Experts on Consumer Price Indices, 7–9 May 2018, Geneva, Switzerland. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/-ge.22/2018/Ireland.pdf (accessed November 2019).Search in Google Scholar

Kirkpatrick, S., C.D. Gelatt Jr, and M.P. Vecchi. 1983. “Optimization by Simulated Annealing.” Science 220 (4598): 671–680. DOI: https://doi.org/10.1126/science.220.4598.671.Search in Google Scholar

Krsinich, F. 2014. “The FEWS Index: Fixed Effects with a Window Splice – Non-Revisable Quality-Adjusted Price Indices with No Characteristic Information.” Meeting of the Group of Experts on Consumer Price Indices, 26–28 May 2014, Geneva, Switzerland. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.22/2014/New_Zealand_-_FEWS.pdf (accessed November 2019).Search in Google Scholar

Land, A.H., and A.G. Doig. 1960. “An Automatic Method of Solving Discrete Programming Problems.” Econometrica 28: 497–520. DOI: https://doi.org/10.2307/1910129.Search in Google Scholar

Little, J.D.C., K.G. Murty, D.W. Sweeney, and C. Karel. 1963. “An Algorithm for the Traveling Salesman Problem.” Operations Research 11: 972–989. DOI: https://doi.org/10.2307/1910129.Search in Google Scholar

Mafteiu-Scai, L.O. 2013. “A New Dissimilarity Measure between Feature-Vectors.” International Journal of Computer Applications 64: 39-44. DOI: https://doi.org/10.5120/10730-5734.Search in Google Scholar

Russell, S.J., and P. Norvig. 2003. Artificial Intelligence: A Modern Approach (2nd ed.). Upper Saddle River, New Jersey: Prentice Hall. DOI: https://doi.org/10.1017/s0269888900007724.Search in Google Scholar

Van Loon, K., and D. Roels. 2018. “Integrating Big Data in the Belgian CPI.” Meeting of the Group of Experts on Consumer Price Indices, 7–9 May 2018, Geneva, Switzerland. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/-ge.22/2018/Belgium.pdf (accessed November 2019).Search in Google Scholar

Von Auer, L. 2014. “The Generalized Unit Value Index Family.” Review of Income and Wealth 60: 843–861. DOI: https://doi.org/10.1111/roiw.12042.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo