Analytics & Data Science

Web Scraping for Consumer Insight “Nuggets”

Despite its potential, for many consumer researchers, web scraping is not a methodology they can consider. Heretofore, this remains an untapped opportunity, a black box. One reason: There are few standards or uniform methods to evaluate web scraping research. With little consensus, we researchers are, as James Madison put it, “…in a wilderness without a single footstep to guide us.” This data is very different from that generated from conventional, consumer research methods. Furthermore, the current literature is insufficient in describing the decision-making and judgment calls required in the process of web scraping. This makes it difficult to replicate methods or compare findings.

The internet plays an increasingly central role in consumers’ daily lives. Every second, consumers create terabytes of data containing rich information about their opinions, preferences and consumption choices. The massive volume and variety of consumers’ digital footprints present many opportunities for researchers to examine and test theories about consumer processes and behaviors in the field.

In this paper, Johannes Boegershausen, Abhishek Borah and Andrew Stephen outline the key challenges, state-of-the-art remedies, best practices and corresponding standards for evaluation for web scraping in consumer research. They provide a structured workflow designed to achieve a sufficient level of consistency and standardization with respect to how web scraping is conducted, documented, reported and evaluated in both the research and peer review processes.

The authors propose four interdependent facets necessary for generating credible, scientific findings from web scraping research:

  1. Design transparency.
  2. Analytic reproducibility.
  3. Analytic robustness.
  4. Effect replicability and generalizability.

The structured workflow outlined offers a pathway for generating interesting, impactful and credible consumer research findings. This allows more researchers to embrace web scraping as an avenue for producing timely and credible scholarly knowledge about consumer behavior.

Read the working paper here.

The ARF’s 2021 Research Agenda

How does the ARF select topics to research? Our foundation utilizes a democratic, 401K approach, allowing member organizations to vote with their dollars on what areas of research funding is allocated to. All projects are concentrated into seven topic areas: cross-platform, ROI and attribution, leveraging data, consumer attitudes, advertising creative, teams and talent and future states (driven by tech changes/social changes).

2020 was such a tumultuous year, it begs for an ARF research agenda that that helps members navigate the now and the future. Paul Donato, ARF CRO.

In order to keep membership abreast of the current research agenda, we have compiled this year’s list, with descriptions of each project, available for download below. Those of particular note include, The Study of Device and Account Sharing, Brand Loyalty & Lifetime Value, How Did Brands Handle 2020, the ARF Podcast Series and an eBook entitled: WOW! The Wit and Wisdom of Erwin Ephron.

See the agenda here.