A/B testing

Human Experience: Why Attention AI Needs Human Input

Dr. Matthias RothenseeCSO & Partner, eye square

Stefan SchoenherrVP Brand and Media & Partner, eye square

Speakers Matthias Rothensee and Stefan Schoenherr of eye square discussed the need for a human element and oversight of AI. Beginning their discussion on the state of attention and AI, Matthias acknowledge that race for attention is one of the defining challenges of our time for modern marketers. He quoted author Rex Briggs, who noted the "conundrum at the heart of AI: its greatest strength can also be its greatest weakness." Matthias indicated that AI is incredibly powerful in recognizing pattern from big data sets but at the same time there are some risks attached to it (e.g., finding spurious patterns, hallucinations, etc.). Stefan examined a case study using an advertisement for the candy M&Ms, which considered real humans using eye tracking technology and compared it to results using AI. The goal was to better understand where AI is good at predicting attention and where does it still have to optimize or get better. Results from a case study indicated areas for AI improvements in terms of gaze cueing, movement, contrast, complexity and nonhuman entities (e.g., a dog). The static nature of AI (e.g., AI prediction models are often built based on static attention databases) can become a challenge when comparing dynamic attention trends. Key takeaways:
  • Predictive AI is good at replicating human attention for basic face and eye images, high-contrast scenes (e.g., probability of looking at things that stand out) and slow-paced scene cuts where AI can detect details.
  • AI seems unaware of a common phenomenon called the "cueing effect" (e.g., humans not only pay attention to people's faces but also to where they're looking), which leads to an incorrect prediction.
  • AI has difficulties deciphering scenes with fast movements (e.g., AI shows inertia) in contrast to slow-paced scenes where AI excels in replicating human feedback. In this case human feedback is more accurate.
  • AI is more consumed with attention towards contrast (e.g., in an ad featuring a runner, AI gave attention to trees surrounding the runner), whereas humans can decipher the main aspect of an image.
  • AI decomposes human faces (e.g., AI is obsessed with human ears), whereas humans can detect the focal point of a human face. In addition, AI hallucinates, underestimating facial effects.
  • AI has difficulties interpreting more complex visual layouts (e.g., complex product pack shots are misinterpreted).
  • AI is human centric and does not focus well on nonhuman entities such as a dog (e.g., in scenes where a dog was present, AI disregarded the dog altogether).
  • AI tends to be more static in nature (e.g., AI prediction models are often built based on static attention databases), which can be a problem when comparing this to dynamic attention trends.

Download Presentation

Member Only Access

Evidence-Based Social Media Advertising: Two Field Experiments

Prof. Rachel KennedyAssociate Director (Product Development), Ehrenberg-Bass Institute for Marketing Science

Beginning her discussion, Rachel Kennedy (Ehrenberg-Bass Institute) noted that Artificial Intelligence (AI) and other developments in computational advertising could mean key media principles, developed for traditional advertising, no longer apply. She examined empirical evidence, primarily focused on traditional media, which validated the idea that for media to thrive, it must consistently reach category buyers with both continuity and recency. Nevertheless, she acknowledged the evolving landscape of media. Building on that notion, she detailed two field experiments using social media, conducted with Stephen Bellman and Zachary Anesbury, also from the Ehrenberg-Bass Institute. The experiments aimed to assess: (1) whether AI-based optimization outperformed simpler, evidence-based optimization methods by implementing algorithms on YouTube and Meta platforms and (2) whether bursting, compared to continuous advertising, was more effective in reaching category buyers. The experimental design considered matched cells (e.g., randomized zip codes, matched demographics, people per HH, median weekly income, monthly repayments, motor vehicles per dwelling, etc.). Additionally, there were equal budgets per cell. Rachel noted that the standing principles will likely still have a role, but the research aimed to understand which ones and how they contribute to the current media landscape. Results from the experiments tended to be uneven and varied, indicating room for improvement. Key takeaways:
  • AI and ML in programmatic advertising are discovering and using new media principles that may generate results from a variety of data points, better than any human could.
  • Experiment 1 (platform optimizer vs. simple reach principle): AI-based optimization beat simpler, evidence-based reach optimization, considering results for impressions, clicks and reach, reported by the digital agency responsible for scheduling the media.
    • However, AI did not outperform the simple media principles.
    • These findings suggest that using traditional media placement strategies can be just as effective as AI-based strategies for certain goals.
  • Experiment 2: Bursting is better than continuous advertising for reaching as many category buyers as possible.
    • However, neither campaign performed significantly better than the unexposed control cell.
  • Overall results from these experiments were messy, indicating the need for improvement, particularly in tools on the platform end (e.g., inadequate capping options, high budget spending and the need for enhancements in forecasting tools).

Download Presentation

Member Only Access

Business Outcomes in Advertising Powered by Machine Learning

Brett MershmannSr. Director, Research & Development (R&D), NCSolutions

Brett Mershmann’s (NCSolutions) discussion focused on how to quantify incremental advantages of some more modern contemporary machine learning (ML) frameworks, over more traditional measurements for incrementality. Beginning the presentation, Brett provided an overview of both traditional modeling techniques as well as more contemporary ML campaign measurements. To understand the differences, Brett detailed an 11-experiment process, using real observational household data, intersected with real campaign impression data but with simulated outcome and with a defined outcome function. The experiments measured accuracy, validity and power. Additionally, they compared ML with randomized controlled trials (RCTs), noting that RCTs are the gold standard but are not always feasible. To accomplish this, they ran both an RCT and an ML analysis, by creating test-control groups on real, limited data. This experiment applied the same outcome function to each, depending on a larger set of variables. In closing, Brett shared feedback from these experiments, which supported ML as a powerful method of measurement and a viable alternative to RCTs. He highlighted the importance of getting the correct data into these models for optimum results. Key takeaways:
  • A survey from the CMO Council indicated that 56% of marketers want to improve their campaign measurement performance in the next 12 months.
  • Traditional campaign measurement techniques use household matching (Nearest-Neighbor), household matching (Propensity) and inverse propensity weighting (IPW), based on simple statistical models applied uniformly. This method simulates balanced test and control groups to estimate the group-wise counterfactual.
  • The ML measurement technique, using NCSolutions’ measurement methodology, is computationally robust for large, complex data sets, understanding that data is not one-size-fits-all and estimates counterfactual for individual observations.
  • Simple A/B testing does not capture the true effect, while the counterfactual approach uses a "what-if model" approach to estimate the true effect.
  • The experiments comparing ML to traditional methods, measuring accuracy, validity and power showed that:
    • Accuracy: Machine learning outperforms on accuracy 55% compared to inverse propensity weighing (9%), propensity match (27%) and nearest-neighbor match (8%).
    • Validity: Percentage of scenarios with true effect in confidence interval (validity) found that ML gave valid estimates most often (91%), compared with inverse propensity weighing (82%), propensity match (64%) and nearest-neighbor match (73%).
    • Power: Machine learning is more statistically powerful. The average width of confidence interval using machine learning was 1.48, compared to inverse propensity weighing (1.56), propensity match (1.78) and nearest-neighbor match (1.72).
  • Results from ML vs. RCTs: Both ML and RCT are accurate in campaign measurement, both methods are generally valid, but ML is more powerful.
    • Overall, ML can be an adequate substitute for RCTs providing meaningful estimates when running an RCT is not a possibility.

Download Presentation

Member Only Access

The Power of AI for Effective Advertising in an ID-free World

Rachel GantzManaging Director, Proximic by Comscore

Amidst heightened regulations in the advertising ecosystem, Rachel Gantz of Proximic by Comscore delved into a discussion of diverse AI applications and implementation tactics, in an increasingly ID-free environment, to effectively reach audiences. Rachel highlighted signal loss as a "massive industry challenge," to provide a framework for the research she examined. She remarked that the digital advertising environment was built on ID-based audience targeting, but with the loss of this data and the increase in privacy regulations, advertisers have placed their focus on first-party and contextual targeting (which includes predictive modeling). In her discussion, she focused on the many impacts predictive AI is having on contextual targeting, in a world increasingly void of third-party data, providing results from a supporting experiment. The research aimed to understand how the performance of AI-powered ID-free audience targeting tactics compared to their ID-based counterparts. The experiment considered audience reach, cost efficiency (eCPM), in-target accuracy and inventory placement quality. Key takeaways:
  • Fifty to sixty percent of programmatic inventory has no IDs associated with it and that includes alternative IDs.
  • Specific to mobile advertising, many advertisers saw 80% of their IOS scale disappear overnight.
  • In an experiment, two groups were exposed to two simultaneous campaigns, focused on holiday shoppers. The first group (campaign A) was an ID-based audience, while the second group was an ID-free predictive audience.
    • Analyzing reach: ID-free targeting nearly doubled the advertisers’ reach, vs. the same audience, with ID-based tactics.
    • Results from cost efficiency (eCPM): ID-free AI-powered contextual audiences saw 32% lower eCPMs than ID-based counterparts.
    • In-target rate results: Significant accuracy was confirmed (84%) when validating if users reached with the ID-free audience matched the targeting criteria.
    • Inventory placement quality: ID-free audience ads appeared on higher quality inventory, compared to the same ID-based audience (ID-free 27% vs. ID-based 21%).

Download Presentation

Member Only Access

FORECASTING 2023: Managing Risk — How Businesses Can Get Better Visibility into the Near and Long-Term Future

Managing business risk involves having a rational, data-driven view of the future while simultaneously being as prepared as possible for external shocks — from a global pandemic and the ensuing supply-chain disruptions, to inflation, data signal losses, war, and great power competition. At our annual Forecasting event, held virtually on July 18, leading experts shared how businesses can adapt forecasting techniques to manage risk.

How Businesses Can Get Better Visibility into the Near and Long-Term Future

  • FORECASTING 2023

Managing business risk involves having a rational, data-driven view of the future while simultaneously being as prepared as possible for external shocks — from a global pandemic and the ensuing supply-chain disruptions, to inflation, data signal losses, war, and great power competition. At our annual Forecasting event, held virtually on July 18, leading experts shared how businesses can adapt forecasting techniques to manage risk.

Member Only Access

MODERATED TRACK DISCUSSIONS: Understanding Audiences

In a follow-up discussion for the “Understanding Audiences” track, Havas Media’s Peter Sedlarcik delves deeper into the ways the panelists are measuring for their clients, from the challenges of creating custom platforms and how technology’s rapid advances are affecting how they reconcile data, to balancing rigorous methodology with dynamic measurement approaches.

Leveraging A/B Testing to Understand Consumer Behavior

Vidyotham Reddi shared insights from Mars’ approach to A/B testing as their gold standard of learning. Framing it within the timely “virus” context, Vidyotham reinforced A/B testing’s dependability, versatility, and precision for marketing and understanding consumer behavior as part of Mars’ overall strategy.

Day 4 Panel Discussion & Closing Remarks

Maggie Zhang of NBC Universal invited all the presenters back to a wrap-up session called “Attribution Pivot,” where she asked what challenges marketers are facing and how they are meeting them. Each provided insight into important attribution challenges that they as a marketer or their client is facing. Limitations include lacking the ability to do A/B testing, privacy issues and the looming issue of cookie depreciation. It is also difficult to determine long-term lift, such as lifetime value.

Leveraging Look Alike Models when A/B Testing isn’t an Option

It isn’t always possible to perform A/B tests when it comes to evaluating the impact of paid media campaigns. Caroline Iurillo and Megan Lau of Microsoft outlined the company’s development of a strategy which matches campaign exposure data with a customer database and then creates “look-alikes“ for non-exposed customers to make audiences as comparable as possible. Lifts in perceptions, behaviors and revenue can then be compared (in aggregate) amongst exposed customers and their non-exposed look-alikes to determine the effectiveness of a campaign.