Online Interactions leave a trail that allow to form time-stamped sequences of events join able to the same entity (device, individual, household). These include cookies, tags, account logins, email/physical addresses, geo-location, IP addresses, credit card #, phone numbers, etc.
Marketers want to use that data to understand how their advertising works at the granular individual level.
Using the simulated big data, we wanted to answer the following questions:
- Does having enough data mean that we can recover unbiased estimates from the event log data – given that we know the real answers?
- If not, given what we know about the real data generating process:
- What are the biases?
- What do we need to do to remove the bias?
- What is the best overall methodology to handle them?