Keynote: one neat trick to run better experiments

23 Nov 2019
10:55 - 11:30
Keynote Room

Keynote: one neat trick to run better experiments

Lukas Vermeer (NL)

5 minute video summary

Faye – Marketing Analyst, feedback through our #CH2019 attendee survey:

Loved this session. Very nice level of information, great presentation.



Check the live notes of Lukas his talk

Questions asked by attendees through our #CH2019 app:

  • How often does a SRM occur @ Booking?
  • How do you know you’ve checked for all possible variables? Is there a defined list of variables?
  • what is the difference between SRM and selection bias?
  • Do tests exist that calculate expected SRM?
  • What do we need to do with tests that have ran for weeks and than turn out to be suffering from SRM?
  • If you have run a test or research, and there’s SRM, can you correct your results. Or is it game over (you probably answer this later in your talk)
  • Should you also check SRM when both sample sizes are equal in number?
  • Could cookie banners and consents be causing SRM? How to get around it?
  • Google specifically says in their ab testing and Google search to specifically not exclude the bot or, because that’s against guidelines, but that could well be a big reason for SRM, what’s your opinion
  • Could responsibility for SRM find itself into the contracts of experimentation tools, ideally acting as processors? Where does the liability for bias lie?
  • Is SRM another word for response bias/sampling errors? Or is the difference in numbers, whilst being sampled proportionally , a problem as well?
  • Is it possible to fix your data (with srm) after the test ended? And how? Cleaning data afterwards?
  • When does a SRM error happens the most? At the beginning or the end of an experiment?
  • Is it possible to decrease the risk of SRM by doing more research upfront?
  • How do you have enough time writing papers and doing analysis while keeping up with the latest memes
  • How do you check SRM with dynamic traffic allocation?
  • Is SRM making multi armed bandit testing obsolete?
  • How come SRM is only getting attention in the industry now?
  • so talking about groups on test that are not equal. How do you correct for a test a the bottom of the page comparing it to the A group ?
  • Can you have a SRM mismatch on a session level but not on a user level? Or vice versa
  • What do we have to do to discuss this issue before testing/doing well at the start?
  • Is SRM something that can happen in exact science experiments?
  • If you have machine learning tests, how do you track srm?
  • So conclusion: the more you know about data (SRM, false positives), the less we are sure about the results. Let’s start redesigning again 😄

Become an attendee at our next event!

The average #CH2019 attendee experience score on a 1 (awful) to 5 (awesome) scale was 4.71!

Get yourself on our #CH2020 email list!
Or check out our full #CH2019 overview!