Keynote: The Jungle of Experiments for AI Systems

21 Nov 2025
17:30 - 18:05
Keynote Room

Keynote: The Jungle of Experiments for AI Systems

Jelena Nadj – Data Scientist @ Hipcamp

Jelena Nadj at #CH2025

Small video preview

Full keynote video access available for members (tickets here).

Garret – VP Global CX, feedback through our #CH2025 attendee survey:

Interesting topic. Highlighted the challenges of testing a learning system against a learning system. Different to other topics which made it interesting.

Slides

(direct download link)


Notes

This is the link to the full summary of Jelena her talk

Questions asked by attendees through our #CH2025 app:

  • If the old model was so easy to interpret, couldn’t you just replicate it’s logic internally?
  • Do you think the term AI is overused? Would you prefer we use machine learning
  • Sounds like the new model was optimizing for CTR not CR or revenue. Did you consider using a different optimization target for the model?
  • When testing models with feedback loops would you recommend an initial burn in period, to allow it to learn, which gets excluded from final result analysis?
  • What goal/ goals was the algorithm told to learn? like was it interaction on the SERP or was it bookings after interaction? wouldn’t that be the easiest fix for the model? would love to understand how you train the model
  • How long did it take for the feedback loop to begin degrading the performance of the variation?
  • What caused the external vendor to give you only 3 months to replace them?
  • How do you actually measure the quality of learning? how do you measure that a model is smarter or dumber?
  • How resource intensive is it to keep tracking learning and feedback loops?
  • Cool slides! how did you made these?
  • If you had seen a positive result, would you have been worried about the feedback loop?
  • How did you find out that reviews were the issue? how to identify these factors in general?
  • Why didn’t you let the new model learn in the background, before you did the a/b test?
  • How long will it take to make a positive story about this subject?
  • Is it worth it? with tool costs and lower results why not just kill it?
  • How did you teach the model what metric is most important?
  • How did the company react to the results considering the internal pressures to show positive AI initiatives?
  • How do you build the need for real user learning/training requirements of your ML variant into your experiment setup/calculations to ensure you’re giving it the best chance but still maintain fairness?
  • When you changed the control data, did you reset the variable model data?
  • So we should test models’ learning trend sustainability than pure performance?
  • Did you use ai to understand the bad performance of the ai model?
  • Was your AI model better at the end?
  • How do you track learning during experimentation?
  • Great to hear a story, an experiment that didn’t go to plan!
  • So it was more like model parameter optimization

Become an attendee of our next event!

The average #CH2025 attendee experience score on a 1 to 5 scale was 4.74 (NPS +85)!

Subscribe to the Conversion Hotel ticket notification email list!
or check which #CH tickets/products are currently available.

<< Back to the the full #CH2025 overview!