Powerful tests and reproducible benchmarks with `pytest-cases`

Great projects deserve great tests ! ;)

Sylvain MariƩ

Data Data Science Science Test Libraries (pytest/nose/...) Testing

See in schedule: Wed, Jul 28, 11:30-12:00 CEST (30 min) Download/View Slides

`pytest` is undoubtedly the most popular test framework for python. Its fixture and parametrization mechanisms, as well as its detailed hook API and vibrant plugin ecosystem make it a must-know for any developer wishing to create quality software. Some key limitations in its engine and API, however, prevent users from truly unleashing their testing scenarii. Creating tests with complex parametrization is complicated, and users trying to explore this direction may loose the legendary elegance and maintainability of pytest tests on the way.

`pytest-cases` extends `pytest` so that users can manage their parameters the same way they manage tests: as python functions. This separates test cases from test functions in an elegant way, encouraging developers to add more cases without decreasing readability. With this new paradigm it becomes ridiculously simple to create tests where several "kind" of parameters coexist nicely: datasets come both from files, databases and simulations ; distinct algorithms can be evaluated and compared, with possible variants ; etc. Fixtures can be leveraged by any of these, in an intuitive manner.

Finally `pytest-cases` can also be used in combination with `pytest-harvest` to generate scientific results tables in a reproducible manner. Reproducible research projects may therefore wish to use it to replace an "ad-hoc" benchmark engine. That way, adding datasets and algorithms in the benchmark becomes as easy as creating new python functions.

This presentation is for python developers and data scientists at ease with python, with basic experience of `pytest`.

Type: Talk (30 mins); Python level: Intermediate; Domain level: Beginner

Sylvain MariƩ

Schneider Electric

Sylvain received his General Engineering degree from CentraleSupelec (Paris) and a MSc in Machine Learning from UCL (London) in 2005 - with awards for his thesis on Automated Medical Diagnosis using Semi-Supervised Learning. He joined Schneider Electric as an embedded software engineer, to imagine how industrial gateways could leverage SOA/M2M/Web2.0. His work within EU-funded innovation projects was published and transferred into industrial IoT offers.

In 2010, Sylvain joined an energy efficiency program for tertiary buildings, leading Monitoring & Analytics topics. He developped a platform used for BI and Visual Analytics prototypes. He opened collaborations with major universities and labs and assessed key technology partners.

Since 2013 Sylvain is leading projects spanning from AI Research [1] to Analytics-as-a-Service industrialization in multiple market segments, with production targets such as the various Schneider Electric EcoStruxure Advisors [2] and the Exchange [3]. He was the supervisor of four PhD students, animates an internal group of python users, and is an active contributor to the broader Open Source python community, through both flagship libs (scikit-learn, nox, pytest...) or his own libraries (pyfields, pytest-cases, makefun...) [4]. Finally since 2020 Sylvain gives a small "datascience with python" course for Masters students.

[1] https://scholar.google.fr/citations?user=PRZ7h8sAAAAJ
[2] https://www.se.com/ww/en/work/campaign/innovation/overview.jsp
[3] https://exchange.se.com/
[4] https://github.com/smarie/