This program is tentative and subject to change.

Tue 22 Oct 2024 12:00 - 12:30 at Pacific C - NSAD: Session 1

Due to its interdisciplinary nature, the development of data science code is subject to a wide range of potential errors that can easily compromise the final results. Several tools have been proposed that can help the data scientist in identifying the most common, low level programming issues. We discuss the steps needed to implement a tool that is rather meant to focus on higher level errors that are specific of the data science pipeline. To this end, we propose a static analysis assigning ad hoc abstract datatypes to the program variables, which are then checked for consistency when calling functions defined in data science libraries. By adopting a descriptive (rather than prescriptive) abstract type system, we obtain a linter tool reporting data science related code smells. While being still work in progress, the current prototype is able to identify and report the code smells contained in several examples of questionable data science code.

This program is tentative and subject to change.

Tue 22 Oct

Displayed time zone: Pacific Time (US & Canada) change

11:00 - 12:30
NSAD: Session 1NSAD at Pacific C
11:00
5m
Opening
NSAD
Vincenzo Arceri University of Parma, Italy, Michele Pasqua University of Verona
11:05
55m
Keynote
Abstract Domains for Machine Learning Verification
NSAD
Caterina Urban Inria & École Normale Supérieure | Université PSL
12:00
30m
Full-paper
Towards a High Level Linter for Data ScienceFull Paper
NSAD
Greta Dolcetti Ca' Foscari University of Venice - Department of Environmental Sciences, Informatics and Statistics, Agostino Cortesi Università Ca' Foscari Venezia, Caterina Urban Inria & École Normale Supérieure | Université PSL, Enea Zaffanella University of Parma, Italy