Leveraging Tools to Better Grok Model Calibration
Corresponding Author: Brice Nichols , Puget Sound Regional Council
Presented By: Bhargava Sana, San Francisco County Transportation Authority
Abstract
Model calibration is often performed at aggregate levels using simple comparison metrics and visuals, but in some cases, more granularity is required. As models become more disaggregate and complex, understanding outputs requires more advanced visuals, but these can still be built from an existing suite of utilities.
Based on recent experiences of testing and calibrating an open source dynamic transit assignment modeling tool (Fast-Trips), the authors found that a combination of Python scripting, out-of-the-box dashboard building tools, (e.g., Tableau) and common data like GTFS feeds were critical in assessing model performance beyond bar charts and tables.
With dashboard tools and scripting growing in options and popularity, we propose a session composed of short demonstrations of calibration dashboard tools used at various agencies with facilitated discussion surrounding the considerations for various approaches. The session proposers will recruit other agencies to present that they know would be able to contribute; specifically agencies that have tools that provide windows into disaggregate results or that include interactive platforms, such as Tableau, Shiny, Plotly, and others. Innovative visualizations in GIS, and detailed spreadsheet results would be interesting to include as well. The goal is to showcase a set of approaches for digging deeper into model performance, with an eye toward an activity-based and dynamic modeling future.
The authors’ work, as an example, includes a transit path analysis and a performance dashboard built in Tableau, with Python data prep. The dashboard displays aggregate level results, but allows detailed zoom-ins of specific paths, to isolate problematic trends (like a model assigning too many transfers) or unreasonable paths, identified visually. The authors hope to draw more examples of similar applications for DTA, activity models, and other more fine-grained model types.