Assessing Data Mining Protocol Priority in Multi-Modal Data in New York City
Corresponding Author: Joon Park, New York City Dept. of Transportation
Presented By: Kyeongsu Steve Kim, Louis Berger
Abstract
New York City Department of Transportation (NYCDOT) has recently developed extensive transportation database systems including traffic volumes and speeds, bicycle and pedestrian volumes, crash and injury/fatality, signal timing plan and geometry. These database and control systems are operated by different units at NYCDOT as an independent database or operation system. Each database helps to alleviate various data collection/ reduction efforts but hardly eliminates additional manual process to interface with operational evaluation and analysis. There are lacks of useful functions to analyze various transportation data in a specific area-wide and regional boundary. Therefore, application of data mining system (DMS) is an urgent request to connect various databases for streamline data analysis using a dynamic data interface system. In addition, exchange engine and develop for transportation analysis and real-time control various analysis period from traditional transportation data an.
The purpose of this study was three-fold: (1) discuss the updates of existing traffic, bicycle and pedestrian data collection process with available technologies, (2) assess related data mining protocol priority, and (3) suggest effective design and facilitate NYCDOT databases as an Integrated Transportation Database System (ITDS). The benefits of ITDS deployment include (1) visualizing mobility and safety data, (2) summarizing site-specific dynamic data from a signal cycle length to 24/7, (3) presenting time-series data analysis and large–scale or network-wide snapshots, (4) validating collected transportation data, (5) monitoring the most up-to-date traffic condition, and (6) projecting future transportation conditions.
For technical perspectives, advanced technology from big data includes IoT (Internet of Things), neural network, statistical clustering, NoSQL(Not only SQL), and computer vision. Integration of such technologies will allow real-time data feeding, reconciliation of model for pattern detection, updates of major factors in demand estimation and operational network projection. All data mining outputs can be utilized for advanced real-time analysis and control of traffic.
Based on assessing data mining protocol priority, the findings of this research indicate that integrated DMS deployment can provide a macroscopic real-time information to monitor traffic conditions, project future traffic patterns, visualize congestion and safety data and provide effective real-time control policy along key corridors and across regional networks.