Detecting, tracking and counting people getting on/off a metropolitan train using a standard video camera

Sergio A. Velastin, Rodrigo Fernández, Jorge E. Espinosa, Alessandro Bay

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The main source of delays in public transport systems (buses, trams, metros, railways) takes place in their stations. For example, a public transport vehicle can travel at 60 km per hour between stations, but its commercial speed (average en-route speed, including any intermediate delay) does not reach more than half of that value. Therefore, the problem that public transport operators must solve is how to reduce the delay in stations. From the perspective of transport engineering, there are several ways to approach this issue, from the design of infrastructure and vehicles to passenger traffic management. The tools normally available to traffic engineers are analytical models, microscopic traffic simulation, and, ultimately, real-scale laboratory experiments. In any case, the data that are required are number of passengers that get on and off from the vehicles, as well as the number of passengers waiting on platforms. Traditionally, such data has been collected manually by field counts or through videos that are then processed by hand. On the other hand, public transport networks, specially metropolitan railways, have an extensive monitoring infrastructure based on standard video cameras. Traditionally, these are observed manually or with very basic signal processing support, so there is significant scope for improving data capture and for automating the analysis of site usage, safety, and surveillance. This article shows a way of collecting and analyzing the data needed to feed both traffic models and analyze laboratory experimentation, exploiting recent intelligent sensing approaches. The paper presents a new public video dataset gathered using real-scale laboratory recordings. Part of this dataset has been annotated by hand, marking up head locations to provide a ground-truth on which to train and evaluate deep learning detection and tracking algorithms. Tracking outputs are then used to count people getting on and off, achieving a mean accuracy of 92% with less than 0.15% standard deviation on 322 mostly unseen dataset video sequences.

Original languageEnglish
Article number6251
Pages (from-to)1-20
Number of pages20
JournalSensors
Volume20
Issue number21
DOIs
StatePublished - 1 Nov 2020

Bibliographical note

Funding Information:
Sergio A. Velastin is grateful for funding received from the Universidad Carlos III de Madrid, the European Union?s Seventh Framework Programme for research, technological development and demonstration under grant agreement N 600371, el Ministerio de Econom?a, Industria y Competitividad (COFUND2013-51509) el Ministerio de Educaci?n, Cultura y Deporte (CEI-15-17) and Banco Santander. Rodrigo Fernandez and Sergio A. Velastin gratefully acknowledge the Chilean National Science and Technology Council (Conicyt) for its funding under CONICYT-Fondecyt Regular Grant Nos. 1120219, 1080381 and 1140209 (?OBSERVE?).

Funding Information:
Funding: Sergio A. Velastin is grateful for funding received from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement N 600371, el Ministerio de Economía, Industria y Competitividad (COFUND2013-51509) el Ministerio de Educación, Cultura y Deporte (CEI-15-17) and Banco Santander. Rodrigo Fernandez and Sergio A. Velastin gratefully acknowledge the Chilean National Science and Technology Council (Conicyt) for its funding under CONICYT-Fondecyt Regular Grant Nos. 1120219, 1080381 and 1140209 (“OBSERVE”).

Publisher Copyright:
© 2020 by the authors. Licensee MDPI, Basel, Switzerland.

Keywords

  • Camera sensor
  • Deep learning
  • Multi-object tracking
  • People counting
  • People detection
  • People tracking

Fingerprint Dive into the research topics of 'Detecting, tracking and counting people getting on/off a metropolitan train using a standard video camera'. Together they form a unique fingerprint.

Cite this