Abstract
Extracting essential information including the topological structure of rail-tracks, the position of switches and their current state can increase safety by reducing human error, while also boosting the efficiency of rail transportation. Despite the impressive advancements in the field of autonomous driving, computer vision approaches in the rail domain are still a small niche. In an effort to further develop the state-of-the-art in railway scene understanding, we propose a novel multi-task architecture, called UOLO, which combines the ubiquitous YoloV5 object detector with the established U-Net model for semantic segmentation. We further enhance this network, with additional geometric features computed by employing traditional computer vision approaches. Our approach is able to achieve state-of-the-art results while respecting the real-time constraints which would be required in practice.
Citare
@Inproceedings{Manole2025UOLOAM,
author = {Alexandru Manole and Laura Diosan},
booktitle = {IEEE Transactions on Intelligent Vehicles},
title = {UOLO: A Multitask U-Net YOLO Hybrid Model for Railway Scene Understanding},
year = {2025}
}
