gui element detection

Evaluating Deep Learning Models for Cross-Platform UI Component Detection: A Study Across Web, Desktop, and Mobile Interfaces (2025)

User interfaces look different across web, desktop, and mobile platforms — not just in layout, but in how buttons, icons, and text appear. This makes it hard for deep learning models trained on one platform to accurately detect UI components on another. In this paper, we evaluate the cross-domain generalization of three modern object…

GenGUI: A Dataset for Automatic Generation of Web User Interfaces Using ChatGPT (2025)

The identification of elements in user interfaces is a problem that can generate great interest in current times due to the significant interaction between users and machines. Digital technologies are increasingly used to carry out almost any daily task. Computer vision can be helpful in different applications, such as accessibility, testing, or automatic code…

The Impact of Augmentation Techniques on Icon Detection Using Machine Learning Techniques (2024)

This article examines the use of image augmentation techniques to improve icon detection in mobile interfaces, a critical task due to the small size of graphical user interface (GUI) elements and the insufficiency of comprehensive datasets. It evaluates whether diversifying the dataset or using specific augmentation methods alone can enhance detection performance. The study…

The Impact of Data Annotations on the Performance of Object Detection Models in Icon Detection for GUI Images (2024)

Detecting icons in Graphical User Interfaces (GUIs) is essential for effective application automation. This study examines the impact of different annotation methods on the performance of object detection models for icon detection in GUIs. We compared manual, automated, and hybrid annotations using three models: Faster R-CNN, YOLOv8, and YOLOv9. The results show that manual…

UICVD: A Computer Vision UI Dataset for Training RPA Agents (2024)

This paper introduces the UICVD Dataset, a novel resource fostering advancements in Robotic Process Au-tomation (RPA) and Computer Vision. The paper focuses on recognizing UI (User Interface) components of a web application which is not as well known as recognizing real objects in images in the field of computer vision. This dataset derives from…