FaceARG – Age, Race, Gender, Accessories dataset


The human face holds a privileged position in multi-disciplinary research as it conveys multiple information: demographical attributes (age, race, gender, ethnicity), social signals, emotion expression, etc. Studies have shown that due to the distribution of ethnicity/race in training datasets,biometric algorithms suffer from "cross race effect": their performance is better on subjects closer to the "country of origin" of the algorithm. We built an annotated dataset of facial images and proposed a deep learning approach for automatic human race and ethnicity detection from facial images; 4 state of the art convolutional neural networks were fully trained to differentiate between the following racial classes: African-American, Asian, Caucasian and Indian. A study has been made to pinpoint the facial features and zones that are extracted by the CNNs in order to identify race in the context of some facial accessories.

We gathered over 175.000 facial images from the Internet and used four independent human subjects to label the images with race information. The training database is made publicly available. To our knowledge, we gathered the largest available face database (of more than 175000 images) annotated with race, age, gender and accessories information.

race annotated images data distribution


Please fill in the agreeement form in order to get access to the dataset ! Please use your institutional email address to request access to the data ! We do not use email and contact information in any way except to identify your institution !

The database contains the training and test images together with the corresponding annotation csv files.

The trained models are accessible here (will be transfered to github).

Terms and conditions


If you use these data please cite the following work: 

Adrian Darabant, Borza, Diana and Radu Danescu . Recognizing Human Races through Machine Learning - A Multi-Network, Multi-Features Study Mathematics, vol 9, pp:438-449, 2020.--

Bibtex format:

	AUTHOR = {Darabant, Adrian Sergiu and Borza, Diana and Danescu, Radu},
	TITLE = {Recognizing Human Races through Machine Learning—A Multi-Network, Multi-Features Study},
	JOURNAL = {Mathematics},
	VOLUME = {9},
	YEAR = {2021},
	NUMBER = {2},
	URL = {https://www.mdpi.com/2227-7390/9/2/195},
	ISSN = {2227-7390},
	DOI = {10.3390/math9020195}



Please contact dadi (at) cs.ubbcluj.ro Adrian DARABANT or Diana Borza (dianaborza (at) cs.ubbcluj.ro) for any questions or comments about the database.