Detecting the Most Important Classes from Software Systems with Self Organizing Maps

  • E.-M. Manole Department of Computer Science, Faculty of Mathematics and Computer Science, Babes-Bolyai University, Cluj-Napoca, Romania


Self Organizing Maps (SOM) are unsupervised neural networks suited for visualisation purposes and clustering analysis. This study uses SOM to solve a software engineering problem: detecting the most important (key) classes from software projects. Key classes are meant to link the most valuable concepts of a software system and in general these are found in the solution documentation. UML models created in the design phase become deprecated in time and tend to be a source of confusion for large legacy software. Therefore, developers try to reconstruct class diagrams from the source code using reverse engineering. However, the resulting diagram is often very cluttered and difficult to understand. There is an interest for automatic tools for building concise class diagrams, but the machine learning possibilities are not fully explored at the moment. This paper proposes two possible algorithms to transform SOM in a classification algorithm to solve this task, which involves separating the important classes - that should be on the diagrams - from the others, less important ones. Moreover, SOM is a reliable visualization tool which able to provide an insight about the structure of the analysed projects.


[1] Abaei, G., Selamat, A., and Fujita, H. An empirical study based on semi-supervised hybrid self-organizing map for software fault prediction. Know.-Based Syst. 74, 1 (Jan. 2015), 28–39.
[2] Ceylan, E., Kutlubay, F. O., and Bener, A. B. Software defect identification using machine learning techniques. In 32nd EUROMICRO Conference on Software Engineering and Advanced Applications (EUROMICRO’06) (2006), pp. 240–247.
[3] Czibula, G., and Czibula, I. Unsupervised restructuring of object-oriented software systems using self-organizing feature maps. International Journal of Innovative Computing, Information and Control 8 (03 2012).
[4] do Nascimento Vale, L., and de Almeida Maia, M. Key classes in object-oriented systems: Detection and assessment. International Journal of Software Engineering and Knowledge Engineering 29, 10 (2019), 1439–1463.
[5] Fernandez-Saez, A. M., Genero, M., Chaudron, M. R., Caivano, D., and ´ Ramos, I. Are forward designed or reverse-engineered UML diagrams more helpful for code maintenance? A family of experiments. Information and Software Technology 57 (2015), 644 – 663.
[6] Golik, P., Doetsch, P., and Ney, H. Cross-entropy vs. squared error training: a theoretical and experimental comparison. In INTERSPEECH (2013).
[7] Hammad, M., Collard, M. L., and Maletic, J. I. Measuring class importance in the context of design evolution. 2010 IEEE 18th International Conference on Program Comprehension (2010), 148–151.
[8] Iqbal, T., Elahidoost, P., and Lucio, L. A bird’s eye view on requirements engineering and machine learning. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC) (12 2018).
[9] Jha, S., Kumar, R., Son, L., Priyadarshini, I., Sharma, R., Long, H., and Abdel-Basset, M. Deep learning approach for software maintainability metrics prediction. IEEE Access PP (04 2019), 1–1.
[10] Kohonen, T. The ’neural’ phonetic typewriter. Computer 21, 3 (1988), 11–22.
[11] Kohonen, T. The self-organizing map. Proceedings of the IEEE 78, 9 (1990), 1464–1480.
[12] Lau, K., Yin, H., and Hubbard, S. Kernel self-organising maps for classification. Neurocomputing 69 (10 2006), 2033–2040.
[13] Lin, J.-C. Automatically estimating software effort and cost using computing intelligence technique. Computer Science & Information Technology 2 (10 2012), 381–392.
[14] MacDonell, S. G. Visualization and analysis of software engineering data using self-organizing maps. In 2005 International Symposium on Empirical Software Engineering, (2005), pp. 10.
[15] Mattos, C., and Barreto, G. Artie and muscle models: building ensemble classifiers from fuzzy art and som networks. Neural Computing and Applications 22 (01 2013), 49–61.
[16] Onet-Marian, Z., Czibula, I., Czibula, G., and Sotoc, S. Software defect detection using self-organizing maps. Studia Universitatis Babes, -Bolyai Informatica LX (01 2015), 55–69.
[17] Onet-Marian, Z., Czibula, I.-G., and Czibula, G. A hierarchical clustering-based approach for software restructuring at the package level. 2017 19th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) (09 2017), 239–246.
[18] Sora, I. Helping progran comprehension of large software systems by identifying their most important classes. In ENASE (Department of Computer and Software Engineering University Politehnica of Timisoara, Romania, 2015).
[19] Osman, M. H., Chaudron, M. R. V., and v. d. Putten, P. An analysis of machine learning algorithms for condensing reverse engineered class diagrams. 2013 IEEE International Conference on Software Maintenance (2013), 140–149.
[20] Osman, M. H., Zadelhoff, A., and Chaudron, M. UML class diagram simplification: A survey for improving reverse engineered class diagram comprehension. Empirical Studies on the Effects of Modeling in Software Development (01 2013), 291–296.
[21] Paramshetti, P., and Phalke, D. Survey on software defect prediction using machine learning techniques. International Journal of Science and Research (IJSR) 3, 12 (2014).
[22] Pedrycz, W., Succi, G., Musilek, P., and Bai, X. Using self-organizing maps to analyze object-oriented software measures. Journal of Systems and Software 59 (10 2001), 65–82.
[23] Pedrycz, W., Succi, G., Reformat, M., Musilek, P., and Bai, X. Self organizing maps as a tool for software analysis. Canadian Conference on Electrical and Computer Engineering 1 (02 2001), 93 – 97 vol.1.
[24] Platon, L., Zehraoui, F., and Tahi, F. Self-organizing maps with supervised layer. 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM) (06 2017), 1–8.
[25] Singer, J., Elves, R., and Storey, M.-A. Navtracks: supporting navigation in software maintenance. IEEE International Conference on Software Maintenance, ICSM 2005 (10 2005), 325 – 334.
[26] Sousa, R., Rocha Neto, A., Cardoso, J., and Barreto, G. Robust classification with reject option using the self-organizing map. Neural Computing and Applications 26 (01 2015).
[27] Thung, F., Lo, D., Osman, M. H., and Chaudron, M. R. V. Condensing class diagrams by analyzing design and network metrics using optimistic classification. Proceedings of the 22nd International Conference on Program Comprehension (2014), 110–121.
[28] Vanmali, M., Last, M., and Kandel, A. Using a neural network in the software testing process. Int. J. Intell. Syst. 17 (01 2002), 45–62.
How to Cite
MANOLE, E.-M.. Detecting the Most Important Classes from Software Systems with Self Organizing Maps. Studia Universitatis Babeș-Bolyai Informatica, [S.l.], v. 66, n. 1, p. 54-73, july 2021. ISSN 2065-9601. Available at: <>. Date accessed: 05 dec. 2021. doi: