3D Deformable Object Matching using Graph Neural Networks

M.-A. Loghin

doi:10.24193/subbi.2024.1.02

M.-A. Loghin Department of Computer Science, Babes-Bolyai University, 1, M. Kogalniceanu Street, 400084, Cluj-Napoca, Romania

DOI: https://doi.org/10.24193/subbi.2024.1.02

Abstract

Considering the current advancements in computer vision it can be observed that most of it is focused on two dimensional imagery. This includes problems such as classification, regression, and the lesser known object matching problem. While object matching ca be viewed as a solved problem in a two dimensional space, for a three dimensional space there is a long way to go, especially for non-rigid objects. The problem is focused on matching a given object to a target object. We propose a solution based on Graph Neural Networks that tries to generalize over multiple objects at once, based on self-attention and cross-attention blocks for the network. To test our solution, we utilised five convolutional operators for the layers of the model. The convolutional operators we compared included GCNConv, ChebConv, SAGEConv, TAGConv, and FeaStConv. This paper aims to find the best operators for our architecture and the task. Our approach obtained favourable results for predicting the barycentric weights for the model, while struggling on predicting the triangle indexes. The best results were obtained for the models using GCNConv, for the triangles index prediction and FeaStConv for the barycentric coordinates prediction.

References

[1] Adedigba, A. P., Adeshina, S. A., Aina, O. E., and Aibinu, A. M. Optimal hyperparameter selection of deep learning models for COVID-19 chest x-ray classification. Intell Based Med 5 (Apr. 2021), 100034.
[2] Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., and Yu, F. ShapeNet: An Information-Rich 3D Model Repository. Tech. Rep. arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chicago, 2015.
[3] Christoffersen, P., and Jacobs, K. The importance of the loss function in option valuation. Journal of Financial Economics 72, 2 (2004), 291–318.
[4] Cosmo, L., Rodol, E., Bronstein, M. M., Torsello, A., Cremers, D., and Sahillioglu, Y. Partial Matching of Deformable Shapes. In Eurographics Workshop on 3D Object Retrieval (2016), A. Ferreira, A. Giachetti, and D. Giorgi, Eds., The Eurographics Association.
[5] Cosmo, L., Rodol, E., Masci, J., Torsello, A., and Bronstein, M. M. Matching deformable objects in clutter. In 2016 Fourth International Conference on 3D Vision (3DV) (2016), pp. 1–10.
[6] Defferrard, M., Bresson, X., and Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems (Red Hook, NY, USA, 2016), NIPS’16, Curran Associates Inc., p. 38443852.
[7] Du, J., Zhang, S., Wu, G., Moura, J. M. F., and Kar, S. Topology adaptive graph convolutional networks. CoRR abs/1710.10370 (2017).
[8] Dyke, R. M., Stride, C., Lai, Y.-K., Rosin, P. L., Aubry, M., Boyarski, A., Bronstein, A. M., Bronstein, M. M., Cremers, D., Fisher, M., Groueix, T., Guo, D., Kim, V. G., Kimmel, R., Lhner, Z., Li, K., Litany, O., Remez, T., Rodol, E., Russell, B. C., Sahilliolu, Y., Slossberg, R., Tam, G. K. L., Vestner, M., Wu, Z., and Yang, J. Shape correspondence with isometric and non-isometric deformations. In Eurographics Workshop on 3D Object Retrieval (2019), S. Biasotti, G. Lavou, and R. Veltkamp, Eds., The Eurographics Association.
[9] Dyke, R. M., Zhou, F., Lai, Y.-K., Rosin, P. L., Guo, D., Li, K., Marin, R., and Yang, J. SHREC 2020 Track: Non-rigid shape correspondence of physically-based deformations. In Eurographics Workshop on 3D Object Retrieval (2020), T. Schreck, T. Theoharis, I. Pratikakis, M. Spagnuolo, and R. C. Veltkamp, Eds., The Eurographics Association.
[10] Fey, M., and Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds (2019).
[11] Gao, X., and Fang, Y. A note on the generalized degrees of freedom under the l1 loss function. Journal of Statistical Planning and Inference 141, 2 (2011), 677–686.
[12] Ginzburg, D., and Raviv, D. Dual geometric graph network (dg2n) iterative network for deformable shape alignment. In 2021 International Conference on 3D Vision (3DV) (Dec. 2021), IEEE.
[13] Gonzlez Izard, S., Snchez Torres, R., Alonso Plaza, ., Juanes Mndez, J. A., and Garca-Pealvo, F. J. Nextmed: Automatic imaging segmentation, 3d reconstruction, and 3d model visualization platform using augmented and virtual reality. Sensors 20, 10 (2020).
[14] Grozdev, S., and Dekov, D. Barycentric coordinates: Formula sheet. International Journal of Computer Discovered Mathematics 1 (03 2016), 72–82.
[15] Hamilton, W. L., Ying, R., and Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Red Hook, NY, USA, 2017), NIPS’17, Curran Associates Inc., p. 10251035.
[16] Joo, H., Jeong, Y., Duchenne, O., and Kweon, I. S. Graph-based shape matching for deformable objects. In 2011 18th IEEE International Conference on Image Processing (2011), pp. 2901–2904.
[17] Kim, J., and Shontz, S. M. An improved shape matching algorithm for deformable objects using a global image feature. In Advances in Visual Computing (Berlin, Heidelberg, 2010), G. Bebis, R. Boyle, B. Parvin, D. Koracin, R. Chung, R. Hammound, M. Hussain, T. Kar-Han, R. Crawfis, D. Thalmann, D. Kao, and L. Avila, Eds., Springer Berlin Heidelberg, pp. 119–128.
[18] Kipf, T. N., and Welling, M. Semi-supervised classification with graph convolutional networks. CoRR abs/1609.02907 (2016).
[19] Lazarou, M., Li, B., and Stathaki, T. A novel shape matching descriptor for real-time static hand gesture recognition. Computer Vision and Image Understanding 210 (2021), 103241.
[20] Li, Y., Wang, Y., Case, M., Chang, S.-F., and Allen, P. K. Real-time pose estimation of deformable objects using a volumetric approach. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (2014), pp. 1046–1052.
[21] Lian, Y., and Chen, M. Ca-cgnet: Component-aware capsule graph neural network for non-rigid shape correspondence. Applied Sciences 13, 5 (2023).
[22] Lhner, Z., Rodol, E., Bronstein, M. M., Cremers, D., Burghard, O., Cosmo, L., Dieckmann, A., Klein, R., and Sahillioglu, Y. Matching of Deformable Shapes with Topological Noise. In Eurographics Workshop on 3D Object Retrieval (2016), A. Ferreira, A. Giachetti, and D. Giorgi, Eds., The Eurographics Association.
[23] Marin, R., Rakotosaona, M.-J., Melzi, S., and Ovsjanikov, M. Correspondence learning via linearly-invariant embedding, 2020.
[24] Muthukumar, V., Narang, A., Subramanian, V., Belkin, M., Hsu, D., and Sahai, A. Classification vs regression in overparameterized regimes: Does the loss function matter? J. Mach. Learn. Res. 22, 1 (jan 2021).
[25] Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024–8035.
[26] Prisacariu, V. A., and Reid, I. D. Pwp3d: Real-time segmentation and tracking of 3d objects. International Journal of Computer Vision 98, 3 (Jul 2012), 335–354.
[27] Ranftl, R., Bochkovskiy, A., and Koltun, V. Vision transformers for dense prediction. ArXiv preprint (2021).
[28] Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., and Koltun, V. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2020).
[29] Rodol, E., Bronstein, A. M., Albarelli, A., Bergamasco, F., and Torsello, A. A game-theoretic approach to deformable shape matching. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (2012), pp. 182–189.
[30] Sarlin, P., DeTone, D., Malisiewicz, T., and Rabinovich, A. Superglue: Learning feature matching with graph neural networks. CoRR abs/1911.11763 (2019).
[31] Schulman, J., Lee, A., Ho, J., and Abbeel, P. Tracking deformable objects with point clouds. In 2013 IEEE International Conference on Robotics and Automation (2013), pp. 1130–1137.
[32] Veltkamp, R. Shape matching: similarity measures and algorithms. In Proceedings International Conference on Shape Modeling and Applications (2001), pp. 188–197.
[33] Verma, N., Boyer, E., and Verbeek, J. Dynamic filters in graph convolutional networks. CoRR abs/1706.05206 (2017).
[34] Wu, Y., Yan, W., Kurutach, T., Pinto, L., and Abbeel, P. Learning to manipulate deformable objects without demonstrations. In Robotics Science and Systems (07 2020), pp. 65–76.
[35] Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., and Yu, P. S. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems 32, 1 (2021), 4–24.
[36] Yang, L., and Shami, A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (2020), 295–316.
[37] Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., Wang, L., Li, C., and Sun, M. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57–81.