The impact of the number representation format in parameterized text queries on the accuracy of 3D model generation
Abstract
Keywords
Full Text:
PDFReferences
Barkovska, O., Oliinyk, D., Sorokin, A., Zabroda, I., & Sedlaček, P. A system for monitoring the progress of rehabilitation of patients with musculoskeletal disorder. Advanced Information Systems, 2024, vol. 8, no. 3, pp. 13–24. DOI: 10.20998/2522-9052.2024.3.02.
Bukharova, L. D., & Barkovska, O. Y. 3D-generatziya z vykorystannyam opysovykh і chyslovykh zapytiv [3D-generation using descriptive and numerical queries]. Suchasni napryamy rozvytku informatziino-communicatziinykh tekhnologii ta zasobiv upravlinnya – Current Directions of Development of Information and Communication Technologies and Control Tools, 2025, vol. 3, p. 26. DOI: 10.32620/ict.25.t3. (In Ukrainian).
AI in action 2024 report | IBM. IBM - United States, 2024. Available at: https://www.ibm.com/think/reports/ai-in-action (accessed 31.05.2025).
Goehring, B., Goyal, M., Gunnar, R., Marshall, A., & Soffer, A. The Ingenuity of Generative AI at Scale, 2024. Available at: https://www.ibm.com/thought-leadership/institute-business-value/en-us/report/scale-generative-ai. (accessed 31.05.2025).
Shi, Z., Peng, S., Xu, Y., Geiger, A., Liao, Y., & Shen, Y. Deep Generative Models on 3D Representations: A Survey, arXiv, 2023, DOI: 10.48550/arXiv.2210.15663.
Kerbl, B., Kopanas, G., Leimkuehler, T., & Drettakis, G. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics, 2023, vol. 42, no. 4, pp. 1–14. DOI: 10.1145/3592433.
Lee, H., Savva, M., & Chang, A. X. 7.Text-to-3D Shape Generation. Computer Graphics Forum, 2024, vol. 43, no. 2. DOI: 10.1111/cgf.15061.
Wang, Z., Li, D., & Jiang, R. Diffusion Models in 3D Vision: A Survey, ArXiv, 2024. DOI: 10.48550/arXiv.2410.04738.
Fu, R., Zhan, X., Chen, Y., Ritchie, D., & Sridhar, S. ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model. In: S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, & A. Oh, eds. Advances in Neural Information Processing Systems. Curran Associates, Inc., 2022, pp. 8882–8895. DOI: 10.5555/3600270.3600916.
Rashidi, H. H., Pantanowitz, J., Hanna, M. G., Tafti, A. P., Sanghani, P., Buchinsky, A., Fennell, B., Deebajah, M., Wheeler, S., Pearce, T., Abukhiran, I., Robertson, S., Palmer, O., Gur, M., Tran, N. K., & Pantanowitz, L. Introduction to Artificial Intelligence and Machine Learning in Pathology and Medicine: Generative and Nongenerative Artificial Intelligence Basics. Modern Pathology, 2025, vol. 38, no. 4, article no. 100688. DOI: 10.1016/j.modpat.2024.100688.
Kamath, P., Morreale, F., Bagaskara, P. L., Wei, Y., & Nanayakkara, S. Sound Designer-Generative AI Interactions: Towards Designing Creative Support Tools for Professional Sound Designers. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. New York, NY, USA: Association for Computing Machinery, 2024. DOI: 10.1145/3613904.3642040.
Nikolenko, S. I. Synthetic Data for Deep Learning. Springer International Publishing, 2021. DOI: 10.1007/978-3-030-75178-4.
Strothotte, T., & Schlechtweg, S. 7 – Geometric Models and Their Exploitation in NPR. In: T. Strothotte and S. Schlechtweg, eds. Non-Photorealistic Computer Graphics. San Francisco: Morgan Kaufmann, 2002, pp. 203–245. DOI: 10.1016/B978-155860787-3/50008-6.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. Learning Transferable Visual Models From Natural Language Supervision, ArXiv, 2021. DOI: 10.48550/arXiv.2103.00020 (accessed 31.05.2025).
Huang, J.-H., Zhu, H., Shen, Y., Rudinac, S., & Kanoulas, E. Image2Text2Image: A Novel Framework for Label-Free Evaluation of Image-to-Text Generation with Text-to-Image Diffusion Models. MultiMedia Modeling: 31st International Conference on Multimedia Modeling, 2025, Nara, Japan, January 8–10, pp. 413–427. DIO: 10.1007/978-981-96-2071-5_30.
Tang, J., Chen, Z., Chen, X., Wang, T., Zeng, G., & Liu, Z. LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation. ArXiv, 2024. Available from: https://arxiv.org/abs/2402.05054 (accessed 31.05.2025).
Tang, J., Ren, J., Zhou, H., Liu, Z., & Zeng, G. DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024, arXiv, 2024. DOI: 10.48550/arXiv.2309.16653.
Lee, H.-H., Savva, M., & Chang, A. X. Text-to-3D Shape Generation. arXiv, 2024. Available from: https://arxiv.org/abs/2403.13289 (accessed 31.05.2025).
Chen, L., Wang, Z., Zhou, Z., Gao, T., Su, H., Zhu, J., & Li, C. MicroDreamer: Zero-shot 3D Generation in ~20 Seconds by Score-based Iterative Reconstruction, CoRR. arXiv, 2024. Available from: https://arxiv.org/abs/2404.19525 (accessed 31.05.2025).
Xiang, J., Lv, Z., Xu, S., Deng, Y., Wang, R., Zhang, B., Chen, D., Tong, X., & Yang, J. Structured 3D Latents for Scalable and Versatile 3D Generation, CoRR, 2024, Available from: 10.48550/arXiv.2412.01506.
Chen, R., Chen, Y., Jiao, N., & Jia, K. Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 22246–22256. DOI: 10.48550/arXiv.2303.1387.
Qiu, L., Chen, G., Gu, X., Zuo, Q., Xu, M., Wu, Y., Yuan, W., Dong, Z., Bo, L., & Han, X. Richdreamer: A generalizable normal-depth diffusion model for detail richness in text-to-3d. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 9914–9925. DOI: 10.48550/arXiv.2311.16918.
Li, J., Tan, H., Zhang, K., Xu, Z., Luan, F., Xu, Y., Hong, Y., Sunkavalli, K., Shakhnarovich, G., & Bi, S. Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model. The Twelfth International Conference on Learning Representations, ICLR 2024, arXiv. DOI: 10.48550/arXiv.2311.06214.
Li, M., Zhou, P., Liu, Keppo, J., Lin, M., Yan, S., & Xu, X. Instant3D: Instant Text-to-3D Generation. International Journal of Computer Vision, 2024, vol. 132, pp. 4456–4472. DOI: 10.1007/s11263-024-02097-5.
Chen, C., Yang, X., Yang, F., Feng, C., Fu, Z., Foo, C.-S., Lin, G., & Liu, F. Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior. EEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 10228–10237. DOI: 10.1109/cvpr52733.2024.00974.
Ren, X., Huang, J., Zeng, X., Museth, K., Fidler, S., & Williams, F. XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2024, pp. 4209–4219. DOI: 10.1109/CVPR52733.2024.00403.
Huang, T., Zeng, Y., Zhang, Z., Xu, W., Xu, H., Xu, S., Lau, R. W., & Zuo, W. Dreamcontrol: Control-based text-to-3d generation with 3D self-prior. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 5364–5373. DOI: 10.1109/CVPR52733.2024.00513.
Zhang, L., & et al. CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets. ACM Transactions on Graphics, 2024, vol. no. 4, pp. 1–20. DOI: 10.1145/3658146.
Hyper3d.ai. Hyper3D. Available at: https://hyper3d.ai/ (accessed 30.05.2025).
Zhao, Z., & et al. Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation. arXiv, 2025. Available at: http://arxiv.org/abs/2501.12202. (accessed 30.05.2025).
Zhao, Z., Liu, W., Chen, X., Zeng, X., Wang, R., Cheng, P., FU, B., Chen, T., YU, G., & Gao, S. Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation. In: Thirty-seventh Conference on Neural Information Processing Systems, arXiv, 2023. DOI: 10.48550/arXiv.2306.17115.
Li, W., Liu, J., Yan, H., Chen, R., Liang, Y., Chen, X., Tan, P., & Long, X. CraftsMan3D: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner. arXiv, 2024. DOI: 10.48550/arXiv.2405.14979.
Cheng, Y.-C., & et al. SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023, pp. 4456–4465. DOI: 10.1109/cvpr52729.2023.00433.
Leng, Z., Birdal, T., Liang, X., & Tombari, F. HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 19691-19700. DOI: 10.1109/CVPR52733.2024.01862.
Lin, F., Hou, S., Liu, H., Gao, S., Yamada, K. D., Zhang, H. K., & Zhang, Z. Hyperbolic Chamfer Distance for Point Cloud Completion and Beyond. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2024, pp. 14549–14560. DOI: 10.1109/ICCV51070.2023.01342.
Koffeman, A. Wooden Bookcase by Akoffeman (3D model). 2020. Available from: https://www.thingiverse.com/thing:4500930 (accessed 30.05.2025).
Replica Heerenhuis Rapture Coffee Table Model, Black - Poliigon. Poliigon (3D model). Available at: https://www.poliigon.com/model/replica-heerenhuis-rapture-coffee-table-model-black/4285 (accessed 30.05.2025).
Hultgren, K. Three 1:24 Windsor Chairs. (3D Model). Available at: https://www.thingiverse.com/thing:21999 (accessed 30.05.2025)
DOI: https://doi.org/10.32620/reks.2025.3.02
Refbacks
- There are currently no refbacks.
