Team Crafts Framework for Human-Focused Volumetric Videos

A analysis group has developed a Gaussian Splatting processing platform that helps end-to-end processing from knowledge acquisition to multi-platform rendering. Their framework offers a stable basis for the large-scale adoption and future analysis of Gaussian Splatting know-how.

The analysis is revealed within the journal Visual Intelligence on March 27, 2026.

3D Gaussian Splatting is a complicated laptop graphics that makes use of hundreds of thousands of tiny factors, or “splats,” to create extremely lifelike 3D scenes. Because of its distinctive rendering high quality and real-time capabilities, 3D Gaussian Splatting affords robust help in functions comparable to digital actuality, augmented actuality, and next-generation immersive media. In latest years, researchers have prolonged 3D Gaussian Splatting to dynamic scenes, proposing numerous 4D Gaussian Splatting representations.

3D Gaussian Splatting is a key know-how in a revolutionizing immersive media known as volumetric video. Volumetric video is a 3D recording approach that makes use of a number of cameras to movie an individual or object and creates a digital 3D mannequin that’s viewable from any angle, somewhat than a flat picture. “Volumetric video enables free-viewpoint exploration of immersive virtual environments, eﬀectively narrowing the gap between digital and physical realities,” stated Professor Jingyi Yu from the School of Information Science and Technology, ShanghaiTech University.

However, there are two main challenges with the volumetric video know-how: the large storage and transmission overhead related to temporal sequences, and a fragmented instrument chain ecosystem that hinders eﬃcient analysis and improvement. These challenges forestall researchers from extending this know-how to dynamic scenes.

For 3D Gaussian Splatting to seek out its technique to sensible use, the problem of storage price should be overcome. “The most pressing issue is the high storage cost, especially for dynamic scenes where the introduction of the temporal dimension dramatically increases the data volume, imposing greater demands on storage, transmission, and real-time interaction,” stated Dr. Lan Xu, additionally from the School of Information Science and Technology, ShanghaiTech University.

Existing options usually deal with remoted levels and have lacked a unified, end-to-end workflow from knowledge acquisition to ultimate viewing. Earlier research have centered on the compression and optimization of Gaussian Splatting, nonetheless, this work has been scattered throughout totally different code bases and restricted by inconsistent knowledge codecs and incompatible knowledge loaders. These issues hinder the copy, comparability, and integration of those totally different strategies.

“To address these challenges, we propose a comprehensive dynamic Gaussian processing framework that provides a complete, end-to-end pipeline. This framework systematically integrates the entire process, from data acquisition and standardized preprocessing to a suite of diverse dynamic Gaussian reconstruction algorithms,” stated Dr. Xu.

The group’s platform incorporates a wide range of mainstream and cutting-edge 3D Gaussian Splatting and 4D Gaussian Splatting reconstruction strategies. These strategies present standardized knowledge preprocessing interfaces and unified knowledge loading mechanisms and embrace a common compression framework that may be tailored to a number of representations.

One of framework’s core contributions is a general-purpose compression framework, suitable with the outputs of varied reconstruction strategies, which considerably reduces the storage footprint of dynamic sequences whereas sustaining excessive visible constancy.

The group has additionally developed a cross-platform real-time rendering plugin that helps high-quality, interactive, free-viewpoint experiences for customers on desktop, cell, and XR units.

Their work additionally features a large-scale, high-quality dynamic human movement seize dataset. To obtain this, they constructed a dense multi-view acquisition system that incorporates 81 synchronized RGB cameras. With this 81-camera array, the group then captured over 130 sequences of numerous human motions, together with advanced interactions with topological adjustments. This system can report timecode-aligned 3840 x 2160 decision video at a body fee of 30 frames per second.

“With this platform, we aim to establish a complete pipeline from data acquisition to practical application, promoting the large-scale adoption of Gaussian Splatting technologies in real-world scenarios and providing a reliable and eﬃcient experimental foundation for future research,” stated Dr. Xu.

The ShanghaiTech University analysis group contains Shengkun Zhu, Chengcheng Guo, Yuanji Lu, Zhehao Shen, Yize Wu, Yu Hong, YiwenCai, Meihan Zheng, Yingliang Zhang, Lan Xu, and Jingyi Yu.

Funding info

This analysis was funded by the National Natural Science Foundation of China, National Key R&D Program of China, and the Central Guided Local Science and Technology Foundation of China, MoE KeyLab of Intelligent Perception and Human-Machine Collaboration (Shanghai Tech University), and the Shanghai Frontiers Science Center of Human-centered Artificial Intelligence.

About the Authors

Dr. Jingyi Yu is an OSA Fellow, IEEE Fellow and an ACM Distinguished Scientist, Director of the MoE Key Lab of Intelligent Perception and Human-Machine Collaboration. He obtained B.S. with honor from Caltech in 2000 in Computer Science and Applied Mathematics and Ph.D. from MIT in EECS in 2005. He is now the Inaugural Chair Professor of the ShanghaiTech University. He additionally serves because the Vice President of the college. Dr. Yu has been working extensively on computational imaging, laptop imaginative and prescient, laptop graphics, and bioinformatics. He has gained a number of Best Paper Awards at prime conferences, together with the 2025 ACM SIGGRAPH Best Paper Award, the 2025 SIGGRAPH Best in Show Award (Emerging Technology), and a 2024 SIGGRAPH Best Paper Nomination. His pupil obtained the 2024 CVPR Best Student Paper Award. He was the primary to introduce giant visible fashions into chip design and obtained DAC Best Paper Honorable Mentions in each 2024 and 2025. He has served on the editorial boards of main journals and was Program Chair of CVPR 2021 and ICCV 2027, in addition to General Chair of ICCV 2025.

Dr. Lan Xu is an Assistant Professor with the School of Information Science and Technology at ShanghaiTech University, China. He obtained a PhD diploma in Electronic and Computer Engineering from the Hong Kong University of Science and Technology (HKUST), Hong Kong, China. After that, he joined ShanghaiTech as a tenure-track Assistant Professor, PI. His analysis lies on the intersection of laptop imaginative and prescient, laptop graphics, and laptop images. He has revealed numerous top-tier convention and journal papers, together with SIGGRAPH, CVPR, ICCP, IEEE TRO, IEEE TPAMI, and ACM TOG and so forth.

About Visual Intelligence

Visual Intelligence is a global, peer-reviewed, open-access journal dedicated to the idea and observe of visible intelligence. This journal is the official publication of the China Society of Image and Graphics (CSIG), with Article Processing Charges absolutely lined by the Society. It focuses on the foundations of visible computing, the methodologies employed within the area, and the functions of visible intelligence, whereas notably encouraging submissions that deal with quickly advancing areas of visible intelligence analysis.

/Public Release. This materials from the originating group/writer(s) may be of the point-in-time nature, and edited for readability, model and size. Mirage.News doesn’t take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely these of the writer(s).View in full here.

Sources

Team Crafts Framework for Human-Focused Volumetric Videos

ByNews Center

By News Center

Related Post

Thailand uses AI to boost durian export quality

Renaissance Philanthropy Expands Global Science and Technology Funding Initiatives

New Air Pollution Center Launched by Int’l Team in Kyrgyzstan

You missed

Thailand uses AI to boost durian export quality

Summer investment pays off for Black Cats

Renaissance Philanthropy Expands Global Science and Technology Funding Initiatives

7-Eleven Japan founder Toshifumi Suzuki dies at 93

ByNews Center

Share this:

By News Center

Related Post

You missed