Post

3D Language Gaussian Splatting ( LangSplat )

Having semantic in a 3D reconstruction is extremely powerful as it can be used for segmentation or connected to a LLM to retrieve localised information. Could we do that for 3D gaussian splatting?

Take a look at โ€œLangSplat: 3D Language Gaussian Splattingโ€ from Tsinghua University and Harvard University

This method ground CLIP features into a set of 3D language Gaussians, which attains precise 3D language fields while being 199 ร— faster than LERF.

They propose to learn hierarchical semantics using SAM, thereby eliminating the need for extensively querying the language field across various scales and the regularization of DINO features

I overlooked this method now accepted to CVPR 2024 but Iโ€™m glad I found it again. Have a look as well.

๐Ÿง™Paper Authors: Minghan Qin1, Wanhua Li2โ€ , Jiawei Zhou1, Haoqian Wang1โ€ , Hanspeter Pfister2 ( indicates equal contribution, โ€  means Co-corresponding author) 1Tsinghua University, 2Harvard University

Translate to Korean

3D ์žฌ๊ตฌ์„ฑ์—์„œ ์‹œ๋งจํ‹ฑ์„ ๊ฐ–๋Š” ๊ฒƒ์€ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜์— ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ LLM์— ์—ฐ๊ฒฐํ•˜์—ฌ ํ˜„์ง€ํ™”๋œ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ๊ฐ•๋ ฅํ•ฉ๋‹ˆ๋‹ค. 3D ๊ฐ€์šฐ์‹œ์•ˆ ์Šคํ”Œ๋ž˜ํŒ…์— ๋Œ€ํ•ด ๊ทธ๋ ‡๊ฒŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

Tsinghua University ๋ฐ Harvard University ์˜ โ€œLangSplat: 3D ์–ธ์–ด Gaussian Splattingโ€์„ ์‚ดํŽด๋ณด์‹ญ์‹œ์˜ค.

์ด ๋ฐฉ๋ฒ•์€ CLIP ๊ธฐ๋Šฅ์„ 3D ์–ธ์–ด ๊ฐ€์šฐ์‹œ์•ˆ ์„ธํŠธ๋กœ ์ ‘์ง€ํ•˜์—ฌ LERF๋ณด๋‹ค 199ร— ๋น ๋ฅด๋ฉด์„œ ์ •ํ™•ํ•œ 3D ์–ธ์–ด ํ•„๋“œ๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค.

๊ทธ๋“ค์€ SAM์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณ„์ธต์  ์˜๋ฏธ๋ก ์„ ํ•™์Šตํ•  ๊ฒƒ์„ ์ œ์•ˆํ•˜๋ฏ€๋กœ ๋‹ค์–‘ํ•œ ๊ทœ๋ชจ์— ๊ฑธ์ณ ์–ธ์–ด ํ•„๋“œ๋ฅผ ๊ด‘๋ฒ”์œ„ํ•˜๊ฒŒ ์ฟผ๋ฆฌํ•˜๊ณ  DINO ๊ธฐ๋Šฅ์„ ์ •๊ทœํ™”ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค

ํ˜„์žฌ CVPR 2024์— ์Šน์ธ๋œ ์ด ๋ฐฉ๋ฒ•์„ ๊ฐ„๊ณผํ–ˆ์ง€๋งŒ ๋‹ค์‹œ ๋ฐœ๊ฒฌํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค. ๋‹น์‹ ๋„ ๋ณด์„ธ์š”.

This post is licensed under CC BY 4.0 by the author.