Bootstrap Modal Example

Robust Dual-Modal Speech Keyword Spotting for XR Headsets

Abstract: While speech interaction finds widespread utility within the Extended Reality (XR) domain, conventional vocal speech keyword spotting systems continue to grapple with formidable challenges, ...

GitHub

Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment

Phantom is a unified video generation framework for single and multi-subject references, built on existing text-to-video and image-to-video architectures. It achieves cross-modal alignment using ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Robust Dual-Modal Speech Keyword Spotting for XR Headsets

Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment

Trending now