SGSG: Stroke-Guided Scene Graph Generation

  • Qixiang Ma
  • , Runze Fan
  • , Lizhi Zhao
  • , Jian Wu
  • , Sio Kei Im
  • , Lili Wang

Research output: Contribution to journalArticlepeer-review

Abstract

3D scene graph generation is essential for spatial computing in Extended Reality (XR), providing structured semantics for task planning and intelligent perception. However, unlike instance-segmentation-driven setups, generating semantic scene graphs still suffer from limited accuracy due to coarse and noisy point cloud data typically acquired in practice, and from the lack of interactive strategies to incorporate users' spatialized and intuitive guidance. We identify three key challenges: designing controllable interaction forms, involving guidance in inference, and generalizing from local corrections. To address these, we propose SGSG, a Stroke-Guided Scene Graph generation method that enables users to interactively refine 3D semantic relationships and improve predictions in real time. We propose three types of strokes and a lightweight SGstrokes dataset tailored for this modality. Our model integrates stroke guidance representation and injection for spatio-temporal feature learning and reasoning correction, along with intervention losses that combine consistency-repulsive and geometry-sensitive constraints to enhance accuracy and generalization. Experiments and the user study show that SGSG outperforms state-of-the-art methods 3DSSG and SGFN in overall accuracy and precision, surpasses JointSSG in predicate-level metrics, and reduces task load across all control conditions, establishing SGSG as a new benchmark for interactive 3D scene graph generation and semantic understanding in XR.

Original languageEnglish
Pages (from-to)9792-9802
Number of pages11
JournalIEEE Transactions on Visualization and Computer Graphics
Volume31
Issue number11
DOIs
Publication statusPublished - 2025

Keywords

  • Extended Reality
  • Scene Graph Generation
  • Spatial Computing
  • User Interaction

Fingerprint

Dive into the research topics of 'SGSG: Stroke-Guided Scene Graph Generation'. Together they form a unique fingerprint.

Cite this