SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Now on arXiv → arxiv.org/abs/2505.21795
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Now on arXiv → arxiv.org/abs/2505.21795
📈enabling fast, scalable annotation with minimal supervision.
See the qualitative results 👇
📈enabling fast, scalable annotation with minimal supervision.
See the qualitative results 👇
📈 +9.3% mIoU on LVIS-92i
⚡ 3× faster than prior works
💡 Only 234M parameters (4-5x smaller than competitors)
📈 +9.3% mIoU on LVIS-92i
⚡ 3× faster than prior works
💡 Only 234M parameters (4-5x smaller than competitors)
SAM2 features are rich, but optimized for tracking.
🧠 Insert bottleneck adapters into frozen SAM2
📉 These restructure feature space to disentangle semantics
📈 Result: features cluster semantically—even for unseen classes (see PCA👇)
SAM2 features are rich, but optimized for tracking.
🧠 Insert bottleneck adapters into frozen SAM2
📉 These restructure feature space to disentangle semantics
📈 Result: features cluster semantically—even for unseen classes (see PCA👇)
🔹 Textual Prompts for SAM2: Early fusion of visual-text cues via a novel adapter
🔹 Temporal Modeling: Essential for video understanding, beyond frame-by-frame object tracking
🔹 Tracking Bias: Correcting tracking bias in SAM2 for text-aligned object discovery
🔹 Textual Prompts for SAM2: Early fusion of visual-text cues via a novel adapter
🔹 Temporal Modeling: Essential for video understanding, beyond frame-by-frame object tracking
🔹 Tracking Bias: Correcting tracking bias in SAM2 for text-aligned object discovery