Back to Discover
cs.CV

T-REN: Learning Text-Aligned Region Tokens Improves Dense Vision-Language Alignment and Scalability

Savya Khosla, Sethuraman T, Aryan Chadha, Alex Schwing, Derek Hoiem4/20/2026arxiv

This paper hasn't been summarized yet