CAGE-SGG: Counterfactual Active Graph Evidence for Open-Vocabulary Scene Graph Generation
arXiv:2604.22274v3 Announce Type: replace
Abstract: Open-vocabulary scene graph generation (SGG) aims to describe visual scenes with flexible and fine-grained relation phrases beyond a fixed predicate vocabulary. While recent vision-language models gr…