Visual Foundation Model

Microsoft Research 블로그

Microsoft Research 블로그

Swin Transformer supports 3-billion-parameter vision models that can train with higher-resolution images for greater task applicability

6월 21, 2022 | Han Hu 그리고 Baining Guo

Early last year, our research team from the Visual Computing Group introduced Swin Transformer, a Transformer-based general-purpose computer vision architecture that for the first time beat convolutional neural networks on the important vision benchmark of COCO object detection (opens in…