TY - JOUR AU - AB - The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning Ting Yao, Yiheng Zhang, Zhaofan Qiu, Yingwei Pan, Tao Mei JD AI Research, Beijing, China ftingyao.ustc, yihengzhang.chn, zhaofanqiu, panyw.ustcg@gmail.com, tmei@jd.com Abstract building unsupervised learning models to yield powerful and generic representations. A steady momentum of innovations and breakthroughs has The supervision in the video sequence generally origi- convincingly pushed the limits of unsupervised image rep- nates from three types: spatial, spatiotemporal, and sequen- resentation learning. Compared to static 2D images, video tial. In between, spatial supervision is derived from the has one more dimension (time). The inherent supervision structures in static frame, spatiotemporal supervision re- existing in such sequential structure offers a fertile ground for building unsupervised learning models. In this paper, we flects the correlation across different frames, and sequential compose a trilogy of exploring the basic and generic su- supervision verifies the temporal coherence. In the literature, pervision in the sequence from spatial, spatiotemporal and unsupervised learning methods for videos often involve dif- sequential perspectives. We materialize the supervisory sig- ferent proxy tasks, e.g., predicting the pixel-level displace- nals through determining whether a pair of samples is from ment across consecutive TI - SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning JF - Proceedings of the AAAI Conference on Artificial Intelligence DO - 10.1609/aaai.v35i12.17274 DA - 2021-05-18 UR - https://www.deepdyve.com/lp/unpaywall/seco-exploring-sequence-supervision-for-unsupervised-representation-XBqfxZ0rdR DP - DeepDyve ER -