Deep Learning, Computer Vision, PyTorch, OpenCV, and more
Top Cited Papers: CVPR 2017
Best Paper Award
2017
“Densely Connected Convolutional Networks”
G. Huang, Z. Liu, L. van der Maaten, K. Q. Weinberger
Honorable Mention
2017
“Annotating Object Instances with a Polygon-RNN”
L. Castrejon, K. Kundu, R. Urtasun, S. Fidler
2017
“YOLO9000: Better, Faster, Stronger”
J. Redmon, A. Farhadi
Best Student Paper Award
2017
“Computational Imaging on the Electric Grid”
M. Sheinin, Y. Y. Schechner, K. N. Kutulakos
Longuet-Higgins Prize(Test-of-Time)
2017
“Accurate, Dense, and Robust Multi-View Stereopsis”
Y. Furukawa, J. Ponce
2017
“Object Retrieval with Large Vocabularies and Fast Spatial Matching”
J. Philbin, O. Chum, M. Isard, J. Sivic, A. Zisserman
Curated Papers:
Densely Connected Convolutional Networks
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections-one between each layer and its subsequent layer-our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and pre-trained models are available at https://github.com/liuzhuang13/DenseNet.
Image-to-Image Translation with Conditional Adversarial Networks
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pi×2pi× software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.
Feature Pyramid Networks for Object Detection
Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors that are based on deep convolutional networks, partially because they are slow to compute and memory intensive. In this paper, we exploit the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost. A top-down architecture with lateral connections is developed for building high-level semantic feature maps at all scales. This architecture, called a Feature Pyramid Network (FPN), shows significant improvement as a generic feature extractor in several applications. Using a basic Faster R-CNN system, our method achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles, surpassing all existing single-model entries including those from the COCO 2016 challenge winners. In addition, our method can run at 5 FPS on a GPU and thus is a practical and accurate solution to multi-scale object detection. Code will be made publicly available.
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset João Carreira;Andrew Zisserman Publication Year: 2017,Page(s):4724 – 4733 Cited by: Papers (631) | Patents (1)
Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution Wei-Sheng Lai;Jia-Bin Huang;Narendra Ahuja;Ming-Hsuan Yang Publication Year: 2017,Page(s):5835 – 5843 Cited by: Papers (347)
SphereFace: Deep Hypersphere Embedding for Face Recognition Weiyang Liu;Yandong Wen;Zhiding Yu;Ming Li;Bhiksha Raj;Le Song Publication Year: 2017,Page(s):6738 – 6746 Cited by: Papers (339)
End-to-End Representation Learning for Correlation Filter Based Tracking Jack Valmadre;Luca Bertinetto;João Henriques;Andrea Vedaldi;Philip H. S. Torr Publication Year: 2017,Page(s):5000 – 5008 Cited by: Papers (301) | Patents (1)
Image Super-Resolution via Deep Recursive Residual Network Ying Tai;Jian Yang;Xiaoming Liu Publication Year: 2017,Page(s):2790 – 2798 Cited by: Papers (298) | Patents (1)
Learning from Simulated and Unsupervised Images through Adversarial Training Ashish Shrivastava;Tomas Pfister;Oncel Tuzel;Joshua Susskind;Wenda Wang;Russell Webb Publication Year: 2017,Page(s):2242 – 2251 Cited by: Papers (276) | Patents (6)
Learning Deep CNN Denoiser Prior for Image Restoration Kai Zhang;Wangmeng Zuo;Shuhang Gu;Lei Zhang Publication Year: 2017,Page(s):2808 – 2817 Cited by: Papers (275)
Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification Weihua Chen;Xiaotang Chen;Jianguo Zhang;Kaiqi Huang Publication Year: 2017,Page(s):1320 – 1329 Cited by: Papers (226) | Patents (1)
EAST: An Efficient and Accurate Scene Text Detector Xinyu Zhou;Cong Yao;He Wen;Yuzhi Wang;Shuchang Zhou;Weiran He;Jiajun Liang Publication Year: 2017,Page(s):2642 – 2651 Cited by: Papers (226)
Re-ranking Person Re-identification with k-Reciprocal Encoding Zhun Zhong;Liang Zheng;Donglin Cao;Shaozi Li Publication Year: 2017,Page(s):3652 – 3661 Cited by: Papers (226) | Patents (1)
OctNet: Learning Deep 3D Representations at High Resolutions Gernot Riegler;Ali Osman Ulusoy;Andreas Geiger Publication Year: 2017,Page(s):6620 – 6629 Cited by: Papers (215) | Patents (1)
Large Kernel Matters — Improve Semantic Segmentation by Global Convolutional Network Chao Peng;Xiangyu Zhang;Gang Yu;Guiming Luo;Jian Sun Publication Year: 2017,Page(s):1743 – 1751 Cited by: Papers (208) | Patents (3)
Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion Haiyu Zhao;Maoqing Tian;Shuyang Sun;Jing Shao;Junjie Yan;Shuai Yi;Xiaogang Wang;Xiaoou Tang Publication Year: 2017,Page(s):907 – 915 Cited by: Papers (198) | Patents (1)
A Point Set Generation Network for 3D Object Reconstruction from a Single Image Haoqiang Fan;Hao Su;Leonidas Guibas Publication Year: 2017,Page(s):2463 – 2471 Cited by: Papers (195)
Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring Seungjun Nah;Tae Hyun Kim;Kyoung Mu Lee Publication Year: 2017,Page(s):257 – 265 Cited by: Papers (189)
Semantic Scene Completion from a Single Depth Image Shuran Song;Fisher Yu;Andy Zeng;Angel X. Chang;Manolis Savva;Thomas Funkhouser Publication Year: 2017,Page(s):190 – 198 Cited by: Papers (177)
Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification Dangwei Li;Xiaotang Chen;Zhang Zhang;Kaiqi Huang Publication Year: 2017,Page(s):7398 – 7407 Cited by: Papers (171)
Disentangled Representation Learning GAN for Pose-Invariant Face Recognition Luan Tran;Xi Yin;Xiaoming Liu Publication Year: 2017,Page(s):1283 – 1292 Cited by: Papers (167)
Semantic Image Inpainting with Deep Generative Models Raymond A. Yeh;Chen Chen;Teck Yian Lim;Alexander G. Schwing;Mark Hasegawa-Johnson;Minh N. Do Publication Year: 2017,Page(s):6882 – 6890 Cited by: Papers (161)
Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition Jianlong Fu;Heliang Zheng;Tao Mei Publication Year: 2017,Page(s):4476 – 4484 Cited by: Papers (159) | Patents (2)
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation R. Qi Charles;Hao Su;Mo Kaichun;Leonidas J. Guibas Publication Year: 2017,Page(s):77 – 85 Cited by: Papers (157) | Patents (1)
ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases Xiaosong Wang;Yifan Peng;Le Lu;Zhiyong Lu;Mohammadhadi Bagheri;Ronald M. Summers Publication Year: 2017,Page(s):3462 – 3471 Cited by: Papers (156) | Patents (2)
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.