The annual Computer Vision and Pattern Recognition (CVPR) conference has come to a close in New Orleans, with world-leading technology company OPPO successfully selecting seven of its papers submitted for the conference, making it place among the most successful technology companies of the event. OPPO also placed in eight of the most-watched competitions during the conference, winning three first-place, one second-place, and four third-place awards.
“In 2012, deep neural networks designed for image recognition rejuvenated the research and application of artificial intelligence. Since then, AI technology has seen a decade of rapid development,” said Guo Yandong, Chief Scientist of Intelligent Perception at OPPO.
“OPPO continues to promote artificial intelligence to perform complex perceptual and cognitive behaviors. We empower AI with superior cognitive abilities to understand and create beauty and develop embodied AI with autonomous behavior. I’m excited to see that seven of our papers have been selected for this year’s conference. Building on this success, we will continue to explore both fundamental AI and cutting-edge AI technology, as well as business applications that will allow us to bring the benefits of AI to more people.
Seven papers accepted by CVPR 2022 showcase OPPO’s progress in creating humanizing AI
Seven papers submitted by OPPO for CVPR 2022 have been selected for presentation at the conference. Their research areas include multimodal information interaction, 3D human body reconstruction, personalized evaluation of image aesthetics, knowledge distillation, etc.
Cross-module innovation is seen as the way to “humanize” artificial intelligence. Textual data often includes a high degree of oversimplification, while visual image data contains many specific contextual details. OPPO researchers have proposed a new CRIS framework based on the CLIP model to enable AI to gain a finer understanding of text and image modal data.
The biggest difference between human intelligence and artificial intelligence today lies in multimodality. Humans can undoubtedly understand data in words and pictures and establish relationships between the two types of data. The new method proposed by OPPO improves multimodal intelligence, which could potentially enable artificial intelligence to truly understand and interpret the world through multiple forms of information such as language, hearing, vision and others , making the robot and digital assistants of sci-fi movies become a reality.
3D reconstruction of the human body is another area in which the OPPO Research Institute has made significant progress. At CVPR, OPPO demonstrated a process to automatically generate digital avatars of humans with clothes that behave more naturally. By analyzing RGB video of humans captured with a camera, OPPO can accurately generate dynamic 1:1 3D models that include small details like logos or fabric textures. Creating accurate 3D models of clothing remained one of the biggest challenges. The new model effectively reduces the requirements needed to perform 3D human body reconstruction, providing technical foundations that can be applied to areas such as virtual locker rooms for online shopping, AI fitness classes and creation of realistic avatars in VR/AR worlds.
Structured local radiation fields for modeling human avatars
AI image recognition has now reached a stage where it can accurately identify a wide range of objects in an image. The ability of AI to evaluate images in terms of perceived aesthetic quality is often strongly tied to the big data used in training the AI model.
Together with Leida Li, a professor from Xidian University came up with the Personalized Image Aesthetic Assessment (PIAA) model. The model is the first to optimize the aesthetic evaluation of AI by combining subjective user preferences with more general aesthetic values. Going forward, the model will be used to create personalized experiences for users, not only limited to curating photo albums, but also providing recommendations on how to take the best photo and content that a user might prefer.
Personalized assessment of image aesthetics with rich attributes
OPPO has also chosen to make the PIAA Model Assessment Dataset the open source for developers, with a number of research institutes and universities having already expressed interest in using the data to pursue their own efforts in personalized AI aesthetic evaluation.
In addition to this, OPPO has also offered a multi-view 3D semantic plane reconstruction solution capable of accurately analyzing surfaces in a 3D environment. Developed in partnership with Tsinghua University, INS-Conv (INcremental Sparse Convolution) can achieve faster and more accurate online 3D instance and semantic segmentation. This can effectively reduce the computing power needed to perform environment recognition, which will allow this technology to be more easily adopted in applications such as automated driving and virtual reality.
OPPO makes AI “light” with a second place in the NAS Challenge
CVPR 2022 also saw a number of technical challenges unfold, with OPPO placing third and above in eight challenges. These include the Neural Architecture Search (NAS) Challenge, SoccerNet, SoccerNet Replay Grounding, ActivityNet Temporal Localization, the 4e Large-scale video object segmentation challenge.
From mobile photography to automated driving, deep learning models are being applied in a growing pool of industries. However, deep learning relies heavily on big data and computing power and consumes a lot of cost, which presents challenges for its commercial implementation. Neural architecture search (NAS) techniques can automatically discover and implement optimal neural network architectures. In the NAS competition, OPPO researchers trained a supernet of 45,000 neural subnets to inherit the parameters of the supernet by optimizing the model.
Using the NAS technique, researchers only need to train a large supernet and create a predictor to allow subnets to learn by inheriting parameters from the supernet. This provides an efficient and inexpensive approach to obtaining a deep learning model that surpasses those designed manually by expert network architects. It will finally bring previously unthinkable levels of AI technology to mobile devices in the near future.
During CPVR 2022, OPPO also participated in seminar presentations and three high-level workshops. During the SLAM seminar, OPPO researcher Deng Fan explained how real-time vSLAM can be performed on smartphones and AR/VR devices. In the AICITY workshop, Li Wei proposed a motion localization system based on multiple views to identify abnormal driver behavior while driving.
OPPO brings the benefits of AI to more people, sooner
This is the third year that OPPO has participated in CVPR. OPPO’s growing success at CVPR over these three years owes much to its continued investment in AI technology. In early 2020, the Institute of Intelligent Perception and Interaction was established under the OPPO Research Institute to further OPPO’s exploration of cutting-edge AI technologies. Today, OPPO has more than 2,650 global patent applications in the field of AI.
Guided by its brand proposition, “Inspiration Ahead”, OPPO is also working with industry partners to bring AI technology from the lab to everyday life. OPPO’s AI technology has also been used to develop products and features such as CybeReal real-time spatial AR generator, OPPO Air Glass, Omoji, and more. Through these technologies, OPPO aims to create more realistic digital worlds that combine virtual and reality to create brand new experiences for users.