Samasource and Cornell Tech announced today their collaboration on iMaterialist-Fashion, a high-quality data set to enable research into advanced methods for clothing identification. Visual analysis of clothing is a topic that has received increasing attention, with benefits for brands and consumers. Being able to recognize apparel products and associated attributes (for example, lace or beading) from pictures could enhance the shopping experience and drive efficiency for retailers. The dataset will be part of the Fine Grained Visual Categorization (FGVC) workshop this June at CVPR, the premier annual computer vision conference. The FGVC workshop is co-sponsored by Google AI.
“Quality data is important for algorithmic success. Using the SamaHub, Samasource’s fashion annotators were consistently able to produce quality results and on time deliveries for the Cornell Tech team to help further our research and development for the fashion dataset,” says Cornell Tech professor Serge Belongie. “This dataset will facilitate significant advances in computer vision with the potential for wide-reaching consumer engagement.”
“At Samasource, we’re committed to advancing the AI industry, including supporting open source data initiatives. We’re thankful to the Cornell Tech team for sharing this vision and facilitating the development of this open source dataset. They were the ideal partner.” said Loic Juillard, VP Engineering, Samasource.
The Cornell Tech research team turned to Artificial Intelligence trained by Samasource with the goal of introducing a novel, fine-grained segmentation task by joining forces between the fashion and computer vision industries. The Cornell Tech team proposed a fashion taxonomy built by fashion experts, informed by product description from the internet. To capture the complex structure of fashion objects and ambiguity in descriptions obtained from crawling the web, the standardized taxonomy contains 46 apparel objects (27 main apparel items and 19 apparel parts), and 92 related fine-grained attributes. A total of around 50K clothing images in daily-life, celebrity events, and online shopping were labeled by Samasource’s fashion annotators for fine-grained segmentation.