Product data is not always available, complete and correct. Not every supplier has the capacity, the focus and the resources to deliver high quality data. As a result, revenue is lost due to missing or incorrect data or a lot of manual labor is required to complete and correct product information.
Our solution
Squadra Machine Learning Company has developed several algorithms which can enrich and/or improve product data automatically. The following algorithms are available:
Scraping
We are able to scrape suppliers websites and PDF files, find relevant product features of products automatically and convert these to the right standards. In this way product data can be enriched or data quality issues can be detected with limited manual intervention.
Feature extraction – text based
Product feature data might not always be available in a structured format, however features might be included in a product description or ERP text. With our advanced feature extraction algorithms these ‘hidden’ product features can be extracted with minimal manual intervention.
Feature extraction – image based
Product feature data might not always be available in a structured format, however features might be visable in product images. With our advanced feature extraction algorithms these visual product features can be extracted automatically.
Data quality: deduplication and fuzzy matching
Product data sets might contain duplicate data: similar feature data in multiple columns, multiple lines for the same or closely related products, sometimes misspelled or with type errors.
Our algorithms are able to detect these data quality issues and sometimes even automatically correct these.
Data quality: outlier detection and contractionary data
Product data might contain faulty product feature data. Our algorithms can detect ‘strange’ features for a certain product in comparison with other products in the same product group and can detect contradicts in product features for a product.
Our platform can be customized to support any available product information standard and language.
In practice
Kaemingk, an international company which specializes in selecting and distributing the most beautiful, original, and creative decorative items for home and garden, was implementing a new PIM system. In the process of converting product data from their old solution to the new PIM system, a lot of manual labor was reduced by using Squadra Machine Learning Company’s feature extraction algorithms.