We are able to scrape suppliers websites and PDF files, find relevant product features of products automatically and convert these to the right standards. In this way product data can be enriched or data quality issues can be detected with limited manual intervention.
Feature extraction – text based
Product feature data might not always be available in a structured format, however features might be included in a product description or ERP text. With our advanced feature extraction algorithms these ‘hidden’ product features can be extracted with minimal manual intervention.
Feature extraction – image based
Product feature data might not always be available in a structured format, however features might be visable in product images. With our advanced feature extraction algorithms these visual product features can be extracted automatically.
Data quality: deduplication and fuzzy matching
Product data sets might contain duplicate data: similar feature data in multiple columns, multiple lines for the same or closely related products, sometimes misspelled or with type errors.
Our algorithms are able to detect these data quality issues and sometimes even automatically correct these.
Data quality: outlier detection and contractionary data
Product data might contain faulty product feature data. Our algorithms can detect ‘strange’ features for a certain product in comparison with other products in the same product group and can detect contradicts in product features for a product.
Our platform can be customized to support any available product information standard and language.