If you watch Data Quality news, or subscribe to my Information Quality aggregator, you'll have already seen the announcement that Microsoft's purchasing data quality vendor Zoomix. Vince McBurney and others have posted analyses of the purchase, citing Zoomix's ability to better-position Microsoft for MDM.
A Product Data Quality Capability
What caught my eye in the press release was the mention of product data quality. I posted last June on the challenges of standardizing product attributes. Similarly, Peter Scott's posted on the challenges of modeling products. A good deal of research has targeted how to standardize, classify, and cleanse customer/vendor names and addresses, but few have taken up the challenge for working with products.
[I should mention that Zoomix also targets other data domains such as customer name /address. I chose to only review product domain capabilities for this article.]
But, Aren't Products Easy?
The attribute set for the product domain is surprisingly rich. It encompasses a products' "state" in the production cycle, allows for composition and decomposition of multi-component products, tracks materials' sources, includes geographic-specific packaging and ingredients, and of course deals with shipping units and pricing.
The likelihood of two business partners implementing their product data in the same fashion is roughly the same as the likelihood of Larry Ellison stopping by my house for a visit.
I know of at least one (large) retailer whose automation pulls only the most basic information into the Product Information Management (PIM) software. Staff manually key in the rest of the data from product sheets -- the data standardization is more challenge than they're able to address. (I should note they have initiatives upcoming to improve their situation.)
The Promise
Since I haven't used Zoomix (though I'd be MORE than happy to put it through its paces with a trial license) I can only speak to what's written on their product sheets.
- It appears to parse semi-structured data, a hotbutton of mine.
- They diagram support for a SOA implementation architecture -- pretty much a requirement for any serious player these days.
- They also tout a self-learning capability to avoid needing to build up rule sets. I'll be skeptical of this last part until actually seeing it in action.
- They mention support of the major product taxonomies (UNSPSC, GS1, etc)

Delivery on a promise is what matters in the end. They've raised my curiosity -- I'll be watching competitors' announcements for reactions/trends and will post my findings here.
Ivan Chong has already weighed in over at Informatica, rightly pointing out the value of dbms-agnostic data quality rules. I would hope Microsoft will keep this in mind, but that may be overly optimistic on my part.

0 comments:
Post a Comment