Friday, June 25, 2010

Predictably Poor MetaData Quality

Big huge disclaimer -- post's title is a play on Jim Harris's excellent post titled "Predictably Poor Data Quality."


In fact, it's Jim's post that started me thinking about whether data and metadata quality issues stem from the same root source.

I work with a product that, among other things, infers table relationships and proposes ER diagrams based on statistical patterns found in the data.  We analyze the actual data values, rather than documentation and naming conventions, because the truth is often in the data.  (Of course, data quality issues add noise to that truth, but that's a topic for another day.)

Why do we analyze the data?  Because database documentation, ER diagrams, naming conventions and subject matter experts' memories are usually incomplete, if not errant or flat-out missing. 

In other words, this solution exists because organizations have predictably poor metadata.

As with data quality, solving metadata quality means addressing the root problems.  Technology solutions can help, but will never resolve the challenge in and of themselves. 

I'd argue that similar human motivations come into play for metadata quality as with data quality.  For example:
  • short-term thinking
  • believing documentation is someone else's role
  • empire-building
  • lack of understanding of the impact
  • task overload
Jim lists some great behavior data quality readings.  I think they'll be worth a re-read from a metadata perspective.

What behaviors do you see as root causes for poor metadata quality?

Monday, June 07, 2010

Road Warrior: Using TripIt to Manage Itineraries

This is the first week in around three months I haven't traveled.  After a while the trips blur together, as do the hotels and departure times.  I truly do not remember on any given week what time the next week's flights are.

I started using TriptIt.com in January to organize my travel details.  My memory isn't any better, but now I don't worry about it.

ScreenHunter_02 Jun. 07 23.15

When I get my itinerary email from my travel agency, I just forward that email to plans@tripit.com.  Tripit parses the email into plane, car, and hotel portions, then adds driving instructions between each and the latest weather forecasts.

Okay, that's nice but still a bit ho-hum.  What else does it do? Well, it:

  • Lets me know when friends and co-workers (who've given permission TripIt to share itineraries with me) are nearby on their travels.
  • Creates calendar entries for the travel on my BlackBerry.  (I do have to open the TripIt Blackberry app for this synchronization to occur.)
  • Lets me email updates to the itinerary and adjusts the entries -- and gets it right!
  • Integrates to a widget in my Lotus Notes sidebar.
  • Creates an RSS calendar feed that I can view in my Google and Lotus Notes calendars.  Disclaimer: my employer's travel software does send me calendar entries -- but I don't use them because it keeps getting things wrong.  TripIt's entries are much more accurate.
  • Sends me an email with my itinerary any time I request it. For example if I send an email to plans@tripit.com with "get car" in the subject line it'll send me an email with today's car rental details.

End result?  My calendar is up to date with almost no effort -- in Lotus, Google, and my Blackberry.  And, the time zone adjustments are always correct.  Unlike when I do it manually...

All the details of my reservations are in the calendar entries.  My confirmation codes and hotel addresses are always at my fingertips without having to search through my emails. 

Using a trip manager is a big time-saver for me.  Might be for you too.

Saturday, June 05, 2010

Data Quality: A Philosophical Approach to Truth

Jim Harris just wrote an excellent post titled The Point of View Paradox on the importance of considering how multiple points of view can lead to multiple perceptions of truth. 

The basic thesis is that the background, history, and perceptions we bring to a situation, any situation, will impact what we perceive as "truth" in that moment.

What does this have to do with data quality?  Or 'technical' projects of any type for that matter?  More than 15 years technology experience tell me it's a critical concept.  Some examples:

Software development:  How many times have we joked that users say "it's just what we asked for but not what we wanted."  How often do developers and users read the same specification document and build a completely different mental image?  Um, nearly every time!  Successful development projects necessarily include checkpoints to ensure all parties' "truth" re what's being built have a chance to sync up.

DBA:  Users want all the data available, all the time, at the push of a button.  Administration wants to meet business requirements at the minimal cost.  Security officers (and auditors) want thorough separation of data access based on business roles.  DBAs want to ensure stable systems with adequate hardware and staffing for backups, redundancy, and maintenance.  All important priorities -- yet each priority impacts the ability to meet the other priorities.  (Assuming we don't have unlimited funds....)

Data Quality:  The U.S. only has 50 states.  I know this to be a truth.  I also know of a system that has 51 states -- one of which is MassHampshire.  Why?  Because the company has an employee working in New Hampshire while living in Massachusetts.  The system's "truth" of 51 states reflected its need to manage a cross-border set of regulations.  When planning the data migration to a new ERP system, what originally appeared to be bad data was actually a use case that needed to be addressed in the new system's configuration.

US Map

Jim's post is worth taking a few moments to internalize.  We don't have to agree with another's "truth" or point of view, but we generally do best when we at least make an attempt to understand the logic behind it.  Even in technology.

Friday, June 04, 2010

Evergreen Topic: Securing the Oracle Agent

Jeepers.  According to HitTail the most popular posts I've ever written were in April, 2006.  In dog years (which are surely even slower than blog years?) that'd be 30 years ago.

So what's the top 'oldie but goodie' topic?  Securing communications with the agent -- who knew? 
Securing Communication with the Agent
Securing Communication with the Repository

I thought the RMAN posts would be the heavy hitters, but they came in 2nd, followed by Service Broker (SQLServer) 3rd.

My takeaway from this little exercise is that blogging about specific technical topics has an enduring value.  Now I'm thinking a few posts about configuring Optim Data Growth or InfoSphere Discovery may be in order.....

Wednesday, June 02, 2010

Apologies to Data Quality Aggregator Subscribers

Hey folks, sorry for the huge list of old posts from my blog that just showed up in the aggregator. 

I changed the aggregator to only include my posts if labeled "data quality" -- so y'all wouldn't have to read my ramblings about life as a road warrior, etc...

Unfortunately, submitting that change meant all 29 data quality posts re-fed themselves into the aggregated feed (sigh).

Have a great day.

Tuesday, June 01, 2010

Road Warrior: What I Carry

[Note: Life as a “road warrior” is one topic I expect to return to regularly as this blog evolves…]

I fly somewhere nearly every week. I’ll hit 40,000 miles for 2010 by this Friday. How does a dedicated geek decide what electronic toys to take? I decide by weight and whether it can be multi-purpose.

With these gadgets I can work, surf, read, connect. They're my lifeline.

Always with me:

  • Laptop (duh!)
  • iGo charging system: I always carry the wall plug and car charge always. I never carry the laptop’s original power cord when I travel – it’s iGo all the way.
  • iGo tips: For USB, micro USB, iPod, and my laptop.

In case you aren’t familiar with the iGo, here’s a pic of mine:

iGoParts

  • Blackberry: It sometimes serves several purposes … a Kindle reader, GPS, twitter client, facebook client, flickr client, and Pandora radio. Oh, and sometimes I use it as the Internet source for my laptop.
  • Spare Blackberry battery: After reading how I overload my Blackberry, are you surprised?
  • USB Hard Drive: I keep spare copies of my Virtual Machines on the hard drive. I also use it to transfer files between laptops instead of thumb drives – I lose thumb drives too easily.
  • LAN Cable: 10 foot cable. Wireless is almost ubiquitous, but not quite enough to risk leaving the cable at home.
  • iPod: I get my news via my nano, by listening to the Wall Street Journal This Morning podcast. It’s also great for playing workout videos to help me be less of a slug.

Sometimes with me:

  • Kindle: If I expect to have spare time to read, and if I’m not hauling a lot of other gadgets. I love my Kindle, but sometimes I leave it behind to save on weight.
  • Garmin GPS: Only if I don’t know the area. If the Garmin comes with, so does a wall-plug charger. My Garmin doesn’t charge reliably from the car charger. (As I learned when it died 10 minutes outside Taylor, Texas one day.)
  • Plantronics Bluetooth: I actually like the Jawbone better, but it’s too small. After losing two of ‘em I decided to go with the Plantronics’ larger size.

What gadgets can’t you live without when you’re traveling? Which do you leave behind?

Blog Directions: A Year and a Half Later

Wow, time really does fly...especially when you're flying.

Last year I logged just over 100,000 miles on airplanes.  It's a little slower this year -- I've flown just under 40,000 so far.  And yes I'm tired.

mileage

My life's changed a lot since I began this blog.  I was a DBA for a regional university with a strong interest in all things data quality/security.  I've evolved into a sales engineer for IBM, working closely with DBAs, Architects, and Directors for customers around the country.

I'm still me, always interested in all things data, but the way I use my technical skills has changed over time.  I think it's time for this blog to do the same.

What does that mean?  Excellent question!  Maybe in a month or two I'll figure that out.  Penelope Trunk writes "Don't get hung up on topic. As in dating, you'll know when you've found one that's the right fit."  I'm at that stage again....

Since I travel so much you can count on seeing posts about the road warrior life.  I'm still a data geek, and will always include posts along that line.  Maybe some posts about life as a sales engineer role too...

More to come.