December 21, 2011
In most ways, Data Governance of Big Data is not different from normal Data Governance. The benefits are the same. The reasons for doing it are the same. And, mostly, what needs to be done is the same.
What is different about Big Data Governance is that it’s about more data types, more sophisticated tools are needed, and the need for more metadata is critical.
First of all, Big Data Governance requires performing Governance over many different types of data, not just what’s in relational databases. Certainly, the scope needs to include non-relational databases and unstructured data and documents. This itself may require new tools to deal with these other technologies.
Secondly (and maybe this should be first because it is about data volumes), more sophisticated tools are needed to assess and profile data. Big Data volumes are beyond human manageable scale and the traditional approaches of profiling and managing data primarily through observation becomes unfeasible.
Thirdly, the importance of collecting and documenting metadata becomes critical in order to automate as much as possible of the Data Governance activities. This item is tied with the one above, in that more sophisticated tools can help to infer the metadata of the relatonships between the data, and metadata is required to automate the monitoring activities.
In summary, the strategic reasons for doing Data Governance remain the same and the way a Big Data organization is structured, but how the Data Governance of Big Data is actually performed may be very different.
December 6, 2011
The Data Management Association (DAMA) is now offering a Data Governance Certification as an option of their current Certified Data Management Professional, which is a natural extension since the test for Data Governance already existed under their current certification process and merely requires a specific configuration of test modules. But what does Data Governance certification mean and is that really what is needed? The Data Governance certification offered by DAMA is, to a great extent, based on the Data Governance practice area described in the DAMA Data Management Body of Knowledge document (DMBOK) which was published in 2009. That focuses on the best practices for a Data Governance program and organization in terms of what activities it should be performing, what tools it should be using, and what roles and responsibilities should be present. But do we need to be certifying that people know how to set up a Data Governance program? Rather, should we be focusing on what the people who need to perform Data Governance for an organization should be doing – the Data Stewards? Certifying Data Stewards may not be something that should be done generically. Rather, an organization may want to certify that the identified Data Stewards within their organization are knowledgeable in the agreed standard operating procedures for Data Stewards in that particular organization. In summary, having a Data Governance certification makes sense that identifies individuals who are familiar with how, in general, a Data Governance organization should be created and operated. It makes more sense for an organization to certify their Data Stewards on the particular processes unique to their organization.