<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>thoughtsondata</title>
	<atom:link href="http://thoughtsondata.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://thoughtsondata.com</link>
	<description>My thoughts on Data Management</description>
	<lastBuildDate>Thu, 25 Apr 2013 15:44:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='thoughtsondata.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>thoughtsondata</title>
		<link>http://thoughtsondata.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://thoughtsondata.com/osd.xml" title="thoughtsondata" />
	<atom:link rel='hub' href='http://thoughtsondata.com/?pushpress=hub'/>
		<item>
		<title>Drivers for Managing Data Integration – from Data Conversion to Big Data</title>
		<link>http://thoughtsondata.com/2013/04/25/drivers-for-managing-data-integration-from-data-conversion-to-big-data/</link>
		<comments>http://thoughtsondata.com/2013/04/25/drivers-for-managing-data-integration-from-data-conversion-to-big-data/#comments</comments>
		<pubDate>Thu, 25 Apr 2013 15:44:47 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Archiving]]></category>
		<category><![CDATA[Data Conversion]]></category>
		<category><![CDATA[Data Integration]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Master Data Management]]></category>
		<category><![CDATA[Technology Strategy]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=124</guid>
		<description><![CDATA[Data management in an organization is focused on getting data to its data consumers (whether human or application). Whereas the goal of data quality and data governance is trusted data, the goal of data integration is available data – getting data to the data consumers in the format that is right for them. My new [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=124&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Data management in an organization is focused on getting data to its data consumers (whether human or application). Whereas the goal of data quality and data governance is <strong>trusted data</strong>, the goal of data integration is <strong>available data </strong>– getting data to the data consumers in the format that is right for them.<br />
My new book on Data Integration has been published and is now available: <a href="http://store.elsevier.com/product.jsp?isbn=9780123971678" target="_blank">“Managing Data in Motion: Data Integration Best Practice Techniques and Technologies”</a>. Of course the first part of a book on data management techniques has to answer the question of why an organization should invest time and effort and money. The drivers for data integration solutions are very compelling.<br />
<strong>Supporting Data Conversion</strong><br />
One very common need for data integration techniques is when copying or moving data from one application or data store to another, either when replacing an application in a portfolio or seeding the data needed for an additional application implementation. It is necessary to format the data as appropriate for the new application data store, both the technical format and the semantic business meaning of the data.<br />
<strong>Managing the Complexity of Data Interfaces by Creating Data Hubs – MDM, Data Warehouses &amp; Marts, Hub &amp; Spoke</strong><br />
This, I think, is the most compelling reason for an organization to have an enterprise data integration strategy and architecture: hubs of data significantly simplify the problem of managing the data flowing between the applications in an organization. The number of potential interfaces between applications in an organization is an exponential function of the number of applications. Thus, an organization with one thousand applications could have as many as half a million interfaces, if all applications had to talk to all others. By using hubs of data, an organization brings the potential number of interfaces down to be just a linear function of the number of applications.<br />
Master Data Management hubs are created to provide a central place for all applications in an organization to get its Master Data. Similarly, Data Warehouses and Data Marts enable an organization to have one place to obtain all the data they need for reporting and analysis.<br />
Data hubs that are not visible to the human data consumers of the organization can be used to significantly simplify the natural complexity of data interfaces. If data being passed around in the organization is formatted, on leaving the application where it has been updated, into a common data format for that type of data, then applications updating data only need to reformat data into one format, instead of a different format for every application that needs it. Applications that need to receive the data that has been updated only need to reformat the data from the one common format into their own needs. This approach to data integration architecture is called using a “hub and spoke” approach. The structure of the common data format that all applications pass their data to and from is called the “canonical model.” Applications that want a certain kind of data need to “subscribe” to that data and applications that provide a certain kind of data are said to “publish” the data.<br />
<strong>Integrating Vendor Packages with an Organization’s Application Portfolio</strong><br />
Current best practice is to buy vendor packages rather than developing custom applications, whenever possible. This exacerbates the data integration problem because each of these vendor packages will have their own master data that have to be integrated with the organization’s master data and they will either have to send or receive transactional data for consolidated reporting and analytics.<br />
Sharing Data Among Applications and Organizations<br />
Some data just naturally needs to flow between applications to support the operational processes of the organization. These days, that flow of data usually needs to be in a real time or near real time mode, and it makes sense to solve the requirements across the enterprise or across the applications that support the supply chain of data rather than developing independent solutions for each application.<br />
<strong>Archiving Data</strong><br />
The life cycle for data may not match the life cycle for the application in which it resides. Some data may get in the way if retained in the active operational application and some data may need to be retained after an application is retired, even if the data is not being migrated to another application. All enterprises should have an enterprise archiving solution available where data can be housed and from which it can still be retrieved, even if the application from which it was taken no longer exists.<br />
Moving data out of an application data store and restructuring it for an enterprise archiving solution is an important data integration function.<br />
<strong>Leveraging External Available Data</strong><br />
There is so much data now available from government and other sites external to a company’s own, for free as well as data available for a fee. In order to leverage the value of what is available the external data needs to be made available to the data consumers who can use it, in an appropriate format. The amount of data now available is so vast and so fast that it may not be warranted to store or persist the external data, rather using techniques with data virtualization and streaming data, or not to store the data within the organization, choosing instead to leverage cloud solutions that are also external.<br />
<strong>Integrating Structured and Unstructured Data</strong><br />
New tools and techniques allow analysis of unstructured data such as documents, web sites, social media feeds, audio, and video data. Greatest meaning can be applied to the analysis when it is possible to integrate together structured data (found in databases) and unstructured data types listed above. Data integration techniques and new technologies such as data virtualization servers enable the integration of structured and unstructured data.<br />
<strong>Support Operational Intelligence and Management Decision Support</strong><br />
Using data integration to leverage big data includes not just mashing different types of data together for analysis, but being able to use data streams with that big data analysis to trigger alerts and even automated actions. Example use cases exist in every industry but some of the ones we’re all aware of include monitoring for credit card fraud as well as recommending products.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/124/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/124/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=124&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2013/04/25/drivers-for-managing-data-integration-from-data-conversion-to-big-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Is High Availability Sexy?</title>
		<link>http://thoughtsondata.com/2013/04/10/is-high-availability-sexy/</link>
		<comments>http://thoughtsondata.com/2013/04/10/is-high-availability-sexy/#comments</comments>
		<pubDate>Wed, 10 Apr 2013 19:42:05 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Business Continuity]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Technology Strategy]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=120</guid>
		<description><![CDATA[The subject of business continuity has grown in appeal for me as my years in IT have grown, especially as I have personally experienced disasters big and small and the need to recover systems and facilities. I became particularly interested during my training as an IT auditor. The area of business continuity isn’t necessarily a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=120&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The subject of business continuity has grown in appeal for me as my years in IT have grown, especially as I have personally experienced disasters big and small and the need to recover systems and facilities.  I became particularly interested during my training as an IT auditor.<br />
The area of business continuity isn’t necessarily a “sexy” part of data management, but it is a franchise requirement for most organizations and corporations and certainly critical for financial services organizations.  Interestingly, the responsibility for business continuity is a business responsibility and yet the knowledge and training for how to implement it is a specialty within information technology (IT).  I call this “the tail wagging the dog” because the responsibility cannot be delegated to IT and yet IT needs to lead the process of how to implement it.<br />
The way we implement business continuity is using techniques in disaster recovery and high availability.  Disaster recovery is how to bring back up systems and access after the loss of power, services, or access to a facility.  High availability is a similar concept except maintaining system continuity by switching to alternative resources automatically at the loss of any resource, system, connection, facility, etc.<br />
The rule of thumb with business continuity is that the lower the amount of time of any disruption at the loss of a resource, the higher the cost.  Thus, a high availability solution that has no disruption has the highest cost.  Organizations that require high availability solutions are therefore frequently spending millions of dollars on their disaster recovery solutions and millions more on their high availability solutions.<br />
EMC has recently released a new high availability services product and is now asking the question “why invest in both disaster recovery and high availability?” <a href="http://www.emc.com/about/news/press/2013/20130212-01.htm" title="EMC High Availability Offerring">http://www.emc.com/about/news/press/2013/20130212-01.htm</a><br />
Maybe organizations that require high availability should put their business continuity budgets into that rather than dividing between both high availability and disaster recovery.  Well, it may not be such a simplistic answer.  Should every single application in the organization be set up with high availability?  And yet, dividing systems between continuity solutions makes testing and effecting business continuity much more difficult.  As long as the organization can prove they have a high availability solution for everything that would serve any necessary disaster recovery requirements.<br />
OK.  High availability isn’t sexy.  But to me, it is slightly sexier than disaster recovery.  Certainly it is time to consider whether it is more cost effective to put the entire business continuity budget into high availability.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/120/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/120/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=120&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2013/04/10/is-high-availability-sexy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Master Data in Big Data Management</title>
		<link>http://thoughtsondata.com/2013/04/03/master-data-in-big-data-management/</link>
		<comments>http://thoughtsondata.com/2013/04/03/master-data-in-big-data-management/#comments</comments>
		<pubDate>Wed, 03 Apr 2013 18:42:06 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Master Data Management]]></category>
		<category><![CDATA[Metadata]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=116</guid>
		<description><![CDATA[Currently, most data management activities are segregated by data type: documents are kept in one type of file repository, emails in another, structured data in databases, etc. One of the goals and values of big data management is being able to analyze data across these repositories, but if so then how do we link the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=116&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Currently, most data management activities are segregated by data type: documents are kept in one type of file repository, emails in another, structured data in databases, etc. One of the goals and values of big data management is being able to analyze data across these repositories, but if so then how do we link the data together?  A big part of the answer, I believe, is master data.  Master data is the data describing the important things in the organization: customers, products, employees, organizational structure, financial reporting structure, etc. </p>
<p>People with appropriate access in the organization should be able to view about a customer, for example, not only the customer&#8217;s name, addresses, and other demographic information, but emails to and from and concerning the customer, documents related to them, as well as audio recordings of any calls to customer service and video of the customer visiting the organization&#8217;s offices. All the organizational information about a customer can be made appropriately available to customer service to support a customer inquiry, to identify additional products with which the customer might be interested, or to predict likely future behavior.</p>
<p>Standard business intelligence tools can be used to find and connect information about a customer located in databases.  Tools that search text can be used to find information related to a customer in document and email repositories either because these items contain text with the customer&#8217;s identifying information or because someone has tagged the documentation with the customer&#8217;s id.  Similarly, audio and video files can be searched for the customer likeness or tagged manually with customer information. Links to a customer can be made at the time the information is stored or dynamically when a query is made about the customer.  Tagging files and documents with customer identifiers can be performed automatically or manually.  The ability to attach the customer information automatically is critical to big data management since the volume of data is usually beyond human manageable scale and we need to move away from the concept of manually crafted metadata.</p>
<p>And so, if the data in databases is called &#8220;structured&#8221; with keys associated with the master data in the organization, then we can integrate that data together with the &#8220;unstructured&#8221; data in files, documents, and emails by tagging the unstructured data with the key master data information, automatically and manually, at storage and at query time.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/116/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=116&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2013/04/03/master-data-in-big-data-management/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Don’t Get Caught in the Statistical Cobwebs of Data Quality</title>
		<link>http://thoughtsondata.com/2012/11/14/dont-get-caught-in-the-statistical-cobwebs-of-data-quality/</link>
		<comments>http://thoughtsondata.com/2012/11/14/dont-get-caught-in-the-statistical-cobwebs-of-data-quality/#comments</comments>
		<pubDate>Wed, 14 Nov 2012 19:16:22 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Data Conversion]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Project Management]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=114</guid>
		<description><![CDATA[A couple of days ago, I heard about a data conversion project where the team was taking a statistical sample of source master data and cleaning it.  The discussion I heard was on how big a sample needed to be taken in order to have a 5% margin of error and continued through a series [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=114&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A couple of days ago, I heard about a data conversion project where the team was taking a statistical sample of source master data and cleaning it.  The discussion I heard was on how big a sample needed to be taken in order to have a 5% margin of error and continued through a series of issues about statistics.  This discussion has brought me up short because sometimes we get so caught up in the mathematics and techniques of our processes that we lose the basic understanding of when different techniques are appropriate.</p>
<p>I applaud the fact that the data conversion project in question had enough foresight to include a data quality stream, certainly not always the case.  Besides the fact that we don’t always know the level of quality of our production data in systems that have been running for many years, it is frequently true that data may have to be additionally cleaned in order for it to be in a state sufficient for the running of a new system to which the data is to be migrated.  The standard method to assess the quality level of the data in the source systems is to take a statistical sample of the source data and assess whether the quality level is sufficient for the target system.  Once we’ve determined how much cleaning of the sample data is necessary to get it into proper shape, we can extrapolate the estimate across the entire set of master data in order to determine how much in time and resources would be necessary to clean the master data.</p>
<p>How does a method for statistically determining an estimate turn into the idea that we only need to clean the statistical sample of data? And even if one person accidentally skipped a few steps in specifying the process, why hasn’t anyone else realized that cleaning a statistical sample of data doesn’t make the entire dataset clean?  Somehow, an entire project team has been dazzled by the implementation of statistics, or no one really thought about it that hard because it wasn’t their job.  Anyway, if you clean a statistical sample of data then only that sample will be sufficiently clean for your target system, the rest of the data will be at the same quality level as the start.</p>
<p>How do we prevent a problem like this?  I believe that a great deal of the problem is that most people like to be as far away from theoretical mathematical discussions as possible because, as Barbie used to say: “Math is hard”.  I think it is important, however, that people with common sense ask questions about project planning and process, even if they don’t understand the complexity of a technical design or approach.  Even very complex issues like encryption and business continuity need to make sense in their implementation and can easily be applied in slightly wrong ways that lose the intent. The economist Kenneth Galbraith proposed in the 1960s that technicians would take over the running of companies because business people wouldn’t understand what the technicians were talking about.  That did not happen because business managers with common sense insisted that the technicians explain the concepts sufficiently to their understanding, regardless of how long it took.  We need to continue that practice with even the implementation of statistics and mathematical concepts in project planning and data management.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/114/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=114&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/11/14/dont-get-caught-in-the-statistical-cobwebs-of-data-quality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Big Data and NoSQL – the problem with relational databases</title>
		<link>http://thoughtsondata.com/2012/09/25/big-data-and-nosql-the-problem-with-relational-databases/</link>
		<comments>http://thoughtsondata.com/2012/09/25/big-data-and-nosql-the-problem-with-relational-databases/#comments</comments>
		<pubDate>Tue, 25 Sep 2012 19:05:00 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Data Management]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=110</guid>
		<description><![CDATA[The NoSQL movement, where “NoSQL” stands for “Not Only SQL” is based on the concept that relational databases are not the right database solution for all problems.  Relational databases are so ubiquitous in most organization these days that many people may not even be aware that there are other types of databases, let alone when [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=110&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The NoSQL movement, where “NoSQL” stands for “Not Only SQL” is based on the concept that relational databases are not the right database solution for all problems.  Relational databases are so ubiquitous in most organization these days that many people may not even be aware that there are other types of databases, let alone when using another database might be preferable. Relational databases perform transaction update functions very well, particularly handling the difficult issues of consistency during update. Production strength relational databases can handle the complexity of two phase commit capability, where one business transaction affects multiple databases and tables, and all updates have to be effected at the same moment.</p>
<p>However, relational databases apply much of the same overhead required for complex update operations to every activity, and that can handicap them for other functions. Relational databases struggle with the efficiency of certain operations key to big data management.  Firstly, they don’t scale well to very large sizes, and although grid solutions can help with this problem, the creation of new clusters on the grid is not dynamic and large data solutions become very expensive using relational databases. Secondly, they don’t do unstructured data search very well (i.e. google type searching) nor do they handle data in unexpected formats well. Thirdly, but not lastly, it is difficult to implement certain kinds of basic queries using SQL and relational databases, such as the shortest path between two points.</p>
<p>Social networking and big data organizations such as Facebook, Yahoo, Google, and Amazon were among the first to decide that relational databases were not good solutions for the volumes and types of data that they were dealing with, hence the development of the Hadoop file system, the MapReduce programming language, and associated databases such as Cassandra and HBase.  One of the key capabilities of a Hadoop type environment is the ability to dynamically, or at least easily, expand the number of servers being used for data storage. The cost of storing large amounts of data in a relational database gets very expensive, where cost grows geometrically with the amount of data to be stored, reaching a limit in the petabyte range.  The cost of storing data in a Hadoop solution grows linearly with the volume of data and there is no ultimate limit.</p>
<p>I was a working programmer before relational databases were in common use.  Yes, we did have electricity back then.  And the databases I used were of the type called “hierarchical”.  In fact, they were more efficient, in general, for high volume individual transaction processing than relational databases, although like relational databases they were not good for data that was structured inconsistently.  But what we considered “high volume” then could be handled reasonably by my laptop now and those databases couldn’t handle dynamically allocating unlimited additional space, either.</p>
<p>In my next blog post I’ll describe some of the new classes of NoSQL databases and what problems they solve well.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/110/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/110/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=110&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/09/25/big-data-and-nosql-the-problem-with-relational-databases/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Big Data Modeling  &#8211; part 2 – The Big Data Modeler</title>
		<link>http://thoughtsondata.com/2012/07/23/big-data-modeling-part-2-the-big-data-modeler/</link>
		<comments>http://thoughtsondata.com/2012/07/23/big-data-modeling-part-2-the-big-data-modeler/#comments</comments>
		<pubDate>Mon, 23 Jul 2012 21:21:37 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Data Integration]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Data Virtualization]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=108</guid>
		<description><![CDATA[Continuing my discussion of “Big Data Modeling,” what is it and is it any different from normal data modeling?  Ultimately, the questions come down to: is there a role for a modeler on Big Data projects and what does that role look like? Modeling for Communication - If modeling is the process of creating a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=108&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><em></em>Continuing my discussion of “Big Data Modeling,” what is it and is it any different from normal data modeling?  Ultimately, the questions come down to: is there a role for a modeler on Big Data projects and what does that role look like?</p>
<p>Modeling for Communication -</p>
<p>If modeling is the process of creating a simpler representation of something that does or might exist, we can use modeling for communicating information about something in a simpler way than presenting the thing itself.  After all, we aren’t limited in describing a computer system to presenting only the system itself, but we present various models to communicate different aspects of what is or what might be.</p>
<p>Modeling Semantics -</p>
<p>On Big Data projects, as with all data oriented projects, it is necessary to communicate logical and semantic concepts about the data involved in the project.  This may involve, but is not limited to, models presented in entity-relationship diagrams.  The data modeling needs, in fact, are not limited to design of structures even but certainly includes data flows, process models, and other kinds of models.  This also would include any necessary taxonomy and ontology models.</p>
<p>Modeling Design -</p>
<p>Prior to construction it is necessary to represent (design) the data structures needed for the persistent as well as transitory data used in the project.  Persistent data structures include those in files or databases.  Transitory data structures include the messages and streams of data passing into and out of the organization as well as between applications.  For data being received from other organizations or groups, this may be receiving information rather than designing. This is, or is close to, the physical design level of the implementation including the design of database tables and structures, file layouts, metadata tags, message layouts, data services, etc.</p>
<p>Modeling Virtual Layers -</p>
<p>There is a big movement in systems development in virtualizing layers of the infrastructure, where the view presented to programmers or users may be different from the actual physical implementation.  This move toward creating virtual layers that can change independently is true in data design as well. It is necessary to design, or model, the presentation of information to the systems users (client experience) and programmers independently of the modeling of the physical data structures. This is more necessary for Big Data because it includes designing levels of virtualization for normalizing or merging data of different types into a consistent format.  In addition to the modeling of the virtual data layers there is a need for the translation from the physical data structures to the virtual level such as between relational database structures and web service objects.</p>
<p>Modeling Mappings and Transformations -</p>
<p>t is necessary in any design that involves the movement of data between systems, whether Big Data or not, to specifiy the lineage in the flow of data from physical data structure to physical data structure including the mappings and transformation rules necessary from persistent data structure to message to persistent data structure, as necessary.  This level of design requires an understanding of both the physical implementation and the business meaning of the data. We don’t usually call this activity modeling but strictly design.</p>
<p><em>Ultimately, there is a lot of work for a data modeler on Big Data projects, although little of it may look like creating entity relational models.  There is the need to create models for communicating ideas, for designing physical implementation solutions, for designing levels of virtualization, and for mapping between these models and designs.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/108/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/108/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=108&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/07/23/big-data-modeling-part-2-the-big-data-modeler/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Big Data Modeling  &#8211; part 1 – Defining “Big Data” and “Data Modeling”</title>
		<link>http://thoughtsondata.com/2012/07/15/big-data-modeling-part-1-defining-big-data-and-data-modeling/</link>
		<comments>http://thoughtsondata.com/2012/07/15/big-data-modeling-part-1-defining-big-data-and-data-modeling/#comments</comments>
		<pubDate>Sun, 15 Jul 2012 16:35:55 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Data Integration]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Data Modeling]]></category>
		<category><![CDATA[Data Virtualization]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=105</guid>
		<description><![CDATA[Last month I participated in a DataVersity webinar on Big Data Modeling .  There are a lot of definitions necessary in that discussion. What is meant by Big Data? What is meant by modeling? Does modeling mean entity-relationship modeling only or something broader? The term “Big Data” implies an emphasis on high volumes of data. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=105&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Last month I participated in a DataVersity webinar on Big Data Modeling .  There are a lot of definitions necessary in that discussion. What is meant by Big Data? What is meant by modeling? Does modeling mean entity-relationship modeling only or something broader?</p>
<p>The term “Big Data” implies an emphasis on high volumes of data. What constitutes big volumes for an organization seems to be dependent on the organization and its history.  The Wikipedia definition of “Big Data” says that an organization’s data is “big” when it can’t be comfortably handled by on hand technology solutions.  Since the current set of relational database software can comfortably handle terabytes of data and even desktop productivity software can comfortably handle gigabytes of data, “big” implies many terabytes at least.</p>
<p>However, the consensus on the definition of “Big Data” seems to be with the Gartner Group definition that says that “Big Data” implies large volume, variety, and velocity of data.  Therefore, “Big Data” means not just data located in relational databases but files, documents, email, web traffic, audio, video, and social media, as well.  The various types of data provides the “variety”, and not just data in an organization’s own data center but in the cloud and data from external sources as well as data on mobile devices.</p>
<p>The third aspect of “Big Data” is the velocity of data.  The ubiquity of sensor and global position monitoring information means a vast amount of information available at an ever increasing rate from both internal and external sources.  How quickly can this barrage of information be processed?  How much of it needs to be retained and for how long?</p>
<p>What is “data modeling”? Most people seem to picture this activity as synonymous with “entity relationship modeling”.  Is entity relationship modeling useful for purposes outside of relational database design?  If modeling is the process of creating a simpler representation of something that does or might exist, we can use modeling for communicating information about something in a simpler way than presenting the thing itself. So modeling is used for communicating.  Entity relationship modeling is useful to communicate information about the attributes of the data and the types of relationships allowed between the pieces of data.  This seems like it might be useful to communicate ideas outside of just relational databases.</p>
<p>Data modeling is also used to design data structures at various levels of abstraction from conceptual to physical. When we differentiate between modeling and design, we are mostly just differentiating between logical design and design closer to the physical implementation of a database. So data modeling is also useful for design.</p>
<p>In the next part of this blog I’ll get back to the question of “Big Data Modeling.”</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/105/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/105/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=105&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/07/15/big-data-modeling-part-1-defining-big-data-and-data-modeling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Data Virtualization – Part 2 – Data Caching</title>
		<link>http://thoughtsondata.com/2012/06/10/data-virtualization-part-2-data-caching/</link>
		<comments>http://thoughtsondata.com/2012/06/10/data-virtualization-part-2-data-caching/#comments</comments>
		<pubDate>Sun, 10 Jun 2012 14:46:46 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Data Virtualization]]></category>
		<category><![CDATA[Technology Strategy]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=102</guid>
		<description><![CDATA[Another type of Data Virtualization that is less frequently discussed than Business Intelligence (see my previous blog) has to do with having data available in the computer’s memory, or as close to it as possible, in order to tremendously speed up the speed of both data access and update. A simplistic way of thinking about [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=102&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Another type of Data Virtualization that is less frequently discussed than Business Intelligence (see my previous blog) has to do with having data available in the computer’s memory, or as close to it as possible, in order to tremendously speed up the speed of both data access and update.</p>
<p>A simplistic way of thinking about the relative time to retrieve data is that if it takes a certain amount of time in nanoseconds to retrieve something in memory then it will be something like 1000 times that to retrieve data from disk (milliseconds). Depending on the infrastructure configuration, retrieving data over a LAN or from the internet may be ten to 1000 times slower than that. If I load my most heavily used data into memory in advance, or something that behaves like memory, then my processing of that data should be speeded up by multiple orders of magnitude.  Using solid state disk for heavily used data can achieve access and update response times similar to having data in memory.  Computer memory, as well as solid state drives, although not as cheap as traditional disk, are certainly substantially less expensive than they used to be.</p>
<p>Using memory caching of data can be done using traditional databases and sequential processing, and multiple orders of magnitude response time improvements can be achieved.  However, really spectacular performance is possible if we combine memory caching with parallel computing and databases designed around data caching, such as GemFire.  This does require that we develop systems using these new technologies and approaches in order to take advantage of parallel processing and optimized data caching, but the results can be blazingly fast.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/102/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=102&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/06/10/data-virtualization-part-2-data-caching/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>People who Tweet about Data Management</title>
		<link>http://thoughtsondata.com/2012/04/30/people-who-tweet-about-data-management/</link>
		<comments>http://thoughtsondata.com/2012/04/30/people-who-tweet-about-data-management/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 13:59:16 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Data Security]]></category>
		<category><![CDATA[Master Data Management]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=99</guid>
		<description><![CDATA[Data Management &#38; Architecture Karen Lopez @datachick Neil Raden @NeilRaden Robin Bloor @robinbloor M. David Allen @mdavidallen Sue Geuens @suegeuens Mehmet Orun @DataMinstrel Alec Sharp @alecsharp Loretta Mahon Smith @silverdata Eva Smith @datadeva Corine Jasonius @DataGenie Peter Aiken @paiken Tony Shaw @tonyshaw Glenn Thomas @Warduke Bonnie O’Neil @bonnieoneil Rob Paller @RobPaller Pete Rivett @rivettp Charles [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=99&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><span style="text-decoration:underline;">Data Management &amp; Architecture</span></p>
<p>Karen Lopez @datachick</p>
<p>Neil Raden @NeilRaden</p>
<p>Robin Bloor @robinbloor</p>
<p>M. David Allen @mdavidallen</p>
<p>Sue Geuens @suegeuens</p>
<p>Mehmet Orun @DataMinstrel</p>
<p>Alec Sharp @alecsharp</p>
<p>Loretta Mahon Smith @silverdata</p>
<p>Eva Smith @datadeva</p>
<p>Corine Jasonius @DataGenie</p>
<p>Peter Aiken @paiken</p>
<p>Tony Shaw @tonyshaw</p>
<p>Glenn Thomas @Warduke</p>
<p>Bonnie O’Neil @bonnieoneil</p>
<p>Rob Paller @RobPaller</p>
<p>Pete Rivett @rivettp</p>
<p>Charles T. Betz @CharlesTBetz</p>
<p>Tracie Larsen @RelatedStuff</p>
<p>Wayne Eckerson @weckerson</p>
<p>Julian Keith Loren @jkloren</p>
<p>Christophe @mydatanews</p>
<p>Steve Francia @spf13</p>
<p>Gorm Braavig @gormb</p>
<p>Jim Finwick @jimfinwick</p>
<p>Alexej Freund @alexej_freund</p>
<p>Corinna Martinez @Futureatti</p>
<p><span style="text-decoration:underline;">Data Quality</span></p>
<p>Jim Harris @ocdqblog &#8211; blog</p>
<p>David Loshin @davidloshin – blog</p>
<p>Rich Murnane @murnane</p>
<p>Daragh O Brien @daraghobrien</p>
<p>Jacqueline Roberts @JackieMRoberts</p>
<p>Steve Tuck @SteveTuck</p>
<p>Vish Agashe @VishAgashe</p>
<p>Julian Schwarzenbach @jschwa1</p>
<p>Henrik L. Sorensen @hlsdk</p>
<p><span style="text-decoration:underline;">MDM and Data Governance</span></p>
<p>Jill Dyche @jilldyche &#8211; blog</p>
<p>Charles Blyth @charlesblyth</p>
<p>Steve Sarsfield @stevesarsfield – blog</p>
<p>Dan Power @dan_power</p>
<p>Philip Tyler @tylep0</p>
<p><span style="text-decoration:underline;">Business Intelligence and Analytics</span></p>
<p>Marcus Borba @marcusborba</p>
<p>Tamara Dull @tamaradull</p>
<p>Claudia Imhoff @Claudia_Imhoff – blog</p>
<p>Scott Wallask @BI_expert</p>
<p>Peter Thomas @PeterJThomas – blog</p>
<p>Barney Finucane @bfinucane</p>
<p>Matt Winkleman @mattwinkleman</p>
<p>Stray_Cat @Stray_Cat</p>
<p>Brett2point0 @Brett2point0</p>
<p><span style="text-decoration:underline;">Risk Management</span></p>
<p>Peter Went @Bank_Risk</p>
<p>Joshua Corman @joshcorman</p>
<p>Michael Rasmussen @GRCPundit</p>
<p>Nenshad Bardoliwalla @nenshad</p>
<p>Gary Byrne @GRCexpert</p>
<p>Helmut Schindlwick @Schindwick</p>
<p><span style="text-decoration:underline;">Technology Companies and Data Organizations</span></p>
<p>Oracle @Oracle</p>
<p>DAMA international @DAMA_I</p>
<p>McKinsey on BT @mck_biztech</p>
<p>SmartData Collective @SmartDataCo</p>
<p>DataFlux InSight @Datafluxinsight</p>
<p>Gartner @Gartner_inc</p>
<p>TDWI @TDWI</p>
<p>Scientific  Computing @SciCom</p>
<p>Wearecloud @wearecloud</p>
<p>CloudCamp @cloudcamp</p>
<p>Panorama Software @PanoramaSW</p>
<p>Data Hole @datahole</p>
<p>BI Knowledge Base @biknowledgebase</p>
<p>EnterpriseArchitects @enterprisearchitects</p>
<p>DataQualityPro.com @dataqualitypro</p>
<p>RSA Archer eGRC @ArcherGRC</p>
<p>Exobox @Exobox_Security</p>
<p>EA_Consultant @EA_Consultant</p>
<p>Cloudbook @cloudbook</p>
<p>ID Experts @idexperts</p>
<p>IAIDQ @iaidq</p>
<p>EMC Forum @EMCForums</p>
<p>Data Junkies @datajunkies</p>
<p>True Finance Data @truefinancedata</p>
<p>Madam @TheMDMNetwork</p>
<p>IBM Initiate @IBMInitiate</p>
<p>Accelus_GRC @PaisleyGRC</p>
<p>DQ Asia Pacific @DQAsiaPacific</p>
<p>Data Guide @DataGuide</p>
<p>PCI PA-DSS Data @DataAssurant</p>
<p>DataFlux Corporation @DataFlux</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/99/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/99/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=99&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/04/30/people-who-tweet-about-data-management/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
		<item>
		<title>Data Virtualization – Part 1 – Business Intelligence</title>
		<link>http://thoughtsondata.com/2012/03/26/data-virtualization-part-1-business-intelligence/</link>
		<comments>http://thoughtsondata.com/2012/03/26/data-virtualization-part-1-business-intelligence/#comments</comments>
		<pubDate>Mon, 26 Mar 2012 21:01:23 +0000</pubDate>
		<dc:creator>thoughtsondata</dc:creator>
				<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Management]]></category>
		<category><![CDATA[Data Virtualization]]></category>
		<category><![CDATA[Data Warehousing]]></category>

		<guid isPermaLink="false">http://thoughtsondata.com/?p=92</guid>
		<description><![CDATA[The big transformation that we’re all dealing with in technology today is virtualization.  There are many aspects to virtualization: infrastructure, systems, organization, office, applications. When you search on the internet for “data virtualization,” most of the references are regarding business intelligence and data warehousing uses.  In part 2 of this blog I will talk about [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=92&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The big transformation that we’re all dealing with in technology today is virtualization.  There are many aspects to virtualization: infrastructure, systems, organization, office, applications. When you search on the internet for “data virtualization,” most of the references are regarding business intelligence and data warehousing uses.  In part 2 of this blog I will talk about data virtualization and transaction processing.</p>
<p>In the day, when I used to build data warehouses (1990s+), there was a reference to a concept of “federated data warehouses”, where data in the logical warehouse would be separated physically either with the same schemas in multiple instances or different types of data in different locations.  The thought was that the data would be physically separate but brought together real time for reporting.  We also used to call that “data warehouses that don’t work”.  After all, the reason we created data warehouses in the first place was that we needed to instantiate the data consolidation in order to make the response time reasonable when trying to report on millions of records. No, really, the response time on these “federated data warehouse” systems used to be many minutes or more.</p>
<p>Now, however, the technologies involved have made huge leaps in capabilities.  The vendors have put thousands and thousands of man hours into how to make real time integration and reporting work.  There are many techniques involving specialized software and hardware (data appliances) that enable these capabilities, query optimization, distributed processing, and other optimization techniques, and hybrid solutions between pure virtualization and instantiation.  Specialized tuning is necessary, and the fastest solutions involve instantiating the consolidated data in a central place.</p>
<p>Ultimately, having to do a project to incorporate new data into the data warehouse physically isn’t responsive enough to the business need for information.  Better to have a short term solution that allows for the quick incorporation of new data and then, if there is a continued need for the data in question and you want to speed up the response, possibly integrate the additional data into the physical data warehouse.</p>
<p>The problems being solved now for business intelligence and data virtualizations include real time data integration of multiple regional instances of a data warehouse, integrating data of different types and kinds, and integrating data from a data warehouse with big data and cloud data.  This enables much more responsive business intelligence and analytical solutions to business requests without having to always instantiate all data for analysis into a central, single, enterprise data warehouse.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/thoughtsondata.wordpress.com/92/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/thoughtsondata.wordpress.com/92/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=thoughtsondata.com&#038;blog=22728107&#038;post=92&#038;subd=thoughtsondata&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://thoughtsondata.com/2012/03/26/data-virtualization-part-1-business-intelligence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1116693bb8a98f2e622c5d40aa3af8c0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">thoughtsondata</media:title>
		</media:content>
	</item>
	</channel>
</rss>
