One of the streams that has my interest at Strata + Hadoop in Singapore this week is around safe guarding data in Big Data technologies. When I say safe guarding, I mean securing it, curating it, and sharing it properly.
Initially, Big Data technologies were developed for democratic and liberal use of data in specific use cases where there was open and unfettered access, manipulation and use of the data stored in Big Data repositories. These first use cases were typically related to e-marketing analytics: website analysis, sentiment analysis or text analytics. Now, newer use cases see Big Data technologies come under scrutiny as they are applied to data stores and management services in environments with stricter controls.
I attended a session with a panel of Chief Data Officers (CDO) of major Singaporean and other ASEAN member banks. Their key message to the attendees was that data quality and data governance was paramount to enabling a data driven business. Essentially, if these two areas are done well, then the CDO keeps operations running and the bank out of the papers. The CDOs said if they were doing their job right then they’d do themselves out of a job. Why? Their role is to make data just like electricity: everywhere, used by everything and without a Chief Electricity Officer running about making sure people were using it correctly. How is this utopian dream attainable?
Other sessions on safe guarding data that I attended confirmed my hunch that Big Data technologies and the emerging utility tools that provide enterprise capabilities (e.g., data quality, data profiling, master data management, data lineage, data audit, meta data management, access authentication and authorisation), are still evolving to meet the production ready requirements of most organisations. So, for now you have to stitch the tools together or roll your own. Thankfully, these capabilities are being met by “open source+” vendors such as Cloudera or utility tools like Trifacta, to name a few.
Conclusion? Don’t let the infancy of the enterprise grade capabilities of these tools deter you from leveraging them to advance data to being as ubiquitous as electricity in your organisation.
For those residing in NZ, there’s more to share on this topic and the NZ Strata + Hadoop contingent will do so in mid-to-late January 2017.