The cloudy future of data management & governance

[tl;dr The cloud enables novel ways of handling an expected explosion in data store types and instances, allowing stakeholders to know exactly what data is where at all times without human process dependencies.]

Data management & governance is a big and growing concerns for more and more organizations of all sizes. Effective data management is critical for compliance, resilience, and innovation.

Data governance is necessary to know what data you have, when you got it, where it came from, where it is being used, and whether it is of good quality or not.

While the field is relatively mature, the rise of cloud-based services and service-enabled infrastructure will, I believe, fundamentally change the nature of how data is managed in the future and enable greater agility if leveraged effectively.

Data Management Meta-Data

Data and application architects are concerned about ensuring that applications use the most appropriate data storage solution for the problem being solved. To better manage cost and complexity, firms tend to converge on a handful of data management standards (such as Oracle or SQL Server for databases; NFS or NTFS for filesystems; Netezza, Terradata for data warehousing, Hadoop/HDFS for data processing, etc). Expertise is concentrated around central teams that manage provisioning, deployments, and operations for each platform. This introduces dependencies that project teams must plan around. This also requires forward planning and long-term commitment – so not particularly agile.

Keeping up with data storage technology is a challenge – technologies like key/value stores, graph databases, columnar databases, object stores, and document databases exist as these represent varying datasets in a more natural way for applications to consume, reducing or eliminating the ‘impedance mismatch‘ between how applications view state and how that state is stored.

In particular, may datastore technologies are used to scaling up rather than out; i.e., the only way to make them perform faster is to add more CPU/memory, or faster IO hardware. While this keeps applications simpler, it require significant forward planning and longer-term commitments to scale up, and is out of the control of application development teams. Cloud-based services can typically handle scale-out transparently, although applications may need to be aware of the data dimensions across which scale out happens (e.g., sharding by primary key, etc).

Fulfilling provisioning requests for a new datastore on-premise is mostly ticket driven, but fulfillment is still mostly by humans and not by software within enterprises – which means an “infrastructure-as-code” approach is not feasible.

Data Store Manageability vs Application Complexity

Most firms decide that it is better to simplify the data landscape such that fewer datastore solutions are available, but to resource those solutions so that they are properly supported to handle business critical production workloads with maximum efficiency.

The trade-off is in the applications themselves, where the data storage solutions available end up driving the application architecture, rather than the application architecture (i.e., requirements) dictating the most appropriate data store solution, which would result in the lowest impedance mismatch.

A typical example of an impedance mismatch are object-oriented applications (written in, say C++ or Java) which use relational databases. Here, object/relational mapping technologies such as Hibernate or Gigaspaces are used to map the application view of the data (which likes to view data as in-memory objects) to the relational view. These middle layers, while useful for naturally relational data, can be overly expensive to maintain and operate if what your application really needs is a more appropriate type of datastore (e.g., graph).

This mismatch gets exacerbated in a microservices environment where each microservice is responsible for its own persistence, and individual microservices are written in the language most appropriate for the problem domain. Typical imperative, object-oriented languages implementing transactional systems will lean heavily towards relational databases and ORMs, whereas applications dealing with multi-media, graphs, very-large objects, or simple key/value pairs will not benefit from this architecture.

The rise of event-driven architectures (in particular, transactional ‘sagas‘, and ‘aggregates‘ from DDD) will also tend to move architectures away from ‘kitchen-sink’ business object definitions maintained in a single code-base into multiple discrete but overlapping schemas maintained by different code-bases, and triggered by common or related events. This will ultimately lead to an increase in the number of independently managed datastores in an organisation, all of which need management and governance across multiple environments.

For on-premise solutions, the pressure to keep the number of datastore options down, while dealing with an explosion in instances, is going to limit application data architecture choices, increase application complexity (to cope with datastore impedance mismatch), and reduce the benefits from migrating to a microservices architecture (shared datastores favor a monolithic architecture).

Cloud Changes Everything

So how does cloud fundamentally change how we deal with data management and governance? The most obvious benefit cloud brings is around the variety of data storage services available, covering all the typical use cases applications need. Capacity and provisioning is no longer an operational concern, as it is handled by the cloud provider. So data store resource requirements can now be formulated in code (e.g., in CloudFormation, Terraform, etc).

This, in principle, allows applications (microservices) to choose the most appropriate storage solution for their problem domain, and to minimize the need for long-term forward planning.

Using code to specify and provision database services also has another advantage: cloud service providers typically offer the means to tag all instantiated services with your own meta-data. So you can define and implement your own data management tagging standards, and enforce these using tools provided by the cloud provider. These can be particularly useful when integrating with established data discovery tools, which depend on reliable meta-data sources. For example, tags can be defined based on a data ontology defined by the chief data office (see my previous article on CDO).

These mechanisms can be highly automated via service catalogs (such as AWS Service Catalog or ServiceNow), which allow compliant stacks to be provisioned without requiring developers to directly access the cloud providers APIs.

Let a thousand flowers bloom

The obvious downside to letting teams select their storage needs is the likely explosion of data stores – even if they are selected from a managed service catalog. But the expectation is that each distinct store would be relatively simple – at least compared to relational stores which support many application use cases and queries in a single database.

In on-premise situations, data integration is also a real challenge – usually addressed by a myriad of ad-hoc jobs and processes whose purpose is to extract data from one system and send it to another (i.e., ETL). Usually no meta-data exists around these processes, except that afforded by proprietary ETL systems.

In best case integration scenarios, ‘glue’ data flows are implemented in enterprise service buses that generally will have some form of governance attached – but which usually has the undesirable side-effect of introducing yet another dependency for development teams which needs planning and resourcing. Ideally, teams want to be able to use ‘dumb’ pipes for messaging, and be able to self-serve their message governance, such that enterprise data governance tools can still know what data is being published/consumed, and by whom.

Cloud provides two main game-changing capabilities to manage data complexity management at scale. Specifically:

  • All resources that manage data can be tagged with appropriate meta-data – without needing to, for example, examine tables or know anything about the specifics about the data service. This can also extend to messaging services.
  • Serverless functions (e.g., AWS Lambda, Azure Functions, etc) can be used to implement ‘glue’ logic, and can themselves be tagged and managed in an automated way. Serverless functions can also be used to do more intelligent updates of data management meta-data – for example, update a specific repository when a particular service is instantiated, etc. Serverless functions can be viewed as on-demand microservices which may have their own data stores – usually provided via a managed service.

Data, Data Everywhere

By adopting a cloud-enabled microservice architecture, using datastore services provisioned by code, applying event driven architecture, leveraging serverless functions, and engaging with the chief data officer for meta-data standards, it will be possible to have an unprecedented up-to-date view of what data exists in an organization and where. It may even address static views of data in motion (through tagging queue and notification topic resources). The data would be maintained via policies and rules implemented in service catalog templates and lambda functions triggered automatically by cloud configuration changes, so it would always be current and correct.

The CDO, as well as data and enterprise architects, would be the chief consumer of this metadata – either directly or as inputs into other applications, such as data governance tools, etc.

Conclusion

The ultimate goal is to avoid data management and governance processes which rely on reactive human (IT) input to maintain high-quality data management metadata. Reliable metadata can give rise to a whole new range of capabilities for stakeholders across the enterprise, and finally take IT out of the loop for business-as-usual data management queries, freeing up valuable resources for building even more data-driven applications.

The cloudy future of data management & governance

Achieving modularity: functional vs volatility decomposition

Enterprise architecture is all about managing complexity. Many EA initiatives tend to focus on managing IT complexity, but there is only so much that can be done there before it becomes obvious that IT complexity is, for the most part, a direct consequence of enterprise complexity. To recap, complexity needs to be managed in order to maintain agility – the ability for an organisation to respond (relatively) quickly and efficiently to changes in markets, regulations or innovation, and to continue to do this over time.

Enterprise complexity can be considered to be the activities performed and resources consumed by the organisation in order to deliver ‘value’, a metric usually measured through the ability to maintain (financial) income in excess of expenses over time.

Breaking down these activities and resources into appropriate partitions that allow holistic thinking and planning to occur is one of the key challenges of enterprise architecture, and there are various techniques to do this.

Top-Down Decomposition

The natural approach to decomposition is to first understand what an organisation does – i.e., what are the (business) functions that it performs. Simply put, a function is a collection of data and decision points that are closely related (e.g., ‘Payments ‘is a function). Functions typically add little value in and of themselves – rather they form part of an end-to-end process that delivers value for a person or legal entity in some context. For example, a payment on its own means nothing: it is usually performed in the context of a specific exchange of value or service.

So a first course of action is to create a taxonomy (or, more accurately, an ontology) to describe the functions performed consistently across an enterprise. Then, various processes, products or services can be described as a composition of those functions.

If we accept (and this is far from accepted everywhere) that EA is focused on information systems complexity, then EA is not responsible for the complexity relating to the existence of processes, products or services. The creation or destruction of these are usually a direct consequence of business decisions. However, EA should be responsible for cataloging these, and ensuring these are incorporated into other enterprise processes (such as, for example, disaster recovery or business continuity processes). And EA should relate these to the functional taxonomy and the information systems architecture.

This can get very complex very quickly due to the sheer number of processes, products and services – including their various variations – most organisations have. So it is important to partition or decompose the complexity into manageable chunks to facilitate meaningful conversations.

Enterprise Equivalence Relations

One way to do this at enterprise level is to group functions into partitions (aka domains) according to synergy or autonomy (as described by Roger Sessions), for all products/services supporting a particular business. This approach is based on the mathematical concept of equivalenceBecause different functions in different contexts may have differing equivalence relationships, functions may appear in multiple partitions. One role of EA is to assess and validate if those functions are actually autonomous or if there is the potential to group apparently duplicate functions into a new partition.

Once partitions are identified, it is possible to apply ‘traditional’ EA thinking to a particular partition, because that partition is of a manageable size. By ‘traditional’ EA, I mean applying Zachman, TOGAF, PEAF, or any of the myriad methodologies/frameworks that are out there. More specifically, at that level, it is possible to establish a meaningful information systems strategy or goal for a particular partition that is directly supporting business agility objectives.

The Fallacy of Functional Decomposition

Once you get down to the level of partition, the utility of functional decomposition when it comes to architecting solutions becomes less useful. The natural tendency for architects would be to build reusable components or services that realise the various functions that comprise the partition. In fact, this may be the wrong thing to do. As Jüval Lowy demonstrates in his excellent webinar, this may result in more complexity, not less (and hence less agility).

When it comes to software architecture, the real reason to modularise your architecture is to manage volatility or uncertainty – and to ensure that volatility in one part of the architecture does not unnecessarily negatively impact another part of the architecture over time. Doing this allows agility to be maintained, so volatile parts of the application can, in fact, change frequently, at low impact to other parts of the application.

When looking at a software architecture through this lens, a quite different set of components/modules/services may become evident than those which may otherwise be obvious when using functional decomposition – the example in the webinar demonstrates this very well. A key argument used by Jüval in his presentation is that (to paraphrase him somewhat) functions are, in general, highly dependent on the context in which they are used, so to split them out into separate services may require making often impossible assumptions about all possible contexts the functions could be invoked in.

In this sense, identified components, modules or services can be considered to be providing options in terms of what is done, or how it is done, within the context of a larger system with parts of variable volatility. (See my earlier post on real options in the context of agility to understand more about options in this context.)

Partitions as Enterprise Architecture

When each partition is considered with respect to its relationship with other partitions, there is a lot of uncertainty around how different partitions will evolve. To allow for maximum flexibility, every partition should assume each other partition is a volatile part of their architecture, and design accordingly for this. This allows each partition to evolve (reasonably) independently with minimum fixed co-ordination points, without compromising the enterprise architecture by having different partitions replicate the behaviours of partitions they depend on.

This then allows:

  • Investment to be expressed in terms of impact to one or more partitions
  • Partitions to establish their own implementation strategies
  • Agile principles to be agreed on a per partition basis
  • Architectural standards to be agreed on a per partition basis
  • Partitions to define internally reusable components relevant to that partition only
  • Partitions to expose partition behaviour to other partitions in an enterprise-consistent way

In generative organisation cultures, partitions do not need to be organisationally aligned. However, in other organisation cultures (pathological or bureaucratic), alignment of enterprise infrastructure functions such as IT or operations (at least) with partitions (domains) may help accelerate the architectural and cultural changes needed – especially if coupled with broader transformations around investment planning, agile adoption and enterprise architecture.

Achieving modularity: functional vs volatility decomposition

Achieving Agile at Scale

[tl;dr Scaling agile at the enterprise level will need rethinking how portfolio management and enterprise architecture are done to ensure success.]

Agility,as a concept, is gaining increasing attention within large organisations. The idea that business functions – and in particular IT – can respond quickly and iteratively to business needs is an appealing one.

The reasons why agility is getting attention are easy to spot: larger firms are getting more and more obviously unagile – i.e., the ability of business functions to respond to business needs in a timely and sustainable manner is getting progressively worse, even as a rapidly evolving competitive and technology-led commercial environment is demanding more agility.

Couple that with the heavy cost of failing to meet ever increasing regulatory compliance obligations, and ‘agile’ seems a very good idea indeed.

Agile is a great idea, but when implemented at scale (in large enterprise organisations), it can actually reduce enterprise agility, rather than increase it, unless great care is taken.

This is partly because Agile’s origins come from developing web applications: in these scenarios, there is usually a clear customer, a clear goal (to the extent that the team exists in the first place), and relatively tight timelines that favour short or non-existent analysis/design phases. Agile is perfect for these scenarios.

Let’s call this scenario ‘local agile’. It is quite easy to see a situation where every team, in response to the question, ‘are you doing agile?’, for teams to say ‘Yes, we do!’. So if every team is doing ‘local agile’, does that mean your organisation is now ‘agile’?

The answer is No. Getting every team to adopt agile practices is a necessary but insufficient step towards achieving enterprise agility. In particular, two key factors needs to be addressed before true a firm can be said to be ‘agile’ at the enterprise level. These are:

  1. The process by which teams are created and funded, and
  2. Enterprise awareness

Creating & Funding (Agile) Teams

Historically, teams are usually created as a result of projects being initiated: the project passes investment justification criteria, the project is initiated and a team is put in place, led by the project manager. Also, this process was owned entirely by the IT organisation, irrespective of which other organisations were stakeholders in the project.

At this point, IT’s main consideration is, will the project be delivered on time and on budget? The business sponsor’s main consideration is, will it give us what we need when we need it? And the enterprise’s consideration (which is often ignored) is who is accountable for ensuring that the IT implementation delivers value to the enterprise. (In this sense, the ‘enterprise’ could be either a major business line with full P&L responsibility for all activities performed in support of their business, or the whole organisation, including shared enterprise functions).

Delivering ‘value’ is principally about ensuring that  on-going or operational processes, roles and responsibilities are adjusted to maximise the benefits of a new technology implementation – which could include organisational change, marketing, customer engagement, etc.

However, delivering ‘value’ is not always correlated to one IT implementation; value can be derived from leveraging multiple IT capabilities in concert. Given the complexity of large organisations, it is often neither desirable or feasible to have a single IT partner be responsible for all the IT elements that collectively deliver business value.

On this basis, it is evident that how businesses plan and structure their portfolio of IT investments needs to change dramatically. In particular,

  1. The business value agenda is outcome focused and explicit about which IT capabilities are required to enable it, and
  2. IT investment is focused around the capability investment lifecycle that IT is responsible for stewarding.

In particular ‘capabilities’ (or IT products or services) have a lifecycle: this affects the investment and expectations around those capabilities. And some capabilities need to be more ‘agile’ than others – some must be agile to be useful, whereas for others, stability may be the over-riding priority, and therefore their lack of agility must be made explicit – so agile teams can plan around that.

Enterprise Awareness

‘Locally’ agile teams are a step in the right direction – particularly if the business stakeholders all agree they are seeing the value from that agility. But often this comes at the expense of enterprise awareness. In short, agility in the strict business sense can often only deliver results by ignoring some stakeholders interests. So ‘locally’ agile teams may feel they must minimise their interactions with other teams – particularly if those teams are not themselves agile.

If we assume that teams have been created through a process as described in the previous section, it becomes more obvious where the team sits in relation to its obligations to other teams. Teams can then make appropriate compromises to their architecture, planning and agile SDLC to allow for those obligations.

If the team was created through ‘traditional’ planning processes, then it becomes a lot harder to figure out what ‘enterprise awareness’ is appropriate (except perhaps or IT-imposed standards or gates, which only contributes indirectly to business value).

Most public agile success stories describe very well how they achieved success up to  – but not including – the point at which architecture becomes an issue. Architecture, in this sense, refers to either parts of the solution architecture which can no longer be delivered via one or two members of an agile team, or those parts of the business value chain that cannot be entirely delivered via the agile team on its own.

However, there are success stores (e.g., Spotify) that show how ‘enterprise awareness’ can be achieved without limiting agility. For many organisations, transitioning from existing organisation structures to new ‘agile-ready’ structures will be a major challenge, and far harder than simply having teams ‘adopt agile’.

Conclusions

With the increased attention on Agile, there is fortunately increased attention on scaling agile. Methodologies like Disciplined Agile Development (DaD) and LargE Scale Scrum (LeSS), coupled with portfolio concepts like Scaled Agile Framework (SAFe) propose ways in which Agile can scale beyond the team and up to enterprise level, without losing the key benefits of the agile approach.

All scaled agile methodologies call for changes in how Portfolio Management and Enterprise Architecture are typically done within an organisation, as doing these activities right are key to the success of adopting Agile at scale.

Achieving Agile at Scale

The Meaning of Data

[tl;dr The Semantic Web may be a CDOs best friend in their efforts to help the business realise the full value of enterprise data.]

Large organisations with complex legacy infrastructure are faced with a dilemma: the technology that allowed them to grow and capture market share has reached a turning point in terms of the value returned from additional spending. The only practical investments that can be made is to shore up (protect) those investments by performing essential maintenance, focusing on security, and (perhaps) moving to lower-cost, cloud-based infrastructure. Mandatory (usually regulatory) enhancements also need to be done.

But in terms of protecting existing revenue or capturing new markets, legacy applications are often seen as a burden. Businesses are increasingly looking outside their core IT capability to address their needs – especially in technologies that can help pinpoint opportunities in the first place (i.e., analysing and processing the digital foot-prints left by customers and prospective customers, whether those foot-prints are internal to the firm or external).

Technology is both the problem and the solution. And herein lies the dilemma: internal IT organisations cannot possibly advise on all possible technology-enabled solutions for a business. Rather, businesses need to become more “tech-savvy” – a term which is bandied about a lot these days.

What does “tech-savvy” actually mean, and where is the line drawn between a “tech-savvy” business person, and the professional IT service provider? And what form should this ‘line’ take?

Arguably, most businesses have always been tech-savvy: they knew by-and-large where technology was necessary to grow their business. So for the most successful firms, there is a *lot* of technology. Does that mean those businesses are tech-savvy?

Yes, if those firms are able to manage the complexity of their IT – in other words, to be able to adapt their IT to shifting market needs, and to incorporate/absorb innovations into their architecture without significantly reducing that agility. In practice, few firms have such mastery of their IT (and, by implication, their processes and business information systems).

So a new form of “tech savvy” is needed, that allows business folks to leverage opportunities to both find customers and meet their needs (profitably), while preventing a build-up of unsustainable complexity of processes and systems.

In essence, tech-savvy business folk need to get better at understanding data. And IT needs to get much better at making meaningful business data available to their business – irrespective of where that business data may actually reside.

What does this mean in practice? It’s all about semantics – something previously in the domain of data geeks. But major initiatives like the semantic web (led by Tim Berners-Lee, of WWW fame) are making semantics useful to traditionally non-technical people.

The tools available to business folks around what information (i.e., data + context) is available is very poor, and even when the required data is found (i.e., it is known where it is) it can be difficult to extract it and use it in a productive way.

A good example of this may be observed in the proliferation of Excel spreadsheets and Access databases in organisations that support critical business functions. These tools are necessary to support those areas as the data they need from various systems are not where they need when they need it – and that is often due to business needs changing faster than IT can capture, absorb and deliver requirements.

Over time (at least in principle), all meaningful business data will (must) end up being processed exclusively by controlled enterprise systems..but this doesn’t mean waiting until IT has made the necessary changes in order to effectively govern that data and make maximum use out of it.

A key principle behind the semantic web is that data can be anywhere. In an enterprise scenario, that means the data could exist in:

  • an internal enterprise system
  • a file system (e.g., Excel sheet or Access database)
  • a website (file download, or website screen scraping)
  • a commercial information provider (Reuters, Bloomberg, etc)
  • a business partner/supplier
  • etc

To take advantage of this, meta-data (i.e., data about data) needs to be made available to businesses, and it needs to be business-relevant and agnostic to internal IT systems and processes.

More than this, the tools needed to discover useful information, as well as to retrieve and process that information need to be available to tech-savvy business folk. The right set of tools, which are suitably agnostic to specific architectures and systems, can allow businesses to explore new opportunities and business models, while allowing the IT systems and platforms to catch up and evolve over time – noting that there is no presumption that the eventual source of useful business data originates from, or is stewarded by, internal systems managed by IT.

Such tools also need to be enforce compliance to ensure only the right people get access to the right data, and (if necessary) limit the ability for people to pull data outside of the retrieval platform. In essence, providing a controlled sandbox in which businesses can distill value from information, whatever its source.

Many organisations may approach this by implementing a ‘data lake‘. This may (perhaps) be a workable solution for all data assets managed by the IT organisation. But it is not feasible for many sets of data which are not managed by the IT organisation, but which still have business utility.

Emerging standards and technologies are evolving to meet this need for a data-centric view of information systems – in particular, RDF, OWL and related ‘ontology‘ languages, as well as standards like SPARQL which enable the discovery and retrieval of data originating from multiple sources. But tools are still very primitive and generally require too much technology savviness. However, efforts like, for example, the Callimachus Project gives a hint at the potential of data-driven applications.

At some point, it will not be unreasonable to ask an IT organisation to expose all of its data assets semantically via SPARQL end-points (with appropriate access controls), and to provide tools to businesses to allow them to explore that data and (where permissible) incorporate them into spreadsheets, models and other tools in ways that allow the business to realise value from that data without requiring IT change requests. Developing the capabilities to understand semantic data, and use it to commercial advantage, will take time but it will be a worthwhile investment (and arguably before folks start spending money on ‘big data’ projects that have no wider context).

In fact, I would go so far as to say that any provider (startup or established) delivering technology services to a business should provide SPARQL endpoints as a matter of course. Making data useful and available to the business will help the business realise the value of data and get more involved in how it is captured, processed and stored in the future – and make it easier to incorporate 3rd party solution providers into business operations.

In a nutshell, the Semantic Web may be the CDOs best friend in their efforts to help the business realise the full value of enterprise data.

The Meaning of Data

The CDO: Enterprise Architect’s best friend

[tl;dr The CDO’s agenda may be the best way to bring the Enterprise Architecture agenda to the business end of the C-suite table.]

Enterprise Architecture as a concept has for some time claimed intellectual ownership of a holistic, integrated view of architecture within an organisation. In that lofty position, enterprise architects have put a stake in the ground with respect to key architecture activities, and formalised them all within frameworks such as TOGAF.

In TOGAF, Data Architecture has been buried within the Information Systems Architecture phase of the Architecture Development Method, along with Application Architecture.

But the real world often gets in the way of (arguably) conceptually clean thinking: I note the rise of the CDO, or Chief Data Officer, usually as part of the COO function of an organisation. (This CDO is not to be confused with the Chief Digital Officer, who’s primary focus could critically be seen as to build cool apps…)

So, why has Data gotten all this attention, while Enterprise Architecture still, for the most part, languishes within IT?

Well, as with most things, its happening for all the wrong reasons: mostly its because regulators are holding businesses accountable for how they manage regulated data, and businesses need, in turn, to hold people to account for doing that. Hence the need for a CDO.

But in the act of trying to understand what data they are liable for managing, and to what extent, CDO’s necessarily have to do Data Architecture and Data Governance. And at this point the CDO’s activities starts dipping into new areas such as business process management, and business & IT strategy – and EA.

If CDOs choose to take an expansive view of their role (and I believe they should), then it would most definitely include all the domains of Enterprise Architecture – especially if one views data complexity and technology complexity as two sides of the same enterprise complexity coin: one as a consequence of business decisions, the other as a consequence of IT decisions.

The good news is that this would at long last put EAs into a role more aligned with business goals than with the goals of the IT organisation. This is not to say that the goals of the IT organisation are in some way wrong, but rather than because the business had abdicated its responsibility for many decisions that IT needs in order to do “the right thing”, IT has had to fill the gaps itself – and in ways which said abdicating businesses could understand/appreciate.

For anyone in the position of CTO, this should help a lot: without prescribing which technologies should be used, businesses then can give real context to their demand which the CTO can interpret strategically, and, in collaboration with his CDO colleague, agree a technology strategy that can deliver on those needs.

In this way, the CDO/CTO have a very symbiotic relationship: the boundaries should be fuzzy, although the responsibilities should be clear: the CDO provides the context, the CTO delivers the technology.

So collaboration is key: while in principle one should be able to describe what a business needs strictly in terms of process and data flows etc, the reality is that business needs and practices change in *response* to technology. In other words, it is not a one-way street. What technology is *capable* of doing can shape what the business *should* be doing. (In many cases, for example, technology may eliminate the need for entire processes or activities.)

Folks on the edge of this CDO/CTO collaboration have a choice: do I become predominantly business facing (CDO), or do I become predominantly technology facing (CTO)? Folks may choose to specialise, or some, like myself, may choose to focus on different perspectives according to the needs of the organisation.

One interesting impact from this approach may  address one of the biggest questions out there at the moment: how to absorb the services and capabilities offered by innovative, efficient, and nimble external businesses into the architecture of the enterprise, while retaining control, compliance and agility of the data and processes? But more on this later.

So, where does all this sit with respect to the ‘3 pillars of a digital strategy’? The points raised there are still valid. With the right collaboration and priorities, it is possible to have different folks fill the roles of ‘client centricity’, ‘chief data officer’ and ‘head of enterprise complexity’. But for the most part, all these roles begin and end with data. A proper, disciplined and thoughtful approach to how and why data is captured, processed, stored, retrieved and transmitted should properly address firm-wide themes such as improving client experience, integrating innovative services, keeping a lid on complexity and retaining an appropriate level of enterprise agility.

Some folks may be wondering where the CIO fits in all this: the CIO maintains responsibility for the management and operation of all IT processes (as defined in The Open Group’s IT4IT model). This is a big job. But it is not the job of the CTO, whose focus is to ensure the right technology is brought to bear to solve the right problem.

The CDO: Enterprise Architect’s best friend