The hidden costs of PaaS & microservice engineering innovation

[tl;dr The leap from monolithic application development into the world of PaaS and microservices highlights the need for consistent collaboration, disciplined development and a strong vision in order to ensure sustainable business value.]

The pace of innovation in the PaaS and microservice space is increasing rapidly. This, coupled with increasing pressure on ‘traditional’ organisations to deliver more value more quickly from IT investments, is causing a flurry of interest in PaaS enabling technologies such as Cloud Foundry (favoured by the likes of IBM and Pivotal), OpenShift (favoured by RedHat), Azure (Microsoft), Heroku (SalesForce), AWS, Google Application Engine, etc.

A key characteristic of all these PaaS solutions is that they are ‘devops’ enabled – i.e., it is possible to automate both code and infrastructure deployment, enabling the way to have highly automated operational processes for applications built on these platforms.

For large organisations, or organisations that prefer to control their infrastructure (because of, for example, regulatory constraints), PaaS solutions that can be run in a private datacenter rather than the public cloud are preferable, as this a future option to deploy to external clouds if needed/appropriate.

These PaaS environments are feature-rich and aim to provide a lot of the building blocks needed to build enterprise applications. But other framework initiatives, such as Spring Boot, DropWizard and Vert.X aim to make it easier to build PaaS-based applications.

Combined, all of these promise to provide a dramatic increase in developer productivity: the marginal cost of developing, deploying and operating a complete application will drop significantly.

Due to the low capital investment required to build new applications, it becomes ever more feasible to move from a heavy-weight, planning intensive approach to IT investment to a more agile approach where a complete application can be built, iterated and validated (or not) in the time it takes to create a traditional requirements document.

However, this also has massive implications, as – left unchecked – the drift towards entropy will increase over time, and organisations could be severely challenged to effectively manage and generate value from the sheer number of applications and services that can be created on such platforms. So an eye on managing complexity should be in place from the very beginning.

Many of the above platforms aim to make it as easy as possible for developers to get going quickly: this is a laudable goal, and if more of the complexity can be pushed into the PaaS, then that can only be good. The consequence of this approach is that developers have less control over the evolution of key aspects of the PaaS, and this could cause unexpected issues as PaaS upgrades conflict with application lifecycles, etc. In essence, it could be quite difficult to isolate applications from some PaaS changes. How these frameworks help developers cope with such changes is something to closely monitor, as these platforms are not yet mature enough to have gone through a major upgrade with a significant number of deployed applications.

The relative benefit/complexity trade-off between established microservice frameworks such as OSGi and easier to use solutions such as described above needs to be tested in practice. Specifically, OSGi’s more robust dependency model may prove more useful in enterprise environments than environments which have a ‘move fast and break things’ approach to application development, especially if OSGi-based PaaS solutions such as JBoss Fuse on OpenShift and Paremus ServiceFabric gain more popular use.

So: all well and good from the technology side. But even if the pros and cons of the different engineering approaches are evaluated and a perfect PaaS solution emerges, that doesn’t mean Microservice Nirvana can be achieved.

A recent article on the challenges of building successful micro-service applications, coupled with a presentation by Lisa van Gelder at a recent Agile meetup in New York City, has emphasised that even given the right enabling technologies, deploying microservices is a major challenge – but if done right, the rewards are well worth it.

Specifically, there are a number of factors that impact the success of a large scale or enterprise microservice based strategy, including but not limited to:

  • Shared ownership of services
  • Setting cross-team goals
  • Performing scrum of scrums
  • Identifying swim lanes – isolating content failure & eventually consistent data
  • Provision of Circuit breakers & Timeouts (anti-fragile)
  • Service discoverability & clear ownership
  • Testing against stubs; customer driven contracts
  • Running fake transactions in production
  • SLOs and priorities
  • Shared understanding of what happens when something goes wrong
  • Focus on Mean time to repair (recover) rather than mean-time-to-failure
  • Use of common interfaces: deployment, health check, logging, monitoring
  • Tracing a users journey through the application
  • Collecting logs
  • Providing monitoring dashboards
  • Standardising common metric names

Some of these can be technically provided by the chosen PaaS, but a lot is based around the best practices consistently applied within and across development teams. In fact, it is quite hard to capture these key success factors in traditional architectural views – something that needs to be considered when architecting large-scale microservice solutions.

In summary, the leap from monolithic application development into the world of PaaS and microservices highlights the need for consistent collaboration, disciplined development and a strong vision in order to ensure sustainable business value.
The hidden costs of PaaS & microservice engineering innovation

Making good architectural moves

[tl;dr In Every change is an opportunity to make the ‘right’ architectural move to improve complexity management and to maintain an acceptable overall cost of change.]

Accompanying every new project, business requirement or product feature is an implicit or explicit ‘architectural move’ – i.e., a change to your overall architecture that moves it from a starting state to another (possibly interim) state.

The art of good architecture is making the ‘right’ architectural moves over time. The art of enterprise architecture is being able to effectively identify and communicate what a ‘right’ move actually looks like from an enterprise perspective, rather than leaving such decisions solely to the particular implementation team – who, it must be assumed, are best qualified to identify the right move from the perspective of the relevant domain.

The challenge is the limited inputs that enterprise architects have, namely:

  • Accumulated skill/knowledge/experience from past projects, including any architectural artefacts
  • A view of the current enterprise priorities based on the portfolio of projects underway
  • A corporate strategy and (ideally) individual business strategies, including a view of the environment the enterprise operates in (e.g., regulatory, commercial, technological, etc)

From these inputs, architects need to guide the overall architecture of the enterprise to ensure every project or deliverable results in a ‘good’ move – or at least not a ‘bad’ move.

In this situation, it is difficult if not impossible to measure the actual success of an architecture capability. This is because, in many respects, the beneficiaries of a ‘good’ enterprise architecture (at least initially) are the next deliverables (projects/requirements/features), and only rarely the current deliverables.

Since the next projects to be done is generally an unknown (i.e., the business situation may change between the time the current projects complete and the time the next projects start), it is rational for people to focus exclusively on delivering the current projects. Which makes it even more important the current projects are in some way delivering the ‘right’ architectural moves.

In many organisations, the typical engagement with enterprise architecture is often late in the architectural development process – i.e., at a ‘toll-gate’ or formal architectural review. And the focus is often on ‘compliance’ with enterprise standards, principles and guidelines. Given that such guidelines can get quite detailed, it can get quite difficult for anyone building a project start architecture (PSA) to come up with an architecture that will fully comply: the first priority is to develop an architecture that will work, and is feasible to execute within the project constraints of time, budget and resources.

Only then does it make sense to formally apply architectural constraints from enterprise architecture – at least some of which may negatively impact the time, cost, resource or feasibility needs of the project – albeit to the presumed benefit of the overall landscape. Hence the need for board-level sponsorship for such reviews, as otherwise the project’s needs will almost always trump enterprise needs.

The approach espoused by an interesting new book, Chess and the Art of Enterprise Architecture, is that enterprise architects need to focus more on design and less on principles, guidelines, roadmaps, etc. Such an approach involves enterprise architects more closely in the creation and evolution of project (start) architectures, which represents the architectural basis for all the work the project does (although it does not necessarily lay out the detailed solution architecture).

This approach is also acceptable for planning processes which are more agile than waterfall. In particular, it acknowledges that not every architectural ‘move’ is necessarily independently ‘usable’ by end users of an agile process. In fact, some user stories may require several architectural moves to fully implement. The question is whether the user story is itself validated enough to merit doing the architectural moves necessary to enable it, as otherwise those architectural moves may be premature.

The alternative, then, is to ‘prototype’ the user story,  so users can evaluate it – but at the cost of non-conformance with the project architecture. This is also known as ‘technical debt’, and where teams are mature and disciplined enough to pay down technical debt when needed, it is a good approach. But users (and sometimes product owners) struggle to tell the difference between an (apparently working) prototype and a production solution that is fully architecturally compliant, and it often happens that project teams move on to the next visible deliverable without removing the technical debt.

In applications where the end-user is a person or set of persons, this may be acceptable in the short term, but where the end-user could be another application (interacting via, for example, an API invoked by either a GUI or an automated process), then such technical debt will likely cause serious problems if not addressed. At the various least, it will make future changes harder (unmanaged dependencies, lack of automated testing), and may present significant scalability challenges.

So, what exactly constitutes a ‘good’ architectural move? In general, this is what the project start architecture should aim to capture. A good basic principle could be that architectural commitments should be postponed for as long as possible, by taking steps to minimise the impact of changed architectural decisions (this is a ‘real-option‘ approach to architectural change management). Over the long term, this reduces the cost of change.

In addition, project start architectures may need to make explicit where architectural commitments must be made (e.g., for a specific database, PaaS or integration solution, etc) – i.e., areas where change will be expensive.

Other things the project start architecture may wish to capture or at least address (as part of enterprise design) could include:

  • Cataloging data semantics and usage
    • to support data governance and big data initiatives
  • Management of business context & scope (business area, product, entity, processes, etc)
    • to minimize unnecessary redundancy and process duplication
  • Controlled exposure of data and behaviour to other domains
    • to better manage dependencies and relationships with other domains
  • Compliance with enterprise policies around security and data management
    • to address operational risk
  • Automated build, test & deploy processes
    • to ensure continued agility and responsiveness to change, and maximise operational stability
  • Minimal lock-in to specific solution architectures (minimise solution architecture impact on other domains)
    • to minimize vendor lock-in and maximize solution options

The Chess book mentioned above includes a good description of a PSA, in the context of a PRINCE2 project framework. Note that the approach also works for Agile, but the PSA should set the boundaries within which the agile team can operate: if those boundaries must be crossed to deliver a user story, then enterprise design architects should be brought back into the discussion to establish the best way forward.

In summary, every change is an opportunity to make the ‘right’ architectural move to improve complexity management and to maintain an acceptable overall cost of change.

Making good architectural moves

Agile quantitative analytics in Financial Services

[tl;dr technology is empowering quantitative analysts to much more on their own. IT organisations will need to think about how to make these capabilities available to their users, and how to incorporate it into IT strategies around big-data, cloud computing, security and data governance.]

The quest for positive (financial) returns in investments is helping drive considerable innovation in the space of quantitative analytics. This, coupled with the ever-decreasing capital investment required to do number-crunching, has created demand for ‘social’ analytics – where algorithms are shared and discussed amongst practitioners rather than kept sealed behind the closed doors of corporate research & trading departments.

I am not a quant, but have in the past built systems that provided an alternate (to Excel) vehicle  for quantitative research analysts to capture and publish their models. While Excel is hard to beat for experimenting with new ideas, from a quantitative analyst perspective, it suffers from many deficiencies, including:

  • Spreadsheets get complex very quickly and are hard to maintain
  • They are not very efficient for back-end (server-side) use
  • They cannot be efficiently incorporated into scalable automated workflows
  • Models cannot be distributed or shared without losing control of the model
  • Integrating spreadsheets with multiple large data sets can be cumbersome and memory inefficient at best, and impossible at worst (constrained by machine memory limits)

QuantCon was created to provide a forum for quantitative analysts to discuss and share tools and techniques for  quantitative research, with a particular focus on the sharing and distribution of models (either outcomes or logic, or both).

Some key themes from QuantCon which I found interesting were:

  • The emergence of social analytics platforms that can execute strategies on your venue of choice (e.g. quantopian.com)
  • The search for uncorrelated returns & innovation in (algorithmic) investment strategies
  • Back-testing as a means of validating algorithms – and the perils of assuming backtests would execute at the same prices in real-life
  • The rise of freely available interactive model distribution tools such as the Jupyter project (similar to Mathematica Notebooks)
  • The evolution of probabilistic programming and machine learning – in particular the PyMC3 extensions to Python
  • The rise in the number of free and commercial data sources (APIs) of data points (signals) that can be included in models

From an architectural perspective, there are some interesting implications. Specifically:

  • There really is no limit to what a single quant with access to multiple data sources (either internal or external) and access to platform- or infrastructure-as-a-service capabilities can do.
  • Big data technologies make it very easy to ingest, transform and process multiple data sources (static or real-time) at very low cost (but raise governance concerns).
  • It has never been cheaper or easier to efficiently and safely distribute, publish or share analytics models (although the tools for this are still evolving).
  • The line between the ‘IT developer’ and the user has never been more blurred.

Over the coming years, we can expect analytics (and business intelligence in general) capabilities to become key functions within every functional domain in an organisation, integrated both into the function itself through feedback loops, as well as for more conventional MIS reporting. Some of the basic building blocks will be the same as they are today, but the key characteristic is that users of these tools will be technologists specifically supporting business needs – i.e., part of ‘the business’ and not part of ‘IT’.

In the past, businesses have been supported by vendors providing expensive but easy-to-use tools to allow non-technical people to work with large datasets. The IT folks were very specifically supporting the core data warehouse & business intelligence infrastructure, or provided technical support for the development of particular reports. In these cases, the clients were typically non-technical, and the platform could only evolve as quickly as the software  vendors evolved.

The emerging (low-cost) tools for quantitative analytics will give rise to a post-Excel world of innovation, scale and distribution that will empower users, give rise to whole new business models, and be itself a big driver into how enterprise IT defines its role in the agile, unbundled, decentralised and as-a-service technology landscape of the near future.

Traditional IT organisations often saw the ‘application development’ functions as business aligned, and were comfortable with the client-oriented nature of providing technical infrastructure services to development teams. However, internal development teams supporting other (business-aligned) development teams is a fairly new concept, and will likely best be done by external specialist providers. This is a good example of where IT’s biggest role (apart from governance) in the future is in sourcing relevant providers, and ensuring business technologists are able to do their job effectively and efficiently in that environment.

In summary, technology is empowering quantitative analysts to much more on their own. IT organisations will need to think about how to make these capabilities available to their users, and how to incorporate it into IT strategies around big-data, cloud computing, security and data governance.

Agile quantitative analytics in Financial Services

Achieving modularity: functional vs volatility decomposition

Enterprise architecture is all about managing complexity. Many EA initiatives tend to focus on managing IT complexity, but there is only so much that can be done there before it becomes obvious that IT complexity is, for the most part, a direct consequence of enterprise complexity. To recap, complexity needs to be managed in order to maintain agility – the ability for an organisation to respond (relatively) quickly and efficiently to changes in markets, regulations or innovation, and to continue to do this over time.

Enterprise complexity can be considered to be the activities performed and resources consumed by the organisation in order to deliver ‘value’, a metric usually measured through the ability to maintain (financial) income in excess of expenses over time.

Breaking down these activities and resources into appropriate partitions that allow holistic thinking and planning to occur is one of the key challenges of enterprise architecture, and there are various techniques to do this.

Top-Down Decomposition

The natural approach to decomposition is to first understand what an organisation does – i.e., what are the (business) functions that it performs. Simply put, a function is a collection of data and decision points that are closely related (e.g., ‘Payments ‘is a function). Functions typically add little value in and of themselves – rather they form part of an end-to-end process that delivers value for a person or legal entity in some context. For example, a payment on its own means nothing: it is usually performed in the context of a specific exchange of value or service.

So a first course of action is to create a taxonomy (or, more accurately, an ontology) to describe the functions performed consistently across an enterprise. Then, various processes, products or services can be described as a composition of those functions.

If we accept (and this is far from accepted everywhere) that EA is focused on information systems complexity, then EA is not responsible for the complexity relating to the existence of processes, products or services. The creation or destruction of these are usually a direct consequence of business decisions. However, EA should be responsible for cataloging these, and ensuring these are incorporated into other enterprise processes (such as, for example, disaster recovery or business continuity processes). And EA should relate these to the functional taxonomy and the information systems architecture.

This can get very complex very quickly due to the sheer number of processes, products and services – including their various variations – most organisations have. So it is important to partition or decompose the complexity into manageable chunks to facilitate meaningful conversations.

Enterprise Equivalence Relations

One way to do this at enterprise level is to group functions into partitions (aka domains) according to synergy or autonomy (as described by Roger Sessions), for all products/services supporting a particular business. This approach is based on the mathematical concept of equivalenceBecause different functions in different contexts may have differing equivalence relationships, functions may appear in multiple partitions. One role of EA is to assess and validate if those functions are actually autonomous or if there is the potential to group apparently duplicate functions into a new partition.

Once partitions are identified, it is possible to apply ‘traditional’ EA thinking to a particular partition, because that partition is of a manageable size. By ‘traditional’ EA, I mean applying Zachman, TOGAF, PEAF, or any of the myriad methodologies/frameworks that are out there. More specifically, at that level, it is possible to establish a meaningful information systems strategy or goal for a particular partition that is directly supporting business agility objectives.

The Fallacy of Functional Decomposition

Once you get down to the level of partition, the utility of functional decomposition when it comes to architecting solutions becomes less useful. The natural tendency for architects would be to build reusable components or services that realise the various functions that comprise the partition. In fact, this may be the wrong thing to do. As Jüval Lowy demonstrates in his excellent webinar, this may result in more complexity, not less (and hence less agility).

When it comes to software architecture, the real reason to modularise your architecture is to manage volatility or uncertainty – and to ensure that volatility in one part of the architecture does not unnecessarily negatively impact another part of the architecture over time. Doing this allows agility to be maintained, so volatile parts of the application can, in fact, change frequently, at low impact to other parts of the application.

When looking at a software architecture through this lens, a quite different set of components/modules/services may become evident than those which may otherwise be obvious when using functional decomposition – the example in the webinar demonstrates this very well. A key argument used by Jüval in his presentation is that (to paraphrase him somewhat) functions are, in general, highly dependent on the context in which they are used, so to split them out into separate services may require making often impossible assumptions about all possible contexts the functions could be invoked in.

In this sense, identified components, modules or services can be considered to be providing options in terms of what is done, or how it is done, within the context of a larger system with parts of variable volatility. (See my earlier post on real options in the context of agility to understand more about options in this context.)

Partitions as Enterprise Architecture

When each partition is considered with respect to its relationship with other partitions, there is a lot of uncertainty around how different partitions will evolve. To allow for maximum flexibility, every partition should assume each other partition is a volatile part of their architecture, and design accordingly for this. This allows each partition to evolve (reasonably) independently with minimum fixed co-ordination points, without compromising the enterprise architecture by having different partitions replicate the behaviours of partitions they depend on.

This then allows:

  • Investment to be expressed in terms of impact to one or more partitions
  • Partitions to establish their own implementation strategies
  • Agile principles to be agreed on a per partition basis
  • Architectural standards to be agreed on a per partition basis
  • Partitions to define internally reusable components relevant to that partition only
  • Partitions to expose partition behaviour to other partitions in an enterprise-consistent way

In generative organisation cultures, partitions do not need to be organisationally aligned. However, in other organisation cultures (pathological or bureaucratic), alignment of enterprise infrastructure functions such as IT or operations (at least) with partitions (domains) may help accelerate the architectural and cultural changes needed – especially if coupled with broader transformations around investment planning, agile adoption and enterprise architecture.

Achieving modularity: functional vs volatility decomposition

The Importance of Principles in Managing Complexity

[tl;dr The first step to managing complexity is to define what is important. Principles are a great way to reach consensus on what is important, and establishing governance around adherence to those principles are the first step to getting on top of change-related complexity.]

Managing complexity is not easy, whether you are in a large organisation, or in a relatively green-field environment where growth is rapid and complexity is a far-off problem you hope to have some day…

The fact is, if your organisation suffers from a complexity problem (in technology, and implicitly, processes and data), and you are still in (a profitable) business, then it is that complexity which, for better or for worse, has been the instrument of that organisation’s success. As a consequence, it is quite hard to convince people to implement complexity management (read ‘enterprise architecture’) before complexity actually becomes a problem.

Sometimes, however, unmanaged complexity brings an organisation’s growth to a grinding halt, in the worst cases leading to eventual shutdown. (In this context, an organisation can be an independent business or a business unit within a larger company.)

Historically, investment in IT incurred large capital costs, so usually there was a large focus on realising value from that investment – i.e. to exploit it to the maximum. Exploiting technology usually means adding on more and more functionality to existing systems in ways that meet business imperatives.  But business imperatives tend to be fairly short-term in nature. Without guiding principles, the act of exploiting technology leads to ever more expensive change-related costs.

For any self-respecting IT organisation, a core set of principles, actively governed, is an absolute necessity. Principles transcend solution architectures and technologies, and enable project autonomy as decisions can be made locally without requiring ‘central’ approval, provided principles are being adhered to.

Such principles, call them ‘implementation principles’ should guide how IT organisations do their work and manage change.

But many principles that stay only within the IT organisation may break in the face of over-whelming business imperatives: on the basis that the (immediate) business needs comes first, IT often takes it on the chin and accumulates technical debt.

So it is important to agree principles that business stakeholders agree to and supports, and for which they are willing to have an open discussion about in the context of delaying projects, increasing costs, prioritising change, or driving organisational change.

Let’s call these principles ‘value principles’ – they are focused on how the business can realise value from its investments, particularly in IT.

In order to understand what this means in practice, it may be worth returning to the ‘Zero CapEx IT’ concept introduced in my last post. What if all IT was sourced through (virtual or actual) 3rd parties, so that all of IT was delivered as software-as-a-service?

In this case, the ‘implementation principles’ would be whatever guidelines or standards were used by the SaaS provider to ensure they could deliver change reliably, and maintain service according to agreed levels. The SaaS provider has a vested interest in the client getting value out of their service, but the only levers they have to ensure that is their ability to react in a timely manner to change requests.

From the business’ perspective, adopting an IT (SaaS) service requires structural change to ensure the business can take advantage of the technology. This may mean assigning roles and responsibilities, defining new activities, maybe creating whole new departments, etc. The business may be paying for the service, but unless it pro-actively maximises what the service can offer in terms of what the business needs, it may not get the value. The SaaS (i.e., IT) can do little to address this situation, and may end up losing the client if the client does not take action.

On the other hand, if a client is pro-actively engaged, then they will likely want regular changes so they can continually optimise the value of the service to their operations. This puts the onus on the provider to remain agile.

In a corporate environment, where the accountability between value and implementation is often blurred, there is usually little optionality with respect to who your provider is – or, from a provider perspective, who your client is. In fact, the client/provider perspective in a corporate environment can be damaging, as it can reinforce an us-vs-them culture, leading to mistrust and an ‘order-taking’ mentality developing between the client and the provider.

In this environment, complexity will thrive because, after all, the client is tied and therefore has to keep paying for their service. There is little incentive to be agile..if the client wants more quicker, they have to pay for it.

If, on the other hand, the client retains an active, supportive interest in how their IT is delivered, as well as what is delivered, then they will get the IT they want.

If the client is not willing to agree certain base-line principles with their (IT) provider, and to put in place structures to enforce them (including a formal waiver process), then one can expect little to improve.

On the other hand, if the client or business partner will willingly participate in establishing principles and related governance forums, this will set the tone for how the business/IT relationship will evolve going forward.

At the very least, ‘value principles’ (such as, for example, “The business will commit to implementing the necessary structural changes to maximise IT investment”) should be agreed, and the consequence/implications of these to project planning should be understood.

Note that value principles are increasingly including concepts around data governance, which historically have been an ‘implementation’ principle – for example, the principle that ‘information is an asset’.

‘Implementation principles’  are principles which determine how (IT) change is decided and implemented, and how complexity is managed, by the provider. Where the business and technology are inter-twined (i.e., the business value proposition *is* the technology), these would simply form part of the business ‘value principles’. But where the business is agnostic as to the technology, and are only focused on value, then the ‘implementation principles’ are solely for the provider to define and enforce. An example of an ‘implementation’ principle may be ‘Control of technical diversity’, with consequent implications for solution architectures.

Adopting principles means that an organisation has some sense of what it wants to be, even if only aspirationally. It allows individuals or teams to feel comfortable making decisions on their own without feeling they have to get approval, which is key for agile cultures. Establishing formal business+IT forums to review principle waivers helps provide transparency with respect to project risk and platform risk. And it allows Technical Debt to be managed, as every waiver is a recognition of technical debt incurred.

In conclusion, managing complexity is hard. But Principles can make it manageable.

The Importance of Principles in Managing Complexity

What if IT had no CapEx? A gedanken experiment.

[tl;dr Imagining IT with no CapEx opens up possibilities that may not otherwise be considered. This is a thought experiment and it may never be practical or desirable to realise, but seeking to minimise direct CapEx expenditure in portfolio planning may help the firm’s overall Agile agenda.]

The ‘no CapEx’ thought experiment is related to the ‘Lean Enterprise’ theme – i.e., how enterprises can be more agile and responsive to business change, and how this could fundamentally change organisation’s approach to IT.

CapEx (or Capital Expenditure) usually relates to large up-front costs that are depreciated over a number of years. Investments such as infrastructure and software licenses are generally treated as CapEx. OpEx, on the other hand, refers to operational expenses, incurred on a recurring basis.

The interesting thing about CapEx is that it is a commitment – i.e., there is no optionality behind it. This means if business needs change, the cost of breaking that commitment could be prohibitively high. So businesses may be forced to keep pushing square pegs through round holes in their technology, reducing the potential value of the investment.

Agility is all about optionality – the deferring of (expensive) commitments to the latest possible moment, so that the commercial benefits of that commitment can be productively exploited for the life of that commitment.

Increasingly, it is becoming less and less clear to organisations, as technology rapidly advances, where to make these commitments that do not unnecessarily limit options in the future.

So, as a thought experiment, how would a CapEx-less IT organisation actually work, and what would it look like?

The first observation is that somebody, somewhere, has to make a CapEx investment for IT. For example, Amazon has made a significant CapEx investment in its datacentres, and it is this investment that allows other businesses to avoid having to make CapEx investments. Is this profitable for Amazon? It’s hard to tell..prices keep going down, and efficiency is improving all the time. In the long run, some profitable players will emerge. For now, organisations choose to use Amazon only because it is cheaper than the capital-investment alternative, over the length of the investment cycle.

Software licensing as a revenue stream is under pressure: businesses may be happier to pay for services to support the software they are using rather than for some license that gives them the right to use a specific version of a software package for commercial purposes. In the days when software was distributed to consumers and deployed onto proprietary infrastructure, this made sense. But more and more generally useful software is being open-sourced, and more and more ‘proprietary’ software is being delivered as a service. So CapEx related to software licenses will (eventually) not be as profitable as it once was (at least for the major enterprise software players).

So, in a CapEx-free IT environment, the IT landscape looks like an interacting set of software services that collectively deliver business value (i.e., give the businesses what they need in a timely manner and at a manageable cost).

The obligation of the enterprise is also clear: either derive value from the service, or stop using the service. In other words, exercise your option to terminate use of the service if the value is not there.

In this environment, do organisations actually need an IT strategy in the conventional/traditional sense? Isn’t it all about sourcing solutions and ensuring the solutions providers interact appropriately to collectively provide value to the enterprise, and that the enterprise interacts with the providers in such a way as to optimise change for maximum value? In the extreme case, yes.

For this model to work, providers need to be agile. They will likely have a number of clients, all requiring different solutions (albeit in the same domain). They need to be responsive, and they need to be able to scale. It is these characteristics that will differentiate among different providers.

All of this has massive implications with respect to data management, security and compliance. A strong culture of data governance coupled with sensible security and compliance policies could (eventually) allow these hurdles to be overcome.

The question for firms then, is where does it make sense to have an internal IT capability? And, apart from some R&D capabilities, is it worth it? Especially as individual businesses could innovate with a number of providers at relatively low risk. Perhaps the model should be to treat each internal capability as potentially a future third party service provider? Give the team the ability to innovate on agility, responsiveness and scaleability. Retain the option in the future to let the ‘internal’ service provider serve other businesses, or to shut it down if it is not providing value.

It is worth bearing in mind that the concept of the traditional ‘end user’ is changing. The end-user is someone who is often quite adept at making technology do what they want, without being considered traditional developers. Providing end-users with innovative ways to connect and orchestrate various (enterprise) services is expected. The difference between ‘local’ and ‘enterprise’ development is then emphasised, with local development being reactive and using enterprise services to ensure compliance etc. while still meeting localised needs.

Such an approach would maximise enterprise agility and also provide skilled technologists and even end users the ability to flex their technical chops in ways that businesses can exploit to their fullest advantage. The cost of this agility is that other organisations can use the same ‘building blocks’ to greater competitive advantage. As has always been the case, it is not so much the technology, as what you do with it, that really matters.

To summarise, while not always possible or even desirable, minimising (direct) CapEx expenses for IT is becoming a feasible way to achieve IT agility.

What if IT had no CapEx? A gedanken experiment.

The CDO: Enterprise Architect’s best friend

[tl;dr The CDO’s agenda may be the best way to bring the Enterprise Architecture agenda to the business end of the C-suite table.]

Enterprise Architecture as a concept has for some time claimed intellectual ownership of a holistic, integrated view of architecture within an organisation. In that lofty position, enterprise architects have put a stake in the ground with respect to key architecture activities, and formalised them all within frameworks such as TOGAF.

In TOGAF, Data Architecture has been buried within the Information Systems Architecture phase of the Architecture Development Method, along with Application Architecture.

But the real world often gets in the way of (arguably) conceptually clean thinking: I note the rise of the CDO, or Chief Data Officer, usually as part of the COO function of an organisation. (This CDO is not to be confused with the Chief Digital Officer, who’s primary focus could critically be seen as to build cool apps…)

So, why has Data gotten all this attention, while Enterprise Architecture still, for the most part, languishes within IT?

Well, as with most things, its happening for all the wrong reasons: mostly its because regulators are holding businesses accountable for how they manage regulated data, and businesses need, in turn, to hold people to account for doing that. Hence the need for a CDO.

But in the act of trying to understand what data they are liable for managing, and to what extent, CDO’s necessarily have to do Data Architecture and Data Governance. And at this point the CDO’s activities starts dipping into new areas such as business process management, and business & IT strategy – and EA.

If CDOs choose to take an expansive view of their role (and I believe they should), then it would most definitely include all the domains of Enterprise Architecture – especially if one views data complexity and technology complexity as two sides of the same enterprise complexity coin: one as a consequence of business decisions, the other as a consequence of IT decisions.

The good news is that this would at long last put EAs into a role more aligned with business goals than with the goals of the IT organisation. This is not to say that the goals of the IT organisation are in some way wrong, but rather than because the business had abdicated its responsibility for many decisions that IT needs in order to do “the right thing”, IT has had to fill the gaps itself – and in ways which said abdicating businesses could understand/appreciate.

For anyone in the position of CTO, this should help a lot: without prescribing which technologies should be used, businesses then can give real context to their demand which the CTO can interpret strategically, and, in collaboration with his CDO colleague, agree a technology strategy that can deliver on those needs.

In this way, the CDO/CTO have a very symbiotic relationship: the boundaries should be fuzzy, although the responsibilities should be clear: the CDO provides the context, the CTO delivers the technology.

So collaboration is key: while in principle one should be able to describe what a business needs strictly in terms of process and data flows etc, the reality is that business needs and practices change in *response* to technology. In other words, it is not a one-way street. What technology is *capable* of doing can shape what the business *should* be doing. (In many cases, for example, technology may eliminate the need for entire processes or activities.)

Folks on the edge of this CDO/CTO collaboration have a choice: do I become predominantly business facing (CDO), or do I become predominantly technology facing (CTO)? Folks may choose to specialise, or some, like myself, may choose to focus on different perspectives according to the needs of the organisation.

One interesting impact from this approach may  address one of the biggest questions out there at the moment: how to absorb the services and capabilities offered by innovative, efficient, and nimble external businesses into the architecture of the enterprise, while retaining control, compliance and agility of the data and processes? But more on this later.

So, where does all this sit with respect to the ‘3 pillars of a digital strategy’? The points raised there are still valid. With the right collaboration and priorities, it is possible to have different folks fill the roles of ‘client centricity’, ‘chief data officer’ and ‘head of enterprise complexity’. But for the most part, all these roles begin and end with data. A proper, disciplined and thoughtful approach to how and why data is captured, processed, stored, retrieved and transmitted should properly address firm-wide themes such as improving client experience, integrating innovative services, keeping a lid on complexity and retaining an appropriate level of enterprise agility.

Some folks may be wondering where the CIO fits in all this: the CIO maintains responsibility for the management and operation of all IT processes (as defined in The Open Group’s IT4IT model). This is a big job. But it is not the job of the CTO, whose focus is to ensure the right technology is brought to bear to solve the right problem.

The CDO: Enterprise Architect’s best friend