The cloudy future of data management & governance

[tl;dr The cloud enables novel ways of handling an expected explosion in data store types and instances, allowing stakeholders to know exactly what data is where at all times without human process dependencies.]

Data management & governance is a big and growing concerns for more and more organizations of all sizes. Effective data management is critical for compliance, resilience, and innovation.

Data governance is necessary to know what data you have, when you got it, where it came from, where it is being used, and whether it is of good quality or not.

While the field is relatively mature, the rise of cloud-based services and service-enabled infrastructure will, I believe, fundamentally change the nature of how data is managed in the future and enable greater agility if leveraged effectively.

Data Management Meta-Data

Data and application architects are concerned about ensuring that applications use the most appropriate data storage solution for the problem being solved. To better manage cost and complexity, firms tend to converge on a handful of data management standards (such as Oracle or SQL Server for databases; NFS or NTFS for filesystems; Netezza, Terradata for data warehousing, Hadoop/HDFS for data processing, etc). Expertise is concentrated around central teams that manage provisioning, deployments, and operations for each platform. This introduces dependencies that project teams must plan around. This also requires forward planning and long-term commitment – so not particularly agile.

Keeping up with data storage technology is a challenge – technologies like key/value stores, graph databases, columnar databases, object stores, and document databases exist as these represent varying datasets in a more natural way for applications to consume, reducing or eliminating the ‘impedance mismatch‘ between how applications view state and how that state is stored.

In particular, may datastore technologies are used to scaling up rather than out; i.e., the only way to make them perform faster is to add more CPU/memory, or faster IO hardware. While this keeps applications simpler, it require significant forward planning and longer-term commitments to scale up, and is out of the control of application development teams. Cloud-based services can typically handle scale-out transparently, although applications may need to be aware of the data dimensions across which scale out happens (e.g., sharding by primary key, etc).

Fulfilling provisioning requests for a new datastore on-premise is mostly ticket driven, but fulfillment is still mostly by humans and not by software within enterprises – which means an “infrastructure-as-code” approach is not feasible.

Data Store Manageability vs Application Complexity

Most firms decide that it is better to simplify the data landscape such that fewer datastore solutions are available, but to resource those solutions so that they are properly supported to handle business critical production workloads with maximum efficiency.

The trade-off is in the applications themselves, where the data storage solutions available end up driving the application architecture, rather than the application architecture (i.e., requirements) dictating the most appropriate data store solution, which would result in the lowest impedance mismatch.

A typical example of an impedance mismatch are object-oriented applications (written in, say C++ or Java) which use relational databases. Here, object/relational mapping technologies such as Hibernate or Gigaspaces are used to map the application view of the data (which likes to view data as in-memory objects) to the relational view. These middle layers, while useful for naturally relational data, can be overly expensive to maintain and operate if what your application really needs is a more appropriate type of datastore (e.g., graph).

This mismatch gets exacerbated in a microservices environment where each microservice is responsible for its own persistence, and individual microservices are written in the language most appropriate for the problem domain. Typical imperative, object-oriented languages implementing transactional systems will lean heavily towards relational databases and ORMs, whereas applications dealing with multi-media, graphs, very-large objects, or simple key/value pairs will not benefit from this architecture.

The rise of event-driven architectures (in particular, transactional ‘sagas‘, and ‘aggregates‘ from DDD) will also tend to move architectures away from ‘kitchen-sink’ business object definitions maintained in a single code-base into multiple discrete but overlapping schemas maintained by different code-bases, and triggered by common or related events. This will ultimately lead to an increase in the number of independently managed datastores in an organisation, all of which need management and governance across multiple environments.

For on-premise solutions, the pressure to keep the number of datastore options down, while dealing with an explosion in instances, is going to limit application data architecture choices, increase application complexity (to cope with datastore impedance mismatch), and reduce the benefits from migrating to a microservices architecture (shared datastores favor a monolithic architecture).

Cloud Changes Everything

So how does cloud fundamentally change how we deal with data management and governance? The most obvious benefit cloud brings is around the variety of data storage services available, covering all the typical use cases applications need. Capacity and provisioning is no longer an operational concern, as it is handled by the cloud provider. So data store resource requirements can now be formulated in code (e.g., in CloudFormation, Terraform, etc).

This, in principle, allows applications (microservices) to choose the most appropriate storage solution for their problem domain, and to minimize the need for long-term forward planning.

Using code to specify and provision database services also has another advantage: cloud service providers typically offer the means to tag all instantiated services with your own meta-data. So you can define and implement your own data management tagging standards, and enforce these using tools provided by the cloud provider. These can be particularly useful when integrating with established data discovery tools, which depend on reliable meta-data sources. For example, tags can be defined based on a data ontology defined by the chief data office (see my previous article on CDO).

These mechanisms can be highly automated via service catalogs (such as AWS Service Catalog or ServiceNow), which allow compliant stacks to be provisioned without requiring developers to directly access the cloud providers APIs.

Let a thousand flowers bloom

The obvious downside to letting teams select their storage needs is the likely explosion of data stores – even if they are selected from a managed service catalog. But the expectation is that each distinct store would be relatively simple – at least compared to relational stores which support many application use cases and queries in a single database.

In on-premise situations, data integration is also a real challenge – usually addressed by a myriad of ad-hoc jobs and processes whose purpose is to extract data from one system and send it to another (i.e., ETL). Usually no meta-data exists around these processes, except that afforded by proprietary ETL systems.

In best case integration scenarios, ‘glue’ data flows are implemented in enterprise service buses that generally will have some form of governance attached – but which usually has the undesirable side-effect of introducing yet another dependency for development teams which needs planning and resourcing. Ideally, teams want to be able to use ‘dumb’ pipes for messaging, and be able to self-serve their message governance, such that enterprise data governance tools can still know what data is being published/consumed, and by whom.

Cloud provides two main game-changing capabilities to manage data complexity management at scale. Specifically:

  • All resources that manage data can be tagged with appropriate meta-data – without needing to, for example, examine tables or know anything about the specifics about the data service. This can also extend to messaging services.
  • Serverless functions (e.g., AWS Lambda, Azure Functions, etc) can be used to implement ‘glue’ logic, and can themselves be tagged and managed in an automated way. Serverless functions can also be used to do more intelligent updates of data management meta-data – for example, update a specific repository when a particular service is instantiated, etc. Serverless functions can be viewed as on-demand microservices which may have their own data stores – usually provided via a managed service.

Data, Data Everywhere

By adopting a cloud-enabled microservice architecture, using datastore services provisioned by code, applying event driven architecture, leveraging serverless functions, and engaging with the chief data officer for meta-data standards, it will be possible to have an unprecedented up-to-date view of what data exists in an organization and where. It may even address static views of data in motion (through tagging queue and notification topic resources). The data would be maintained via policies and rules implemented in service catalog templates and lambda functions triggered automatically by cloud configuration changes, so it would always be current and correct.

The CDO, as well as data and enterprise architects, would be the chief consumer of this metadata – either directly or as inputs into other applications, such as data governance tools, etc.

Conclusion

The ultimate goal is to avoid data management and governance processes which rely on reactive human (IT) input to maintain high-quality data management metadata. Reliable metadata can give rise to a whole new range of capabilities for stakeholders across the enterprise, and finally take IT out of the loop for business-as-usual data management queries, freeing up valuable resources for building even more data-driven applications.

The cloudy future of data management & governance

The future of modularity is..serverless

[tl;dr As platform solutions evolve and improve, the pressure for firms to reduce costs, increase agility and be resilient to failure will drive teams to adopt modern infrastructure platform solutions, and in the process decompose and simplify monoliths, adopt microservices and ultimately pave the way to building naturally modular systems on serverless platforms.]

“Modularity” – the (de)composition of complex systems into independently composable or replaceable components without sacrificing performance, security or usability – is an architectural holy grail.

Businesses may be modular (commonly expressed through capability maps), and IT systems can be modular. IT modularity can also be described as SOA (Service Oriented Architecture), although because of many aborted attempts at (commercializing) SOA in the past, the term is no longer in fashion. Ideally, the relationship between business ‘modules’ and IT application modules should be fully aligned (assuming the business itself has a coherent underlying business architecture).

Microservices are the latest manifestation of SOA, but this is born from a fundamentally different way of thinking about how applications are developed, tested, deployed and operated – without the need for proprietary vendor software.

Serverless takes takes the microservices concept one step further, by removing the need for developers (or, indeed, operators) to worry about looking after individual servers – whether virtual or physical.

A brief history of microservices

Commercial manifestations of microservices have been around for quite a while – for example Spring Boot, or OSGi for Java – but these have very commercial roots, and implement a framework tied to a particular language. Firms may successfully implement these technologies, but they will need to have already gone through much of the microservices stone soup journey. It is not possible to ‘buy’ a microservices culture from a technology vendor.

Because microservices are intended to be independently testable and deployable components, a microservices architecture inherently rejects the notion of a common framework for implementing/supporting the microservices natures of an application. This therefore puts the onus on the infrastructure platform to provide all the capabilities needed to build and run microservices.

So, capabilities like naming, discovery, orchestration, encryption, load balancing, retries, tracing, logging, monitoring, etc which used to be handled by language-specific frameworks are now increasingly the province of the ‘platform’. This greatly reduces the need for complex, hard-to-learn frameworks, but places a lot of responsibility on the platform, which must handle these requirements in a language-neutral way.

Currently, the most popular ‘platforms’ are the major cloud providers (Azure, Google, AWS, Digital Ocean, etc), IaaS vendors (e.g., VMWare, HPE), core platform building blocks such as Kubernetes, and platform solutions such as Pivotal Cloud Foundry,  Open Shift and Mesophere. (IBM’s BlueMix/Cloud is likely to be superseded by Red Hat’s Open Shift.)

The latter solutions previously had their own underlying platform solutions (e.g., OSGi for BlueMix, Bosh for PKS), but most platform vendors have now shifted to use Kubernetes under the hood. These solutions are intended to work in multiple cloud environments or on-premise, and therefore in principle allow developers to avoid caring about whether their applications are deployed on-premise or on-cloud in an IaaS-neutral way.

Decomposing Monolithic Architectures

With the capabilities these platforms offer, developers will be incentivized to decompose their applications into logical, distributed functional components, because the marginal additional cost of maintaining/monitoring each new process is relatively low (albeit definitely not zero). This approach is naturally amenable to supporting event driven architectures, as well as more conventional RESTful and RPC architectures (such as gRPC), as running processes can be mapped naturally to APIs, services and messages.

But not all processes need to be running constantly – and indeed, many processes are ‘out-of-band’ processes, which serve as ‘glue’ to connect events that happen in one system to another system: if events are relatively infrequent (e.g., less than one every few seconds), then no resources need to be used in-between events. So provisioning long-running docker containers etc may be overkill for many of these processes – especially if the ‘state’ required by those processes can be made available in a low-latency, highly available long-running infrastructure service such as a high-performance database or cache.

Functions on Demand

Enter ‘serverless’, which aims to specify the resources required to execute a single piece of code (basically a functional monolith) on-demand in a single package – roughly the equivalent of, for example, a declarative service in OSGi. The runtime in which the code runs is not the concern of the developer in a serverless architecture. There are no VMs, containers or side-cars – only functions communicating via APIs and events.

Currently, the serverless offerings by the major cloud providers are really only intended for ‘significant’ functions which justify the separate allocation of compute, storage and network resources needed to run them. A popular use case are ‘transformational’ functions which convert binary data from one form to another – e.g., create a thumbnail image from a full image – which may temporarily require a lot of CPU or memory. In contrast, an OSGi Declarative Service, for example, could be instantiated by the runtime inside the same process/memory space as the calling service – a handy technique for validating a modular architecture without worrying about the increased failure modes of a distributed system, while allowing the system to be readily scaled out in the future.

Modular Architectures vs Distributed Architectures

Serverless functions can be viewed as ‘modules’ by another name – albeit modules that happen to require separately allocated memory, compute and storage to the calling component. While this is a natural fit for browser-based applications, it is not a great fit for monolithic applications that would benefit from modular architectures, but not necessarily benefit from distributed architectures. For legacy applications, the key architectural question is whether it is necessary or appropriate to modularize the application prior to distributing the application or migrating it to an orchestration platform such as Kubernetes, AWS ECS, etc.

As things currently stand, the most appropriate (lowest risk) migration route for complex monolithic applications is likely to be a migration of some form to one of the orchestrated platforms identified above. By allowing the platform to take care of ‘non-functional’ features (such as naming, resilience, etc), perhaps the monolith can be simplified. Over time, the monolith can then be decomposed into modular ‘microservices’ aligned by APIs or events, and perhaps eventually some functionality could decompose into true serverless functions.

Serverless and Process Ownership

Concurrently with decomposing the monolith, a (significant) subset of features – mainly those not built directly using the application code-base, or which straddle two applications – may be meaningfully moved to serverless solutions without depending on the functional decomposition of the monolith.

It’s interesting to note that such an architectural move may allow process owners to own these serverless functions, rather than relying on application owners, where often, in large enterprises, it isn’t even clear which application owner should own a piece of ‘glue’ code, or be accountable when such code breaks due to a change in a dependent system.

In particular, existing ‘glue’ code which relies on centralized enterprise service buses or equivalent would benefit massively from being migrated to a serverless architecture. This not only empowers teams that look after the processes the glue code supports, but also ensures optimal infrastructure resource allocation, as ESBs can often be heavy consumers of infrastructure resources. (Note that a centralized messaging system may still be needed, but this would be a ‘dumb pipe’, and should itself be offered as a service.)

Serverless First Architecture

Ultimately, nirvana for most application developers and businesses, is a ‘serverless-first’ architecture, where delivery velocity is only limited by the capabilities of the development team, and solutions scale both in function and in usage seamlessly without significant re-engineering. It is fair to say that serverless is a long way from achieving this nirvana (technologies like ‘AIOps‘ have a long way to go), and most teams still have to shift from monolithic to modular and distributed thinking, while still knowing when a monolith is still the most appropriate solution for a given problem.

As platform solutions improve and mature, however, and the pressure mounts on businesses whose value proposition is not in the platform engineering space to reduce costs, increase agility and be increasingly resilient to failures of all kinds, the path from monolith to orchestrated microservices to serverless (and perhaps ‘low-code’) applications seems inevitable.

The future of modularity is..serverless

What I realized from studying AWS Services & APIs

[tl;dr The weakest link for firms wishing to achieve business agility is principally based around the financial and physical constraints imposed by managing datacenters and infrastructure. The business goals of agile, devops and enterprise architecture are fundamentally unachievable unless these constraints can be fully abstracted through software services.]

Background

Anybody who has grown up with technology with the PC generation (1985-2005) will have developed software with a fairly deep understanding of how the software worked from an OS/CPU, network, and storage perspective. Much of that generation would have had some formal education in the basics of computer science.

Initially, the PC generation did not have to worry about servers and infrastructure: software ran on PCs. As PCs became more networked, dedicated PCs to run ‘server’ software needed to be connected to the desktop PCs. And folks tasked with building software to run on the servers would also have to buy higher-spec PCs for server-side, install (network) operating systems, connect them to desktop PCs via LAN cables, install disk drives and databases, etc. This would all form part of the ‘waterfall’ project plan to deliver working software, and would all be rather predictable in timeframes.

As organizations added more and more business-critical, network-based software to their portfolios, organization structures were created for datacenter management, networking, infrastructure/server management, storage and database provisioning and operation, middleware management, etc, etc. A bit like the mainframe structures that preceded the PC generation, in fact.

Introducing Agile

And so we come to Agile. While Agile was principally motivated by the flexibility in GUI design offered by HTML (vs traditional GUI design) – basically allowing development teams to iterate rapidly over, and improve on, different implementations of UI – ‘Agile’ quickly became more ‘enterprise’ oriented, as planning and coordinating demand across multiple teams, both infrastructure and application development, was rapidly becoming a massive bottleneck.

It was, and is, widely recognized that these challenges are largely cultural – i.e., that if only teams understood how to collaborate and communicate, everything would be much better for everyone – all the way from the top down. And so a thriving industry exists in coaching firms how to ‘improve’ their culture – aka the ‘agile industrial machine’.

Unfortunately, it turns out there is no silver bullet: the real goal – organizational or business agility – has been elusive. Big organizations still expend vast amounts of time and resources doing small incremental change, most activity is involved in maintaining/supporting existing operations, and truly transformational activities which bring an organization’s full capabilities together for the benefit of the customer still do not succeed.

The Reality of Agile

The basic tenet behind Agile is the idea of cross-functional teams. However, it is obvious that most teams in organizations are unable to align themselves perfectly according to the demand they are receiving (i.e., the equivalent of providing a customer account manager), and even if they did, the number of participants in a typical agile ‘scrum’ or ‘scrum of scrums’ meeting would quickly exceed the consensus maximum of about 9 participants needed for a scrum to be successful.

So most agile teams resort to the only agile they know – i.e., developers, QA and maybe product owner and/or scrum-master participating in daily scrums. Every other dependency is managed as part of an overall program of work (with communication handled by a project/program manager), or through on-demand ‘tickets’ whereby teams can request a service from other teams.

The basic impact of this is that pre-planned work (resources) gets prioritized ahead of on-demand ‘tickets’ (excluding tickets relating to urgent operational issues), and so agile teams are forced to compromise the quality of their work (if they can proceed at all).

DevOps – Managing Infrastructure Dependencies

DevOps is a response to the widening communications/collaboration chasm between application development teams and infrastructure/operations teams in organizations. It recognizes that operational and infrastructural concerns are inherent characteristics of software, and software should not be designed without these concerns being first-class requirements along with product features/business requirements.

On the other hand, infrastructure/operations providers, being primarily concerned with stability, seek to offer a small number of efficient standardized services that they know they can support. Historically, infrastructure providers could only innovate and adapt as fast as hardware infrastructure could be procured, installed, supported and amortized – which is to say, innovation cycles measured in years.

In the meantime, application development teams are constantly pushing the boundaries of infrastructure – principally because most business needs can be realized in software, with sufficiently talented engineers, and those tasked with building software often assume that infrastructure can adapt as quickly.

Microservices – Managing AppDev Team to AppDev Team Dependencies

While DevOps is a response to friction in application development and infrastructure/operations engagement, microservices can be usefully seen as a response to how application development team can manage dependencies on each other.

In an ideal organization, an application development team can leverage/reuse capabilities provided by another team through their APIs, with minimum pre-planning and up-front communication. Teams would expose formal APIs with relevant documentation, and most engagement could be confined to service change requests from other teams and/or major business initiatives. Teams would not be required to test/deploy in lock-step with each other.

Such collaboration between teams would need to be formally recognized by business/product owners as part of the architecture of the platform – i.e., a degree of ‘mechanical sympathy’ is needed by those envisioning new business initiatives to know how best to leverage, and extend, software building blocks in the organization. This is best done by Product Management, who must steward the end-to-end business and data architecture of the organization or value-stream in partnership with business development and engineering.

Putting it all together

To date, most organizations have been fighting a losing battle. The desire to do agile and devops is strong, but the fundamental weakness in the chain is the ability for internal infrastructure providers and operators to move as fast as software development teams need them to – issues as much related to financial management as it is to managing physical buildings, hardware, etc.

What cloud providers are doing is creating software-level abstractions of infrastructure services, allowing the potential of agile, devops and microservices to begin to be realized in practice.

Understanding these services and abstractions is like re-learning the basic principles of Computer Science and Engineering – but through a ‘service’ lens. The same issues need to be addressed, the same technical challenges exist. Except now some aspects of those challenges no longer need to be solved by organizations (e.g., how to efficiently abstract infrastructure services at scale), and businesses can focus on the designing the infrastructure services that are matched with the needs of application developers (rather than a compromise).

Conclusion

The AWS Service Catalog and APIs is an extraordinary achievement (as is similar work by other cloud providers, although they have yet to achieve the catalog breadth that AWS has). Architects need to know and understand these service abstractions and focus on matching application needs with business needs, and can worry less about the traditional constraints infrastructure organizations have had to work with.

In many respects, the variations between these abstractions across providers will vary only in syntax and features. Ultimately (probably at least 10 years from now) all commodity services will converge, or be available through efficient ‘cross-plane’ solutions which abstract providers. So that is why I am choosing to ‘go deep’ on the AWS APIs. This is, in my opinion, the most concrete starting point to helping firms achieve ‘agile’ nirvana.

What I realized from studying AWS Services & APIs

The hidden costs of PaaS & microservice engineering innovation

[tl;dr The leap from monolithic application development into the world of PaaS and microservices highlights the need for consistent collaboration, disciplined development and a strong vision in order to ensure sustainable business value.]

The pace of innovation in the PaaS and microservice space is increasing rapidly. This, coupled with increasing pressure on ‘traditional’ organisations to deliver more value more quickly from IT investments, is causing a flurry of interest in PaaS enabling technologies such as Cloud Foundry (favoured by the likes of IBM and Pivotal), OpenShift (favoured by RedHat), Azure (Microsoft), Heroku (SalesForce), AWS, Google Application Engine, etc.

A key characteristic of all these PaaS solutions is that they are ‘devops’ enabled – i.e., it is possible to automate both code and infrastructure deployment, enabling the way to have highly automated operational processes for applications built on these platforms.

For large organisations, or organisations that prefer to control their infrastructure (because of, for example, regulatory constraints), PaaS solutions that can be run in a private datacenter rather than the public cloud are preferable, as this a future option to deploy to external clouds if needed/appropriate.

These PaaS environments are feature-rich and aim to provide a lot of the building blocks needed to build enterprise applications. But other framework initiatives, such as Spring Boot, DropWizard and Vert.X aim to make it easier to build PaaS-based applications.

Combined, all of these promise to provide a dramatic increase in developer productivity: the marginal cost of developing, deploying and operating a complete application will drop significantly.

Due to the low capital investment required to build new applications, it becomes ever more feasible to move from a heavy-weight, planning intensive approach to IT investment to a more agile approach where a complete application can be built, iterated and validated (or not) in the time it takes to create a traditional requirements document.

However, this also has massive implications, as – left unchecked – the drift towards entropy will increase over time, and organisations could be severely challenged to effectively manage and generate value from the sheer number of applications and services that can be created on such platforms. So an eye on managing complexity should be in place from the very beginning.

Many of the above platforms aim to make it as easy as possible for developers to get going quickly: this is a laudable goal, and if more of the complexity can be pushed into the PaaS, then that can only be good. The consequence of this approach is that developers have less control over the evolution of key aspects of the PaaS, and this could cause unexpected issues as PaaS upgrades conflict with application lifecycles, etc. In essence, it could be quite difficult to isolate applications from some PaaS changes. How these frameworks help developers cope with such changes is something to closely monitor, as these platforms are not yet mature enough to have gone through a major upgrade with a significant number of deployed applications.

The relative benefit/complexity trade-off between established microservice frameworks such as OSGi and easier to use solutions such as described above needs to be tested in practice. Specifically, OSGi’s more robust dependency model may prove more useful in enterprise environments than environments which have a ‘move fast and break things’ approach to application development, especially if OSGi-based PaaS solutions such as JBoss Fuse on OpenShift and Paremus ServiceFabric gain more popular use.

So: all well and good from the technology side. But even if the pros and cons of the different engineering approaches are evaluated and a perfect PaaS solution emerges, that doesn’t mean Microservice Nirvana can be achieved.

A recent article on the challenges of building successful micro-service applications, coupled with a presentation by Lisa van Gelder at a recent Agile meetup in New York City, has emphasised that even given the right enabling technologies, deploying microservices is a major challenge – but if done right, the rewards are well worth it.

Specifically, there are a number of factors that impact the success of a large scale or enterprise microservice based strategy, including but not limited to:

  • Shared ownership of services
  • Setting cross-team goals
  • Performing scrum of scrums
  • Identifying swim lanes – isolating content failure & eventually consistent data
  • Provision of Circuit breakers & Timeouts (anti-fragile)
  • Service discoverability & clear ownership
  • Testing against stubs; customer driven contracts
  • Running fake transactions in production
  • SLOs and priorities
  • Shared understanding of what happens when something goes wrong
  • Focus on Mean time to repair (recover) rather than mean-time-to-failure
  • Use of common interfaces: deployment, health check, logging, monitoring
  • Tracing a users journey through the application
  • Collecting logs
  • Providing monitoring dashboards
  • Standardising common metric names

Some of these can be technically provided by the chosen PaaS, but a lot is based around the best practices consistently applied within and across development teams. In fact, it is quite hard to capture these key success factors in traditional architectural views – something that needs to be considered when architecting large-scale microservice solutions.

In summary, the leap from monolithic application development into the world of PaaS and microservices highlights the need for consistent collaboration, disciplined development and a strong vision in order to ensure sustainable business value.
The hidden costs of PaaS & microservice engineering innovation

How ‘uncertainty’ can be used to create a strategic advantage

[TL;DR This post outlines a strategy for dealing with uncertainty in enterprise architecture planning, with specific reference to regulatory change in financial services.]

One of the biggest challenges anyone involved in technology has is in balancing the need to address the immediate requirement with the need to prepare for future change at the risk of over-engineering a solution.

The wrong balance over time results in complex, expensive legacy technology that ends up being an inhibitor to change, rather than an enabler.

It should not be unreasonable to expect that, over time, and with appropriate investment, any firm should have significant IT capability that can be brought to bear for a multitude of challenges or opportunities – even those not thought of at the time.

Unfortunately, most legacy systems are so optimised to solve the specific problem they were commissioned to solve, they often cannot be easily or cheaply adapted for new scenarios or problem domains.

In other words, as more functionality is added to a system, the ability to change it diminishes rapidly:

Agility vs FunctionalityThe net result is that the technology platform already in place is optimised to cope with existing business models and practices, but generally incapable of (cost effectively) adapting to new business models or practices.

To address this needs some forward thinking: specifically, what capabilities need to be developed to support where the business needs to be, given the large number of known unknowns? (accepting that everybody is in the same boat when it comes to dealing with unknown unknowns..).

These capabilities are generally determined by external factors – trends in the specific sector, technology, society, economics, etc, coupled with internal forward-looking strategies.

An excellent example of where a lack of focus on capabilities has caused structural challenges is the financial industry. A recent conference at the Bank for International Settlements (BIS) has highlighted the following capability gaps in how banks do their IT – at least as it relates to regulator’s expectations:

  • Data governance and data architecture need to be optimised in order to enhance the quality, accuracy and integrity of data.
  • Analytical and reporting processes need to facilitate faster decision-making and direct availability of the relevant information.
  • Processes and databases for the areas of finance, control and risk need to be harmonised.
  • Increased automation of the data exchange processes with the supervisory authorities is required.
  • Fast and flexible implementation of supervisory requirements by business units and IT necessitates a modular and flexible architecture and appropriate project management methods.

The interesting aspect about the above capabilities is that they span multiple businesses, products and functional domains. Yet for the most part they do not fall into the traditional remit of typical IT organisations.

The current state of technology today is capable of delivering these requirements from a purely technical perspective: these are challenging problems, but for the most part they have already been solved, or are being solved, in other industries or sectors – sometimes in a larger scale even than banks have to deal with. However, finding talent is, and remains, an issue.

The big challenge, rather is in ‘business-technology’. That amorphous space that is not quite business but not quite (traditional) IT either. This is the capability that banks need to develop: the ability to interpret what outcomes a business requires, and map that not only to projects, but also to capabilities – both business capabilities and IT capabilities.

So, what core capabilities are being called out by the BIS? Here’s a rough initial pass (by no means complete, but hopefully indicative):

Data Governance Increased focus on Data Ontologies, Semantic Modelling, Linked/Open Data (RDF), Machine Learning, Self-Describing Systems, Integration
Analytics & Reporting Application of Big Data techniques for scaling timely analysis of large data sets, not only for reporting but also as part of feedback loops into automated processes. Data science approach to analytics.
Processes & Databases Use of meta-data in exposing capabilities that can be orchestrated by many business-aligned IT teams to support specific end-to-end business processes. Databases only exposed via prescribed services; model-driven product development; business architecture.
Automation of data exchange  Automation of all report generation, approval, publishing and distribution (i.e., throwing people at the problem won’t fix this)
Fast and flexible implementation Adoption of modular-friendly practices such as portfolio planning, domain-driven design, enterprise architecture, agile project management, & microservice (distributed, cloud-ready, reusable, modular) architectures

It should be obvious looking at this list that it will not be possible or feasible to outsource these capabilities. Individual capabilities are not developed isolation: they complement and support each other. Therefore they need to be developed and maintained in-house – although vendors will certainly have a role in successfully delivering these capabilities. And these skills are quite different from skills existing business & IT folks have (although some are evolutionary).

Nobody can accurately predict what systems need to be built to meet the demands of any business in the next 6 months, let alone 3 years from now. But the capabilities that separates the winners for the losers in given sectors are easier to identify. Banks in particular are under massive pressure, with regulatory pressure, major shifts in market dynamics, competition from smaller, more nimble alternative financial service providers, and rapidly diminishing technology infrastructure costs levelling the playing field for new contenders.

Regulators have, in fact, given banks a lifeline: those that heed the regulators and take appropriate action will actually be in a strong position to deal competitively with significant structural change to the financial services industry over the next 10+ years.

The changes (client-centricity, digital transformation, regulatory compliance) that all knowledge-based industries (especially finance) will go through will depend heavily on all of the above capabilities. So this is an opportunity for financial institutions to get 3 for the price of 1 in terms of strategic business-IT investment.

How ‘uncertainty’ can be used to create a strategic advantage

THE FUTURE OF ESBs (AND SOA)

There are some interesting changes happening in technology, which will likely fundamentally change how IT approaches technology like Enterprise Service Buses (ESBs) and concepts like Service Oriented Architecture (SOA).

Specifically, those changes are:

  • An increased focus on data governance, and
  • Microservice technology

Let’s take each in turn, and conclude by suggesting how this will impact how ESBs and SOA will likely evolve.

Data Governance

Historically, IT has an inconsistent record with respect to data governance. For sure, each application often had dedicated data modellers or designers, but its data architecture tended to be very inward focused. Integration initiatives tended to focus on specific projects with specific requirements, and data was governed only to the extent it enabled inidividual project objectives to be achieved.

Sporadic attempts at creating standard message structures and dictionaries crumbled in the face of meeting tight deadlines for critical business deliverables.

ESBs, except in the most stable, controlled environments, failed to deliver the anticipated business benefits because heavy-weight ESBs turned out to be at least as un-agile as the applications they intended to integrate, and since the requirements on the bus evolve continually, application teams tended to favour reliable (or at least predictable) point-to-point solutions over enterprise solutions.

But there are three new drivers for improving data governance across the enterprise, and not just at the application level. These are:

  • Security/Privacy
  • Digital Transformation
  • Regulatory Control

The security/privacy agenda is the most visible, as organisations are extremely exposed to reputational risk if there are security breaches. An organisation needs to know what data it has where, and who has access to it, in order to ensure it can protect it.

Digital transformation means that every process is a digital-first process (or ‘straight-through-processing’ in the parlance of financial services). Human intervention should only be required to handle exceptions. And it means that the capabilities of the entire enterprise need to be brought to bear in order to provide a consistent connected customer experience.

For regulated industries, government regulators are now insisting that firms govern their data throughout that data’s entire lifecycle, not only from a security/privacy compliance perspective, but also from the perspective of being able to properly aggregate and report on regulated data sets.

The same governance principles, policies, processes and standards within an enterprise should underpin all three drivers – hence the increasing focus on establishing the role of ‘chief data officeer’ within organisations, and resourcing that role to materially improve how firms govern their data.

Microservice Technology

Microservice technology is an evolution of modularity in monolithic application design that started with procedures, and evolved through to object-oriented programming, and then to packages/modules (JARs and DLLs etc).

Along the way were attempts to extend the metaphor to distributed systems – e.g., RPC, CORBA, SOA/SOAP, and most recently RESTful APIs – in addition to completely different ‘message-driven’ approachs such as that advocated by the Reactive Development community.

Unfortunately, until fairly recently, most applications behind distributed end-points were architecturally monolithic – i.e., complex applications that needed to go through significant build-test-deploy processes for even minor changes, making it very difficult to adapt these applications in a timely manner to external change factors, such as integrations.

The microservices movement is a reaction to this, where the goal is to be able to deploy microservices as often as needed, without the risk of breaking the entire application (or of having a simple rollback process if it does break). In addition, microservice architectures are inherently amenable to horizontal scaling, a key factor behind its use within internet-scale technology companies.

So, microservices are an architectural style that favours agile, distributed deployment.

As such, one benefit behind the use of microservices is that it allows teams, or individuals within teams, to take responsibility for all aspects of the microservice over its lifetime. In particular, where microservices are exposed to external teams, there is an implied commitment from the team to continue to support those external teams throughout the life of the microservice.

A key aspect of microservices is that they are fairly lightweight: the developer is in control. There is no need for specific heavyweight infrastructure – in fact, microservices favor anti-fragile architectures, with abundant low-cost infrastructure.

Open standards such as OSGi and abstractions such as Resource Oriented Computing allow microservices to participate in a governed, developer-driven context. And in the default (simplest) case, microservices can be exposed using plain-old RESTful standards, which every web application developer is at least somewhat familiar with.

Data Governance + Microservices = Enterprise Building Blocks

Combining the benefits of both data governance and microservices means that firms for the first time can start buiding up a real catalog of enterprise-re-usable building blocks – but without the need for a traditional ESB, or traditional ESB governance. Microservices are developed in response to developer needs (perhaps influenced by Data Governance standards), and Data Standards can be used to describe, in an enterprise context, what those (exposed) microservices do.

Because microservices technologies allow ‘smart endpoints’ to be easily created and integrated into an application architecture, the need for a central ‘bus’ is eliminated. Developers can create many endpoints with limited complexity overhead, and over time can converge these into a small number of common services.

With respect to the Service Registry function provided by ESBs, the new breed of API Management tools may be sufficient to provide any lookup/resolution capabilities required (above and beyond those provided by the microservice architecture itself). API Management tools also keep complexity out of API development by taking care of monitoring, analytics, authentication, protocol conversion and basic throttling capabilities – for those APIs that require those capabilities.

Culturally, however, microservices requires a collaborative approach to software development and evolution, with minimum top-down command-and-control intervention. Data governance, on the other hand, is necessarily driven top-down. So there is a risk of a cultural conflict between top-down data governance and bottom-up microservice delivery: both sides need to be sensitive to the needs of the other side, and be prepared to make compromises occasionally.

In conclusion, the ESB is dead…but long live (m)SOA.

THE FUTURE OF ESBs (AND SOA)

Microservice Principles and Enterprise IT Architecture

The idea that the benefits embodied in a microservices approach to solution architecture is relevant to enterprise architecture is a solid one.

In particular, it allows bottom-up, demand-driven solution architectures to evolve, while providing a useful benchmark to assess if those architectures are moving in a way that increases the organization’s ability to manage overall complexity (and hence business agility).

Microservices cannot be mandated top-down the same way Services were intended to be. But fostering a culture and developing an IT strategy that encourages the bottom-up development of microservices will have a significant sustainable positive impact on a business’s competitiveness in a highly digital environment.

I fully endorse all the points Gene has made in his blog post.

Form Follows Function

Julia Set Fractal

Ruth Malan is fond of noting that “design is fractal”. In a comment on her post “We Just Stopped Talking About Design”, she observed:

We need to get beyond thinking of design as just a do once, up-front sort of thing. If we re-orient to design as something we do at different levels (strategic, system-in-context, system, elements and mechanisms, algorithms, …), at different times (including early), and iteratively and throughout the development and evolution of systems, then we open up the option that we (can and should) design in different media.

This fractal nature is illustrated by the fact that software systems and systems of systems belonging to an organization exist within an ecosystem dominated by that organization which is itself a system of systems of the social kind operating within a larger ecosystem (i.e. the enterprise). Just as structure follows strategy then becomes a constraint on strategy…

View original post 670 more words

Microservice Principles and Enterprise IT Architecture