The SOA strategy and the request-response pattern, or what happens when it is abused
Resistance to change is one of the main causes that hamper the expansion of integration patterns in real time in an interoperability strategy based on Service Orientation. The request-response pattern is the most frequent solution to an integration requirement, and although in some cases it is the best option, in most situations its use entails more problems than advantages. In addition, in my opinion the abuse that is made of this approach, is a reflection of some flawed defect, widespread in the profession when designing data models of information systems.
In this post we are going to analyze this whole thing.
Tabla de Contenidos
WHAT IS THE REQUEST-RESPONSE PATTERN?
The request-response pattern consists of publishing a service by a system, so that systems interested in accessing the information that the service provides, invoke it whenever they need it. The call is synchronous from end to end, that is, the transaction of the system that makes the request does not end until it receives the response from the system that provides the service.
Request-response pattern main characteristics
1. The requesting system user triggers a transaction on his system. For the duration of that transaction, the user must wait for another system to respond (the one that publishes the service). Usually the user does not have to know this detail, because in theory for him what happens “behind the curtain” (its interface) is transparent. However, in this use case, his user experience depends on the availability and performance of another system that is external to him.
2. The system that provides the service normally does not publish it specifically for a particular petitioning system, but for all those systems that may be interested in the information provided through that service. Therefore, their ability to respond to multiple threads or transactions is crucial in order not to harm other applications and their users within the ecosystem. The necessary infrastructure and complexity to ensure optimal performance make the solution more expensive.
3. Usually, the schema that supports the data that travels in the request and in the response, is “property” of the system that publishes the service. From the point of view of this system this is logical, because any necessary transformation between the reception of the request and its processing would only add time to the response. But from the point of view of the ecosystem, this detail adds coupling between the petitioner systems and the system that provides the service. If it modifies its schema, all systems invoking the service must also adapt its software. And, this change must be done simultaneously, or manage a double version of the service while the systems that use it evolve to the new version. Evolutionary maintenance of the ecosystem is complicated and expensive, and the probability of production errors increases significantly.
4. The information that is distributed in the ecosystem about this entity at any given time, does not respect one of the most sacred principles in information technologies: referential integrity. All affected systems should make an almost simultaneous call to the published service to ensure such referential integrity. Otherwise, data inconsistency exists. And we will agree that this is serious.
5. Finally, let us think of a moderately large ecosystem, where this pattern is used in different systems. The scenario is not precisely efficient, neither scalable nor sustainable.
But there is still room for further reflection.
The user’s point of view
Let us put ourselves in the skin of the user of the petitioner system. Imagine that we learned about this operation, probably alerted by the numerous times that my interface tells me that it was not possible to attend my action at this time, and that I should try again in a few moments or something. I could wonder why my application has to go to another system to get information that I need here and now. Why can’t I have that information on my system, so that I can use it when I need it? Is it less important information than the rest of the information I get quickly and effectively in my system?. Probably not. So?. What is the justification?
The answer I would receive from the corresponding technician would look like this: “because that system is responsible for the entity you are consulting, and therefore it is that system who has to provide the information.”
If I were that user I could answer something like this: “Excuse me, but I should not be affected by the design of other systems, or which entities it maintains and which it does not. In my work I need that information as much as any other information that I get without failure and immediately, but that information often produces errors or I can not consult it. Therefore, for me, my application is poorly designed.
And he would be right. Specifically, his data model would be badly designed, and the error would be a serious one, a basic error in computer science: an entity is missing. Or at least, there are missing attributes in some entity. And the solution that the technical team has provided to that user is that data that is not in his data model, is requested to another system that does have it, and has published a service.
WHAT DOES THE SOA STRATEGY ADVICE AS AN ALTERNATIVE TO THE REQUEST-RESPONSE PATTERN?
In an interoperability strategy based on Service Orientation, the subject should take a very different approach. Let’s look at some of the most important differences:
1. Decisions should be based on the interests of the information ecosystem, not on the technical capabilities of one of the systems that compose it. This argument should not be used anymore: “since that system has published a service that returns that data, we will use it”.
2. The data model of each information system should be complete. If a system has among its requirements the need to have certain information associated with a particular entity, that entity or at least the attributes that are needed should be part of the data model. And of course there should be a way to keep that data perfectly up-to-date, like the rest of the entities and data that the application uses.
3. The flow of information should not be dictated by the technical peculiarities of each system, but by the flow of natural business information. It should arise, for example, from the analysis of the business processes involved, facing the current functioning (“as is” modeling) and the desired performance (“to be” modeling). This analysis should lead to decisions on the technical work to be undertaken.
4. All information systems that need the data must have it with the same reliability, speed and referential integrity as other information that can be consulted by users of the ecosystem, or that can be processed by their applications. Otherwise, there are serious deficiencies in the quality of the data, and therefore, business decisions are made ill-informed.
5. The role of the system responsible for the affected entity remains fundamental, but with a radically different approach: instead of publishing a service that can be invoked at any time by an indeterminate number of applications, the provider of that information must communicate it immediately, in real time, when the business event that generates such information occurs (this is called an Event Driven Architecture).
6. To keep the loose coupling in the ecosystem, you should not care about the destination of that information: the sending ends when it is delivered to the corresponding ESB. It is the middleware that must maintain the list of systems subscribed to that event, which must provide the guarantee of delivery with the policies of retries that are established, and which, in short, orchestrates the flow of information according to the operation of the Business processes.
7. The design of the scheme used to send the information should be based on a messaging standard. And if it does not exist in the industry, it should be based on a corporate messaging design. In the worst case, the ESB would be responsible for ensuring the necessary transformations in the data. This “extra work” does not have the negative effect of before, because now we are not in synchronous end to end transactions, but asynchronous ones. There is no user waiting with his hand on the mouse counting the time since he clicked on his interface. Instead, it allows to maintain the loose coupling.
Asynchronous event-driven model advantages
1. Referential integrity throughout the ecosystem: the quality of the data is how it should be.
2. Loose coupling in the information ecosystem: modifications in a system do not have that domino effect that makes projects more expensive and even blocks their feasibility.
3. Scalable ecosystem: Incorporating new systems into the data flow is as easy as adding them to the event subscription list in the middleware.
4. Sustainability: Corrective maintenance and evolutionary maintenance are simplified, reducing associated costs and expanding the capacity of the ecosystem to grow and adapt to the real needs of the business.
5. The systems that form the information ecosystem are aligned with the actual operation of the business, with its required events and times of response.
6. The user experience significantly improves.
We will not end this post without pointing a nuance to what has been said so far: there are cases where the request-response pattern is fully justified, of course. Sometimes, the volume of data that correspond to a particular entity, is excessive to incorporate it into the database of an application. In these scenarios, it is legitimate to adopt the request-response pattern so that this application only recovers that data when it knows that it will use it.
A clear example of this can be found in hospitals. The departmental systems can offer a specialized service, with a very limited use to certain diagnoses, to certain patients. Therefore it is not logical that these systems subscribe to the hospital’s census services to receive the information of the entire population admitted at any given time. Instead, when they receive the request for one of the diagnostic or analytical tests they perform, they invoke the patient data retrieval service to receive their information, since they only need to know the patients whose tests they must perform, not the complete census of the hospital.
You should always analyze each case. But in my experience, the SOA strategy often encounters an obstacle in the request-response pattern, in its widespread use as a solution to all integration requirements within an information ecosystem. It is difficult to recognize that abuse of this pattern is among the main causes of spaghetti architectures. It is difficult to change the mindset of the ICT leaders of the organizations, and the ICT professionals of many companies. Resistance to change weighs heavily, and one of the most paralyzing phrases in the evolution of ICTs often appears: “it’s always been done this way.”