Microservices Archietcture(MSA) : Design & Development Approaches

In the recent past there has been flood of requests from major retailers who are using the premium ecommerce packages like Oracle ATG, IBM WCS, SAP Hybris & Magento etc to decouple the functionalities from the monolith deployment by moving the architecture approach to microservices. Decision for separation has to be made with lot of care & consideration as there will be too many factors at play like hosting(traditional + cloud), scaling, data modeling, replication to be used, session, cache management etc. Apart from the above listed factors the custom implementation of the solution would also come into play as we design the solution approach for incremental microservices transformation journey for the enterprise.

As we all are going through the learning curve of developing and deploying the distributed applications this journey provides lot of insights into design & development approaches to the various scenarios & problems. I have tried to summarize some key notes which can be useful during MSA implementations. Please share your learnings and experiences to make it more comprehensive.

Below are the multiple focus areas during designing the incremental microservices journey and list can be improved further based upon the experience we gain from the real ground:

  • Domain Architecture
  • CQRS (Command Query Request State) : Separates reads & writes to the data
  • Event Sourcing : Event based architecture
  • Hexagonal Architecture : Logic Kernel & Adaptor based
  • Resilience & Stability
  • Technical Architecture

While focusing on the above areas some things that will came into light during team discussions are listed below…

  • How to manage the service versions?
  • Designing for performance
  • Designing for integrity
  • Designing for failures
  • Service Boundaries
  • Designing for adapting to third party integrations/Products
  • Operational Effectiveness
  • Health checks
  • Performance Metrics & Alerts
  • Resources utilization & effectiveness
  • Logging & Error, Exception Handling
  • Event & Transaction Tracking
  • CICD for packaging & deployment
  • Hybrid Infra model support ( cloud + physical )

Domain Architecture

  • Defines how microservices will implement core domain based functionalities
  • No rigid approach (allows teams to decide the design & structure enabling them to work independently)
  • Easy to understand, Simple to maintain & replace
  • Defines how the entire domain is split into various areas (leading to mapping one team for the changes)
  • Efficient domain architecture will require low co-ordination and communication among teams

Cohesion

  • Domain architecture of overall system influences the domain architecture of each microservice
  • Microservices should be loosely couple with each other and have high internal cohesion
  • With respect to domain it should follow SRP (Single Responsibility Principle)
  • Loose coupling & high cohesion

Encapsulation

  • Internal details are hidden
  • Accessed via defined interfaces
  • Helps in easy modification
  • In microservices context one microservice will never allow other microservice to access internal data
  • Each microservice understand the interface to other microservice

Domain Driven Design

  • Functionally structures the overall design of microservices

Transactions

  • Bundle of actions (Execute Everything or Nothing)
  • One transaction one microservice
  • Transaction across microservices (by leveraging messaging based arch or design)
  • Very hard to achieve the one transaction one microservice level with domain design

CQRS (Command Query Responsibility Segregation)

CQRS is an architectural pattern for developing software where the system is segregated for the steps or operations dealing with reads & writes (very different from old monolith CRUD architecture where single component has the ownership for reads & writes)

  • Each system has a state which can be saved which requires reading(queries) & writing(command) the data
  • Command & Query can’t be synchronous in nature resulting in CQS (Command Query Separation)
  • Cache’s support the query operations

CQRS

Principles of CQRS Arch Benefits of CQRS Arch Challenges of CQRS Arch
Event Sourcing

Event Driven Architecture

Loose Coupling

High Cohesion

Domain Driven Design

SoC (Separation of Concern)

Domain/Busines Focus

Scalability

Simplification

Flexibility

Command & Query can be use different technologies

Read & Write separation

Data consistency

Expensive ( Development & Infrastructure cost will go up)

Transactions with read & write steps are difficult to implement

msafig12

Microservices with CQRS

  • The communication infrastructure can implement the Command Queue when a messaging solution is used. In case of approaches like REST a Microservice has to forward the commands to all interested Command Handlers and implement the Command Queue that way.
  • Each Command Handler can be a separate Microservice. It can handle the commands with its own logic. Thereby logic can very easily be distributed to multiple Microservices.
  • Likewise, a Query Handler can be a separate Microservice. The changes to the data which the Query Handler uses can be introduced by a Command Handler in the same Microservice. However, the Command Handler can also be a separate Microservice. In that case the Query Handler has to offer a suitable interface for accessing the database so that the Command Handler can change the data.

When to choose CQRS ?

  • Event based integration is develop & maintain in the long run
  • If there is heavy imbalance between number of reads & writes in the given system
  • While building multi-view based systems (Web, Mobile, API etc )
  • Complex CRUD operations in existing system
  • If business process has lot of events and records would be required for further analysis
  • Can live with Eventual consistency

Event Sourcing

Event Sourcing is an similar architectural pattern like CQRS, state is recorded as the sequence of events in the system resulting from actions/steps of domain. Events help in rebuilding the state of the actors with in the domain at any point of time.

CRUD stores the current state(not events) of the entity not the sequence or history of the events

  • Events are messages
  • Can be modelled in separate classes
  • Event history is maintained (persists events)
  • Event stores provide read and write capabilities
  • Used for interaction between components
  • Events are immutable ( no modification or deletion)

 

Principles of Event Sourcing

Benefits of Event Sourcing

Challenges of Event Sourcing

Persisting state of the events

Event Driven Architecture

Loose Coupling

High Cohesion

Domain Driven Design

SoC (Separation of Concern)

Simplicity

Easy Integration

Event state traceability(audit trail)

Performance

Flexibility

Consistency (Business drives the level of consistency not the technology)

Parallel updates

Validation

Event Sourcing
  • Event Queue : Sends all the events to recipients (mostly messaging middleware)
  • Event Store : Saves all events using them it can reconstruct state
  • Event Handler: Business logic which responds to the particular event(s)

Hexagonal Architecture

 “Focuses on the core logic of the application i.e. business functionalities with well-defined interfaces which are available for the users & administrators”

  • Alternative to the layered architecture which has UI, Business Logic, Persistence
  • Has adaptors which interact with logic via ports
  • Inbound & Outbound communication
  • Some of the adaptors available for users, data, events & administrators are….
    • UI Adaptor
    • REST Adaptor
    • Test Adaptor (Enables isolated testing harness)
    • Database Adaptor
    • Custom Adaptor (Resilience & Stability)
Adapter (From GoF it is also called as Wrapper)
msafig14

Resilience & Stability

  • Due to the distributed nature of the microservices architecture the system always can run into the risk of cascading failure resulting in hostage situation for the applications accessing the services
  • Distributed architecture relies on the network & cloud servers which can be highly unreliable considering the risk of failure system should be designed in such a way that probability of failure can be masked i.e. building the resilience into the system/application to ensure the availability too.
  • Patterns that can ensure the availability of the services are:
  • Bulkhead
    • Bulkhead has been used for several centuries in ship building area, it creates the watertight compartments that can hold water in case of leakage to avoid ship sink
    • Comparison to software architecture the concept allows to build sub-systems which are well shielded from cascading problems if any like overloads, response failures etc
  • Test Harness
    • It is an approach to induce the situation to study the behavior of the application in certain contexts (TCP/IP, HTTP Headers etc i.e some OS or network level issues) most of time these are not considered but addressing these situations can help the overall architecture
  • Steady State
    • Simple and manageable at any point of time, if the systems are not designed for consistent state they might run into problems at some point of time like heavy database storage operations, log & caching operations which have capability to bring down the system by creating storage & operational challenges. Hence it is recommended to have automated control over storage, caching & log operations
  • Uncoupling Via Messaging
    • Asynchronous process based communication helps (without any waiting), approach should be to proceed with other activities without waiting on the response
  • Timeout
    • An individual thread is started which can be terminated after defined timeout
    • Easy to implement compared to other patterns
    • Can be plugged into monitoring systems as well
  • Stability based patterns
    • Use Circuit Breaker, Timeouts & Bulkhead patterns to safeguard from the failures and tight coupling
    • Apply fail fast mechanism to all the microservices to ease the communication wait times
    • Steady State, Fail Fast, Test Harness, Handshaking etc have to be implemented by each microservice
    • Decoupling via shared communication is the way to design
  • Handshaking
    • It is a protocol based feature once leveraged will help us to track the overloaded system & prevent cascading failures, socket based protocols implement these very well but HTTP doesn’t support this mechanism hence it is the responsibility of the application getting built by introducing some type of health check or signaling
  • Circuit Breaker
    • Are also not a complex to implement in the microservices code but a special attention is required during design to make them work high load scenarios & meet the operational needs to monitoring etc..
  • Fail Fast
    • This approach allows the consumer system to timeout on the response and detect the failed state as quickly as possible  with defined error state. Hence validating the state of the incoming request allows to validate and avoid the processing further by confirming the error state, similar to the timeout model
  • Resilience in Reactive Context
    • Reactive Manifesto considers “resilience” as one of the core property of the reactive application
    • Asynchronous processing with error handling & monitoring
  • Hystrix
    • Very good support for resilience
    • It is a java library given by Netflix shared under Apache license
    • Internally it uses Reactive extensions for Java (RxJava)
    • Implements timeout & Circuit Breaker
    • Hystrix dashboard shares info about state of the thread pools & circuit breaker
    • Can be embedded in commands implementation
    • Library provides rich annotations
    • Leverages thread pools (for each microservice) – ensuring the support for bulkhead
    • Non java implementations can use Sidecar Hyrstrix
    • Can configured for monitoring single hosts as well as cluster mode

Technical Architecture

Technical architecture of the microservices can be designed independently, tech stack or frameworks need not be common i.e. truly ployglot support. Operational or non-functional needs can be collaboratively designed to support & maintain the system.
  • Process Engines
    • SOA based orchestrated process engines can be leveraged for microservices as well
    • Microservices are designed for one domain i.e. one Bounded Context
    • Microservices should not be designed just for integration or orchestration else it will violate the SRP and the changes for microservices will be become complex to do and maintain over the period
    • In case of collective business context multiple microservices should be implemented where each microservice supports one business functionality and used in conjunction to support multiple functionalities
    • Integration only microservices should be strictly avoided
  • Statelessness
    • In a distributed architecture stateless microservices are very useful
    • Microservices should not save any state in their business logic (State can be maintained in database or client side)
    • Having the stored state helps to recover from failures by replicating/recreating the state from database
    • Helps in creating multiple instances
    • Easy replication
    • Load distribution to the instances
    • Versioning : Old version can be shutdown or replaced without migrating state
  • Reactive
    • Reactive properties fits very well into the microservices approaches/needs
    • An application/systems which is reactive should have some defined properties like:
      • Resilience : Ensure the availability & failure tolerance
      • Elastic : Dynamically scalable at runtime
      • Message Driven : Message based communication
      • Responsive : Fast response times with fail fast setup
    • Actors/Processes can send messages to each other

      • Non-blocking I/O communication(Asynchronous)
    • Reactive Technologies

      • Vert.x ( Java JVM based but also supports various languages like Ruby, JavaScript, Groovy, Scala, Clojure & Python)
      • Reactive Extensions : RxJava & RxJS
      • Scala ( Reactive framework Akka & Web framework Play)

msafig15

 

Reactive is just an alternative approach available of the microservices but it is not the only option. Microservices can also be implemented without “reactive” approaches with traditional programming models, resilience can be achieved by using various libraries & elasticity is implemented using VM’s or Containers. Messaging is already used in traditional models. Reactive systems have real advantage in the case of “responsive” implementation

References:

 

Some best practices for Java collections usage

Use ArrayLists, HashMap etc vs. Vector, Hashtable etc, wherever possible to avoid any synchronization overhead. Even better is to use just arrays where possible. If multiple threads concurrently access a collection and at least one of the threads either adds or deletes an entry into the collection, then the collection must be externally synchronized. This can be achieved by:

Map sampleMap = Collections.synchronizedMap (sampleMap);

List sampleList = Collections.synchronizedList (sampleList);

Note : Collections is utility class different from Collection Inerface.

Set the initial capacity of a collection appropriately (e.g. ArrayList, HashMap etc). This is because collection classes like ArrayList, HashMap etc must grow periodically to accommodate new elements. But if you have a very large array, and you know the size in advance then you can speed things up by setting the initial size appropriately.

For Eg: HashMaps/Hashtables need to be created with sufficiently large capacity to minimize rehashing (which happens every time the table grows). HashMap has two parameters initial capacity & load factor that affect its performance and space requirements. Higher load factor values (default load factor of 0.75 provides a good trade off between performance and space) will reduce the space cost but will increase the lookup cost of sampleMap.get(…) & sampleMap.put(…) methods. When the number of entries in the HashMap exceeds the current capacity * loadfactor then the capacity of the HasMap is roughly doubled by calling the rehash function. It is also very important not to set the initial capacity too high or load factor too low if iteration performance or reduction in space is important.

Program in terms of interface not implementation: For eg you might decide a LinkedList is the best choice for some application, but then later decide ArrayList might be a better choice for performance reason.

Please use:

List sampleList = new ArrayList(100); //program in terms of interface & set the initial size.

In the place of (avoid this):

ArrayList sampleArrlist = new ArrayList();

Avoid storing unrelated or different types of objects into same collection: This is analogous to storing items in pigeonholes without any labeling. To store items use value objects or data objects (as oppose to storing every attribute in an ArrayList or HashMap). Provide wrapper classes around your collection API classes like ArrayList, Hashmap etc as shown in better approach column. Also where ever applicable consider using composite design pattern, where an object may represent a single object or a collection of objects.

This brings us to discuss to the details of composite design pattern…I will take it up some time later for now lets keep koding…:-)

Session Management

Session management is critical part of web programming and we use it regularly duirng our web application development, this has to be handled very carefully else it can blow up the security of the applciation & data. Cookies are provided to store simple user-related data on the browser. But this poses high amount of risk for security of the data being maintained in the cookies, if sensitive data is maintained in them.

Good alternative for handling user related information in web programming is HttpSession, it is secure :

User data can be kept in “session scope” and it exists on the server not in the client browser. Server side is the better place to handle the sensitive data. In our(developer) control allows to dump the data once usage is completed.

Session should be very carefully handled as :

  • It deals with sensitive data.
  • They demand server resources.

Here are few tips to handle sessions : 

  • Always use a <%@ page session=”false” %> directive at the top of every JSP that doesn’t use a session.
  • Disable URL rewriting.
  • Create new sessions only when the user login and remember to invalidate it once user logs out.
  • Timeout value in web.xml is set to reasonable value..not too high or low.
  • Validate all the requests to defend against dupe attacks.

Happy Programming 🙂