Camunda Fault Tolerance Evaluation
This project is part of the evaluation of a Saga pattern implementation using the Camunda. Additional sections to the original Saga Pattern Realization with Camunda have been included that simulate different failure scenarios given a particular input.
Start the Application
-
Run
./gradlew clean build
-
Execute
docker-compose up --no-start
-
Execute
docker-compose start mysql
-
Execute
docker-compose start travelservice
-
Execute
docker-compose up
-
Requesting trip bookings is now possible. Either use
curl
commands, the providedTravelApplication.json
insomnia file, which includes different trip booking requests, or access the Swagger UI of the different services:
TravelService | http://localhost:8090/swagger-ui.html |
HotelService | http://localhost:8081/swagger-ui.html |
FlightService | http://localhost:8082/swagger-ui.html |
An example for such a request:
{
"duration":
{
"start":"2021-12-01",
"end":"2021-12-12"
},
"start":
{
"country":"Scotland",
"city":"Stirling"
},
"destination":
{
"country":"Sweden",
"city":"Stockholm"
},
"travellerName": "Max Mustermann",
"boardType":"breakfast",
"customerId":"1"
}
To simulate a Saga that fails because no hotel or no flight is available, use one of the following Strings
as destination country
in the trip booking request:
"Provoke hotel failure"
"Provoke flight failure"
Additionally, the Camunda Cockpit can be accessed via
http://localhost:8090/ with the credentials:
Username: admin | Password: admin
The services also provide a health and an info endpoint that show some information about the system like that the DB is up and running. These endpoints can be accessed via:
TravelService | http://localhost:8090/api/travel/monitor/health | http://localhost:8090/api/travel/monitor/info |
HotelService | http://localhost:8081/api/hotels/monitor/health | http://localhost:8081/api/hotels/monitor/info |
FlightService | http://localhost:8082/api/flights/monitor/health | http://localhost:8082/api/flights/monitor/info |
If you are on Windows or Mac, you sometimes have to replace localhost with the default IP of your docker machine (use docker-machine ip default
to get this default IP).
Stop the Application
To stop the application and remove the created containers, execute the following command:
docker-compose down --remove-orphans
Provoke Failure Scenarios
The respective String has to be used as destination country
in the trip booking request to provoke a participant failure.
An example for such a request:
{
"duration":
{
"start":"2021-12-01",
"end":"2021-12-12"
},
"start":
{
"country":"Scotland",
"city":"Stirling"
},
"destination":
{
"country":"Provoke orchestrator failure while starting trip booking",
"city":"Bamberg"
},
"travellerName": "Orchestrator Start",
"boardType":"All-inclusive",
"customerId":"4"
}
1. Saga Participant Failure
-
Provoke a failure of the FlightService participant before it started to execute a local transaction with the following string as
destination country
:"Provoke participant failure before receiving task"
The HotelService terminates then the docker container of the FlightService while it is executing the bookHotel request. Afterwards, the FlightService has to be restarted manually to investigate what happens as soon as the service is running again. This can be done using one of the following commands:
docker-compose start flightservice docker start flightservice_camundaFailurePerf
If the container name of the FlightService has been changed in the
docker-compose.yml
file, the container has to be started using this name. -
Provoke a termination failure of the FlightService participant while executing a local transaction of the BookTripSaga with the following string as
destination country
:"Provoke participant failure while executing"
The FlightService forces then its JVM to terminate itself, after booking a flight but before informing the orchestrator about it, in order to simulate a sudden failure of the system. Afterwards, the FlightService, again, has to be restarted using the same commands as above.
-
Provoke an exception in the FlightService participant while executing a local transaction of the BookTripSaga with the following string as
destination country
:"Provoke exception while executing"
The FlightService throws then a RuntimeException while booking a flight to simulate unexpected behaviour of the system. Afterwards, the behaviour of the service can be observed. The easiest way is to have a look at the log of the FlightService during that time. This can be done using the following command:
docker logs flightservice_camundaFailurePerf --follow
2. Saga Orchestrator Failure
The Camunda Engine within the TravelService plays the orchestrator role in this example application. Consequently, observing the system's behaviour during orchestrator failures involves failures of the TravelService.
-
Provoke a failure of the TravelService while a trip booking is being started is not considered, since the TravelService is needed in order to make booking requests.
-
Provoke a failure of the TravelService while executing a local transaction of the BookTripSaga with the following string as
destination country
:"Provoke orchestrator failure while executing"
The FlightService terminates then the docker container of the TravelService after booking a flight, but before informing the orchestrator about it. Afterwards, the TravelService has to be restarted manually to investigate what happens as soon as the service is running again. This can be done using one of the following commands:
docker-compose start travelservice docker start travelservice_camundaFailurePerf
If the container name of the TravelService has been changed in the
docker-compose.yml
file, the container has to be started using this name.
3. Breach of Saga Protocol
A participant might send the same message twice to the orchestrator, or even send an old one. Therefore, two scenarios have been added to the implementation that provoke sending either an old or a duplicate message to the orchestrator in order to evaluate how an implementation using Camunda handles this situation.
-
Provoke the HotelService to send a duplicate message to the TravelService with the following string as
destination country
:The HotelService informs then Camunda's"Provoke duplicate message to orchestrator"
ExternalTaskSerice
to complete theExternalTask
again with the respective message, this case theBookHotelResponse
. -
Provoke the HotelService to send an old message to the TravelService with the following string as
destination country
:The HotelService creates then a new thread that waits for five minutes before it sends the same answer as before to the TravelService again. To achieve this, the HotelService sends the old message, in this case the"Provoke sending old message to orchestrator"
BookHotelResponse
, to the provided endpoint/external-task/{taskId}/complete
. The service's logs document when it sends the old message.
Code Link
Camunda_Implementations/Camunda_FailurePerf-Evaluation