Pages

Saturday, June 1, 2013

Case Study on SymmetricDS

Back then on 2011, I've the occasion to build yet another distributed application. But this time, it's not a control system application, it is a business application. My client is an upstream oil and gas company which operates offshore oil field in the Java Sea area. They need to track personnel mob/demob from main land to remote offshore station.

The Application

The application being developed is some kind of kiosk application, located at Jetty and each remote station. Each passenger going to offshore will be inspected for security requirements and as a final check they have to show their personal ID card. This ID card then scanned by barcode scanner and the application will validate this person based on an approved passenger list. If listed, application will record the timestamp and mark this person status as "On Board". When arrived in offshore station, this person is requested to show his personal ID card again. The application will record the timestamp and update the status as "Arrived" and mark this person's trip as completed.

The picture below shows the display panel of this kiosk application.

The Challenge

The application logic is not that sophisticated, It's just a CRUD application which have been done so many times. The challenge lies on the distributed-ness of the whole system, covering onshore and mostly offshore remote stations, which should have synchronized data store with that in Head Office.



One simplistic solution would be using single data store that accessed by all kiosk nodes. But the nature of kiosk application which demand fast response will not perform well across network which occasionally severe outage. Every kiosk nodes should have its own local data store to ensure fast operation, and that local data store should always in sync with other nodes and especially with Head Office. We have Oracle 10g on the Head Office side. But we use Oracle XE in remote station side which don't have replication feature.
As usual, I asked Uncle Google for solution and he brought SymmetricDS on the surface.

SymmetricDS

Here is what SymmetricDS people said about their product :
SymmetricDS is open source software for multi-master database replication, filtered synchronization, or transformation across the network in a heterogeneous environment. It supports multiple subscribers with one direction or bi-directional asynchronous data replication. It uses web and database technologies to replicate tables between relational databases, in near real time if desired. The software was designed to scale for a large number of databases, work across low-bandwidth connections, and withstand periods of network outage.
Their proposition is so promising and I give it a try. Documentation is very good, even they provide a short tutorial which I really recommend to follow. This hands-on tutorial is the key to understand the principles about how SymmetricDS works.

Configuring the engine seems complicated at first, but once you understand the concept of Trigger and Router, it gives you all the tools to implement whatever you need. Details about setting up SymmetricDS are best learned from the documentation itself. Spend enough time to study the Appendix A. Many details provided there, with a little experimentation you will get the optimal setting for your case.

The following are some of my application use case and how I configure SymmetricDS to meet those use case effectively.

No. Use Case SymmetricDS Setup
1. There are 1 head office node, 5 onshore remote stations, and 11 offshore remote stations.
  • Create two Node Group : 'headoffice' and 'stations'. Head Office node is a member of 'headoffice' group. All the rest are member of 'stations' group.
  • Set link from 'stations' to 'headoffice' as Push.
  • Set link from 'headoffice' to 'stations' as Wait for Pull.
2. Master data are all set from Head Office. All stations should have the same master data set. Stations could not modify master data.
  • Define a Channel named 'chMasterData'.
  • Create a Router named 'rtrHeadOffice_Station', with direction from 'headoffice' to 'stations' node group.
  • Setup trigger for all master data tables and assign to the Channel 'chMasterData'.
  • Bind those triggers to 'rtrHeadOffice_Station' Router.
The above steps will establish single directional synchronization from 'headoffice' Group to 'stations' Group.

3. PERSONNEL table contains a LOB column which hold photograph image of particular personnel. This image should also be propagated to all remote stations.
  • Modify 'chMasterData', set property contains_big_lob to 1.
  • Modify trigger attached to PERSONNEL table,  set the property use_stream_lobs to 1.
These steps are necessary to ensure binary data propagated properly.
4. When personnel depart from Jetty, application records the timestamps and mark this person status to "On Board". This status changes should be propagated to the destination station. This use case involves two data transfer phase.
First, data changes at Jetty are replicated to Head Office.
Second, Head Office propagates further this changes to other stations.

Let's say the records of personnel departure/arrival are stored in PERSONNEL_TRIP table. To enable bidirectional synchronization, set a trigger in this table with property sync_on_incoming_batch set to 1.


For day-to-day system maintenance, I wrote a simple monitoring console using ExtJS grid widgets, showing the content of SYM_NODE and SYM_NODE_HOST tables. I use the value of HEARTBEAT_TIME column to detect whether a node is alive. If heartbeat time has elapsed more than 2 hours, I mark that node as "troubled" and put a red mark on its row. As seen below, network could down for days at some unfortunate locations. But after network recovered, SymmetricDS could resume to sync all pending batches within minutes.



Evaluation

In general, SymmetricDS works as expected. It proves its own promise, especialy the one that impressed me most, "... withstand periods of network outage...".
But no system without its quirks. I still have a spurious problem which I guess there still incorrect configuration in my setup. Randomly after two to four weeks continuous operations, the root node stop working with the following message found in log :

Cannot get a connection, pool error Timeout waiting for idle object
Timeout waiting for idle object

I'm sure this is a sizing problem. I haven't find the solution yet. For the mean time, I wrote a simple shell script to automatically restart SymmetricDS service every week.

Conclusion

Despite the spurious problem about the connection pool, SymmetricDS is great! I will use it again for my next projects.