Introduction

This article is a detailed explanation of a client-server data sync implementation. The algorithm used is a customised version of operational transformation. In addition to the benefits of an incremental sync over or complete data transfer, this approach will facilitate addition of a sync layer to your existing application without changing the underlying data layer or business models.

Diff and patch of java objects ??? !!!. You may be familiar with client server sync and various ways of doing it. But how can diff and patch of java objects be done? If you are curious to know more then continue reading.

The problem

Requirement

It all began when our REST API backed android app had a new major feature request. “Guided training on various topics to users”. The app should present a set of training materials to the user. Each training material (let’s call it a course) is an ordered list of steps. Each step is collection different type of tasks. For example a test or a reading material or a video.The user can select one course of his/her interest. The selected course gets added to the user profile. The user can open a step of the course and work on any task in that step. To complete a step and move to next one, the user should complete all the tasks in the given step. Each step has a number of social attributes. For example ‘The number of users who completed the step’. Similarly, each task in a step also contains few social attributes. For example ‘What is the user’s rank for this task in comparison with the score of other users’.

To give an idea of the data size.A course will have around 100 steps. Each step will have around 5 tasks. A course metadata can be around 3MB (500KB after compression).

Note that the app is a fully functional live app. The course is a new feature request for the app from the product team. There are a number of other features which the app should continue to support.

10000ft view of the proposed implementation.

Courses can be managed by an expert using the Content Management Portal.
The app can pull all the available course from the serving layer and make it available to the user.
Once the user selects a course, it becomes part of his profile. The profile of the user should be updated in the server.
The app allows a user to perform tasks and make progress. Any progress made by the user should be saved to his profile. Both locally and in the server. Data is saved on the server so that a device reset or device change does not result in user’s current progress loss.
If the admin changes a course then the user should get the update.
Task’s social information updates should be fetched frequently. Similarly, steps social information update should be fetched frequently.

Data flow

As you can see there are three possible ways a user’s course data can change

A User performing a task.
Expert updating the course. say adding few steps or removing a task in a step.
Other users making progress in the same course/ step/ task.

Conventional Solution

The app and its other features are backed by a set of REST APIs. If we continue to use REST API s to manage the course then the following APIs would be needed

Create and pull a user course

Once a user selects a course, create a snapshot of the course, save it as user course at the server. Return the complete user course data to the app.

Save the user progress

Either a set of APIs to save task progress. One each for each type of task and its data.
Single API with polymorphic payload to save any task’s data

Pull the course updates and social data updates

Whenever admin changes course or some social attribute of the course changes, server updates this data. Client to should pull the data and show the updated data to the user.

Moreover, after each course data pull, the client will have to flush the current data and save the new one.

Disadvantages of the conventional approach

Bandwidth usage

A good fraction of the users is 2g network users. Pulling the complete data from server to app every time is not desirable.

Performance

Whenever there is an update to course, replacing the entire course data is not in favour of CPU cycles and memory usage.

Offline usage.

The user should be able to perform tasks offline. Once the user connects to the internet the progress should be saved to the server. Using a set of APIs to save task completion progress will make the offline implementation quite complex.

Is there a better solution?

Yes; Sync the data between app and server. Considering the nature of the problem and performance requirements Sync is the way to go. In case of sync, the app works independently, irrespective of whether the user is online or offline. The app occasionally syncs data with the server. The sync process updates the user data on the server and pulls latest course updates and social updates back. All in single operation / API call.

The sync process shall be implemented as follows

The app (client here) sends the user progress to the server.
The server persists the user progress received
The server updates course changes; both by the admin and social changes by other users
The server sends back the course changes to the app
The app updates the course and presents it to the user.

What tool or technology best fit these requirements?

Evaluating the data sync implementation options

Option 1: Couchbase. Out of the box solution?

Couchbase is excellent software to sync data between two agents. But it is not an independent library which can be used with any system.(At least at the time of this software development). You need to use the CouchDB as your data store. In this case, sync of domain models is needed not database entities. As mentioned earlier the app is a fully functional live app where the course is the new offering. Using CouchDB for course implementation will require,

At the App side - Adding CouchDB support to app architecture. The pain of refactoring and modifying already working code. Apk size increase because of the new library.
At the App side - At the App side - Use CouchDB for course feature and continue using SQLite for all other features. Here the major challenge is that some of the tasks are supported in the old version of the app. That mean the user can do these tasks outside the course context. So you will end up rewriting the implementation of these tasks. In addition to that, the app upgrade would require a data migration.
At server side - Use couch for course data and continue using MongoDB for everything else at the server end. As mentioned in step 2 for common tasks, re-implement in couch and migrate data
Both the client and the server development team should build CouchDB expertise.

Option 2: Build a custom sync library

Build a sync library which can sync the domain objects or data transfer objects used at the service layer and the app. The same library can reside on both client and server. Implement/reuse an operational transformation algorithm which best fit the needs of this app. However, the challenge would be handling a huge number of use cases which the sync process may introduce. Maintaining state at client and server and the syncing using a single operation / API call can result in a huge number of cases including failure and retry handling. Maintaining the data integrity is very important.

After considering a number of modifications, time allotted for the project and future maintenance requirements, it was clear that option 2 is the preferred one.

Building a sync algorithm

Yes a custom sync library to be implemented to sync java objects (POJOs). The idea is to customise the operational transformation algorithm to match this particular requirement. Here is a very high-level explanation of the algorithm. To start with, let’s understand the entities involved.

Entities

Client Copy - is the copy of the object that client holds
Server Copy - is the copy of the object that server holds
Client Shadow - is the Client Copy last received by the server. At any point in time, this is the Client Copy that is last updated to the server.
Server Shadow - is the Server Copy last received by the client. At any point in time, this is the Server Copy that is last updated to the client.
Client Delta - is the changes in client which client intend to sync to server.
Server Delta - is the changes in server which server intend to sync to client

There are a number of operations to be performed on these entities to perform a sync operations. Following are the operations which the sync algorithm will use.

Operations

Diff operation - Find Delta of changes between two objects.

XDelta = Diff(XCopy1, XCopy2)

Patch operation - Apply delta to an object to update it.

XCopy2 = Patch(XCopy1, XDelta)

Merge operation- Merge two copies of an object

XMergedCopy = Merge(XCopy1, XCopy2)

Unix users will be familiar with these operations/commands. But how can these operations be performed on java objects ???

Following section answer this question in detail.

Diff operation

How can two objects be compared and generate delta? Use of reflection comes handy. Yes, using reflection the fields of a class and its values can be extracted. Both the objects are of the same type, hence compare both objects, one field at a time to find the difference between object. When two fields are compared, there can be three possible cases.

The data is not changed
The data is modified partially

This is the case when the field of a type which is to be synced incrementally.

The data is completely changed

These cases also indicate how the data is to be modified to recreate the object. Hence it can be treated as the data change command. Delta should carry two pieces of information for each field

Data change command

NO_CHANGE
UPDATE
REPLACE

Data ( in case of UPDATE and REPLACE)

How to represent delta of a type?

For each type which has to be incrementally synced, there should be a pair type to represent the delta. The ‘Delta’ type has same field names as that of the ‘Data’ type. Since each field has to carry two pieces of information; Command and Data, the type of the field can not be same. The type of each field can be classified into one of the two

Incremental update supported type

Any custom business domain types which will have to be incrementally synced. Diff has to be done recursively for these fields.

Non-incremental update type

Types which will change as a whole. Primitive types or any business domain type which should update as a whole will belong to this category

For non-incremental update types, the delta can be represented using a generic type, say Delta<T> where T is the actual type. This Delta<T> will have two fields,

The command field
Actual data

For incremental update supported types, each field of the type should be incrementally updated. This is a recursive process, the recursion end when a non incremental type field is encountered.Keeping the same field names for data class and delta class makes it easier to find the matching fields at the time of patching.

Consider this example

class DataClass

{

NonIncrementalUpdateType1 field1;

NonIncrementalUpdateType2 field2;

IncrementalUpdateType1 field3;

}

class DataDeltaClass

{

DeltaCommand cmd;

NonIncrementalDelta<NonIncrementalUpdateType1> field1;

NonIncrementalDelta<NonIncrementalUpdateType2> field2;

IncrementalUpdateType1Delta field3;

}

enum DeltaCommand{

NO_CHANGE, UPDATE, REPLACE

}

Introducing data object version

Diff is of two objects can be error prone if the order of the objects chosen are not right!!. Consider a diff of leftObject from rightObject.

Delta = Diff(Left Object, Right Object)

What if the Left Object older that the Right Object. If that happens then applying the delta will result in data rollback. So the version or age of the diffed object is important. So every class which has to be incrementally updated should have a version number implemented. If the version number of left Object is less than that of the version number of the right object then diff should fail.

Syncing Collections and Maps. Need for a unique object identifier

A data class can have fields of type Unordered Collection, Ordered Collection or Map. In this case comparing two collections is almost impossible or error prone unless there is a unique way to identify an item in the collection. Moreover in the entire system, having a unique way to identify each object will help to avoid a number of error scenarios. One such case can be Applying Delta of Object1 on Object2. If both the parties in the sync can uniquely identify an Object then the mistake can be avoided. This is true only for Incremental update types. For non-incremental update types, the Objects equals itself is the equality checker.

To solve this, each incremental update type object should have a globally unique identifier. This identifier should be unique for each object in the namespace. The Diff and Patch operations can use the Identifier as a pre-validation before performing the operation.

In case of a collection Diff, following changes can happen to a collectionItem

Its position is changed
It is updated
It is removed
Both 1 and 2

The algorithm examines the collection for above changes and generates a collection Delta. A collection delta consists of a sequence of commands. Recreating the changes to a collection is done by applying a sequence of commands in the same order at the time of patching. Each command will have one of the above action to be performed.

Patch operation

The diff operation and the structure of the delta type is clear.The following algorithm explains the patch operation.

Start
For each field of the data class

Get the Delta field value from delta object
If the command is NO_CHANGE

Then continue

If the command is UPDATE

If the type of the field is non-Incremental update type

Then Replace the object

If the type of the field is incremental update type

Then recursively call the algorithm for subtype

If the command is REPLACE

Then replace the object in delta

Merge operation

Need for merge - Merge is required when there two copies of an object; both having valid changes. Merge combines both the valid changes into a single object.

How to merge?

When two objects are merged, there can be conflicts. Say field2 is Replaced with null in ObjectCopy1 whereas field2 is having an update in ObjectCopy2. Resolution of the merge conflicts will depend on the business logic.However technically, how to resolve these conflicts automatically? There can be multiple approaches to this. Specifying the merge strategy for required fields is a simple solution. Consider ObjectCopy1 is to be merged with ObjectCopy2. Based on the business need one copy shall be considered as ABSORBER and the second one as ABSORBED.

The ABSORBER is the copy which is accepting changes.

The ABSORBED is the copy from which changes are being pulled out.

At the end of merge process ABSORBER will have both the changes merged into it.

Based on the business needs, one of the following merge strategies can be used.

ABSORBER_WINS - the Absorber retains the field value
ABSORBED_WINS - The Absorber field value is replaced with Absorbed field value.
COMBINE_COLLECTION - Add item from both the collections

There can be cases where both these are not desirable, or the ABSORBER_WINS/ ABSORBED_WINS is based on some precondition. To handle such cases custom merge policies can be defined. Merge policy can be associated with a field using annotations.

The Sync Algorithm

Following are the steps in a client initiated sync operation. It uses the Diff, Patch and Merge operations defined above

@Client : Client Delta = Diff( Client Copy, Server Shadow)
@Client : send the Client Delta to server by invoking the sync API
@Server : recreate Client Copy = Patch (Client Shadow, Client Delta)
@Server : Merge client copy with latest server copy to get final copy. Final Copy = Merge(Server Copy, Client Copy)
@Server : Server Delta = Diff(Final Copy, Client Copy)
@Server : Send back Server Delta to client in API Response
@Client : Recreate final copy at client . Final Copy = Patch(Client Copy, Server Delta)

First time the client will have to pull the complete data from server. Let’s call it a full sync. In this case the client send an empty Client Delta to server. Since server do not have any Client Shadow with it, server will create an empty Client Shadow and perform the sync operation. This will result in the complete data being sent as a delta. A full sync can be used as a recovery mechanism, whenever client detects data corruption.

Note: Rest API with JSON payload can be used for Delta transfer.

Freezing fields - to avoid unnecessary overrides

A common case in merge is ‘Don’t replace this object if X and Y conditions are met’. A practical use case is, ‘Don’t replace stepX from course, if the user has purchased it. Otherwise replace it with the newly provided stepX’.

Such fields can be called as Freezable fields. Each freezable field can have freeze condition, which the merger will evaluate before performing the merge.

Addressing the Challenges

Partial syncs

The algorithm takes care of most of the cases. Consider the following cases

sync fails at server end

Then the client knows that the sync is incomplete and retries.

sync fails at the client end

Client re-attempts sync. Duplicate patch will happen on the server. Duplicate patching of a field does not change the object unless the field is a collection of non-incremental update types. Non-incremental update type fields requires special handling.

Testing all possible use cases

A generic sync algorithm results in a huge number of use cases. After each minor changes to the algorithm, all possible cases must be verified. Test-driven development was the answer to this problem. All possible scenarios of sync were written as JUnit test cases. Any changes to the sync utility were validated against more than hundred test cases.

Conclusion

Consider sync support has to be added to an existing system. Whereas migration of the data to an available sync solution/ utility is not easy. In this case implementing a custom sync algorithm is the better idea. Yes, there are a good number of use cases and possible error scenarios to handle. But it is possible to implement a custom algorithm which does all the necessary. In this case study, implementation of the sync utility took 25 person days. Another 10 person days were required for client and server side integration. Legacy code or app and server architecture were left untouched. The system is live and running without errors for past 6 months.

I hope this document will serve as good reference for those who want to implement a custom sync algorithm.All the best!

Note:

The final algorithm also deals with more complex cases like, multi-client sync and out of order sync. Those details are excluded from this document to avoid unnecessary complexity.

References

Google. [Search keyword client server data sync :)]
Stackoverflow. [Search keywords client server data sync, operational transformation :) :)]

Spring Of Technology

Sunday, October 21, 2018

Zuul use cases

Wednesday, March 15, 2017

Add blurred border background to image to modify aspect ratio

A Client server sync algorithm making use of object diff and patch