JSON-JEP

Full featured JSON Parser, Data Model, and JSON Path integration

This project is maintained by xmljim

JEP API Guide » Merging JSON Data

Merging JSON Data

Contents

Overview

Merging functionality is supported from the org.ghotibeaun.json.merge package. The MergeProcess interface that orchestrates the merging of JSONNode instances, and defines a single method,

There are additional methods that provide support for handling merge conflicts. For more about conflicts, see Managing Conflicts

Finally the getMergeResultStrategy() defines whether the merge result should be returned as a new JSON instance, or as the updated primary JSON instance (here there be dragons - it’s powerful and useful, but the original primary JSON instance is replaced with the merged instance - know your data).

The MergeProcessor is a built-in implementation of the MergeProcess interface, and includes several static merge methods that support several merge scenarios. Some of these merge scenarios allow you to define specific conflict strategies for handling merge conflicts for JSONArrays and JSONObjects:

Basic Merging Scenarios

There are cases when you need to merge data from two JSON instances. For example:

JSON Instance 1:

{
    "key1": "value1",
    "key2": "value2"
}

JSON Instance 2:

{
    "key3": "value3",
    "key4": "value4"
}

Merged Instance:

{
    "key1": "value1",
    "key2": "value2",
    "key3": "value3",
    "key4": "value4"
}

The same merge process works for a JSONArray:

[
    "value1",
    "value2"
]
[
    "value3",
    "value4"
]

Results in:

[
    "value1",
    "value2",
    "value3",
    "value4"
]

Similarly, merging is relatively straightforward when the following is true:

  1. Both JSONObject instances have equivalent properties (e.g., the keys and values match).
  2. Both JSONArray instances have equivalent values at the same index position within both arrays.

If either of these conditions is true, then the merged instance simply incorporates the property (for JSONObjects), or value at the indexed position (for JSONArrays).

{
    "pi": 3.14,
    "phi": 1.62,
    "theodorus": 1.73
}
{
    "pi": 3.14,
    "phi": 1.62,
    "pythag": 1.414
}

Would return a merged result of:

{
    "pi": 3.14,
    "phi": 1.62,
    "theodorus": 1.73,
    "pythag": 1.414
}

Similarly, with a JSONArray merge, if the values are equivalent at the same index position, then the merge will merge both values:

[
    3.14,
    1.62
    1.73
]
[
    3.14,
    1.62,
    1.414
]

Would return a merged result of:

[
    3.14,
    1.62,
    1.73,
    1.414
]

A Few More Complex Examples

Up until now, we’ve discussed merging of simple, primitives (i.e., Strings, numbers, etc.). So what would a merge look like in the case of nested JSONObject or JSONArray values?

{
    "team": {
        "name": "Boston Red Sox",
        "league": "American League"
    }
}
{
    "team": {
        "name": "Boston Red Sox",
        "stadium": "Fenway Park",
        "yearBuilt": 1912
    }
}

Since both JSON instances have a team property AND the values in both are the same value type (JSONObject), then we can merge the team property into a single JSONObject value:

{
    "team": {
        "name": "Boston Red Sox",
        "league": "American League",
        "stadium": "Fenway Park",
        "yearBuilt": 1912        
    }
}

Let’s look at a JSONArray example:

[
    {
        "name": "Boston Red Sox",
        "league": "American League"
    },
    {
        "name": "New York Yankees",
        "league": "American League"
    }
]
[
    {
        "name": "Boston Red Sox",
        "stadium": "Fenway Park",
        "yearBuilt": 1912        
    },
    {
        "name": "New York Yankees",
        "stadium": "Yankee Stadium",
        "yearBuilt": 2009
    }
]

Since each value in both JSONArray instances are the same type (JSONObject), we can merge the values:


[
    {
        "name": "Boston Red Sox",
        "league": "American League",
        "stadium": "Fenway Park",
        "yearBuilt": 1912        
    },
    {
        "name": "New York Yankees",
        "league": "American League",
        "stadium": "Yankee Stadium",
        "yearBuilt": 2009
    }
]    

A Brief Discussion On Equivalency

There are three levels of equivalency in use for JSON nodes. This is important as this sets the core assumptions for how we handle non-equivalencies and conflicts.

  1. A JSONValue is equivalent if:

    • Both values are the same type (JSONValueType)
    • For primitive values, they can be evaluated to be equal from the Object.equals() method or similarly evaluated as equal through the compare() method from the Comparable<T> interface. For example true == true, 234 == 234, "foo".equals("foo"), null == null
    • For JSONObject or JSONArray values, the isEquivalent(JSONNode) method returns true.
  2. A JSONObject is equivalent to another if:

    • Both JSONObjects contain exactly the same keys
    • The values (JSONValue) for each key are equivalent
    • The order of the keys does not matter
    • Example: {"a": "foo", "b": "bar"} == {"b": "bar", "a": "foo"}
  3. A JSONArray is equivalent to another if:

    • Both JSONArray instances contain the same number of elements
    • The values (JSONValue) at each index position are equivalent
    • The order of the elements in the array does matter
    • Example: [1, 3, 5] == [1, 3, 5]. However, [1, 3, 5] != [1, 5, 3]

IMPORTANT: Equivalency in JSONObjects and JSONArrays is recursive. This means that the entire structure of each of the JSONNodes compared must be equivalent using the conditions set out above. If any of these conditions at any point in the structure is not true, then the entire structure is not equivalent.

Managing Conflicts

Up to this point, we’ve discusssed merge scenarios where properties were either distinctly equivalent, or distinctly different. For example, in the case of a JSONObject, we might have key/value properties where the values are the same:

{"foo": "bar"}

{"foo": "bar"}

Returns:

{"foo": "bar"}

Likewise, we’ve discussed how a merge handles distinctly different keys:

{"foo": "bar"}

{"bar": "baz"}

Returns:

{"foo": "bar", "bar": "baz"}

However, what if we attempt to merge two JSON instances with the same keys but non-equivalent values? Remember that we defined equivalency both in terms of type (via the getType() value on the JSONValue interface), and the underlying value. In the case where the values are both either a JSONObject or JSONArray type, then we’ll merge the two values; however if the types are different, or the values are primitive types AND are not equal, then we need to apply a conflict strategy to address them.

For JSONObject instances with the same key, but different values, we could choose a conflict strategy that declares:

These are different strategies that you might choose depending on your data, and the underlying business requirements.

Likewise, we could apply different strategies for conflicts in JSONArrays. Here a conflict occurs when the values a given index position exists in both the primary and secondary arrays, but are different, or the value at a given index position in the primary does not exist in the seconary, or vice versa. Since order matters within a merged JSONArray, there several ways to handle conflicts:

There’s a fourth option, which really is a special case: We could choose to deduplicate values so that our merged array contains only distinct values.

The ConflictStrategy Interface

ConflictStrategy is the base interface for handling conflicts for either a JSONArray or JSONObject instance and defines a single method to resolve conflicts:

The interface also defines the getMergeProcessor() method that returns the MergeProcess instance that created this instance.

Each ConflictStrategy is type-specific, meaning that it will only apply to JSONObject or JSONArray context. For this, there are distinct subinterfaces for each type:

For each of these subinterfaces, there are abstract implementations that handle the basic block and tacking of creating instances that return the MergeProcess.

JSONObject Conflict Strategies

There are three concrete implementations of the JSONObjectConflictStrategy interface. Each addresses merge conflicts differently:

AcceptPrimaryConflictStrategy

This strategy will use the value from the primary JSONObject instance in the event of a merge conflict for a given property key. Example:

  //primary
  {
      "foo": "bar"
  }

  //secondary
  {
      "foo": "baz"
  }

returns value from the primary:

  {
      "foo": "bar"
  }

AcceptSecondaryConflictStrategy

This strategy uses the value from the secondary JSONObject instance in the event of a merge conflict for a given property key. It’s the reverse of the AcceptPrimaryConflictStrategy. Example:

//primary
{
    "foo": "bar"
}

//secondary
{
    "foo": "baz"
}

returns value from the secondary:

{
    "foo": "baz"
}

AppendObjectConflictStrategy

In the event of a merge conflict,this strategy appends both the primary value using the primary key, and the secondary value using a new key. The new key value is a concatenation of the orginal key and the factory setting from the MERGE_APPEND_KEY. The default value is _append. You can change this value using the FactorySettings.applySetting(Setting.MERGE_APPEND_KEY, "[new_value]") method. Example:

//primary
{
    "foo": "bar"
}

//secondary
{
    "foo": "baz"
}

returns:

{
    "foo": "bar",
    "foo_append": "baz"
}

JSONArray Conflict Strategies

There are four defined conflict strategies that can be applied to JSONArray instances that implement the JSONArrayConflictStrategy interface:

AppendArrayConflictStrategy

Merge conflicts are appended to the end of the new array. Example:

//primary
[
    1,
    2,
    3
]
//secondary
[
    1,
    3,
    4
]

returns an an array where the conflicting values (3, 4) are appended to the end. Note that the value 3 is included twice:

[
    1,
    2,
    3,
    3,
    4
]

This also applies to arrays of different size:

//primary
[
    1,
    2,
    3
]

//secondary
[
    3,
    4
]

returns:

[
    1,
    2,
    3,
    3,
    4
]

InsertBeforeConflictStrategy

Merge conflicts at each index position are inserted before the primary value. Example:

//primary
[
    1,
    2,
    3,
]
//secondary
[
    1,
    3,
    4
]

returns:

[
    1,
    3,
    2,
    4,
    3
]

InsertAfterConflictStrategy

Merge conflicts at each index position are inserted after the primary value. Example:

//primary
[
    1,
    2,
    3,
]
//secondary
[
    1,
    4,
    6
]

returns:

[
    1,
    2,
    4,
    3,
    6
]

DeduplicateArrayConflictStrategy

Creates a merged array that includes only distinct values from both arrays. Note that order is not necessarily sorted.

//primary
[
    1,
    2,
    3,
]
//secondary
[
    1,
    3,
    4
]

returns:

[
    1,
    2,
    3,
    4
]