This is the direct comparison (left: local data from three receivers located at two locations with 60km distance, right: local data combined with opensky data).
The OpenSky data improves my own coverage area by adding more low altitude aircraft, especially for the airport area in Frankfurt and Amsterdam.
The history files are duplicates of the aircraft.json file, which are generated every 30 seconds (first tried 10s). Up to 60 history files are stored(first tried 120). A new aircraft.json is generated every 10s. receiver.json is manipulated to accept this larger refresh interval without timeout warning.
I am filtering the OpenSky data by thresholds for latitude and longitude to be limited to Europe (an option for the API to get a certain radius of data would be helpful). Datasets without position, altitude and course are removed. Additionally, flights containing 'DLH' in the callsign are shown outside of this area.
I am still working on effciency of merging. Generating human readible json files took a few seconds more than those without any spaces or newlines. And I need to reconsider my approach: As I am feeding all my dump1090-fa ADS-B data without MLAT to OpenSky, merging the local MLAT only data with the OpenSky data should bring the same result with less or even without comparisons. The next idea would be to merge OpenSky and ADSBexchange data (containing MLAT) for a further coverage improvement.
I can publish a short howto after tidying up the code.