Optimal GIT storage for blueprints with minimal diffs (relative IDs)
Posted: Tue Jul 12, 2022 7:19 pm
I would like to introduce FaTul. It allows blueprint storage in git to have just a few changed lines even if all entity IDs have been changed, and some entities have been moved.
* Works directly with clipboard
* Minimize text changes between blueprint versions
* Uses human-readable but compact JSON format
* Sort entities by their x,y coordinates
* Do not store entity_number (IDs) in the text files
* Use relative entity position instead of entity_id
* Normalize entity x,y position
* Stores blueprint books as directories
To test, I compared historical storage of a popular Brian's blueprints (trains, city, ...). In some cases, FaTul was able to store the new git revision with over 1,000 times (!) less changed lines.
Factorio JSON is not good to store in GIT because the entity IDs and their order can change on every export, creating a lot of useless text changes. Every entity in a blueprint has an x,y position, so FaTul can create a relative link to the entity:
* Entity 1 is at {x: 2, y: 5}
* Entity 2 is at {x: 3, y: -1}
* If entity 2 references entity 1, the relative link from 2 to 1 would be "-1,6" (computed as 2-3, 5-(-1)). FaTul will replace all "entity_id": 1 inside entity 2 with "entity_rel": "-1,6".
* The same value is used for the neighbours field (an array of entity IDs)
* If anything else uses a list of entity IDs (rather than entity_id field), please create an issue.
Sorting is another reason for large diffs. Factorio could order entities in any order on every export, so to minimize that, FaTul re-sorts entities by their x,y position using Z-order curve. This way entities that are close together on a blueprint are more likely to stay together in a list.
* Works directly with clipboard
* Minimize text changes between blueprint versions
* Uses human-readable but compact JSON format
* Sort entities by their x,y coordinates
* Do not store entity_number (IDs) in the text files
* Use relative entity position instead of entity_id
* Normalize entity x,y position
* Stores blueprint books as directories
To test, I compared historical storage of a popular Brian's blueprints (trains, city, ...). In some cases, FaTul was able to store the new git revision with over 1,000 times (!) less changed lines.
Code: Select all
2022-07-06 | SE: fixed light oil cracking in Refinery [Brian White]
-5a5fcd2 4 files changed, 45658 insertions(+), 45711 deletions(-)
+2d30d15 4 files changed, 34 insertions(+), 46 deletions(-)
* Entity 1 is at {x: 2, y: 5}
* Entity 2 is at {x: 3, y: -1}
* If entity 2 references entity 1, the relative link from 2 to 1 would be "-1,6" (computed as 2-3, 5-(-1)). FaTul will replace all "entity_id": 1 inside entity 2 with "entity_rel": "-1,6".
* The same value is used for the neighbours field (an array of entity IDs)
* If anything else uses a list of entity IDs (rather than entity_id field), please create an issue.
Sorting is another reason for large diffs. Factorio could order entities in any order on every export, so to minimize that, FaTul re-sorts entities by their x,y position using Z-order curve. This way entities that are close together on a blueprint are more likely to stay together in a list.