Increasing Parallezation by Adapted Thread Structure

Post your ideas and suggestions how to improve the game.

Moderator: ickputzdirwech

Post Reply
Premu
Fast Inserter
Fast Inserter
Posts: 100
Joined: Wed Nov 13, 2019 4:40 pm
Contact:

Increasing Parallezation by Adapted Thread Structure

Post by Premu »

TL;DR
A possible change in the multi core architecture of the game could give more options to utilize different cores.

What ?
One important caveat: I don't actually know the actual SW architecture and its overall multi core approach of Factorio. I just assume based on dev messages that it looks like that (visualized with the second best tool for everything):

Image

There is one main thread which splits up in parallel threads when possible. After all of these terminate, they are synchronized and the main thread continues until it reaches the end. Should there be time left, the game will wait until it starts the thread over for the next tick.

If a variable is overwritten, any read attempt afterwards will give the new result. Reading it before that will return the value set in the last tick. You should typically not try to read a variable from a parallel thread, though, unless you have some protection mechanisms like semaphores in place, or weird things can happen.

Now there is another approach of multicore in my work. I'm working for an automotive supplier for developping software for an electronic brake system. It even has some similarities to Factorio. Just as Factorio has ticks, we have a "loops". And just as in Factorio, they always run the same routine and should finish in time. In Factorio's case not finishing in time means the game get laggy. In our case it forces a controller reset and a lot of warning lamps will illuminate the car's dashboard.

When I started over 10 years ago, we still had only single core software. But soon we came in the situation that we had to use multi core processors as the frequency of MCUs can't be simply increased easily, and instead new cores were added. So our single core software had to be split up. But here a different approach is used as above which might actually work for Factorio. The key difference - we actually accept that one core is using "outdated" data calculated by the other core in the last loop:

Image

Just at the beginning of our loop, a central synchronization is done, so that each core gets the data it needs from other core calculated in the last loop. Now both loops will run in parallel without any further synchronization. If a core uses a variable belonging to a component allocated to the same core, it can directly access it. If it wants to read a value belonging to the other core, it can only access a copy of it updated in the synch phase while the other core might have already changed it.

This adds some latency to signals - but importantly it is a deterministic latency! Things might happen a little bit later, but in many cases this is good enough. The behaviour is determenistic, though, and we don't have to wait for the other core to finish its job while idling around.

Caution needs to be applied, though - defining a group of components moved to a different core/thread is tricky. This group should have only limited interfaces to the outside as the whole synchronization obviously will create additional overhead and cause headaches in changing the implementation to potential new interfaces.
Application in Factorio
Now - can we apply this to Factorio as well? I believe so, as many things don't really matter if they happen one tick later. Some potential ideas where you could use such a technique, just based on my observations in the game:

- Pollution: Probably the safest bet here. The player will not care and not even detect that the pollution cloud moves and is absorbed with one tick delay.
- Biters: These have only very limited interactions with your factory. So if they move with a slight delay and their attacks only show results one tick later it could work
- Trains: Trains, signals, stations and rails might be handled in their own thread as well. They move pretty indepenently from the rest of the factory. Some conditions for stations might only be detected one tick later, but these should typically not be so volatile that it makes any visible difference.
- Fluids: Fluids and everything handling those are pretty much independent from all the "solid" stuff. There might be potential to move that out. (Although as fluid handling is already pretty difficult this is proably not a good candidate to start with...)
Risks and Chances
Obviously I'm aware - while the concept itself might not sound too difficult, changing the architecture of the whole game is a very challenging task which is not easy. And if things go wrong with multi core handling weird things happen.

Still, allowing for a significant better distribution of the running software on all cores could be a massive potential for performance increase. Because the performance of a single core won't increase in the forseeable future, any performance increase can be only done by making the existing software even more efficient - something where already a lot has been squeezed out, or by allowing more parallelization.

In case there's actual interest in taking that approach I might give some more hints, ideas and experiences.

User avatar
TheKillerChicken
Long Handed Inserter
Long Handed Inserter
Posts: 70
Joined: Sat Mar 02, 2019 7:06 am
Contact:

Re: Increasing Parallezation by Adapted Thread Structure

Post by TheKillerChicken »

Not just parallezation, but SMT technology also I feel would be great for this game. There are gaming PCs with 64-cores/128-threads now, so I am on board with this setup. It never made sense to me why Wube did not implement such a thing in the first place as there were machines with 12-cores/24-threads readily available at the time of the original production. I also would love to see this game utilise NUMA nodes also.

Post Reply

Return to “Ideas and Suggestions”