March 12, 2019
Fellow frontend/web developer, unless you are still using tech from the 90’s, you have used node at some point for development purposes. The most popular choice is a nice combo of the following:
Part of it has to do to the child_process package, which allows us to create processes in 4 different methods: spawn, fork, exec and execFile.
Please note that all those processes implements node’s eventEmitter API, so we can use them and react based on events but remember, inside the eventEmitter handlers you can only use synchronous operations. Wawn wawn…
IMHO, I believe the use of spawn is preferable to exec or execFile. With exec file you will be able to use them as you were litteraly using a shell command. You will be able to use pipe as if you were on a bash and unfortunately, it will be buffering the outputs… In the other hand spawn you will have to use the pipe method and the data output will not be buffered, and will out straigh away.
The most interesting function from the child_process is fork.
Yup I know. Anyway, whenever you want to start creating microservices or handling way more request, you may want to invest your time (and potentially money) in forking processes. fork will allow to run processes detached or not (depending on configuration) and return to a parent process. Any child process can communicate with his parent and viceversa by using the send method.
How cool is that?
However, if something goes wrong on one of the server you may want to actually restart the process, right? The parent process can be always aware of the fork he was creating. We always can create an array of child processes that relate to a single parent. In order to clone as many as we want, we can actually use the cluster package.
Node cluster will help us to spin up or down processes, check if a process is down and restart or create a new child process for that purpose.
Ah… the good old state manager… Because now we can run processes in a cluster and based on different requirements, all the forked processes (child processes) may not be aware of the general state of the app. Actually unless we communicate back to the parent and then the parent communicating this to the children, there is no way to update the specific general app state. That is why we should keep the state in a Database or small array, that is detached from any remaining processes.
It would be better to allocate one process to deal with the state, communicate with the parent or master node, then broadcast the state the remaining child processes. At that moment we are sure that the state is consistant accross master nodes, microservices, and any other processes.
The last bit would be to actually do something about the master node. What would happend if the master node crashes? We need to deal with these type of situations. Because of nodes implement event emitters we can react to those, and because we were clever enough to have an array of child processes, we could terminate them the restart them as soon as possible. This method will allow us to have high available clusters and zero-downtime if something happens.
All this, are words. You need to experiment, test and try. PM2 is the perfect example that covers the topics on this post (clusters, child processes, …) and you can double check easily the state of each process… So yes, play around with this and welcome to the deep dive of nodeJS.