For starters, auto-restart scripts are your friend here - schedule them with cron (assuming Linux) and have them check the process names/ids to make sure the right processes are up - that’s a decent first step if a process crashes. If you’re game to incorporate another server, the Elastic stack has a lot of tools for recording logs and monitoring your servers (there’s also Prometheus, which I’ve never used - I’d personally lean towards Elastic).
If you want to go crazy, you can even do a Docker or Kubernetes cluster but I wouldn’t recommend doing that - those tend to be a huge pain and have such a list of caveats that it’s frankly rarely worth it unless you’ve got a large operation and staff to handle it. Containers are a pain to deal with even when not in a cluster, and everything gets harder as soon as you move to a clustered environment.
Personally, I’d probably start with auto-restart scripts + periodic, regular email health updates & immediate email alerts when something goes wrong. If you want to get fancy you can even use a service like Twilio to send you an SMS, though you’ll probably need to do some scripting to pull that off.
I come from a background of dealing with large JavaEE applications and large server deployments, so I tend to stick to technology that I am familiar with, but I use docker for my server applications to keep my deployments clean, and use elastic scaling on my clusters. I deploy my apps in AWS.
I build all of my apps for my JME servers as WildFly applications and manage the cluster with a custom JEE management platform I built. This keeps my technology consistent as I am also using Vue on JEE for my websites. This way I only need two different services to worry about: WildFly, and PostgreSQL (for my databases). My JME app uses the pgjdbc-ng driver to communicate with the databases via a connection pool to help reduce the number of connections being created and destroyed between the JME cluster and my DB cluster.