First, an aside:
Are you familiar with Erlang? It has the property that all Erlang “processes” run in the same UNIX process. And Erlang processes can fail independently of each other. And Erlang applications can run for a long, long time. How is this? Well, two things:
-
Erlang processes share no memory (except immutable (recounted, IIRC) strings). data is copied between processes.
-
Erlang has something called “process supervision” where, when a process takes a fault, a supervisor process can wake up and cleanup after it. And since Erlang processes share no memory, it’s possible to recover completely.
But even with this, Erlang processes can fail: the runtime can take errors, foreign C modules can fail, etc. It’s just not common, IIUC.
Now, to your question:
No, this would not be a good design for a “service” that is designed to have high reliability. What you describe is where J2EE started, right? And it’s fair to say, that design failed.
There’s a saying in systems work:
Six weeks in the lab can save you half an hour in the library.
The history and development of fault-tolerant systems shows pretty clearly that the design you describe is not going to yield high-reliability systems. Now, the J2EE designers ought to have known this too (since all the books about it were nearing being out-of-print by the time J2EE came along) but they didn’t spend that half-hour in the library.
P.S. There’s nothing inherently wrong with a multi threaded runtime. Look at Apache 2.0. The thing is, you don’t run one such runtime; instead, you run N of them (N >= 2) and when one takes a fault, it crashes completely. [and again, I’m talking about non-business-logic faults]
BTW, if you’re having enough faults that the startup time of these runtimes is getting prohibitively expensive, that means the faults are common enough that you can go debug them. Jim Gray called these “Bohrbugs”.
PPS. If you’re interested in learning the history of TP, his book Transaction Processing is good. But also, reading “the Tandem papers” is absolutely essential. Tandem was probably the apex of “designed to be reliable” software and hardware, and reading about them is eye-opening. A small example: modern Infiniband networks are the direct descendant of Tandem’s networks.