I have never observed this serious situation in any of my SC applications. This is important to understand the cause. I hope more info is posted as to any findings.
Seems like you will have to log to a file of DB table info about what is happening before and after this condition happens. Going to be hard to automate that type of test, but if you stored and monitored the state and had it flag this change, or even email you as admin of the fault.
If rr is correct and you are somehow blowing past some server capability, you could also monitor CPU or memory and log that as well. You may get that info from your host or server, but it is hard to correlate with SC actions. If you log both yourself you might see what is happening.
I use code like this to monitor my server CPU (I have some processes that can use a lot of processor power on a quad CPU):
$load = sys_getloadavg(); // get the linux system average load on processors
$load1 = $load; // 1 minute ave
$load3 = $load; // 3 minute ave
$load15 = $load; // 15 minute ave
$core_nums = trim(shell_exec("grep -P '^processor' /proc/cpuinfo|wc -l")); //
$percentageload1 = round($load1/($core_nums)*100, 2);
$percentageload3 = round($load3/($core_nums)*100, 2);
$percentageload15 = round($load15/($core_nums)*100, 2);
[performance] = "Load (".$core_nums." CPU Cores) " .
" -- 1 Min Ave= ".$load1." / ".$percentageload1."%" .
" -- 3 Min Ave= ".$load3." / ".$percentageload3."%" .
" -- 15 Min Ave= ".$load15." / ".$percentageload15."%";
You could display or log [performance].
I often use the SC built in log and then use the macro to add custom info as needed.
Again, I would be interested to see what you ultimately find. Since you are using the ‘stable’ SC 8.1, same as most of us, I am hoping it is not a system issue. Maybe somehow you have some code in one app that causes this? And again, perhaps rr is correct - it has to do with the server.
Just thinking - could an error during generation lead to a ‘random’ thing like this? I would just make sure all apps regenerate normally.
The trick is catching error and enough telemetry to debug. Intermittent errors are always worse. But they have to do the same with Mars Rover software glitches and stuff a million miles away, so you will find it