tests/doc/platform_specific_problems
author J.C. Jones <jjones@mozilla.com>
Fri, 21 Jun 2019 14:39:01 -0700
branchNSS_3_36_BRANCH
changeset 15182 de60f2b7f0c3fac0537346f1077f03d6d849edc5
parent 10685 6c43fe3ab5dd41803bbd6705979f73275d7668f6
child 11765 653076b2a38b8260df4dad443e701559657c9043
permissions -rw-r--r--
Added tag NSS_3_36_8_RTM for changeset df8917878ea6

I will, eventually convert all files here to html - just right now I have no
time to do it. Anyone who'd like to - please feel free, mail me the file and 
I will check it in 
sonmi@netscape.com


The NSS 3.1 SSL Stress Tests fail for me on FreeBSD 3.5.  The end of the output
of './ssl.sh stress' looks like this:

********************* Stress Test  ****************************
********************* Stress SSL2 RC4 128 with MD5 ****************************
selfserv -p 8443 -d
/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
conrail.cs.columbia.edu -w nss   -i /tmp/tests_pid.5505  & strsclnt -p 8443 -d . -w nss -c 1000 -C A  conrail.cs.columbia.edu 
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_NewTCPSocket returned error -5974:
Insufficient system resources.
Terminated 
********************* Stress SSL3 RC4 128 with MD5 ****************************
selfserv -p 8443 -d
/local/llennox/NSS-PSM/mozilla/tests_results/security/conrail.20/server -n
conrail.cs.columbia.edu -w nss   -i /tmp/tests_pid.5505  & strsclnt -p 8443 -d . -w nss -c 1000 -C c  conrail.cs.columbia.edu 
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: PR_NewTCPSocket returned error -5974:
Insufficient system resources.
Terminated 

Running ktrace on the process (ktrace is a system-call tracer, the equivalent of
Linux's strace) reveals that socket() failed with ENOBUFS after it was called
for the 953rd time for the first test, and it failed after the 27th time it was
called for the second test.

The failure is consistent, both for debug and optimized builds; I haven't tested
to see whether the count of socket() failures is consistent.

All the other NSS tests pass successfully.


------- Additional Comments From Nelson Bolyard 2000-11-01 23:08 -------

I see no indication of any error on NSS's part from this description.
It sounds like an OS kernel configuration problem on the 
submittor's system.  The stress test is just that.  It stresses
the server by pounding it with SSL connections.  Apparently this 
test exhausts some kernel resource on the submittor's system.

The only change to NSS that might be beneficial to this test 
would be to respond to this error by waiting and trying again
for some limited number of times, rather than immediately 
treating it as a fatal error.  

However, while such a change might make the test appear to pass,
it would merely be hiding a very serious problem, namely,
chronic system resource exhaustion.

So, I suggest that, in this case, the failure serves the useful
purpose of revealing the system problem, which needs to be 
cured apart from any changes to NSS.

I'll leave this bug open for a few more days, to give others
a chance to persuade me that some NSS change would and should
solve this problem.  


------- Additional Comments From Jonathan Lennox 2000-11-02 13:13 -------

Okay, some more investigation leads me to agree with you.  What's happening is
that the TCP connections from the stress test stick around in TIME_WAIT for two
minutes; my kernel is only configured to support 1064 simultaneous open sockets,
which isn't enough for the 2K sockets opened by the stress test plus the 100 or
so normally in use on my system.

So I'd just suggest adding a note to the NSS test webpage to the effect of "The
SSL stress test opens 2,048 TCP connections in quick succession.  Kernel data
structures may remain allocated for these connections for up to two minutes. 
Some systems may not be configured to allow this many simulatenous connections
by default; if the stress tests fail, try increasing the number of simultaneous
sockets supported."

On FreeBSD, you can display the number of simultaneous sockets with the command
	sysctl kern.ipc.maxsockets
which on my system returns 1064.

It looks like this can be fixed with the kernel config option
	options NMBCLUSTERS=[something-large]
or by increasing the 'maxusers' parameter.

It looks like more recent FreeBSD implementations still have this limitation,
and the same solutions apply, plus you can alternatively specify the maxsockets
parameter in the boot loader.


---------------------------------

hpux HP-UX hp64 B.11.00 A 9000/800 2014971275 two-user license

we had to change following kernelparameters to make our tests pass

1. maxfiles.  old value = 60.  new value = 100.
2. nkthread.  old value = 499.  new value = 1328.
3. max_thread_proc.  old value = 64.  new value = 512.
4. maxusers.  old value = 32.  new value = 64.
5. maxuprc.  old value = 75.  new value = 512.
6. nproc.  old formula = 20+8*MAXUSERS, which evaluated to 276.
   new value (note: not a formula) = 750.

A few other kernel parameters were also changed automatically
as a result of the above changes.