Really not having any logs makes it difficult to know what is really
happening, but if that's you're thing this will align debug and release
modes to just have ERR and USER if you give WITH_NO_LOGS
Until now we set metadata value pointers into the onward wsi ah data
area... that's OK until we get a situation the wsi has gone away before we
have a chance to deliver the metadata over the proxy link.
Add a variant lws_ss_alloc_set_metadata() that allocates space on the heap
and takes a copy of the input metadata. Change ss-h1 to alloc copies of
its metadata so we no longer race the wsi ah lifetime.
lws_ss_set_metadata can fail... eg, due to transient OOM situation... if it does,
caller must take appropriate action like disconnect and retry.
So mark the api as requiring the result checking, and make sure all the
examples do it.
It's perfectly possible we will have destroyed the wsi and report that
back in the return code. So let's not dumbly defreference the wsi to
make a log inbetweentimes.
Found with fault injection and valgrind.
In the case that we try ipv6 that isn't routable, we get a POLLHUP, that
marks the wsi as unusable (for writes, not pending reads), that's what
we want.
But in the case we go around and retry other dns results that are
routable, we have to clear the wsi unusable flag. Otherwise we will
connect and find that we can't write on the connection...
If the DNS lookup fails, we just sit out the remaining connect time.
The adapts it to reuse the wsi->sul_connect_timeout to schedule DNS lookup
retries until we're out of time.
Eventually we want to try other things as well, this is aligned with that.
Found with fault injection.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
For SMP case, it was desirable to have a netlink listener per pt so they
could deal with pt-level changes in the pt's local service thread. But
Linux restricts the process to just one netlink listener.
We worked around it by only listening on pt[0], this aligns us a bit more
with the reality and moves to a single routing table in the context.
There's still more to do for SMP case locking.
If the client library loses the proxy connection, it can receive
an endless stream of 0 length rx instead of understanding that
the UDS peer has gone.
Handle that correctly so the client reacts to the loss of the
proxy link by trying to reacquire it.
Adapt the sspc state to be suitable for retry in that case,
by dropping any dsh and letting the logical ss know that he
is DISCONNECTED, if he thought he was CONNECTED.
The state tracking and violation detection is very powerful at enforcing
only legal transitions, but if it's busy, we don't get to see which stream
had to problem. Add a pointer to the handle lc tag, do that rather than
just pass the handle so we can deal with ss and sspc handles cleanly.
openssl v3-alpha11 has marked EC_KEY pieces as deprecated... we use it in
LWS_WITH_GENCRYPTO but the related RSA etc pieces were already deprecated
for that. We use EC_KEY pieces in vhost init...
The apis are not removed but deprecated, we should have a way to keep
trucking, but as it is the deprecation warning is promoted to an error.
Let's add LWS_SUPPRESS_DEPRECATED_API_WARNINGS option off by default. If
enabled at cmake, external deprecated api warnings are suppressed. This
gives a general workaround for now for opensslv3.
In addition, even if you don't do that, let's notice we are on openssl v3
and don't build the EC curve selection stuff, I don't think anyone is
actually using it anyway.