lws_tls_restrict_borrow() returns error when tls restriction limit is
reached. However lws_ssl_close() still calls lws_tls_restrict_return()
to decrease simultaneous_ssl. Thus LWS accepts more than allowed ssl
links, making simultaneous_ssl_restriction useless.
Fix it by tracking lws_tls_restrict_borrow() return value and only
calling lws_tls_restrict_return() if lws_tls_restrict_borrow() is
successful.
This causes the blocking dns lookup to treat EAI_NONAME as immediately
fatal, this is usually caused by an assertive NXDOMAIN from the DNS server
or similar.
Not being able to reach the server should continue to retry.
In order to make the problem visible, it reports the situation using
CLIENT_CONNECTION_ERROR, even though it is still inside the outer client
creation call.
A second chunk of ss / sspc handling did not get cleaned up
along with the other patch from a few weeks ago, it wrongly
treats sspc the same as ss. This can cause the wrong thing
to be zeroed down, 64-bit and 32-bit builds end up with
different victims.
This patch makes it understand the difference and treat them
accordingly, same as the main for_ss handling.
variable 'n' is being set but it is not used anywhere, latest clang is
able to detect this and flags it
Fixes
lib/core-net/route.c:41:6: error: variable 'n' set but not used [-Werror,-Wunused-but-set-variable]
| int n = 0;
| ^
If facing a captive portal, we may seem to get a tcp level connection okay
but find that communication is silently dropped, leading to us timing out
in LRS_WAITING_SERVER_REPLY.
If so, we need to handle it as a connection fail in order to satisfy at
least Captive Portal detection.
This fixes the proxy rx flow by adding an lws_dsh helper to hide the
off-by-one in the "kind" array (kind 0 is reserved for tracking the
unallocated dsh blocks).
For testing, it adds a --blob option on minimal-secure-streams[-client]
which uses a streamtype "bulkproxflow" from here
https://warmcat.com/policy/minimal-proxy-v4.2-v2.json
"bulkproxflow": {
"endpoint": "warmcat.com",
"port": 443,
"protocol": "h1",
"http_method": "GET",
"http_url": "blob.bin",
"proxy_buflen": 32768,
"proxy_buflen_rxflow_on_above": 24576,
"proxy_buflen_rxflow_off_below": 8192,
"tls": true,
"retry": "default",
"tls_trust_store": "le_via_dst"
}
This downloads a 51MB blob of random data with the SHA256sum
ed5720c16830810e5829dfb9b66c96b2e24efc4f93aa5e38c7ff4150d31cfbbf
The minimal-secure-streams --blob example client delays the download by
50ms every 10KiB it sees to force rx flow usage at the proxy.
It downloads the whole thing and checks the SHA256 is as expected.
Logs about rxflow status are available at LLL_INFO log level.
This provides a way to get ahold of LWS_WITH_CONMON telemetry from Secure
Streams, it works the same with direct onward connections or via the proxy.
You can mark streamtypes with a "perf": true policy attribute... this
causes the onward connections on those streamtypes to collect information
about the connection performance, and the unsorted DNS results.
Streams with that policy attribute receive extra data in their rx callback,
with the LWSSS_FLAG_PERF_JSON flag set on it, containing JSON describing the
performance of the onward connection taken from CONMON data, in a JSON
representation. Streams without the "perf" attribute set never receive
this extra rx.
The received JSON is based on the CONMON struct info and looks like
{"peer":"46.105.127.147","dns_us":596,"sockconn_us":31382,"tls_us":28180,"txn_resp_us:23015,"dns":["2001:41d0:2:ee93::1","46.105.127.147"]}
A new minimal example minimal-secure-streams-perf is added that collects
this data on an HTTP GET from warmcat.com, and is built with a -client
version as well if LWS_WITH_SECURE_STREAMS_PROXY_API is set, that operates
via the ss proxy and produces the same result at the client.
This was commented during the metrics patch for some reason...
commenting it breaks UDS -> web serving proxying.
Uncomment it and see what the other problem is..
This provides a build option LWS_WITH_CONMON that lets user code recover
detailed connection stats on client connections with the LCCSCF_CONMON
flag.
In addition to latencies for dns, socket connection, tls and first protocol
response where possible, it also provides the user code an unfiltered list
of DNS responses that the client received, and the peer it actually
succeded to connect to.
Really not having any logs makes it difficult to know what is really
happening, but if that's you're thing this will align debug and release
modes to just have ERR and USER if you give WITH_NO_LOGS
It's perfectly possible we will have destroyed the wsi and report that
back in the return code. So let's not dumbly defreference the wsi to
make a log inbetweentimes.
Found with fault injection and valgrind.
In the case that we try ipv6 that isn't routable, we get a POLLHUP, that
marks the wsi as unusable (for writes, not pending reads), that's what
we want.
But in the case we go around and retry other dns results that are
routable, we have to clear the wsi unusable flag. Otherwise we will
connect and find that we can't write on the connection...
If the DNS lookup fails, we just sit out the remaining connect time.
The adapts it to reuse the wsi->sul_connect_timeout to schedule DNS lookup
retries until we're out of time.
Eventually we want to try other things as well, this is aligned with that.
Found with fault injection.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
For SMP case, it was desirable to have a netlink listener per pt so they
could deal with pt-level changes in the pt's local service thread. But
Linux restricts the process to just one netlink listener.
We worked around it by only listening on pt[0], this aligns us a bit more
with the reality and moves to a single routing table in the context.
There's still more to do for SMP case locking.