Adapt the event lib support slighly so we can pass an event lib "plugin"
header in at context creation time, and direct all event loop handling to
go via that.
This can then be lightly adapted to interface to an existing custom event
loop cleanly, without the problems of EXTERNAL_POLL.
The external loop must consult with us about the max wait timeout as shown
in the added minimal-http-server-eventlib-custom example.
The example shows a complete implementation working with a custom poll()
loop cleanly while only needing 5 ops in the custom event lib handler.
variable 'n' is being set but it is not used anywhere, latest clang is
able to detect this and flags it
Fixes
lib/core-net/route.c:41:6: error: variable 'n' set but not used [-Werror,-Wunused-but-set-variable]
| int n = 0;
| ^
At init-time, PROTOCOL_INIT needs to be sent to each vhost-protocol
combination to give them a chance to instantiate themselves.
PROTOCOL_INIT can be deferred a bit, and since its subject is per vhost,
which has no tsi affinity, in SMP case, its current use of context->pt[0]
fakewsi can clash with other stuff happening simultaneously, eg,
CANCEL_SERVICE broadcast, which happens on each pt.
Solve this by changing PROTOCOL_INIT to use an on-stack fakewsi that cannot
clash with any other service loop use of them.
Take care about using a full fake wsi or an lws_a fakewsi if PLAT_FREERTOS.
If facing a captive portal, we may seem to get a tcp level connection okay
but find that communication is silently dropped, leading to us timing out
in LRS_WAITING_SERVER_REPLY.
If so, we need to handle it as a connection fail in order to satisfy at
least Captive Portal detection.
This fixes the proxy rx flow by adding an lws_dsh helper to hide the
off-by-one in the "kind" array (kind 0 is reserved for tracking the
unallocated dsh blocks).
For testing, it adds a --blob option on minimal-secure-streams[-client]
which uses a streamtype "bulkproxflow" from here
https://warmcat.com/policy/minimal-proxy-v4.2-v2.json
"bulkproxflow": {
"endpoint": "warmcat.com",
"port": 443,
"protocol": "h1",
"http_method": "GET",
"http_url": "blob.bin",
"proxy_buflen": 32768,
"proxy_buflen_rxflow_on_above": 24576,
"proxy_buflen_rxflow_off_below": 8192,
"tls": true,
"retry": "default",
"tls_trust_store": "le_via_dst"
}
This downloads a 51MB blob of random data with the SHA256sum
ed5720c16830810e5829dfb9b66c96b2e24efc4f93aa5e38c7ff4150d31cfbbf
The minimal-secure-streams --blob example client delays the download by
50ms every 10KiB it sees to force rx flow usage at the proxy.
It downloads the whole thing and checks the SHA256 is as expected.
Logs about rxflow status are available at LLL_INFO log level.
This provides a way to get ahold of LWS_WITH_CONMON telemetry from Secure
Streams, it works the same with direct onward connections or via the proxy.
You can mark streamtypes with a "perf": true policy attribute... this
causes the onward connections on those streamtypes to collect information
about the connection performance, and the unsorted DNS results.
Streams with that policy attribute receive extra data in their rx callback,
with the LWSSS_FLAG_PERF_JSON flag set on it, containing JSON describing the
performance of the onward connection taken from CONMON data, in a JSON
representation. Streams without the "perf" attribute set never receive
this extra rx.
The received JSON is based on the CONMON struct info and looks like
{"peer":"46.105.127.147","dns_us":596,"sockconn_us":31382,"tls_us":28180,"txn_resp_us:23015,"dns":["2001:41d0:2:ee93::1","46.105.127.147"]}
A new minimal example minimal-secure-streams-perf is added that collects
this data on an HTTP GET from warmcat.com, and is built with a -client
version as well if LWS_WITH_SECURE_STREAMS_PROXY_API is set, that operates
via the ss proxy and produces the same result at the client.
This was commented during the metrics patch for some reason...
commenting it breaks UDS -> web serving proxying.
Uncomment it and see what the other problem is..
This provides a build option LWS_WITH_CONMON that lets user code recover
detailed connection stats on client connections with the LCCSCF_CONMON
flag.
In addition to latencies for dns, socket connection, tls and first protocol
response where possible, it also provides the user code an unfiltered list
of DNS responses that the client received, and the peer it actually
succeded to connect to.
Really not having any logs makes it difficult to know what is really
happening, but if that's you're thing this will align debug and release
modes to just have ERR and USER if you give WITH_NO_LOGS
It's perfectly possible we will have destroyed the wsi and report that
back in the return code. So let's not dumbly defreference the wsi to
make a log inbetweentimes.
Found with fault injection and valgrind.
In the case that we try ipv6 that isn't routable, we get a POLLHUP, that
marks the wsi as unusable (for writes, not pending reads), that's what
we want.
But in the case we go around and retry other dns results that are
routable, we have to clear the wsi unusable flag. Otherwise we will
connect and find that we can't write on the connection...
If the DNS lookup fails, we just sit out the remaining connect time.
The adapts it to reuse the wsi->sul_connect_timeout to schedule DNS lookup
retries until we're out of time.
Eventually we want to try other things as well, this is aligned with that.
Found with fault injection.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
For SMP case, it was desirable to have a netlink listener per pt so they
could deal with pt-level changes in the pt's local service thread. But
Linux restricts the process to just one netlink listener.
We worked around it by only listening on pt[0], this aligns us a bit more
with the reality and moves to a single routing table in the context.
There's still more to do for SMP case locking.
Also prioritize LD_LIBRARY_PATH check for plugins first
Iterate through paths in LD_LIBRARY_PATH in order
Warn on failed plugins init but continue protocol init