Add some exports so the api test can inject results into the parser for
live queries, suppressing asking the server but otherwise following the
flow.
Provide two new suspect responses for injection and parsing in ctest.
Add a --cos option to minimal-http-client to force a close after the
connection has started the async dns.
This adds an indication of dns disposition to the conmon results,
and for http, if it gets that far a protocol-specific indication
of http response code.
This causes the blocking dns lookup to treat EAI_NONAME as immediately
fatal, this is usually caused by an assertive NXDOMAIN from the DNS server
or similar.
Not being able to reach the server should continue to retry.
In order to make the problem visible, it reports the situation using
CLIENT_CONNECTION_ERROR, even though it is still inside the outer client
creation call.
Until now although we can follow redirects, and they can promote the
protocol from h1->h2, we couldn't handle h2 wsi reuse since there are many
states in the wsi affected by being h2.
This wipes the related states in lws_wsi_reset() and follows the generic
wsi close flow before deviating into the redirect really close to the end,
ensuring we cleaned out evidence of our previous life properly.
h2->h2 redirects work properly after this.
The max number of redirects is increased from 3 -> 4 since this was seen in
the wild with www and then geographic-based redirects.
This provides a build option LWS_WITH_CONMON that lets user code recover
detailed connection stats on client connections with the LCCSCF_CONMON
flag.
In addition to latencies for dns, socket connection, tls and first protocol
response where possible, it also provides the user code an unfiltered list
of DNS responses that the client received, and the peer it actually
succeded to connect to.
It's perfectly possible we will have destroyed the wsi and report that
back in the return code. So let's not dumbly defreference the wsi to
make a log inbetweentimes.
Found with fault injection and valgrind.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
For SMP case, it was desirable to have a netlink listener per pt so they
could deal with pt-level changes in the pt's local service thread. But
Linux restricts the process to just one netlink listener.
We worked around it by only listening on pt[0], this aligns us a bit more
with the reality and moves to a single routing table in the context.
There's still more to do for SMP case locking.
This is a huge patch that should be a global NOP.
For unix type platforms it enables -Wconversion to issue warnings (-> error)
for all automatic casts that seem less than ideal but are normally concealed
by the toolchain.
This is things like passing an int to a size_t argument. Once enabled, I
went through all args on my default build (which build most things) and
tried to make the removed default cast explicit.
With that approach it neither change nor bloat the code, since it compiles
to whatever it was doing before, just with the casts made explicit... in a
few cases I changed some length args from int to size_t but largely left
the causes alone.
From now on, new code that is relying on less than ideal casting
will complain and nudge me to improve it by warnings.
This adds some new objects and helpers for keeping and logging
info on grouped allocations, a group is, eg, SS handles or client
wsis.
Allocated objects get a context-unique "tag" string intended to replace
%p / wsi pointers etc. Pointers quickly become confusing when
allocations are freed and reused, the tag string won't repeat
until you produce 2^64 objects in a context.
In addition the tag string documents the object group, with prefixes
like "wsi-" or "vh-" and contain object-specific additional
information like the vhost name, address / port or the role of the wsi.
At creation time the lws code can use a format string and args
to add whatever group-specific info makes sense, eg, a wsi bound
to a secure stream can also append the guid of the secure stream,
it's copied into the new object tag and so is still available
cleanly after the stream is destroyed if the wsi outlives it.
If getaddrinfo() is not able to reach the server, there may be
a connectivity problem downstream of the device that has not
been recognized by the Captive Portal Detect pieces yet.
If it looks like that might have happened, used the getaddrinfo()
return to provoke a new CPD scan.
RFC6724 defines an ipv6-centric DNS result sorting algorithm, that
takes route and source address route information for the results
given by the DNS resolution, and sorts them in order of preferability,
which defines the order they should be tried in.
If LWS_WITH_NETLINK, then lws takes care about collecting and monitoring
the interface, route and source address information, and uses it to
perform the RFC6724 sorting to re-sort the DNS before trying to make
the connections.
They have been in lib/roles/http for historical reasons, and all
ended up in client-handshake.c that doesn't describe what they
actually do any more. Separate out the staged client connect
related stage functions into
lib/core-net/client/client2.c: lws_client_connect_2_dnsreq()
lib/core-net/client/client3.c: lws_client_connect_3_connect()
lib/core-net/client/client4.c: lws_client_connect_4_established()
Move a couple of other functions from there that don't belong out to
tls-client.c and client-http.c, which is related to http and remains
in the http role dir.