This adds an indication of dns disposition to the conmon results,
and for http, if it gets that far a protocol-specific indication
of http response code.
Add a way to confirm that the ss handle recovered from a ss wsi is still
valid, by walking the pt ss list and confirming it is on there before using
it with conmon.
If it isn't, it will assert.
Add 9 fault injection cases in SS creation flow, and 5 of those
instantiate in the minimal examples ctests. The other 4 relate
to static policy and server, I tested the server ones by hand.
These tests confirm the recent change to unpick create using
lws_ss_destroy.
The late_bail discrete unpick flow is missing some pieces compared
to lws_ss_destroy. Unify the creation fail flow to also use
lws_ss_destroy so everything in one place.
Make lws_ss_destroy() not issue any states if the creation flow
didn't get as far as issuing CREATING.
Normally when doing a Client Connection Error handling,
we can action any ss relationship straight away since
we are in a wsi callback without any ss-aware parents
in the call stack.
But in the specific case we're doing the initial onward
wsi connection part on behalf of a ss, in fact the call
stack does have earlier parents holding references on
the related ss.
For example
secstream_h1 (ss-h1.c:470) CCE
lws_inform_client_conn_fail (close.c:319) fails early
lws_client_connect_2_dnsreq (connect2.c:349)
lws_http_client_connect_via_info2 (connect.c:71)
lws_header_table_attach (parsers.c:291)
rops_client_bind_h1 (ops-h1.c:1001)
lws_client_connect_via_info (connect.c:429) start onward connect
_lws_ss_client_connect (secure-streams.c:859)
_lws_ss_request_tx (secure-streams.c:1577)
lws_ss_request_tx (secure-streams.c:1515) request tx
ss_cpd_state (captive-portal-detect.c:50)
lws_ss_event_helper (secure-streams.c:408)
lws_ss_create (secure-streams.c:1256) SS Create
Under these conditions, we can't action the DESTROY_ME that
is coming when the CCE exhausts the retries.
This patch adds a flag that is set during the SS's onward wsi
connection attempt and causes it to stash rather than action
the result code.
The result code is brought out from the stash when we return to
_lws_ss_client_connect level, and passed up in the SS flow until
it is actioned, cleanly aborting the ss create.
Add -Wextra (with -Wno-unused-parameter) to unix builds in addition to
-Wall -Werror.
This can successfully build everything in Sai without warnings / errors.
This causes the blocking dns lookup to treat EAI_NONAME as immediately
fatal, this is usually caused by an assertive NXDOMAIN from the DNS server
or similar.
Not being able to reach the server should continue to retry.
In order to make the problem visible, it reports the situation using
CLIENT_CONNECTION_ERROR, even though it is still inside the outer client
creation call.
For both ss and sspc, enforce at runtime that user code cannot call
lws_ss[pc]_destroy on a handle from a callback.
The error indicates the remedy (return DESTROY_ME) and asserts.
Trying to use the opaque pointer in the handle to point to the conn isn't
going to work when we need it to point to the ss handle.
Move it to have its on place in the handle.
Currently the lws_cancel_service() api only manifests itself at lws level.
This adds a state LWSSSCS_EVENT_WAIT_CANCELLED that is broadcast to all
SS in the event loop getting the cancel service api call, and allows
SS-level user code to pick up handling events from other threads.
There's a new example minimal-secure-streams-threads which shows the
pattern for other threads to communicate with and trigger the event in the
lws service thread.
If the larger application is defining vhosts using lejp-conf JSON, it's
often more convenient to describe the vhost for ss server binding to
that.
If the server policy endpoint (usually used to describe the server
interface bind) begins with '!', take the remainder of the endpoint
string as the name of a preexisting vhost to bind ss server to at
creation-time.
This provides a way to get ahold of LWS_WITH_CONMON telemetry from Secure
Streams, it works the same with direct onward connections or via the proxy.
You can mark streamtypes with a "perf": true policy attribute... this
causes the onward connections on those streamtypes to collect information
about the connection performance, and the unsorted DNS results.
Streams with that policy attribute receive extra data in their rx callback,
with the LWSSS_FLAG_PERF_JSON flag set on it, containing JSON describing the
performance of the onward connection taken from CONMON data, in a JSON
representation. Streams without the "perf" attribute set never receive
this extra rx.
The received JSON is based on the CONMON struct info and looks like
{"peer":"46.105.127.147","dns_us":596,"sockconn_us":31382,"tls_us":28180,"txn_resp_us:23015,"dns":["2001:41d0:2:ee93::1","46.105.127.147"]}
A new minimal example minimal-secure-streams-perf is added that collects
this data on an HTTP GET from warmcat.com, and is built with a -client
version as well if LWS_WITH_SECURE_STREAMS_PROXY_API is set, that operates
via the ss proxy and produces the same result at the client.
lws_ss_set_metadata can fail... eg, due to transient OOM situation... if it does,
caller must take appropriate action like disconnect and retry.
So mark the api as requiring the result checking, and make sure all the
examples do it.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
The state tracking and violation detection is very powerful at enforcing
only legal transitions, but if it's busy, we don't get to see which stream
had to problem. Add a pointer to the handle lc tag, do that rather than
just pass the handle so we can deal with ss and sspc handles cleanly.
The various stream transitions for direct ss, SSPC, smd, and
different protocols are all handled in different code, let's
stop hoping for the best and add a state transition validation
function that is used everywhere we pass a state change to a
user callback, and knows what is valid for the user state()
callback to see next, given the last state it was shown.
Let's assert if lws manages to violate that so we can find
where the problem is and provide a stricter guarantee about
what user state handler will see, no matter if ss or sspc
or other cases.
To facilitate that, move the states to start from 1, where
0 indicates the state unset.
Let's add a byte on the first message that sspc clients send,
indicating the version of the serialization protocol that the
client was built with.
Start the version at 1, we will add some more changes in other
patches and call v1 (now it has the versioning baked in)
the first real supported serialization version, this patch must
be applied with the next patches to actually represent v1
protocol changes.
This doesn't require user setting, the client is told what version
it supports in LWS_SSS_CLIENT_PROTOCOL_VERSION. The proxy knows
what version(s) it can support and loudly hangs up on the client
if it doesn't understand its protocol version.
This is a huge patch that should be a global NOP.
For unix type platforms it enables -Wconversion to issue warnings (-> error)
for all automatic casts that seem less than ideal but are normally concealed
by the toolchain.
This is things like passing an int to a size_t argument. Once enabled, I
went through all args on my default build (which build most things) and
tried to make the removed default cast explicit.
With that approach it neither change nor bloat the code, since it compiles
to whatever it was doing before, just with the casts made explicit... in a
few cases I changed some length args from int to size_t but largely left
the causes alone.
From now on, new code that is relying on less than ideal casting
will complain and nudge me to improve it by warnings.