Coverity does not understand that once we checked that the header has
a non-zero length, the associated pointer can never be NULL. Add a
pointless check to make it happy.
This adds an indication of dns disposition to the conmon results,
and for http, if it gets that far a protocol-specific indication
of http response code.
Add a way to confirm that the ss handle recovered from a ss wsi is still
valid, by walking the pt ss list and confirming it is on there before using
it with conmon.
If it isn't, it will assert.
Normally when doing a Client Connection Error handling,
we can action any ss relationship straight away since
we are in a wsi callback without any ss-aware parents
in the call stack.
But in the specific case we're doing the initial onward
wsi connection part on behalf of a ss, in fact the call
stack does have earlier parents holding references on
the related ss.
For example
secstream_h1 (ss-h1.c:470) CCE
lws_inform_client_conn_fail (close.c:319) fails early
lws_client_connect_2_dnsreq (connect2.c:349)
lws_http_client_connect_via_info2 (connect.c:71)
lws_header_table_attach (parsers.c:291)
rops_client_bind_h1 (ops-h1.c:1001)
lws_client_connect_via_info (connect.c:429) start onward connect
_lws_ss_client_connect (secure-streams.c:859)
_lws_ss_request_tx (secure-streams.c:1577)
lws_ss_request_tx (secure-streams.c:1515) request tx
ss_cpd_state (captive-portal-detect.c:50)
lws_ss_event_helper (secure-streams.c:408)
lws_ss_create (secure-streams.c:1256) SS Create
Under these conditions, we can't action the DESTROY_ME that
is coming when the CCE exhausts the retries.
This patch adds a flag that is set during the SS's onward wsi
connection attempt and causes it to stash rather than action
the result code.
The result code is brought out from the stash when we return to
_lws_ss_client_connect level, and passed up in the SS flow until
it is actioned, cleanly aborting the ss create.
Add -Wextra (with -Wno-unused-parameter) to unix builds in addition to
-Wall -Werror.
This can successfully build everything in Sai without warnings / errors.
In sai, on Xenial (only...) noticed that the wsi is still bound to the ss
handle, and can reference it even after the ss has been destroyed on
ss-testsfail sometimes.
Leave the handle knowing its wsi and able to detach it later during close.
User reports problems with the close / retry flow not happening if we don't
pass thru the nwsi close... it may be happening before the sid1 migration.
Just log it and don't end the handling before the passthru. Logging it
because there was a reason for the change to not passing it through...
Defer recording the ss metrics histogram until wsi close, so it has a
chance to collect all the tags that apply.
Defer dumping metrics until the FINALIZE phase of context destroy, so we
had a chance to get any metrics recorded.
This provides a way to get ahold of LWS_WITH_CONMON telemetry from Secure
Streams, it works the same with direct onward connections or via the proxy.
You can mark streamtypes with a "perf": true policy attribute... this
causes the onward connections on those streamtypes to collect information
about the connection performance, and the unsorted DNS results.
Streams with that policy attribute receive extra data in their rx callback,
with the LWSSS_FLAG_PERF_JSON flag set on it, containing JSON describing the
performance of the onward connection taken from CONMON data, in a JSON
representation. Streams without the "perf" attribute set never receive
this extra rx.
The received JSON is based on the CONMON struct info and looks like
{"peer":"46.105.127.147","dns_us":596,"sockconn_us":31382,"tls_us":28180,"txn_resp_us:23015,"dns":["2001:41d0:2:ee93::1","46.105.127.147"]}
A new minimal example minimal-secure-streams-perf is added that collects
this data on an HTTP GET from warmcat.com, and is built with a -client
version as well if LWS_WITH_SECURE_STREAMS_PROXY_API is set, that operates
via the ss proxy and produces the same result at the client.
Setting the CONNECTED state only when SUBACK is received if the stream has
defined a subscription topic. This is to avoid SS from sending out SUBSCRIBE
right after CONNACK, even when the connection is not valid.
Until now we set metadata value pointers into the onward wsi ah data
area... that's OK until we get a situation the wsi has gone away before we
have a chance to deliver the metadata over the proxy link.
Add a variant lws_ss_alloc_set_metadata() that allocates space on the heap
and takes a copy of the input metadata. Change ss-h1 to alloc copies of
its metadata so we no longer race the wsi ah lifetime.
lws_ss_set_metadata can fail... eg, due to transient OOM situation... if it does,
caller must take appropriate action like disconnect and retry.
So mark the api as requiring the result checking, and make sure all the
examples do it.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
The various stream transitions for direct ss, SSPC, smd, and
different protocols are all handled in different code, let's
stop hoping for the best and add a state transition validation
function that is used everywhere we pass a state change to a
user callback, and knows what is valid for the user state()
callback to see next, given the last state it was shown.
Let's assert if lws manages to violate that so we can find
where the problem is and provide a stricter guarantee about
what user state handler will see, no matter if ss or sspc
or other cases.
To facilitate that, move the states to start from 1, where
0 indicates the state unset.
This is a huge patch that should be a global NOP.
For unix type platforms it enables -Wconversion to issue warnings (-> error)
for all automatic casts that seem less than ideal but are normally concealed
by the toolchain.
This is things like passing an int to a size_t argument. Once enabled, I
went through all args on my default build (which build most things) and
tried to make the removed default cast explicit.
With that approach it neither change nor bloat the code, since it compiles
to whatever it was doing before, just with the casts made explicit... in a
few cases I changed some length args from int to size_t but largely left
the causes alone.
From now on, new code that is relying on less than ideal casting
will complain and nudge me to improve it by warnings.
This adds some new objects and helpers for keeping and logging
info on grouped allocations, a group is, eg, SS handles or client
wsis.
Allocated objects get a context-unique "tag" string intended to replace
%p / wsi pointers etc. Pointers quickly become confusing when
allocations are freed and reused, the tag string won't repeat
until you produce 2^64 objects in a context.
In addition the tag string documents the object group, with prefixes
like "wsi-" or "vh-" and contain object-specific additional
information like the vhost name, address / port or the role of the wsi.
At creation time the lws code can use a format string and args
to add whatever group-specific info makes sense, eg, a wsi bound
to a secure stream can also append the guid of the secure stream,
it's copied into the new object tag and so is still available
cleanly after the stream is destroyed if the wsi outlives it.
Since client_connect and request_tx can be called from code that expects
the ss handle to be in scope, these calls can't deal with destroying the
ss handle and must pass the lws_ss_state_return_t disposition back to
the caller to handle.