On h2 server POST, there's a race to see if the POST body is going to be
received coalesced with the headers.
The problem is on h2, we can't action the stream http request or body until
the stream is writeable, since we may start issuing the response right away;
there's already DEFERRING_ACTION state to manage this. And indeed, the
coalesced, not-immediately-actionable POST body is buflisted properly.
However when we come to action the POST using buflisted data, we don't follow
the same pattern as dealing with the incoming data immediately.
This patch aligns the pattern dumping the buflist content to track
expected rx_content_length and handle BODY_COMPLETION if we got to
the end of it, along with removal from the pt list of wsi with pending
buflists if we used it up.
When we have to defer http_action for a stream because we may not have
any writeability, we stash any incoming body on the rx buflist for the wsi
which is good.
But when we resume under some conditions, we don't issue the _HTTP cb and
don't drain the stashed body. It's cleaned out in the close flow, but it's
broken.
This makes the deferred resume flow do the right thing under those conditions.
This is a huge patch that should be a global NOP.
For unix type platforms it enables -Wconversion to issue warnings (-> error)
for all automatic casts that seem less than ideal but are normally concealed
by the toolchain.
This is things like passing an int to a size_t argument. Once enabled, I
went through all args on my default build (which build most things) and
tried to make the removed default cast explicit.
With that approach it neither change nor bloat the code, since it compiles
to whatever it was doing before, just with the casts made explicit... in a
few cases I changed some length args from int to size_t but largely left
the causes alone.
From now on, new code that is relying on less than ideal casting
will complain and nudge me to improve it by warnings.
This adds some new objects and helpers for keeping and logging
info on grouped allocations, a group is, eg, SS handles or client
wsis.
Allocated objects get a context-unique "tag" string intended to replace
%p / wsi pointers etc. Pointers quickly become confusing when
allocations are freed and reused, the tag string won't repeat
until you produce 2^64 objects in a context.
In addition the tag string documents the object group, with prefixes
like "wsi-" or "vh-" and contain object-specific additional
information like the vhost name, address / port or the role of the wsi.
At creation time the lws code can use a format string and args
to add whatever group-specific info makes sense, eg, a wsi bound
to a secure stream can also append the guid of the secure stream,
it's copied into the new object tag and so is still available
cleanly after the stream is destroyed if the wsi outlives it.
A few different places want to create wsis and basically repeat their
own versions of the flow. Let's unify it into one helper in wsi.c
Also require the context lock held (this only impacts LWS_MAX_SMP > 1)
http content-length is not mandatory on POST, making a whole
bunch of difficulties. On h2, the client will set the stream
half-closed by DATA with END_STREAM flag when it is done.
Improve the post data tracking to understand that situation
properly.
Chrome has started being able to issue frame type 0x42, we drop the connection
before we realize we wanted to ignore it.
This explicitly ignores it a bit earlier.
role ops are usually only sparsely filled, there are currently 20
function pointers but several roles only fill in two. No single
role has more than 14 of the ops. On a 32/64 bit build this part
of the ops struct takes a fixed 80 / 160 bytes then.
First reduce the type of the callback reason part from uint16_t to
uint8_t, this saves 12 bytes unconditionally.
Change to a separate function pointer array with a nybble index
array, it costs 10 bytes for the index and a pointer to the
separate array, for 32-bit the cost is
2 + (4 x ops_used)
and for 64-bit
6 + (8 x ops_used)
for 2 x ops_used it means 32-bit: 10 vs 80 / 64-bit: 22 vs 160
For a typical system with h1 (9), h2 (14), listen (2), netlink (2),
pipe (1), raw_skt (3), ws (12), == 43 ops_used out of 140, it means
the .rodata for this reduced from 32-bit: 560 -> 174 (386 byte
saving) and 64-bit: 1120 -> 350 (770 byte saving)
This doesn't account for the changed function ops calling code, two
ways were tried, a preprocessor macro and explicit functions
For an x86_64 gcc 10 build with most options, release mode,
.text + .rodata
before patch: 553282
accessor macro: 552714 (568 byte saving)
accessor functions: 553674 (392 bytes worse than without patch)
therefore we went with the macros
RFC6724 defines an ipv6-centric DNS result sorting algorithm, that
takes route and source address route information for the results
given by the DNS resolution, and sorts them in order of preferability,
which defines the order they should be tried in.
If LWS_WITH_NETLINK, then lws takes care about collecting and monitoring
the interface, route and source address information, and uses it to
perform the RFC6724 sorting to re-sort the DNS before trying to make
the connections.
OSX changed to blow a segfault on write to .rodata, exposing that
we're dropping a NUL in what can be .rodata to set the environment
manually. We don't do this on Linux typically because we take the
code path where execvpe() is available to do the env for us.
Adapt the code to treat it as const, and underscore it by changing
its type to be const char ** in the info struct.
This creates a role for RFC3549 Netlink monitoring.
If the OS supports it (currently, linux) then each pt creates a wsi
with the netlink role and dumps the current routing table at pt init.
It then maintains a cache of the routing table in each pt.
Upon routing table changes an SMD message is issued as an event, and
Captive Portal Detection is triggered.
All of the pt's current connections are reassessed for routability under
the changed routing table, those that no longer have a valid route or
gateway are closed.
If we connect out to an IP address, or we adopt a connected socket,
from now on we want to hold the peer sockaddr in the wsi.
Adapt ACCESS_LOG to use this new copy rather than keep the
stringified version.
They have been in lib/roles/http for historical reasons, and all
ended up in client-handshake.c that doesn't describe what they
actually do any more. Separate out the staged client connect
related stage functions into
lib/core-net/client/client2.c: lws_client_connect_2_dnsreq()
lib/core-net/client/client3.c: lws_client_connect_3_connect()
lib/core-net/client/client4.c: lws_client_connect_4_established()
Move a couple of other functions from there that don't belong out to
tls-client.c and client-http.c, which is related to http and remains
in the http role dir.
Teach lws how to deal with date: and retry-after:
Add quick selftest into apt-test-lws_tokenize
Expand lws_retry_sul_schedule_retry_wsi() to check for retry_after and
increase the backoff if a larger one found.
Finally, change SS h1 protocol to handle 503 + retry-after: as a
failure, and apply any increased backoff from retry-after
automatically.
At the moment you can define and set per-stream metadata at the client,
which will be string-substituted and if configured in the policy, set in
related outgoing protocol specific content like h1 headers.
This patch extends the metadata concept to also check incoming protocol-
specific content like h1 headers and where it matches the binding in the
streamtype's metadata entry, make it available to the client by name, via
a new lws_ss_get_metadata() api.
Currently warmcat.com has additional headers for
server: lwsws (well-known header name)
test-custom-header: hello (custom header name)
minimal-secure-streams test is updated to try to recover these both
in direct and -client (via proxy) versions. The corresponding metadata
part of the "mintest" stream policy from warmcat.com is
{
"srv": "server:"
}, {
"test": "test-custom-header:"
},
If built direct, or at the proxy, the stream has access to the static
policy metadata definitions and can store the rx metadata in the stream
metadata allocation, with heap-allocated a value. For client side that
talks to a proxy, only the proxy knows the policy, and it returns rx
metadata inside the serialized link to the client, which stores it on
the heap attached to the stream.
In addition an optimization for mapping static policy metadata definitions
to individual stream handle metadata is changed to match by name.
This is complicated by the fact extern on a function declaration implies
visibility... we have to make LWS_EXTERN empty when building static.
And, setting target_compile_definitions() doesn't work inside macros,
so it has to be set explicitly for the plugins.
Checking the symbol status needs nm -C -D as per
https://stackoverflow.com/questions/37934388/symbol-visibility-not-working-as-expected
after this patch, libwebsockets.a shows no symbols when checked like that and
the static-linked minimal examples only show -U for their other dynamic
imports.
In a handful of cases we use LWS_EXTERN on extern data declarations,
those then need to change to explicit extern.
Formalize the LWSSSSRET_ enums into a type "lws_ss_state_return_t"
returned by the rx, tx and state callbacks, and some private helpers
lws_ss_backoff() and lws_ss_event_helper().
Remove LWSSSSRET_SS_HANDLE_DESTROYED concept... the two helpers that could
have destroyed the ss and returned that, now return LWSSSSRET_DESTROY_ME
to the caller to perform or pass up to their caller instead.
Handle helper returns in all the ss protocols and update the rx / tx
calls to have their returns from rx / tx / event helper and ss backoff
all handled by unified code.
Event lib support as it has been isn't scaling well, at the low level
libevent and libev headers have a namespace conflict so they can't
both be built into the same image, and at the distro level, binding
all the event libs to libwebsockets.so makes a bloaty situation for
packaging, lws will drag in all the event libs every time.
This patch implements the plan discussed here
https://github.com/warmcat/libwebsockets/issues/1980
and refactors the event lib support so they are built into isolated
plugins and bound at runtime according to what the application says
it wants to use. The event lib plugins can be packaged individually
so that only the needed sets of support are installed (perhaps none
of them if the user code is OK with the default poll() loop). And
dependent user code can mark the specific event loop plugin package
as required so pieces are added as needed.
The eventlib-foreign example is also refactored to build the selected
lib support isolated.
A readme is added detailing the changes and how to use them.
https://libwebsockets.org/git/libwebsockets/tree/READMEs/README.event-libs.md
Move the common plugin scanning dir stuff to be based on lws_dir, which
already builds for windows. Previously this was done via dirent for unix
and libuv for windows.
Reduce the dl plat stuff to just wrap instantiation and destruction of
dynlibs, establish common code in lib/misc/dir.c for plugin scanning
itself.
Migrate the libuv windows dl stuff to windows-plugins.c, so that he's
available even if later libuv loop support becomes and event lib plugin.
Remove the existing api exports scheme for plugins, just export a const struct
now which has a fixed header type but then whatever you want afterwards depending
on the class / purpose of the plugin. Place a "class" string in the header so
there can be different kinds of plugins implying different types exported.
Make the plugin apis public and add support for filter by class string, and
per instantation / destruction callbacks so the subclassed header type can
do its thing for the plugin class. The user provides a linked-list base
for his class of plugins, so he can manage them completely separately and
in user code / user export types.
Rip out some last hangers-on from generic sessions / tables.
This is all aimed at making the plugins support general enough so it can
provide event lib plugins later.