The FREERTOS plat has its own h2 SETTINGS, but although they are used, they
get overridden with the lws default SETTINGS during pt init.
Let's not do that if someone else has already touched the context set.
Until now although we can follow redirects, and they can promote the
protocol from h1->h2, we couldn't handle h2 wsi reuse since there are many
states in the wsi affected by being h2.
This wipes the related states in lws_wsi_reset() and follows the generic
wsi close flow before deviating into the redirect really close to the end,
ensuring we cleaned out evidence of our previous life properly.
h2->h2 redirects work properly after this.
The max number of redirects is increased from 3 -> 4 since this was seen in
the wild with www and then geographic-based redirects.
There are a few build options that are trying to keep and report
various statistics
- DETAILED_LATENCY
- SERVER_STATUS
- WITH_STATS
remove all those and establish a generic rplacement, lws_metrics.
lws_metrics makes its stats available via an lws_system ops function
pointer that the user code can set.
Openmetrics export is supported, for, eg, prometheus scraping.
On h2 server POST, there's a race to see if the POST body is going to be
received coalesced with the headers.
The problem is on h2, we can't action the stream http request or body until
the stream is writeable, since we may start issuing the response right away;
there's already DEFERRING_ACTION state to manage this. And indeed, the
coalesced, not-immediately-actionable POST body is buflisted properly.
However when we come to action the POST using buflisted data, we don't follow
the same pattern as dealing with the incoming data immediately.
This patch aligns the pattern dumping the buflist content to track
expected rx_content_length and handle BODY_COMPLETION if we got to
the end of it, along with removal from the pt list of wsi with pending
buflists if we used it up.
When we have to defer http_action for a stream because we may not have
any writeability, we stash any incoming body on the rx buflist for the wsi
which is good.
But when we resume under some conditions, we don't issue the _HTTP cb and
don't drain the stashed body. It's cleaned out in the close flow, but it's
broken.
This makes the deferred resume flow do the right thing under those conditions.
This is a huge patch that should be a global NOP.
For unix type platforms it enables -Wconversion to issue warnings (-> error)
for all automatic casts that seem less than ideal but are normally concealed
by the toolchain.
This is things like passing an int to a size_t argument. Once enabled, I
went through all args on my default build (which build most things) and
tried to make the removed default cast explicit.
With that approach it neither change nor bloat the code, since it compiles
to whatever it was doing before, just with the casts made explicit... in a
few cases I changed some length args from int to size_t but largely left
the causes alone.
From now on, new code that is relying on less than ideal casting
will complain and nudge me to improve it by warnings.
This adds some new objects and helpers for keeping and logging
info on grouped allocations, a group is, eg, SS handles or client
wsis.
Allocated objects get a context-unique "tag" string intended to replace
%p / wsi pointers etc. Pointers quickly become confusing when
allocations are freed and reused, the tag string won't repeat
until you produce 2^64 objects in a context.
In addition the tag string documents the object group, with prefixes
like "wsi-" or "vh-" and contain object-specific additional
information like the vhost name, address / port or the role of the wsi.
At creation time the lws code can use a format string and args
to add whatever group-specific info makes sense, eg, a wsi bound
to a secure stream can also append the guid of the secure stream,
it's copied into the new object tag and so is still available
cleanly after the stream is destroyed if the wsi outlives it.
role ops are usually only sparsely filled, there are currently 20
function pointers but several roles only fill in two. No single
role has more than 14 of the ops. On a 32/64 bit build this part
of the ops struct takes a fixed 80 / 160 bytes then.
First reduce the type of the callback reason part from uint16_t to
uint8_t, this saves 12 bytes unconditionally.
Change to a separate function pointer array with a nybble index
array, it costs 10 bytes for the index and a pointer to the
separate array, for 32-bit the cost is
2 + (4 x ops_used)
and for 64-bit
6 + (8 x ops_used)
for 2 x ops_used it means 32-bit: 10 vs 80 / 64-bit: 22 vs 160
For a typical system with h1 (9), h2 (14), listen (2), netlink (2),
pipe (1), raw_skt (3), ws (12), == 43 ops_used out of 140, it means
the .rodata for this reduced from 32-bit: 560 -> 174 (386 byte
saving) and 64-bit: 1120 -> 350 (770 byte saving)
This doesn't account for the changed function ops calling code, two
ways were tried, a preprocessor macro and explicit functions
For an x86_64 gcc 10 build with most options, release mode,
.text + .rodata
before patch: 553282
accessor macro: 552714 (568 byte saving)
accessor functions: 553674 (392 bytes worse than without patch)
therefore we went with the macros
They have been in lib/roles/http for historical reasons, and all
ended up in client-handshake.c that doesn't describe what they
actually do any more. Separate out the staged client connect
related stage functions into
lib/core-net/client/client2.c: lws_client_connect_2_dnsreq()
lib/core-net/client/client3.c: lws_client_connect_3_connect()
lib/core-net/client/client4.c: lws_client_connect_4_established()
Move a couple of other functions from there that don't belong out to
tls-client.c and client-http.c, which is related to http and remains
in the http role dir.
We can't get here without testing for COLON_PATH existing in http2.c as part of
the h2spec pass code.
if (!lws_hdr_total_length(h2n->swsi, WSI_TOKEN_HTTP_COLON_PATH) ||
!lws_hdr_total_length(h2n->swsi, WSI_TOKEN_HTTP_COLON_METHOD) ||
!lws_hdr_total_length(h2n->swsi, WSI_TOKEN_HTTP_COLON_SCHEME) ||
lws_hdr_total_length(h2n->swsi, WSI_TOKEN_HTTP_COLON_STATUS) ||
lws_hdr_extant(h2n->swsi, WSI_TOKEN_CONNECTION)) {
lws_h2_goaway(wsi, H2_ERR_PROTOCOL_ERROR,
"Pseudoheader checks");
break;
}
So there is no issue. But show Coverity what it wants so we don't keep getting this
false positive reported by different coverity users.
Currently we always reserve a fakewsi per pt so events that don't have a related actual
wsi, like vhost-protocol-init or vhost cert init via protocol callback can make callbacks
that look reasonable to user protocol handler code expecting a valid wsi every time.
This patch splits out stuff that user callbacks often unconditionally expect to be in
a wsi, like context pointer, vhost pointer etc into a substructure, which is composed
into struct lws at the top of it. Internal references (struct lws is opaque, so there
are only internal references) are all updated to go via the substructre, the compiler
should make that a NOP.
Helpers are added when fakewsi is used and referenced.
If not PLAT_FREERTOS, we continue to provide a full fakewsi in the pt as before,
although the helpers improve consistency by zeroing down the substructure. There is
a huge amount of user code out there over the last 10 years that did not always have
the minimal examples to follow, some of it does some unexpected things.
If it is PLAT_FREERTOS, that is a newer thing in lws and users have the benefit of
being able to follow the minimal examples' approach. For PLAT_FREERTOS we don't
reserve the fakewsi in the pt any more, saving around 800 bytes. The helpers then
create a struct lws_a (the substructure) on the stack, zero it down (but it is only
like 4 pointers) and prepare it with whatever we know like the context.
Then we cast it to a struct lws * and use it in the user protocol handler call.
In this case, the remainder of the struct lws is undefined. However the amount of
old protocol handlers that might touch things outside of the substructure in
PLAT_FREERTOS is very limited compared to legacy lws user code and the saving is
significant on constrained devices.
User handlers should not be touching everything in a wsi every time anyway, there
are several cases where there is no valid wsi to do the call with. Dereference of
things outside the substructure should only happen when the callback reason shows
there is a valid wsi bound to the activity (as in all the minimal examples).
This leads to problems at the moment with sticky mux.requested_POLLOUT
causing writeable to not be sent.
Remove it and always set writeable on parents for now.
This adds support for POST in both h1 and h2 queues / stream binding.
The previous queueing tried to keep the "leader" wsi who made the
actual connection around and have it act on the transaction queue
tail if it had done its own thing.
This refactors it so instead, who is the "leader" moves down the
queue and the queued guys inherit the fd, SSL * and queue from the
old leader as they take over.
This lets them operate in their own wsi identity directly and gets
rid of all the "effective wsi" checks, which was applied incompletely
and getting out of hand considering the separate lws_mux checks for
h2 and other muxed protocols alongside it.
This change also allows one wsi at a time to own the transaction for
POST. --post is added as an option to lws-minimal-http-client-multi
and 6 extra selftests with POST on h1/h2, pipelined or not and
staggered or not are added to the CI.
Add selectable event lib support to minimal-http-client-multi and
clean up context destroy flow so we can use lws_destroy_context() from
inside the callback to indicate we want to end the event loop, without
using the traditional "interrupted" flag and in a way that works no
matter which event loop backend is being used.
This provides support to build lws using the linkit 7697 public SDK
from here https://docs.labs.mediatek.com/resource/mt7687-mt7697/en/downloads
This toolchain has some challenges, its int32_t / uint32_t are long,
so assumptions about format strings for those being %u / %d / %x all
break. This fixes all the cases for the features enabled by the
default cmake settings.
There are some minor public api type improvements rather than cast everywhere
inside lws and user code to work around them... these changed from int to
size_t
- lws_buflist_use_segment() return
- lws_tokenize_t .len and .token_len
- lws_tokenize_cstr() length
- lws_get_peer_simple() namelen
- lws_get_peer_simple_fd() namelen, int fd -> lws_sockfd_type fd
- lws_write_numeric_address() len
- lws_sa46_write_numeric_address() len
These changes are typically a NOP for user code
This changes the approach of tx credit management to set the
initial stream tx credit window to zero. This is the only way
with RFC7540 to gain the ability to selectively precisely rx
flow control incoming streams.
At the time the headers are sent, a WINDOW_UPDATE is sent with
the initial tx credit towards us for that specific stream. By
default, this acts as before with a 256KB window added for both
the stream and the nwsi, and additional window management sent
as stuff is received.
It's now also possible to set a member in the client info
struct and a new option LCCSCF_H2_MANUAL_RXFLOW to precisely
manage both the initial tx credit for a specific stream and
the ongoing rate limit by meting out further tx credit
manually.
Add another minimal example http-client-h2-rxflow demonstrating how
to force a connection's peer's initial budget to transmit to us
and control it during the connection lifetime to restrict the amount
of incoming data we have to buffer.
This should be a NOP for h2 support and only affects internal
apis. But it lets us reuse the working and reliable h2 mux
arrangements directly in other protocols later, and share code
so building for h2 + new protocols can take advantage of common
mux child handling struct and code.
Break out common mux handling struct into its own type.
Convert all uses of members that used to be in wsi->h2 to wsi->mux
Audit all references to the members and break out generic helpers
for anything that is useful for other mux-capable protocols to
reuse wsi->mux related features.
This affects max header size since we use the latter half
of the pt_serv_buf to prepare the (possibly huge) auth token.
Adapt the pt_serv_buf_size in the hugeurl example.
Refactor everything around ping / pong handling in ws and h2, so there
is instead a protocol-independent validity lws_sul tracking how long it
has been since the last exchange that confirms the operation of the
network connection in both directions.
Clean out periodic role callback and replace the last two role users
with discrete lws_sul for each pt.
It was already correct but add helpers to isolate and deduplicate
processing adding and closing a generically immortal stream.
Change the default 31s h2 network connection timeout to be settable
by .keepalive_timeout if nonzero.
Add a public api allowing a client h2 stream to transition to
half-closed LOCAL (by sending a 0-byte DATA with END_STREAM) and
mark itself as immortal to create a read-only long-poll stream
if the server allows it.
Add a vhost server option flag LWS_SERVER_OPTION_VH_H2_HALF_CLOSED_LONG_POLL
which allows the vhost to treat half-closed remotes as immortal long
poll streams.
With http, the protocol doesn't indicate where the headers end and the
next transaction or body begin. Until now, we handled that for client
header response parsing by reading from the tls buffer bytewise.
This modernizes the code to read in up to 256-byte chunks and parse
the chunks in one hit (the parse API is already set up for doing this
elsewhere).
Now we have a generic input buflist, adapt the parser loop to go through
that and arrange that any leftovers are placed on there.
Improve the code around stash, getting rid of the strdups for a net
code reduction. Remove the special destroy helper for stash since
it becomes a one-liner.
Trade several stack allocs in the client reset function for a single
sized brief heap alloc to reduce peak stack alloc by around 700 bytes.