This adds some new objects and helpers for keeping and logging
info on grouped allocations, a group is, eg, SS handles or client
wsis.
Allocated objects get a context-unique "tag" string intended to replace
%p / wsi pointers etc. Pointers quickly become confusing when
allocations are freed and reused, the tag string won't repeat
until you produce 2^64 objects in a context.
In addition the tag string documents the object group, with prefixes
like "wsi-" or "vh-" and contain object-specific additional
information like the vhost name, address / port or the role of the wsi.
At creation time the lws code can use a format string and args
to add whatever group-specific info makes sense, eg, a wsi bound
to a secure stream can also append the guid of the secure stream,
it's copied into the new object tag and so is still available
cleanly after the stream is destroyed if the wsi outlives it.
A few different places want to create wsis and basically repeat their
own versions of the flow. Let's unify it into one helper in wsi.c
Also require the context lock held (this only impacts LWS_MAX_SMP > 1)
role ops are usually only sparsely filled, there are currently 20
function pointers but several roles only fill in two. No single
role has more than 14 of the ops. On a 32/64 bit build this part
of the ops struct takes a fixed 80 / 160 bytes then.
First reduce the type of the callback reason part from uint16_t to
uint8_t, this saves 12 bytes unconditionally.
Change to a separate function pointer array with a nybble index
array, it costs 10 bytes for the index and a pointer to the
separate array, for 32-bit the cost is
2 + (4 x ops_used)
and for 64-bit
6 + (8 x ops_used)
for 2 x ops_used it means 32-bit: 10 vs 80 / 64-bit: 22 vs 160
For a typical system with h1 (9), h2 (14), listen (2), netlink (2),
pipe (1), raw_skt (3), ws (12), == 43 ops_used out of 140, it means
the .rodata for this reduced from 32-bit: 560 -> 174 (386 byte
saving) and 64-bit: 1120 -> 350 (770 byte saving)
This doesn't account for the changed function ops calling code, two
ways were tried, a preprocessor macro and explicit functions
For an x86_64 gcc 10 build with most options, release mode,
.text + .rodata
before patch: 553282
accessor macro: 552714 (568 byte saving)
accessor functions: 553674 (392 bytes worse than without patch)
therefore we went with the macros
If we connect out to an IP address, or we adopt a connected socket,
from now on we want to hold the peer sockaddr in the wsi.
Adapt ACCESS_LOG to use this new copy rather than keep the
stringified version.
Event lib support as it has been isn't scaling well, at the low level
libevent and libev headers have a namespace conflict so they can't
both be built into the same image, and at the distro level, binding
all the event libs to libwebsockets.so makes a bloaty situation for
packaging, lws will drag in all the event libs every time.
This patch implements the plan discussed here
https://github.com/warmcat/libwebsockets/issues/1980
and refactors the event lib support so they are built into isolated
plugins and bound at runtime according to what the application says
it wants to use. The event lib plugins can be packaged individually
so that only the needed sets of support are installed (perhaps none
of them if the user code is OK with the default poll() loop). And
dependent user code can mark the specific event loop plugin package
as required so pieces are added as needed.
The eventlib-foreign example is also refactored to build the selected
lib support isolated.
A readme is added detailing the changes and how to use them.
https://libwebsockets.org/git/libwebsockets/tree/READMEs/README.event-libs.md
Currently we always reserve a fakewsi per pt so events that don't have a related actual
wsi, like vhost-protocol-init or vhost cert init via protocol callback can make callbacks
that look reasonable to user protocol handler code expecting a valid wsi every time.
This patch splits out stuff that user callbacks often unconditionally expect to be in
a wsi, like context pointer, vhost pointer etc into a substructure, which is composed
into struct lws at the top of it. Internal references (struct lws is opaque, so there
are only internal references) are all updated to go via the substructre, the compiler
should make that a NOP.
Helpers are added when fakewsi is used and referenced.
If not PLAT_FREERTOS, we continue to provide a full fakewsi in the pt as before,
although the helpers improve consistency by zeroing down the substructure. There is
a huge amount of user code out there over the last 10 years that did not always have
the minimal examples to follow, some of it does some unexpected things.
If it is PLAT_FREERTOS, that is a newer thing in lws and users have the benefit of
being able to follow the minimal examples' approach. For PLAT_FREERTOS we don't
reserve the fakewsi in the pt any more, saving around 800 bytes. The helpers then
create a struct lws_a (the substructure) on the stack, zero it down (but it is only
like 4 pointers) and prepare it with whatever we know like the context.
Then we cast it to a struct lws * and use it in the user protocol handler call.
In this case, the remainder of the struct lws is undefined. However the amount of
old protocol handlers that might touch things outside of the substructure in
PLAT_FREERTOS is very limited compared to legacy lws user code and the saving is
significant on constrained devices.
User handlers should not be touching everything in a wsi every time anyway, there
are several cases where there is no valid wsi to do the call with. Dereference of
things outside the substructure should only happen when the callback reason shows
there is a valid wsi bound to the activity (as in all the minimal examples).
lejp_parse() return type is an int... but in the function, the temp
for it is a char. This leads to badness that is currently worked
around by casting the return through a signed char type.
But that leads to more badness since if there's >127 bytes of buffer
left after the end of the JSON object, we misreport it.
Bite the bullet and fix the temp type, and fix up all the guys
who were working around it at the caller return casting to use the
resulting straight int.
If you are using this api, remove any casting you may have cut-
and-pasted like this
n = (int)(signed char)lejp_parse(...);
... to just be like this...
n = lejp_parse(...);
If a user sets a default filename for a http mount (.def in lws_http_mount),
eg. 'default.html', then a GET request for '/' correctly forwards to
'/default.html'.
However, without this commit the default filename is not taken into account for subdirectories. Thus,
GET subdir/
will forward to
'subdir/index.html'
instead of the expected
'subdir/default.html'
This commit changes the behavior such that the user-provided default filename is also used for subdirectories.
Secure Streams is an optional layer on top of lws that separates policy
like endpoint selection and tls cert validation into a device JSON
policy document.
Code that wants to open a client connection just specifies a streamtype name,
and no longer deals with details like the endpoint, the protocol (!) or anything
else other than payloads and optionally generic metadata; the JSON policy
contains all the details for each streamtype. h1, h2, ws and mqtt client
connections are supported.
Logical secure streams outlive any particular connection and supports "nailed-up"
connectivity regardless of underlying connection stability.
Headers related to ws or h2 are now elided if the ws or h2 role
is not enabled for build. In addition, a new build-time option
LWS_WITH_HTTP_UNCOMMON_HEADERS on by default allows removal of
less-common http headers to shrink the parser footprint.
Minilex is adapted to produce 8 different versions of the lex
table, chosen at build-time according to which headers are
included in the build.
If you don't need the unusual headers, or aren't using h2 or ws,
this chops down the size of the ah and the rodata needed to hold
the parsing table from 87 strings / pointers to 49, and the
parsing table from 1177 to 696 bytes.
The vfork optimized spawn, stdxxx and terminal handling in the cgi
implementation is quite mature and sophisticated, and useful for
other things unrelated to cgi. Break it out into its own public
api under LWS_WITH_SPAWN, off by default.
Expand it so the parent wsi is optional, and the role and protocol
bindings for stdxxx pipes can be set. Allow optional sul timeout
and external lws_dll2 owner for extant children.
Remove inline style from minimal http-server-cgi
This provides support to build lws using the linkit 7697 public SDK
from here https://docs.labs.mediatek.com/resource/mt7687-mt7697/en/downloads
This toolchain has some challenges, its int32_t / uint32_t are long,
so assumptions about format strings for those being %u / %d / %x all
break. This fixes all the cases for the features enabled by the
default cmake settings.
There are some minor public api type improvements rather than cast everywhere
inside lws and user code to work around them... these changed from int to
size_t
- lws_buflist_use_segment() return
- lws_tokenize_t .len and .token_len
- lws_tokenize_cstr() length
- lws_get_peer_simple() namelen
- lws_get_peer_simple_fd() namelen, int fd -> lws_sockfd_type fd
- lws_write_numeric_address() len
- lws_sa46_write_numeric_address() len
These changes are typically a NOP for user code
This changes the approach of tx credit management to set the
initial stream tx credit window to zero. This is the only way
with RFC7540 to gain the ability to selectively precisely rx
flow control incoming streams.
At the time the headers are sent, a WINDOW_UPDATE is sent with
the initial tx credit towards us for that specific stream. By
default, this acts as before with a 256KB window added for both
the stream and the nwsi, and additional window management sent
as stuff is received.
It's now also possible to set a member in the client info
struct and a new option LCCSCF_H2_MANUAL_RXFLOW to precisely
manage both the initial tx credit for a specific stream and
the ongoing rate limit by meting out further tx credit
manually.
Add another minimal example http-client-h2-rxflow demonstrating how
to force a connection's peer's initial budget to transmit to us
and control it during the connection lifetime to restrict the amount
of incoming data we have to buffer.
This should be a NOP for h2 support and only affects internal
apis. But it lets us reuse the working and reliable h2 mux
arrangements directly in other protocols later, and share code
so building for h2 + new protocols can take advantage of common
mux child handling struct and code.
Break out common mux handling struct into its own type.
Convert all uses of members that used to be in wsi->h2 to wsi->mux
Audit all references to the members and break out generic helpers
for anything that is useful for other mux-capable protocols to
reuse wsi->mux related features.