role ops are usually only sparsely filled, there are currently 20
function pointers but several roles only fill in two. No single
role has more than 14 of the ops. On a 32/64 bit build this part
of the ops struct takes a fixed 80 / 160 bytes then.
First reduce the type of the callback reason part from uint16_t to
uint8_t, this saves 12 bytes unconditionally.
Change to a separate function pointer array with a nybble index
array, it costs 10 bytes for the index and a pointer to the
separate array, for 32-bit the cost is
2 + (4 x ops_used)
and for 64-bit
6 + (8 x ops_used)
for 2 x ops_used it means 32-bit: 10 vs 80 / 64-bit: 22 vs 160
For a typical system with h1 (9), h2 (14), listen (2), netlink (2),
pipe (1), raw_skt (3), ws (12), == 43 ops_used out of 140, it means
the .rodata for this reduced from 32-bit: 560 -> 174 (386 byte
saving) and 64-bit: 1120 -> 350 (770 byte saving)
This doesn't account for the changed function ops calling code, two
ways were tried, a preprocessor macro and explicit functions
For an x86_64 gcc 10 build with most options, release mode,
.text + .rodata
before patch: 553282
accessor macro: 552714 (568 byte saving)
accessor functions: 553674 (392 bytes worse than without patch)
therefore we went with the macros
RFC6724 defines an ipv6-centric DNS result sorting algorithm, that
takes route and source address route information for the results
given by the DNS resolution, and sorts them in order of preferability,
which defines the order they should be tried in.
If LWS_WITH_NETLINK, then lws takes care about collecting and monitoring
the interface, route and source address information, and uses it to
perform the RFC6724 sorting to re-sort the DNS before trying to make
the connections.
Some general debugging advice but also really clarify the official way of how to dump
what is going out and coming in directly from the tls tunnel, so you can see the
actual data unencrypted.
Add initial support for defining servers using Secure Streams
policy and api semantics.
Serving h1, h2 and ws should be functional, the new minimal
example shows a combined http + SS server with an incrementing
ws message shown in the browser over tls, in around 200 lines
of user code.
NOP out anything to do with plugins, they're not currently used.
Update the docs correspondingly.
Currently we always reserve a fakewsi per pt so events that don't have a related actual
wsi, like vhost-protocol-init or vhost cert init via protocol callback can make callbacks
that look reasonable to user protocol handler code expecting a valid wsi every time.
This patch splits out stuff that user callbacks often unconditionally expect to be in
a wsi, like context pointer, vhost pointer etc into a substructure, which is composed
into struct lws at the top of it. Internal references (struct lws is opaque, so there
are only internal references) are all updated to go via the substructre, the compiler
should make that a NOP.
Helpers are added when fakewsi is used and referenced.
If not PLAT_FREERTOS, we continue to provide a full fakewsi in the pt as before,
although the helpers improve consistency by zeroing down the substructure. There is
a huge amount of user code out there over the last 10 years that did not always have
the minimal examples to follow, some of it does some unexpected things.
If it is PLAT_FREERTOS, that is a newer thing in lws and users have the benefit of
being able to follow the minimal examples' approach. For PLAT_FREERTOS we don't
reserve the fakewsi in the pt any more, saving around 800 bytes. The helpers then
create a struct lws_a (the substructure) on the stack, zero it down (but it is only
like 4 pointers) and prepare it with whatever we know like the context.
Then we cast it to a struct lws * and use it in the user protocol handler call.
In this case, the remainder of the struct lws is undefined. However the amount of
old protocol handlers that might touch things outside of the substructure in
PLAT_FREERTOS is very limited compared to legacy lws user code and the saving is
significant on constrained devices.
User handlers should not be touching everything in a wsi every time anyway, there
are several cases where there is no valid wsi to do the call with. Dereference of
things outside the substructure should only happen when the callback reason shows
there is a valid wsi bound to the activity (as in all the minimal examples).
When lws_write as many bytes as user can until function returns not all sent,
the next user`s lws_write call will write wrong frame to the other end. This
will cause connection be close by the other side.
Secure Streams is an optional layer on top of lws that separates policy
like endpoint selection and tls cert validation into a device JSON
policy document.
Code that wants to open a client connection just specifies a streamtype name,
and no longer deals with details like the endpoint, the protocol (!) or anything
else other than payloads and optionally generic metadata; the JSON policy
contains all the details for each streamtype. h1, h2, ws and mqtt client
connections are supported.
Logical secure streams outlive any particular connection and supports "nailed-up"
connectivity regardless of underlying connection stability.
This adds support for POST in both h1 and h2 queues / stream binding.
The previous queueing tried to keep the "leader" wsi who made the
actual connection around and have it act on the transaction queue
tail if it had done its own thing.
This refactors it so instead, who is the "leader" moves down the
queue and the queued guys inherit the fd, SSL * and queue from the
old leader as they take over.
This lets them operate in their own wsi identity directly and gets
rid of all the "effective wsi" checks, which was applied incompletely
and getting out of hand considering the separate lws_mux checks for
h2 and other muxed protocols alongside it.
This change also allows one wsi at a time to own the transaction for
POST. --post is added as an option to lws-minimal-http-client-multi
and 6 extra selftests with POST on h1/h2, pipelined or not and
staggered or not are added to the CI.
There are some minor public api type improvements rather than cast everywhere
inside lws and user code to work around them... these changed from int to
size_t
- lws_buflist_use_segment() return
- lws_tokenize_t .len and .token_len
- lws_tokenize_cstr() length
- lws_get_peer_simple() namelen
- lws_get_peer_simple_fd() namelen, int fd -> lws_sockfd_type fd
- lws_write_numeric_address() len
- lws_sa46_write_numeric_address() len
These changes are typically a NOP for user code
This should be a NOP for h2 support and only affects internal
apis. But it lets us reuse the working and reliable h2 mux
arrangements directly in other protocols later, and share code
so building for h2 + new protocols can take advantage of common
mux child handling struct and code.
Break out common mux handling struct into its own type.
Convert all uses of members that used to be in wsi->h2 to wsi->mux
Audit all references to the members and break out generic helpers
for anything that is useful for other mux-capable protocols to
reuse wsi->mux related features.
Remove LWS_LATENCY.
Add the option LWS_WITH_DETAILED_LATENCY, allowing lws to collect very detailed
information on every read and write, and allow the user code to provide
a callback to process events.