RFC6724 defines an ipv6-centric DNS result sorting algorithm, that
takes route and source address route information for the results
given by the DNS resolution, and sorts them in order of preferability,
which defines the order they should be tried in.
If LWS_WITH_NETLINK, then lws takes care about collecting and monitoring
the interface, route and source address information, and uses it to
perform the RFC6724 sorting to re-sort the DNS before trying to make
the connections.
This creates a role for RFC3549 Netlink monitoring.
If the OS supports it (currently, linux) then each pt creates a wsi
with the netlink role and dumps the current routing table at pt init.
It then maintains a cache of the routing table in each pt.
Upon routing table changes an SMD message is issued as an event, and
Captive Portal Detection is triggered.
All of the pt's current connections are reassessed for routability under
the changed routing table, those that no longer have a valid route or
gateway are closed.
Currently we always reserve a fakewsi per pt so events that don't have a related actual
wsi, like vhost-protocol-init or vhost cert init via protocol callback can make callbacks
that look reasonable to user protocol handler code expecting a valid wsi every time.
This patch splits out stuff that user callbacks often unconditionally expect to be in
a wsi, like context pointer, vhost pointer etc into a substructure, which is composed
into struct lws at the top of it. Internal references (struct lws is opaque, so there
are only internal references) are all updated to go via the substructre, the compiler
should make that a NOP.
Helpers are added when fakewsi is used and referenced.
If not PLAT_FREERTOS, we continue to provide a full fakewsi in the pt as before,
although the helpers improve consistency by zeroing down the substructure. There is
a huge amount of user code out there over the last 10 years that did not always have
the minimal examples to follow, some of it does some unexpected things.
If it is PLAT_FREERTOS, that is a newer thing in lws and users have the benefit of
being able to follow the minimal examples' approach. For PLAT_FREERTOS we don't
reserve the fakewsi in the pt any more, saving around 800 bytes. The helpers then
create a struct lws_a (the substructure) on the stack, zero it down (but it is only
like 4 pointers) and prepare it with whatever we know like the context.
Then we cast it to a struct lws * and use it in the user protocol handler call.
In this case, the remainder of the struct lws is undefined. However the amount of
old protocol handlers that might touch things outside of the substructure in
PLAT_FREERTOS is very limited compared to legacy lws user code and the saving is
significant on constrained devices.
User handlers should not be touching everything in a wsi every time anyway, there
are several cases where there is no valid wsi to do the call with. Dereference of
things outside the substructure should only happen when the callback reason shows
there is a valid wsi bound to the activity (as in all the minimal examples).
FreeRTOS only supports nonmonotonic time, when we correct it by, eg,
ntpclient, we offset all the existing sul timeouts. This adds an
internal helper function to correct existing sul timeouts by the
step amount, and call it in lws ntpclient implementation when
adjusting the gettimeofday() time.
- Add low level system message distibution framework
- Add support for local Secure Streams to participate using _lws_smd streamtype
- Add apit test and minimal example
- Add SS proxy support for _lws_smd
See minimal-secure-streams-smd README.md
Adapt the pt sul owner list to be an array, and define two different lists,
one that acts like before and is the default for existing users, and another
that has the ability to cooperate with systemwide suspend to restrict the
interval spent suspended so that it will wake in time for the earliest
thing on this wake-suspend sul list.
Clean the api a bit and add lws_sul_cancel() that only needs the sul as the
argument.
Add a flag for client creation info to indicate that this client connection
is important enough that, eg, validity checking it to detect silently dead
connections should go on the wake-suspend sul list. That flag is exposed in
secure streams policy so it can be added to a streamtype with
"swake_validity": true
Deprecate out the old vhost timer stuff that predates sul. Add a flag
LWS_WITH_DEPRECATED_THINGS in cmake so users can get it back temporarily
before it will be removed in a v4.2.
Adapt all remaining in-tree users of it to use explicit suls.
Establish a new distributed CMake architecture with CMake code related to
a source directory moving to be in the subdir in its own CMakeLists.txt.
In particular, there's now one in ./lib which calls through to ones
further down the directory tree like ./lib/plat/xxx, ./lib/roles/xxx etc.
This cuts the main CMakelists.txt from 98KB -> 33KB, about a 66% reduction,
and it's much easier to maintain sub-CMakeLists.txt that are in the same
directory as the sources they manage, and conceal all the details that that
level.
Child CMakelists.txt become responsible for:
- include_directories() definition (this is not supported by CMake
directly, it passes it back up via PARENT_SCOPE vars in helper
macros)
- Addition child CMakeLists.txt inclusion, for example toplevel ->
role -> role subdir
- Source file addition to the build
- Dependent library path resolution... this is now a private thing
in the child CMakeLists.txt, it just passes back any adaptations
to include_directories() and the LIB_LIST without filling the
parent namespace with the details
Trying to use a remote pool is very variable with CI, the builder can
force a local ntpd this way cleanly.
When enabled all the test apps use ntpclient, so this lets us tell them all to
go to the local ntpd in one hit.
Secure Streams is an optional layer on top of lws that separates policy
like endpoint selection and tls cert validation into a device JSON
policy document.
Code that wants to open a client connection just specifies a streamtype name,
and no longer deals with details like the endpoint, the protocol (!) or anything
else other than payloads and optionally generic metadata; the JSON policy
contains all the details for each streamtype. h1, h2, ws and mqtt client
connections are supported.
Logical secure streams outlive any particular connection and supports "nailed-up"
connectivity regardless of underlying connection stability.
Actually we are scheduling the first retry in case nothing comes
back from the server, it won't fail since it will allow at least
one retry, this being udp.
In the case code is composed into a single process, but it isn't monolithic in the
sense it's made up of modular "applications" that are written separate projects,
provide a way for the "applications" to request a callback from the lws event loop
thread context safely.
From the callback the applications can set up their operations on the lws event
loop and drop their own thread.
Since it requires system-specific locking to be threadsafe, provide a non-threadsafe
helper and then indirect the actual usage through a user-defined lws_system ops
function pointer that wraps the unsafe api with the system locking to make it safe.
Remove the auth lws_system stuff and redo it using generic blobs
with separate namespaces. Support pointing to already-in-memory
blobs without using heap as well as multi-fragment appened blobs
eg, parsed out of JSON chunk by chunk and chained in heap.
Support auth the new way, along with client cert + key in DER
namespaces.
Handle the situation that we are told to use a CNAME, but the CNAME is not resolved
by the remote server... adapt the query to resolve the CNAME and restart it, while
retaining the original query name for the cache entry generation.
"Recursion" doesn't mean function-calling-a-function type recursion, it remains
completely asynchronous on the event loop.
Generic lws_system IPv4 DHCP client
- netif and route control via lib/plat apis
- linux plat pieces implemented
- Uses raw ip socket for UDP broadcast and rx
- security-aware
- usual stuff plus up to 4 x dns server
If it's enabled for build, it holds the system
state at DHCP until at least one registered interface
has acquired a set of IP / mask / router / DNS server
It uses PF_PACKET which is Linux-only atm. But those
areas are isolated into plat code.
TODOs
- lease timing and reacquire
- plat pieces for other than Linux
This affects max header size since we use the latter half
of the pt_serv_buf to prepare the (possibly huge) auth token.
Adapt the pt_serv_buf_size in the hugeurl example.
Introduce a generic lws_state object with notification handlers
that may be registered in a chain.
Implement one of those in the context to manage the "system state".
Allow other pieces of lws and user code to register notification
handlers on a context list. Handlers can object to or take over
responsibility to move forward and retry system state changes if
they know that some dependent action must succeed first.
For example if the system time is invalid, we cannot move on to
a state where anything can do tls until that has been corrected.
This adds the option to have lws do its own dns resolution on
the event loop, without blocking. Existing implementations get
the name resolution done by the libc, which is blocking. In
the case you are opening client connections but need to carefully
manage latency, another connection opening and doing the name
resolution becomes a big problem.
Currently it supports
- ipv4 / A records
- ipv6 / AAAA records
- ipv4-over-ipv6 ::ffff:1.2.3.4 A record promotion for ipv6
- only one server supported over UDP :53
- nameserver discovery on linux, windows, freertos
It also has some nice advantages
- lws-style paranoid response parsing
- random unique tid generation to increase difficulty of poisoning
- it's really integrated with the lws event loop, it does not spawn
threads or use the libc resolver, and of course no blocking at all
- platform-specific server address capturing (from /etc/resolv.conf
on linux, windows apis on windows)
- it has LRU caching
- piggybacking (multiple requests before the first completes go on
a list on the first request, not spawn multiple requests)
- observes TTL in cache
- TTL and timeout use lws_sul timers on the event loop
- ipv6 pieces only built if cmake LWS_IPV6 enabled