This big patch replaces the malloc / realloc per header
approach used until now with a single three-level struct
that gets malloc'd during the header union phase and freed
in one go when we transition to a different union phase.
It's more expensive in that we malloc a bit more than 4Kbytes,
but it's a lot cheaper in terms of malloc, frees, heap fragmentation,
no reallocs, nothing to configure. It also moves from arrays of
pointers (8 bytes on x86_64) to unsigned short offsets into the
data array, (2 bytes on all platforms).
The 3-level thing is all in one struct
- array indexed by the header enum, pointing to first "fragment" index
(ie, header type to fragment lookup, or 0 for none)
- array of fragments indexes, enough for 2 x the number of known headers
(fragment array... note that fragments can point to a "next"
fragment if the same header is spread across multiple entries)
- linear char array where the known header payload gets written
(fragments point into null-terminated strings stored in here,
only the known header content is stored)
http headers can legally be split over multiple headers of the same
name which should be concatenated. This scheme does not linearly
conatenate them but uses a linked list in the fragment structs to
link them. There are apis to get the total length and copy out a
linear, concatenated version to a buffer.
Signed-off-by: Andy Green <andy.green@linaro.org>
A new protocol member is defined that controls the size of rx
buffer allocation per connection. For compatibility 0 size
allocates 4096, but you should adapt your protocol definition
array in the user code to declare an appropriate value.
See the changelog for more detail.
The advantage is the rx frame buffer size is now tailored to
what is expected from the protocol, rather than being fixed
to a default of 4096. If your protocol only sends frames of
a dozen bytes this allows you to only allocate an rx frame
buffer of the same size.
For example the per-connection allocation (excluding headers)
for the test server fell from ~4500 to < 750 bytes with this.
Signed-off-by: Andy Green <andy.green@linaro.org>
Libwebsockets is fundamentally singlethreaded... the existence of the
fork and broadcast support, especially in the sample server is
giving the wrong idea about how to use it.
This replaces broadcast in the sample server with
libwebsocket_callback_on_writable_all_protocol(). The whole idea of
'broadcast' is removed.
All of the broadcast proxy stuff is removed: data must now be sent
from the callback only. Doing othherwise is not reliable since the
service loop may close the socket and free the wsi at any time,
invalidating a wsi pointer held by another thread (don't do that!)
Likewise the confirm_legit_wsi api added recently does not help the
other thread case, since if the wsi has been freed dereferencing the
wsi to study if it is legit or not will segfault in that case. So
this is removed too.
The overall effect is to push user code to only operate inside the
protocol callbacks or external poll loops, ie, single thread context.
Signed-off-by: Andy Green <andy.green@linaro.org>
This puts some numbers of library size with the various --without
and --disable options and about dynamic memory allocation
performance
Signed-off-by: Andy Green <andy.green@linaro.org>
The new --without-extensions config flag completely removes all code
and data related to extensions from the build throughout the library
when given.
Signed-off-by: Andy Green <andy.green@linaro.org>