The header name buffer and its max length handling has actually
been unused since the minilex parser was introduced. We hold
parsing state in the lex-type parts and don't need to store or
worry about max length, since the parser will let us know as
soon as it can't be a match for the valid header names.
This strips it out reducing the per-connection allocation for
x86_64 with default configure from 224 to 160.
Signed-off-by: Andy Green <andy.green@linaro.org>
- Define LWS_DLL and LWS_INTERNAL when websockets_shared is compiled.
- The websocket_shared target compiles to websocket.lib / websocket.dll
(websocket.lib contains the exported functions for websocket.dll, and is
the file that is linked to when a program wants to use the dll)
- The websocket target compiles to websocket_static.lib on windows.
- Replaced any "extern" with "LWS_EXTERN" on libwebsockets.h for proper
DLL function exports.
- Created a LIB_LIST with all the libwebsocket dependencies, instead of
multiple calls to target_link_libraries, only one call is made for both
the static and shared library version. This makes it easy to add other
variants if wanted in the future.
- Added ZLIB as a dependency for the libs, so that the build order will be
correct at all times.
- Added a dependency for the websockets lib to the test apps, so it is
built before them.
- Fixed the test-server-extpoll app to include the emulated_poll, and link
to winsock on Windows.
- Removed the global export of libwebsocket_internal_extensions, and added
a function libwebsocket_get_internal_extensions() that returns it
instead. Using the global would not work with the DLL export on Windows.
If the SSL connection failed before the headers came, we were not
dealing with deallocating the header malloc. This takes care of it.
Using CyaSSL, we are then valgrind-clean for ssl client and server.
With OpenSSL, there is 88 bytes lost at init that never changes or
gets recovered. AFAIK there's nothing to do about that.
OpenSSL also blows these during operation
==1059== Conditional jump or move depends on uninitialised value(s)
==1059== at 0x4A0B131: bcmp (mc_replace_strmem.c:935)
==1059== by 0x3014CDDBA8: ??? (in /usr/lib64/libcrypto.so.1.0.1c)
==1059== by 0x3015430852: tls1_enc (in /usr/lib64/libssl.so.1.0.1c)
==1059== by 0x3015428CEC: ssl3_read_bytes (in /usr/lib64/libssl.so.1.0.1c)
==1059== by 0x30154264C5: ??? (in /usr/lib64/libssl.so.1.0.1c)
==1059== by 0x4C3C596: lws_server_socket_service (server.c:153)
==1059== by 0x4C32C1E: libwebsocket_service_fd (libwebsockets.c:927)
==1059== by 0x4C33270: libwebsocket_service (libwebsockets.c:1225)
==1059== by 0x401C84: main (in /usr/bin/libwebsockets-test-server)
However googling around
https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/60021http://www.openssl.org/support/faq.html#PROG13
(also the next FAQ down)
it seems OpenSSL have a relaxed attitude to this and it's expected.
It's interesting CyaSSL works fine but doesn't have that problem...
Signed-off-by: Andy Green <andy.green@linaro.org>
This brings the library sources into compliance with checkpatch
style except for three or four exceptions like WIN32 related stuff
and one long string constant I don't want to break into multiple
sprintf calls.
There should be no functional or compilability change from all
this (hopefully).
Signed-off-by: Andy Green <andy.green@linaro.org>
This removes all the direct wsi members specific to clients,
most of them are moved to being fake headers in the next 3-layer
header scheme, c_port moves to being a member of the u.hdr
unionized struct.
It gets rid of a lot of fiddly mallocs and frees(), despite it
adds a small internal API to create the fake headers, actually
the patch deletes more than it adds...
Signed-off-by: Andy Green <andy.green@linaro.org>
This big patch replaces the malloc / realloc per header
approach used until now with a single three-level struct
that gets malloc'd during the header union phase and freed
in one go when we transition to a different union phase.
It's more expensive in that we malloc a bit more than 4Kbytes,
but it's a lot cheaper in terms of malloc, frees, heap fragmentation,
no reallocs, nothing to configure. It also moves from arrays of
pointers (8 bytes on x86_64) to unsigned short offsets into the
data array, (2 bytes on all platforms).
The 3-level thing is all in one struct
- array indexed by the header enum, pointing to first "fragment" index
(ie, header type to fragment lookup, or 0 for none)
- array of fragments indexes, enough for 2 x the number of known headers
(fragment array... note that fragments can point to a "next"
fragment if the same header is spread across multiple entries)
- linear char array where the known header payload gets written
(fragments point into null-terminated strings stored in here,
only the known header content is stored)
http headers can legally be split over multiple headers of the same
name which should be concatenated. This scheme does not linearly
conatenate them but uses a linked list in the fragment structs to
link them. There are apis to get the total length and copy out a
linear, concatenated version to a buffer.
Signed-off-by: Andy Green <andy.green@linaro.org>
Also max protocols to 5 (overridable by configure) and max extensions
from 10 to 3 by default (also overridable by configure).
wsi is down to 608 on x86_64 with this.
Signed-off-by: Andy Green <andy.green@linaro.org>
A new protocol member is defined that controls the size of rx
buffer allocation per connection. For compatibility 0 size
allocates 4096, but you should adapt your protocol definition
array in the user code to declare an appropriate value.
See the changelog for more detail.
The advantage is the rx frame buffer size is now tailored to
what is expected from the protocol, rather than being fixed
to a default of 4096. If your protocol only sends frames of
a dozen bytes this allows you to only allocate an rx frame
buffer of the same size.
For example the per-connection allocation (excluding headers)
for the test server fell from ~4500 to < 750 bytes with this.
Signed-off-by: Andy Green <andy.green@linaro.org>
This gets rid of the stack buffer while serving files, and the
PATH_MAX char array that used to hold the filepath in the wsi.
It holds an extra file descriptor open while serving the file,
however it attempts to stuff the socket with as much of the
file as it can take. For files of a few KB, that typically
completes (without blocking) in the call to
libwebsockets_serve_http_file() and then closes the file
descriptor before returning.
Signed-off-by: Andy Green <andy.green@linaro.org>
This reduces the size of struct libwebscocket from 4840 to 4552
on x86_64
There are also big benefits on malloc pool fragmentation and
allocation, the header allocations only exist between the first
peer communication and websocket connection establishment for
both server and client.
Signed-off-by: Andy Green <andy.green@linaro.org>
- For some reason the "extern int pid_daemon" usage in libwebsockets.c would cause an "undefined symbols" linker error for the test-apps. This only happens with the CMake project, not the normal Makefiles. I have no clue why this is. Fixed it by getting the pid via a function instead.
- Added test-server-extpoll
- Renamed the library from libwebsocket -> libwebsockets
- Finalized CMake support (tested on windows only so far).
- Uses a generated lws_config.h that is included in
private-libwebsocket to pass defines, only used if CMAKE_BUILD is set.
- Support for SSL on Windows.
- Initial support for CyaSSL replacement of OpenSSL (This has been added
to my older CMake-fork but haven't been tested on this version yet).
- Fixed windows build (see below for details).
- Fixed at least the 32-bit Debug build for the existing Visual Studio
Project. (Not to keen fixing all the others when we have CMake support
anyway (which can generate much better project files)...)
- BUGFIXES:
- handshake.c
- used C99 definition of handshake_0405 function
- libwebsocket.c
- syslog not available on windows, put in ifdefs.
- Fixed previous known crash bug on Windows where WSAPoll in
Ws2_32.dll would not be present, causing the poll function pointer
being set to NULL.
- Uninitialized variable context->listen_service_extraseen would
result in stack overflow because of infinite recursion. Fixed by
initializing in libwebsocket_create_context
- SO_REUSADDR means something different on Windows compared to Unix.
- Setting a socket to nonblocking is done differently on Windows.
(This should probably broken out into a helper function instead)
- lwsl_emit_syslog -> lwsl_emit_stderr on Windows.
- private-libwebsocket.h
- PATH_MAX is not available on Windows, define as MAX_PATH
- Always define LWS_NO_DAEMONIZE on windows.
- Don't define lws_latency as inline that does nothing. inline is not
support by the Microsoft compiler, replaced with an empty define
instead. (It's __inline in MSVC)
- server.c
- Fixed nonblock call on windows
- test-ping.c
- Don't use C99 features (Microsoft compiler does not support it).
- Move non-win32 headers into ifdefs.
- Skip use of sighandler on Windows.
- test-server.c
- ifdef syslog parts on Windows.
Libwebsockets is fundamentally singlethreaded... the existence of the
fork and broadcast support, especially in the sample server is
giving the wrong idea about how to use it.
This replaces broadcast in the sample server with
libwebsocket_callback_on_writable_all_protocol(). The whole idea of
'broadcast' is removed.
All of the broadcast proxy stuff is removed: data must now be sent
from the callback only. Doing othherwise is not reliable since the
service loop may close the socket and free the wsi at any time,
invalidating a wsi pointer held by another thread (don't do that!)
Likewise the confirm_legit_wsi api added recently does not help the
other thread case, since if the wsi has been freed dereferencing the
wsi to study if it is legit or not will segfault in that case. So
this is removed too.
The overall effect is to push user code to only operate inside the
protocol callbacks or external poll loops, ie, single thread context.
Signed-off-by: Andy Green <andy.green@linaro.org>
Comes in handy if the original application poll loop is the boss,
in this case libwebsockets is optional and can't be the boss poll
loop
Requested-by: ajandhyala@wms.com
Signed-off-by: Andy Green <andy.green@linaro.org>
Large chunks of struct libwebsocket members actually have a mutually
exclusive lifecycle, eg, once the http headers are finished they sit
there unused until the instance is destroyed.
This makes a big improvement in memory efficiency by making four
categories of member: always needed, needed for header processing,
needed for http processing, and needed for ws processing. The last
three are mutually exclusive and bound into a union inside the wsi.
Care needs taking now at "union transitions", although we zeroed down
the struct at init, the other union siblings have been writing the
same memory by the time later member siblings start to use it. So
it must be cleared down appropriately when we cross from one
mutually-exclusive use to another.
Signed-off-by: Andy Green <andy.green@linaro.org>
Since v13 was defined as the released ietf version the older versions
are deprecated. This patch strips out everything to do with the older
versions and gets rid of the option to send stuff unmasked.
The in-tree md5 implementation is then also deleted as nothing needs
it any more, 1280 loc are shed in all
Signed-off-by: Andy Green <andy.green@linaro.org>
The whole thing about count_protocols + 1 broadcast sockets and
associated dummy wsis is a workaround for getting a broadcast from
a different process context, if we are running with --enable-no-fork
then we don't need any of it in.
Signed-off-by: Andy Green <andy.green@linaro.org>
The new --without-extensions config flag completely removes all code
and data related to extensions from the build throughout the library
when given.
Signed-off-by: Andy Green <andy.green@linaro.org>
Move server-only stuff into their own files and make building
that depend on not having --without-server on the configure
Make fragments in other places conditional as well
Remove client-related members from struct libwebscket when
building LWS_NO_CLIENT
Apps:
normal: build test server, client, fraggle, ping
--without-client: build test server
--without-server: build test client, ping
Signed-off-by: Andy Green <andy.green@linaro.org>
Profiling what happens during the ab test, one of the hotspots
was strcasecmp in a loop looking for header name matches each time.
This patch introduces a lexical parser that creates a state machine
in 276 bytes that encodes all the known header names. The fsm is
walked bytewise as chaacters come in... most states do not need any
recursion to match or fail.
The state machine output is cut-and-pasted into parsers.c as an
unsigned char array.
The fsm generator is a bit rough and ready, included in the tree but
not built since normal mortals won't need to touch it.
Signed-off-by: Andy Green <andy.green@linaro.org>
Problems with rx flow control implementation were the underlying cause
of the connection stalling issue that was covered up with the udelay()
patch that was removed recently.
This get rx flow control working properly and corrects problems with
fifo management in the test server mirror protocol code too.
The rxfow control api has been changed to just set a flag, so it's very cheap
to call from user code. After the callbacks that might use the rxflow control
api the flag is checked and any pending actions done.
rx flow control now stops any rx packet coming immediately, with compessed
connections "just what was left in the pipe" might be hundreds of KBytes. To
implement that the current packet being decoded is copied into a malloc'd buffer
by the rx processing code now.
When rxflow is allows to come again, the buffer is drained and freed before any
new packet content is accepted.
Signed-off-by: Andy Green <andy.green@linaro.org>
This rips out the connection hashtable implementation along with
MAX_CLIENTS and replaces it with a dynamically allocated fds array
and lookup table along the same lines as the new extpoll implementation
from Edwin van den Oetelaar.
It detects the max number of file descriptors possible at context init
time and allocates accordingly; this can be externally controlled by
ulimit and the server run as a specific user to facilitate targeting
specific ulimit rules at it.
Many operations that translated between socket descriptors and struct
websocket or pollfd objects have had iteration removed by this patch
and under load will be a lot faster.
Signed-off-by: Andy Green <andy.green@linaro.org>
From an idea by Edwin van den Oetelaar <oetelaar.automatisering@gmail.com>
When testing libwebsockets with ab, Edwin found an unexpected bump in
the distribution of latencies, some connections were held back almost
the whole test duration.
http://ml.libwebsockets.org/pipermail/libwebsockets/2013-January/000006.html
Studying the problem revealed that when there are mass pending connections
amongst many active connections, we do not service the listen socket often
enough to clear the backlog, some seem to get stale violating FIFO ordering.
This patch introduces listen socket service "piggybacking", where every n
normal socket service actions we also check the listen socket and deal with
pending connections there.
Normally, it checks the listen socket gratuitously every 10 normal socket
services. However, if it finds something waiting, it forces a check on the
next normal socket service too by keeping stats on how often something was
waiting. If the probability of something waiting each time becomes high,
it will allow up to two waiting connections to be serviced for each normal
socket service.
In that way it has low burden in the normal case, but rapidly adapts by
detecting mass connection loads as found in ab.
Signed-off-by: Andy Green <andy.green@linaro.org>
Default remains at SOMAXCONN, you can force it at configure time
along these lines
./configure CFLAGS="-DLWS_SOMAXCONN=16384"
Signed-off-by: Andy Green <andy.green@linaro.org>
Previously we sat and looped to dump a file over http protocol.
Actually that's a source of blocking to the other sockets being serviced.
This patch breaks up the file service into a roundtrip around the poll()
loop for each 512-byte packet. It doesn't make much difference if the
server is idle, but if it's busy it makes sure everyone else is getting
service while the file is sent.
It doesn't try to optimize multiple users of the file or to keep the
descriptor open, the point of this patch is to establish the breaking up
of the file send action into the poll loop.
On the user side, there are two differences:
- context is now needed in the first argument to libwebsockets_serve_http_file()
that's not too bad since we provide context in the callback.
- file send is now asynchronous to the user code, you get a new callback coming
in protocol 0 when it's done, LWS_CALLBACK_HTTP_FILE_COMPLETION
libwebsockets-test-server is updated accordingly.
Signed-off-by: Andy Green <andy.green@linaro.org>
"4b0e01f Retry SSL_connect when SSL_get_error requests it. " from David Galeano
noticed the problem that client connect may receive SSL_ERROR_WANT_* from
SSL_connect, which is basically WOULDBLOCK. That patch tried to deal with it
by blocking in a while(1) until the condition went away.
That's problematic because of it blocks service of anything else (including
the host application sockets in the external socket poll sharing case) for
up to 5s controlled by conditions at one client.
After fiddling with and researching this, the actual problem with the code is
we are not getting the SSL layer error correctly, it is not contained in the
code returned from the Connect api directly.
I was unable to get a renegotiation forced on my modern SSL libs, it complained
about protocol error are reopened the connection instead. So I think the stuff
found in the docs and the web about the SSL_ERROR_WANT_ is probably not something
we will see in reality (if we check the right error code...)
Signed-off-by: Andy Green <andy.green@linaro.org>
This patch allows control of the main compiletime constants in libwebsockets
from the configure commandline.
README is updated with documentation on what's available, how to set them
and the defaults.
The constants are logged with "info" severity (not visible by default) at
context create time.
The zlib constant previously exposed like this is moved to private-libwebsockets.h
so it can be printed along with the rest.
Signed-off-by: Andy Green <andy.green@linaro.org>