diff --git a/READMEs/README.event-loops-intro.md b/READMEs/README.event-loops-intro.md
new file mode 100644
index 000000000..35b516101
--- /dev/null
+++ b/READMEs/README.event-loops-intro.md
@@ -0,0 +1,285 @@
+# Considerations around Event Loops
+
+Much of the software we use is written around an **event loop**.  Some examples
+
+ - Chrome / Chromium, transmission, tmux, ntp SNTP... [libevent](https://libevent.org/)
+ - node.js / cjdns / Julia / cmake ... [libuv](https://archive.is/64pOt)
+ - Gstreamer, Gnome / GTK apps ... [glib](https://people.gnome.org/~desrt/glib-docs/glib-The-Main-Event-Loop.html)
+ - SystemD ... sdevent
+ - OpenWRT ... uloop
+
+Many applications roll their own event loop using poll() or epoll() or similar,
+using the same techniques.  Another set of apps use message dispatchers that
+take the same approach, but are for cases that don't need to support sockets.
+Event libraries provide crossplatform abstractions for this functoinality, and
+provide the best backend for their event waits on the platform automagically.
+
+libwebsockets networking operations require an event loop, it provides a default
+one for the platform (based on poll() for Unix) if needed, but also can natively
+use any of the event loop libraries listed above, including "foreign" loops
+already created and managed by the application.
+
+## What is an 'event loop'?
+
+Event loops have the following characteristics:
+
+ - they have a **single thread**, therefore they do not require locking
+ - they are **not threadsafe**
+ - they require **nonblocking IO**
+ - they **sleep** while there are no events (aka the "event wait")
+ - if one or more event seen, they call back into user code to handle each in
+   turn and then return to the wait (ie, "loop")
+
+### They have a single thread
+
+By doing everything in turn on a single thread, there can be no possibility of
+conflicting access to resources from different threads... if the single thread
+is in callback A, it cannot be in two places at the same time and also in
+callback B accessing the same thing: it can never run any other code
+concurrently, only sequentially, by design.
+
+It means that all mutexes and other synchronization and locking can be
+eliminated, along with the many kinds of bugs related to them.
+
+### They are not threadsafe
+
+Event loops mandate doing everything in a single thread.  You cannot call their
+apis from other threads, since there is no protection against reentrancy.
+
+Lws apis cannot be called safely from any thread other than the event loop one,
+with the sole exception of `lws_cancel_service()`.
+
+### They have nonblocking IO
+
+With blocking IO, you have to create threads in order to block them to learn
+when your IO could proceed.  In an event loop, all descriptors are set to use
+nonblocking mode, we only attempt to read or write when we have been informed by
+an event that there is something to read, or it is possible to write.
+
+So sacrificial, blocking discrete IO threads are also eliminated, we just do
+what we should do sequentially, when we get the event indicating that we should
+do it.
+
+### They sleep while there are no events
+
+An OS "wait" of some kind is used to sleep the event loop thread until something
+to do.  There's an explicit wait on file descriptors that have pending read or
+write, and also an implicit wait for the next scheduled event.  Even if idle for
+descriptor events, the event loop will wake and handle scheduled events at the
+right time.
+
+In an idle system, the event loop stays in the wait and takes 0% CPU.
+
+### If one or more event, they handle them and then return to sleep
+
+As you can expect from "event loop", it is an infinite loop alternating between
+sleeping in the event wait and sequentially servicing pending events, by calling
+callbacks for each event on each object.
+
+The callbacks handle the event and then "return to the event loop".  The state
+of things in the loop itself is guaranteed to stay consistent while in a user
+callback, until you return from the callback to the event loop, when socket
+closes may be processed and lead to object destruction.
+
+Event libraries like libevent are operating the same way, once you start the
+event loop, it sits in an inifinite loop in the library, calling back on events
+until you "stop" or "break" the loop by calling apis.
+
+## Why are event libraries popular?
+
+Developers prefer an external library solution for the event loop because:
+
+ - the quality is generally higher than self-rolled ones.  Someone else is
+   maintaining it, a fulltime team in some cases.
+ - the event libraries are crossplatform, they will pick the most effective
+   event wait for the platform without the developer having to know the details.
+   For example most libs can conceal whether the platform is windows or unix,
+   and use native waits like epoll() or WSA accordingly.
+ - If your application uses a event library, it is possible to integrate very
+   cleanly with other libraries like lws that can use the same event library.
+   That is extremely messy or downright impossible to do with hand-rolled loops.
+
+Compared to just throwing threads on it
+
+ - thread lifecycle has to be closely managed, threads must start and must be
+   brought to an end in a controlled way.  Event loops may end and destroy
+   objects they control at any time a callback returns to the event loop.
+
+ - threads may do things sequentially or genuinely concurrently, this requires
+   locking and careful management so only deterministic and expected things
+   happen at the user data.
+
+ - threads do not scale well to, eg, serving tens of thousands of connections;
+   web servers use event loops.
+
+## Multiple codebases cooperating on one event loop
+
+The ideal situation is all your code operates via a single event loop thread.
+For lws-only code, including lws_protocols callbacks, this is the normal state
+of affairs.
+
+When there is other code that also needs to handle events, say already existing
+application code, or code handling a protocol not supported by lws, there are a
+few options to allow them to work together, which is "best" depends on the
+details of what you're trying to do and what the existing code looks like.
+In descending order of desirability:
+
+### 1) Use a common event library for both lws and application code
+
+This is the best choice for Linux-class devices.  If you write your application
+to use, eg, a libevent loop, then you only need to configure lws to also use
+your libevent loop for them to be able to interoperate perfectly.  Lws will
+operate as a guest on this "foreign loop", and can cleanly create and destroy
+its context on the loop without disturbing the loop.
+
+In addition, your application can merge and interoperate with any other
+libevent-capable libraries the same way, and compared to hand-rolled loops, the
+quality will be higher.
+
+### 2) Use lws native wsi semantics in the other code too
+
+Lws supports raw sockets and file fd abstractions inside the event loop.  So if
+your other code fits into that model, one way is to express your connections as
+"RAW" wsis and handle them using lws_protocols callback semantics.
+
+This ties the application code to lws, but it has the advantage that the
+resulting code is aware of the underlying event loop implementation and will
+work no matter what it is.
+
+### 3) Make a custom lws event lib shim for your custom loop
+
+Lws provides an ops struct abstraction in order to integrate with event
+libraries, you can find it in ./includes/libwebsockets/lws-eventlib-exports.h.
+
+Lws uses this interface to implement its own event library plugins, but you can
+also use it to make your own customized event loop shim, in the case there is
+too much written for your custom event loop to be practical to change it.
+
+In other words this is a way to write a customized event lib "plugin" and tell
+the lws_context to use it at creation time.  See [minimal-http-server.c](https://libwebsockets.org/git/libwebsockets/tree/minimal-examples/http-server/minimal-http-server-eventlib-custom/minimal-http-server.c)
+
+### 4) Cooperate at thread level
+
+This is less desirable because it gives up on unifying the code to run from a
+single thread, it means the codebases cannot call each other's apis directly.
+
+In this scheme the existing threads do their own thing, lock a shared
+area of memory and list what they want done from the lws thread context, before
+calling `lws_cancel_service()` to break the lws event wait.  Lws will then
+broadcast a `LWS_CALLBACK_EVENT_WAIT_CANCELLED` protocol callback, the handler
+for which can lock the shared area and perform the requested operations from the
+lws thread context.
+
+### 5) Glue the loops together to wait sequentially (don't do this)
+
+If you have two or more chunks of code with their own waits, it may be tempting
+to have them wait sequentially in an outer event loop.  (This is only possible
+with the lws default loop and not the event library support, event libraries
+have this loop inside their own `...run(loop)` apis.)
+
+```
+	while (1) {
+		do_lws_wait(); /* interrupted at short intervals */
+		do_app_wait(); /* interrupted at short intervals */
+	}
+```
+
+This never works well, either:
+
+ - the whole thing spins at 100% CPU when idle, or
+
+ - the waits have timeouts where they sleep for short periods, but then the
+   latency to service on set of events is increased by the idle timeout period
+   of the wait for other set of events
+
+## Common Misunderstandings
+
+### "Real Men Use Threads"
+
+Sometimes you need threads or child processes.  But typically, whatever you're
+trying to do does not literally require threads.  Threads are an architectural
+choice that can go either way depending on the goal and the constraints.
+
+Any thread you add should have a clear reason to specifically be a thread and
+not done on the event loop, without a new thread or the consequent locking (and
+bugs).
+
+### But blocking IO is faster and simpler
+
+No, blocking IO has a lot of costs to conceal the event wait by blocking.
+
+For any IO that may wait, you must spawn an IO thread for it, purely to handle
+the situation you get blocked in read() or write() for an arbitrary amount of
+time.  It buys you a simple story in one place, that you will proceed on the
+thread if read() or write() has completed, but costs threads and locking to get
+to that.
+
+Event loops dispense with the threads and locking, and still provide a simple
+story, you will get called back when data arrives or you may send.
+
+Event loops can scale much better, a busy server with 50,000 connections active
+does not have to pay the overhead of 50,000 threads and their competing for
+locking.
+
+With blocked threads, the thread can do no useful work at all while it is stuck
+waiting.  With event loops the thread can service other events until something
+happens on the fd.
+
+### Threads are inexpensive
+
+In the cases you really need threads, you must have them, or fork off another
+process.  But if you don't really need them, they bring with them a lot of
+expense, some you may only notice when your code runs on constrained targets
+
+ - threads have an OS-side footprint both as objects and in the scheduler
+
+ - thread context switches are not slow on modern CPUs, but have side effects
+   like cache flushing
+
+ - threads are designed to be blocked for arbitrary amounts of time if you use
+   blocking IO apis like write() or read().  Then how much concurrency is really
+   happening?  Since blocked threads just go away silently, it is hard to know
+   when in fact your thread is almost always blocked and not doing useful work.
+
+ - threads require their own stack, which is on embedded is typically suffering
+   from a dedicated worst-case allocation where the headroom is usually idle
+
+ - locking must be handled, and missed locking or lock order bugs found
+
+### But... what about latency if only one thing happens at a time?
+
+ - Typically, at CPU speeds, nothing is happening at any given time on most
+   systems, the event loop is spending most of its time in the event wait
+   asleep at 0% cpu.
+
+ - The POSIX sockets layer is disjoint from the actual network device driver.
+   It means that once you hand off the packet to the networking stack, the POSIX
+   api just returns and leaves the rest of the scheduling, retries etc to the
+   networking stack and device, descriptor queuing is driven by interrupts in
+   the driver part completely unaffected by the event loop part.
+
+ - Passing data around via POSIX apis between the user code and the networking
+   stack tends to return almost immediately since its onward path is managed
+   later in another, usually interrupt, context.
+
+ - So long as enough packets-worth of data are in the network stack ready to be
+   handed to descriptors, actual throughput is completely insensitive to jitter
+   or latency at the application event loop
+
+ - The network device itself is inherently serializing packets, it can only send
+   one thing at a time.  The networking stack locking also introduces hidden
+   serialization by blocking multiple threads.
+
+ - Many user systems are decoupled like the network stack and POSIX... the user
+   event loop and its latencies do not affect backend processes occurring in
+   interrupt or internal thread or other process contexts
+
+## Conclusion
+
+Event loops have been around for a very long time and are in wide use today due
+to their advantages.  Working with them successfully requires understand how to
+use them and why they have the advantages and restrictions they do.
+
+The best results come from all the participants joining the same loop directly.
+Using a common event library in the participating codebases allows completely
+different code can call each other's apis safely without locking.