Event Loop
Here you find a concise developer-focused description of how the event loop implemented in ev.c and ev.h operates. This includes architecture, watcher mechanics, lifecycle, role separation, callback pipelines, and planned enhancements.
High-Level Architecture
The event loop sits in a continuous wait cycle, monitoring sources of events (I/O readiness, timer expirations, or delivered signals), dispatching the appropriate handler when an event occurs, and calling an user registered callback.
- Backends abstract the OS-specific wait or I/O mechanis used. The backend can be configured in the configuration file with the variable
ev_backend. - Watchers encapsulate interest in one type of event (I/O, signal, or periodic) and carry the callback and context required.
- Lifecycle routines manage setup, execution, interruption, and teardown of the loop.
- Each process has its own event loop: There is a clear distinction between main (accepting connections) and worker (handling established connections) event loops.
Data Structures
struct event_loopcentralizes all loop state, holding:runningflag governs the main loop.sigsettracks which signals the loop intercepts.- An array of generic
event_watcher_t*pointers represents active watchers. - Backend handles (
io_uringring,epollfd, orkqueuefd) interface directly with the kernel. - A scratch
bufferorio_uringbuffer ring is used to stage data transfers efficiently (the memory is defined elsewhere in pgagroal and used here as the buffer).
Watcher Types and Responsibilities
Every watcher embeds a small common header containing its type, enabling the loop to iterate over mixed watcher arrays.
I/O Watchers monitor one or two file descriptors.
- Main watchers listen for new client connections and accept them.
- Worker watchers handle serial request/response flows, blocking on receive then send.
Signal Watchers wrap POSIX signals into file descriptors. The loop unblocks these signals globally, then watches the FD for delivery events, invoking the registered callback.
Periodic Watchers fire at fixed millisecond intervals.
Event Loop Lifecycle
- Initialization (
pgagroal_event_loop_init): - Running (
pgagroal_event_loop_run): - Breaking (
pgagroal_event_loop_break): - Destruction (
pgagroal_event_loop_destroy): - Fork Handling (
pgagroal_event_loop_fork):
Main vs Worker I/O watchers
To simplify connection handling, the code forks a Worker process for each accepted client. Both processes run the same loop, but with different watchers registered:
Main Process:
- Watches
listen_fdfor new connections. - On accept, forks a Worker and continues listening.
- Watches
Worker Process:
- Registers I/O watchers on
rcv_fdandsnd_fd. - Waits for
rcv_fdto signal incoming data, then invokes a pipeline callback to process it. - Sends responses on
snd_fd.
- Registers I/O watchers on
Pipelines & Callback Flow
The loop’s generic I/O handler delegates to a pipeline based on watcher type and context. A typical flow:
- I/O event:
backend -> loop -> io_watcher.handler - Dispatch: Handler inspects messages in buffer and selects the next pipeline stage function.
- Processing and Returning: Pipeline stage validates the message payload and gets back to the loop.
This approach separates generic loop mechanics from application-specific message handling.
Enhancements
First, the main enhancement we could do is improve initial connection time. This could happen by initially caching the event loops beforehand and allowing for a connection to pick up one. Further examination of ftrace here is required.
Second, a series of compile-time flags mark areas for performance tuning. In my experience, none of these have been able to greatly improve performance (haven't tested with iovecs), but these may still require correct implementation and evaluation:
- Zero Copy (
MSG_ZEROCOPYvia io_uring) — reduce CPU overhead by skipping buffer copies. - Fast Poll (
EPOLLET) — edge-triggered epoll mode for high-throughput scenarios. - Huge Pages (
IORING_SETUP_NO_MMAP) — leverage large page mappings for buffer rings. - Multishot Recv — one SQE to deliver multiple receive completions.
- IOVecs — scatter/gather I/O arrays for fewer system calls.