Summary
When CPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFO is defined on Linux/glibc, getaddrinfo_with_timeout() has a use-after-free bug that causes heap corruption and SIGSEGV. The function uses getaddrinfo_a(GAI_NOWAIT) with a stack-local struct gaicb, and when gai_suspend() times out, it calls gai_cancel() but returns immediately without waiting for the cancellation to complete. The async DNS worker thread continues to reference the destroyed stack frame.
Version
httplib 0.43.1
Root Cause
In httplib.h, getaddrinfo_with_timeout() (around line 5880):
#elif defined(_GNU_SOURCE) && defined(__GLIBC__) && \
(__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 2))
struct gaicb request; // ← stack-local
struct gaicb *requests[1] = {&request};
// ...
int start_result = getaddrinfo_a(GAI_NOWAIT, requests, 1, &sevp);
// ...
int wait_result = gai_suspend(..., &timeout);
// ...
} else if (wait_result == EAI_AGAIN) {
gai_cancel(&request); // ← may return EAI_NOTCANCELED
return EAI_AGAIN; // ← destroys 'request' while DNS thread still uses it
}
gai_cancel() is non-blocking. Per the man page, it can return:
EAI_CANCELED — successfully canceled
EAI_NOTCANCELED — request still in progress, could not be canceled
EAI_ALLDONE — already completed
When it returns EAI_NOTCANCELED, the async DNS thread spawned by getaddrinfo_a is still running and still references request.ar_name, request.ar_request, etc. The function returns, the stack frame is destroyed, and the DNS thread writes to freed memory.
This corrupts the heap. The symptom is typically SIGSEGV in an unrelated thread's malloc/free, often many seconds later when the corrupted heap metadata is traversed.
How to Trigger
DNS resolution must take longer than connection_timeout_sec. This happens when:
- The network path to the DNS server is disrupted (interface toggle, cable unplug)
- The DNS server is unreachable or slow
- System DNS timeout (typically 5-30s) exceeds the httplib connection timeout (typically 2-5s)
Evidence
With ASAN's LeakSanitizer, the orphaned DNS threads leak resolver memory:
Direct leak of 84 byte(s) in 3 object(s) allocated from:
#0 malloc
#1 __GI___res_context_send resolv/res_send.c:325
#2 __GI___res_context_query resolv/res_query.c:218
#3 __res_context_querydomain resolv/res_query.c:629
#4 __GI___res_context_search resolv/res_query.c:385
#5 __GI__nss_dns_gethostbyname4_r nss_dns/dns-host.c:418
#6 get_nss_addresses nss/getaddrinfo.c:652
#7 gaih_inet nss/getaddrinfo.c:1185
#8 __GI_getaddrinfo nss/getaddrinfo.c:2390
#9 handle_requests resolv/gai_misc.c:329 ← getaddrinfo_a worker thread
#10 start_thread nptl/pthread_create.c:448
These are the same threads that access the destroyed struct gaicb on the caller's stack.
Suggested Fix
After calling gai_cancel(), wait for the async operation to actually complete before returning:
} else if (wait_result == EAI_AGAIN) {
gai_cancel(&request);
// gai_cancel is non-blocking — the async thread may still be running.
// We must wait until it finishes before destroying the stack-local request.
while (gai_error(&request) == EAI_INPROGRESS) {
struct timespec wait = {0, 1000000}; // 1ms
nanosleep(&wait, nullptr);
}
if (request.ar_result) { freeaddrinfo(request.ar_result); }
return EAI_AGAIN;
} else {
gai_cancel(&request);
while (gai_error(&request) == EAI_INPROGRESS) {
struct timespec wait = {0, 1000000};
nanosleep(&wait, nullptr);
}
if (request.ar_result) { freeaddrinfo(request.ar_result); }
return wait_result;
}
Alternative: use gai_suspend again (with a short timeout) instead of busy-polling gai_error.
Environment
- Linux x86_64, glibc 2.40 (Debian trixie)
- httplib 0.38.0
- Compiled with
-DCPPHTTPLIB_OPENSSL_SUPPORT -DCPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFO
- g++ 14,
-std=gnu++20
Summary
When
CPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFOis defined on Linux/glibc,getaddrinfo_with_timeout()has a use-after-free bug that causes heap corruption and SIGSEGV. The function usesgetaddrinfo_a(GAI_NOWAIT)with a stack-localstruct gaicb, and whengai_suspend()times out, it callsgai_cancel()but returns immediately without waiting for the cancellation to complete. The async DNS worker thread continues to reference the destroyed stack frame.Version
httplib 0.43.1
Root Cause
In
httplib.h,getaddrinfo_with_timeout()(around line 5880):gai_cancel()is non-blocking. Per the man page, it can return:EAI_CANCELED— successfully canceledEAI_NOTCANCELED— request still in progress, could not be canceledEAI_ALLDONE— already completedWhen it returns
EAI_NOTCANCELED, the async DNS thread spawned bygetaddrinfo_ais still running and still referencesrequest.ar_name,request.ar_request, etc. The function returns, the stack frame is destroyed, and the DNS thread writes to freed memory.This corrupts the heap. The symptom is typically SIGSEGV in an unrelated thread's
malloc/free, often many seconds later when the corrupted heap metadata is traversed.How to Trigger
DNS resolution must take longer than
connection_timeout_sec. This happens when:Evidence
With ASAN's LeakSanitizer, the orphaned DNS threads leak resolver memory:
These are the same threads that access the destroyed
struct gaicbon the caller's stack.Suggested Fix
After calling
gai_cancel(), wait for the async operation to actually complete before returning:Alternative: use
gai_suspendagain (with a short timeout) instead of busy-pollinggai_error.Environment
-DCPPHTTPLIB_OPENSSL_SUPPORT -DCPPHTTPLIB_USE_NON_BLOCKING_GETADDRINFO-std=gnu++20