Skip to content

OkhttpClient with Proxy does not handle Network changes #7698

@Dohbedoh

Description

@Dohbedoh

When using an OkHttpClient instance with a Proxy / ProxySelector, if a network change happens so that the proxy address resolves to a different IP address, the client is not able to reach out to the request URL anymore... In such a case, we typically see a SocketTimeoutException:

And note that:

  • If you do not reuse your instance of OkhttpClient but create a new one, this issue does not happen.
  • If you do NOT use the proxy, and change the scenario so that it is the request host DNS resolution that changes. The problem does not happen neither.

This tells me that this might be a bug very specific to the use of ProxySelector in OkHttp. Unless consumers are expected to handle this in which case I would be interested to know how exactly ? (Interceptors, ...)


How to Reproduce (Scenario)

To reproduce this, you need to have a proxy setup and you need to be able to change the address that the proxy hostname resolves to.

  • Spin up a Proxy and a DNS to serve it with a specific hostname
  • Create an OkHttpClient instance with a ProxySelector
  • Execute a request with the OkHttpClient against a URL reachable through the proxy
  • Make a change to the Proxy DNS resolution: typically the proxy host remain the same but the IP of the proxy - that the DNS will resolves to - changes
  • Execute the same request with the same instance of the OkHttpClient
    --> it now fails to reach out to the proxy. Yo ucan try again and again and it always fails...

I am using kubernetes to do this. My proxy runs in a pod and has a service in front of it. So both the service and the POD have a specific IP. What I do to provoke that network change is to remove the service and the pods and recreate them. The hostname remains - the service DNS squid-proxy.squid.svc.cluster.local - but the IP it resolves to is now pointing to the new pod that is different.


Reproduction Project

I dod not find a way to create a simple test form this. But I created a small Java project https://github.com/Dohbedoh/test-clients that do the same request in a loop at 90s interval. The 90s interval is enough to provoke that network change. Also using an interceptor to get more logging. See https://github.com/Dohbedoh/test-clients/blob/f72cd1a3fdbd2b930b1297ee2d4a82f4237d7f26/src/main/java/com/dohbedoh/Main.java.

What I do is

  • I run against https://api.github.com/rate_limit/:
java \
  -Dcom.dohbedoh.proxyHost="squid-proxy.squid.svc.cluster.local" \
  -Dcom.dohbedoh.proxyPort=3128 \
  -cp test-clients-1.0-SNAPSHOT-jar-with-dependencies.jar com.dohbedoh.Main "https://api.github.com/rate_limit"
  • I wait for the first request to succeed:
Feb 17, 2023 2:25:19 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [APPLICATION] Sending request http://api.github.com/rate_limit on null

Feb 17, 2023 2:25:19 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [NETWORK] Sending request http://api.github.com/rate_limit on Connection{api.github.com:80, proxy=HTTP @ squid-proxy.squid.svc.cluster.local/10.30.130.211:3128 hostAddress=/10.30.130.211:3128 cipherSuite=none protocol=http/1.1}
Host: api.github.com
Connection: Keep-Alive
Accept-Encoding: gzip
User-Agent: okhttp/4.10.0

Feb 17, 2023 2:25:19 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [NETWORK] Received response for http://api.github.com/rate_limit in 74.0ms
Content-Length: 0
Location: https://api.github.com/rate_limit
Date: Fri, 17 Feb 2023 02:25:19 GMT
X-Cache: MISS from squid-proxy-bdb677474-scv2j
X-Cache-Lookup: MISS from squid-proxy-bdb677474-scv2j:3128
Via: 1.1 squid-proxy-bdb677474-scv2j (squid/5.6)
Connection: keep-alive

Feb 17, 2023 2:25:20 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [NETWORK] Sending request https://api.github.com/rate_limit on Connection{api.github.com:443, proxy=HTTP @ squid-proxy.squid.svc.cluster.local/10.30.130.211:3128 hostAddress=/10.30.130.211:3128 cipherSuite=TLS_AES_128_GCM_SHA256 protocol=h2}
Host: api.github.com
Connection: Keep-Alive
Accept-Encoding: gzip
User-Agent: okhttp/4.10.0

Feb 17, 2023 2:25:20 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [NETWORK] Received response for https://api.github.com/rate_limit in 74.8ms
access-control-allow-origin: *
access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset
cache-control: no-cache
content-security-policy: default-src 'none'
content-type: application/json; charset=utf-8
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
x-content-type-options: nosniff
x-frame-options: deny
x-github-media-type: github.v3; format=json
x-ratelimit-limit: 60
x-ratelimit-remaining: 59
x-ratelimit-reset: 1676602459
x-ratelimit-resource: core
x-ratelimit-used: 1
x-xss-protection: 1; mode=block
date: Fri, 17 Feb 2023 02:25:20 GMT
content-length: 465
accept-ranges: bytes
x-github-request-id: 1801:3F23:A07AFF:1481DEF:63EEE58F

Feb 17, 2023 2:25:20 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [APPLICATION] Received response for https://api.github.com/rate_limit in 665.3ms
access-control-allow-origin: *
access-control-expose-headers: ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset
cache-control: no-cache
content-security-policy: default-src 'none'
content-type: application/json; charset=utf-8
referrer-policy: origin-when-cross-origin, strict-origin-when-cross-origin
x-content-type-options: nosniff
x-frame-options: deny
x-github-media-type: github.v3; format=json
x-ratelimit-limit: 60
x-ratelimit-remaining: 59
x-ratelimit-reset: 1676602459
x-ratelimit-resource: core
x-ratelimit-used: 1
x-xss-protection: 1; mode=block
date: Fri, 17 Feb 2023 02:25:20 GMT
content-length: 465
accept-ranges: bytes
x-github-request-id: 1801:3F23:A07AFF:1481DEF:63EEE58F

Feb 17, 2023 2:25:20 AM com.dohbedoh.Main testUrl
INFO: {"resources":{"core":{"limit":60,"remaining":59,"reset":1676602459,"used":1,"resource":"core"},"graphql":{"limit":0,"remaining":0,"reset":1676604320,"used":0,"resource":"graphql"},"integration_manifest":{"limit":5000,"remaining":5000,"reset":1676604320,"used":0,"resource":"integration_manifest"},"search":{"limit":10,"remaining":10,"reset":1676600780,"used":0,"resource":"search"}},"rate":{"limit":60,"remaining":59,"reset":1676602459,"used":1,"resource":"core"}}

Feb 17, 2023 2:25:20 AM com.dohbedoh.Main testDns
INFO: Lookup DNS for api.github.com
Feb 17, 2023 2:25:20 AM com.dohbedoh.Main lambda$testDns$0
INFO:   api.github.com/140.82.112.5
Feb 17, 2023 2:25:20 AM com.dohbedoh.Main testDns
INFO: Lookup DNS for squid-proxy.squid.svc.cluster.local
Feb 17, 2023 2:25:20 AM com.dohbedoh.Main lambda$testDns$1
INFO:   squid-proxy.squid.svc.cluster.local/10.30.130.211
  • I provoke the network change:
k delete svc squid-proxy; k delete pods --all;k apply -f svc.yaml;
  • the next requests fail with SocketTimeoutException. And see how OkHttpClient.dns().lookup(PROXY_HOST) actually correctly resolves to the new IP:
Feb 17, 2023 2:26:50 AM com.dohbedoh.Main$LoggingInterceptor intercept
INFO: [APPLICATION] Sending request http://api.github.com/rate_limit on null

Feb 17, 2023 2:27:00 AM com.dohbedoh.Main testUrl
SEVERE: Failure executing request
java.net.SocketTimeoutException: connect timed out
        at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)
        at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)
        at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.base/java.net.Socket.connect(Socket.java:609)
        at okhttp3.internal.platform.Platform.connectSocket(Platform.kt:128)
        at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:295)
        at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207)
        at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
        at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
        at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74)
        at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255)
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at com.dohbedoh.Main$LoggingInterceptor.intercept(Main.java:187)
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
        at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
        at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
        at com.dohbedoh.Main.testUrl(Main.java:99)
        at com.dohbedoh.Main.main(Main.java:55)

Feb 17, 2023 2:27:00 AM com.dohbedoh.Main testDns
INFO: Lookup DNS for api.github.com
Feb 17, 2023 2:27:00 AM com.dohbedoh.Main lambda$testDns$0
INFO:   api.github.com/140.82.112.5
Feb 17, 2023 2:27:00 AM com.dohbedoh.Main testDns
INFO: Lookup DNS for squid-proxy.squid.svc.cluster.local
Feb 17, 2023 2:27:00 AM com.dohbedoh.Main lambda$testDns$1
INFO:   squid-proxy.squid.svc.cluster.local/10.30.130.88

If I run the same tests but I create a new client every time instead, this issue never happens...

Using this tool, the simplest way to reproduce this in K8s:

  • deploy a simple squid proxy kubectl apply -f https://raw.githubusercontent.com/Dohbedoh/test-clients/1474885fef7cba34fac37855f308593308acbd75/examples/proxy.yaml
  • wait for it to be running
  • deploy the test tool with kubectl apply -f https://raw.githubusercontent.com/Dohbedoh/test-clients/1474885fef7cba34fac37855f308593308acbd75/examples/test-clients.yaml that will start sending a request to https://api.github.com/rate_limit/ every 90s through the proxy
  • remove the proxy with kubectl delete -f https://raw.githubusercontent.com/Dohbedoh/test-clients/1474885fef7cba34fac37855f308593308acbd75/examples/proxy.yaml
  • recreate it with kubectl apply -f https://raw.githubusercontent.com/Dohbedoh/test-clients/1474885fef7cba34fac37855f308593308acbd75/examples/proxy.yaml
  • check the log of the test-clients-xxx pod

I hope this is enough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug in existing code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions