Android network calls sometimes get stuck at IOException : java.net.SocketTimeoutException: timeout

1

I have an app that uploads some files (png, mp4). The full stack is Nginx + Node server behind it. Sometimes, it seems that some users experience bizarre behavior in production when the internet is problematic, and then the Android app cannot send further Upload requests cause they all get : IOException : java.net.SocketTimeoutException: timeout.

However, the strangest thing is that we see the server does send a response (409 Conflict, meaning the files being uploaded already exist and done uploading), but the Android app simply doesn't get those responses and just keeps on trying to upload those files, but the connection keeps failing because of SocketTimeoutException every time again and again, instead of receiving the 409 response.

And then, after some time, it gets back to normal and suddenly can receive that 409 response and continue as if everything is fine.

I tried using the most basic HttpUrlConnection, as well as Square's OkHttp library, but it results in the same behavior.

When I debug it with Okhttp + HTTP/2 + a Proxy to monitor the requests, in the frame logging I see (with a Proxy enabled):

02-16 13:30:16.047  2719  6597 D okhttp.Http2: << 0x00000003     4 RST_STREAM    
02-16 13:30:16.049  2719  3758 D okhttp.TaskRunner: Q10117 canceled              : OkHttp ConnectionPool
02-16 13:30:16.051  2719  6597 D okhttp.Http2: >> 0x00000000     8 GOAWAY        
02-16 13:30:16.051  2719  6597 D okhttp.TaskRunner: Q10121 finished run in  60 s : <our server>

at this point my code already received an error from my server implementation. Then I see :

02-16 13:30:20.129  2719  3758 D okhttp.Http2: >> CONNECTION 505249202a20485454502f322e300d0a0d0a534d0d0a0d0a
02-16 13:30:20.131  2719  3758 D okhttp.Http2: >> 0x00000000     6 SETTINGS      
02-16 13:30:20.132  2719  3758 D okhttp.Http2: >> 0x00000000     4 WINDOW_UPDATE 
02-16 13:30:20.133  2719  3758 D okhttp.TaskRunner: Q10157 scheduled after   0 µs: OkHttp <our server>
02-16 13:30:20.135  2719  3758 D okhttp.TaskRunner: Q10153 scheduled after   0 µs: OkHttp ConnectionPool
02-16 13:30:20.135  2719  7381 D okhttp.TaskRunner: Q10157 starting              : OkHttp <our server>
02-16 13:30:20.136  2719  7365 D okhttp.TaskRunner: Q10153 starting              : OkHttp ConnectionPool
02-16 13:30:20.137  2719  7365 D okhttp.TaskRunner: Q10153 run again after 300 s : OkHttp ConnectionPool
02-16 13:30:20.137  2719  7365 D okhttp.TaskRunner: Q10153 finished run in   1 ms: OkHttp ConnectionPool
02-16 13:30:20.138  2719  3758 D okhttp.Http2: >> 0x00000003  1063 HEADERS       END_HEADERS
02-16 13:30:20.145  2719  7381 D okhttp.Http2: << 0x00000000    18 SETTINGS      
02-16 13:30:20.146  2719  7381 D okhttp.TaskRunner: Q10154 scheduled after   0 µs: OkHttp <our server> applyAndAckSettings
02-16 13:30:20.147  2719  7381 D okhttp.Http2: << 0x00000000     4 WINDOW_UPDATE 
02-16 13:30:20.147  2719  7365 D okhttp.TaskRunner: Q10154 starting              : OkHttp <our server> applyAndAckSettings
02-16 13:30:20.149  2719  7365 D okhttp.TaskRunner: Q10156 scheduled after   0 µs: OkHttp <our server> onSettings
02-16 13:30:20.151  2719  6597 D okhttp.TaskRunner: Q10156 starting              : OkHttp <our server> onSettings
02-16 13:30:20.151  2719  6597 D okhttp.TaskRunner: Q10156 finished run in   1 ms: OkHttp <our server> onSettings
02-16 13:30:20.152  2719  7365 D okhttp.Http2: >> 0x00000000     0 SETTINGS      ACK
02-16 13:30:20.153  2719  3758 D okhttp.Http2: >> 0x00000003 65535 DATA          
02-16 13:30:20.153  2719  7365 D okhttp.TaskRunner: Q10154 finished run in   6 ms: OkHttp <our server> applyAndAckSettings
02-16 13:30:20.156  2719  3758 D okhttp.Http2: >> 0x00000003     1 DATA          
02-16 13:30:20.231  2719  7381 D okhttp.Http2: << 0x00000000     0 SETTINGS      ACK

Following @Yuri Schimke's suggestion, I added PrintingEventListener, here's what I get :

2021-02-17 09:18:50.056 28203-28286/<package name> I/System.out: (HTTPLog)-Static: isSBSettingEnabled false
2021-02-17 09:18:53.378 28203-28286/<package name>  I/System.out: 0.000 callStart
2021-02-17 09:18:53.460 28203-28286/<package name>  I/System.out: 0.081 proxySelectStart
2021-02-17 09:18:53.462 28203-28286/<package name>  I/System.out: 0.084 proxySelectEnd
2021-02-17 09:18:53.462 28203-28286/<package name>  I/System.out: 0.084 dnsStart
2021-02-17 09:18:53.463 28203-28286/<package name>  I/System.out: 0.085 dnsEnd
2021-02-17 09:18:53.475 28203-28286/<package name>  I/System.out: 0.096 connectStart
2021-02-17 09:18:53.650 28203-28286/<package name>  I/System.out: 0.271 secureConnectStart
2021-02-17 09:18:53.796 28203-28286/<package name>  I/System.out: 0.418 secureConnectEnd
2021-02-17 09:18:53.834 28203-28286/<package name>  I/System.out: 0.456 connectEnd
2021-02-17 09:18:53.839 28203-28286/<package name>  I/System.out: 0.461 connectionAcquired
2021-02-17 09:18:53.870 28203-28286/<package name>  I/System.out: 0.492 requestHeadersStart
2021-02-17 09:18:53.891 28203-28286/<package name>  I/System.out: 0.513 requestHeadersEnd
2021-02-17 09:18:53.891 28203-28286/<package name>  I/System.out: 0.513 requestBodyStart
2021-02-17 09:19:03.382 28203-28286/<package name>  I/System.out: 10.003 requestFailed
2021-02-17 09:19:03.391 28203-28311/<package name>  I/System.out: 10.012 canceled
2021-02-17 09:19:03.399 28203-28286/<package name>  I/System.out: 10.021 connectionReleased
2021-02-17 09:19:03.400 28203-28286/<package name>  I/System.out: 10.022 callFailed
java
android
networking
okhttp
asked on Stack Overflow Feb 15, 2021 by BVtp • edited Feb 17, 2021 by YO_USA

1 Answer

0

Ignoring your specific question for a bit. The first challenge is to ask a question that the Stackoverflow responders can easily help you progress on. Everyone here wants to help with your question.

Read up on https://stackoverflow.com/help/how-to-ask and maybe http://sscce.org

For OkHttp use the debugging tools it provides to get additional data that explains what is going on, even if it's not immediately clear. Are you getting multiple routes back and the first one is slow and then gets marked as bad?

Event Logging - https://square.github.io/okhttp/events/

HTTP/2 Frame Logging (if you can enable) - https://square.github.io/okhttp/debug_logging/#http2-frame-logging

Run a debugger and capture stack traces of what is going on? Are the IO threads still reading and blocking the network?

answered on Stack Overflow Feb 16, 2021 by Yuri Schimke

User contributions licensed under CC BY-SA 3.0