I have an app that uploads some files (png, mp4).
The full stack is Nginx + Node server behind it.
Sometimes, it seems that some users experience bizarre behavior in production when the internet is problematic, and then the Android app cannot send further Upload requests cause they all get :
IOException : java.net.SocketTimeoutException: timeout
.
However, the strangest thing is that we see the server does send a response (409 Conflict, meaning the files being uploaded already exist and done uploading), but the Android app simply doesn't get those responses and just keeps on trying to upload those files, but the connection keeps failing because of SocketTimeoutException every time again and again, instead of receiving the 409 response.
And then, after some time, it gets back to normal and suddenly can receive that 409 response and continue as if everything is fine.
I tried using the most basic HttpUrlConnection, as well as Square's OkHttp library, but it results in the same behavior.
When I debug it with Okhttp + HTTP/2 + a Proxy to monitor the requests, in the frame logging I see (with a Proxy enabled):
02-16 13:30:16.047 2719 6597 D okhttp.Http2: << 0x00000003 4 RST_STREAM
02-16 13:30:16.049 2719 3758 D okhttp.TaskRunner: Q10117 canceled : OkHttp ConnectionPool
02-16 13:30:16.051 2719 6597 D okhttp.Http2: >> 0x00000000 8 GOAWAY
02-16 13:30:16.051 2719 6597 D okhttp.TaskRunner: Q10121 finished run in 60 s : <our server>
at this point my code already received an error from my server implementation. Then I see :
02-16 13:30:20.129 2719 3758 D okhttp.Http2: >> CONNECTION 505249202a20485454502f322e300d0a0d0a534d0d0a0d0a
02-16 13:30:20.131 2719 3758 D okhttp.Http2: >> 0x00000000 6 SETTINGS
02-16 13:30:20.132 2719 3758 D okhttp.Http2: >> 0x00000000 4 WINDOW_UPDATE
02-16 13:30:20.133 2719 3758 D okhttp.TaskRunner: Q10157 scheduled after 0 µs: OkHttp <our server>
02-16 13:30:20.135 2719 3758 D okhttp.TaskRunner: Q10153 scheduled after 0 µs: OkHttp ConnectionPool
02-16 13:30:20.135 2719 7381 D okhttp.TaskRunner: Q10157 starting : OkHttp <our server>
02-16 13:30:20.136 2719 7365 D okhttp.TaskRunner: Q10153 starting : OkHttp ConnectionPool
02-16 13:30:20.137 2719 7365 D okhttp.TaskRunner: Q10153 run again after 300 s : OkHttp ConnectionPool
02-16 13:30:20.137 2719 7365 D okhttp.TaskRunner: Q10153 finished run in 1 ms: OkHttp ConnectionPool
02-16 13:30:20.138 2719 3758 D okhttp.Http2: >> 0x00000003 1063 HEADERS END_HEADERS
02-16 13:30:20.145 2719 7381 D okhttp.Http2: << 0x00000000 18 SETTINGS
02-16 13:30:20.146 2719 7381 D okhttp.TaskRunner: Q10154 scheduled after 0 µs: OkHttp <our server> applyAndAckSettings
02-16 13:30:20.147 2719 7381 D okhttp.Http2: << 0x00000000 4 WINDOW_UPDATE
02-16 13:30:20.147 2719 7365 D okhttp.TaskRunner: Q10154 starting : OkHttp <our server> applyAndAckSettings
02-16 13:30:20.149 2719 7365 D okhttp.TaskRunner: Q10156 scheduled after 0 µs: OkHttp <our server> onSettings
02-16 13:30:20.151 2719 6597 D okhttp.TaskRunner: Q10156 starting : OkHttp <our server> onSettings
02-16 13:30:20.151 2719 6597 D okhttp.TaskRunner: Q10156 finished run in 1 ms: OkHttp <our server> onSettings
02-16 13:30:20.152 2719 7365 D okhttp.Http2: >> 0x00000000 0 SETTINGS ACK
02-16 13:30:20.153 2719 3758 D okhttp.Http2: >> 0x00000003 65535 DATA
02-16 13:30:20.153 2719 7365 D okhttp.TaskRunner: Q10154 finished run in 6 ms: OkHttp <our server> applyAndAckSettings
02-16 13:30:20.156 2719 3758 D okhttp.Http2: >> 0x00000003 1 DATA
02-16 13:30:20.231 2719 7381 D okhttp.Http2: << 0x00000000 0 SETTINGS ACK
Following @Yuri Schimke's suggestion, I added PrintingEventListener, here's what I get :
2021-02-17 09:18:50.056 28203-28286/<package name> I/System.out: (HTTPLog)-Static: isSBSettingEnabled false
2021-02-17 09:18:53.378 28203-28286/<package name> I/System.out: 0.000 callStart
2021-02-17 09:18:53.460 28203-28286/<package name> I/System.out: 0.081 proxySelectStart
2021-02-17 09:18:53.462 28203-28286/<package name> I/System.out: 0.084 proxySelectEnd
2021-02-17 09:18:53.462 28203-28286/<package name> I/System.out: 0.084 dnsStart
2021-02-17 09:18:53.463 28203-28286/<package name> I/System.out: 0.085 dnsEnd
2021-02-17 09:18:53.475 28203-28286/<package name> I/System.out: 0.096 connectStart
2021-02-17 09:18:53.650 28203-28286/<package name> I/System.out: 0.271 secureConnectStart
2021-02-17 09:18:53.796 28203-28286/<package name> I/System.out: 0.418 secureConnectEnd
2021-02-17 09:18:53.834 28203-28286/<package name> I/System.out: 0.456 connectEnd
2021-02-17 09:18:53.839 28203-28286/<package name> I/System.out: 0.461 connectionAcquired
2021-02-17 09:18:53.870 28203-28286/<package name> I/System.out: 0.492 requestHeadersStart
2021-02-17 09:18:53.891 28203-28286/<package name> I/System.out: 0.513 requestHeadersEnd
2021-02-17 09:18:53.891 28203-28286/<package name> I/System.out: 0.513 requestBodyStart
2021-02-17 09:19:03.382 28203-28286/<package name> I/System.out: 10.003 requestFailed
2021-02-17 09:19:03.391 28203-28311/<package name> I/System.out: 10.012 canceled
2021-02-17 09:19:03.399 28203-28286/<package name> I/System.out: 10.021 connectionReleased
2021-02-17 09:19:03.400 28203-28286/<package name> I/System.out: 10.022 callFailed
Ignoring your specific question for a bit. The first challenge is to ask a question that the Stackoverflow responders can easily help you progress on. Everyone here wants to help with your question.
Read up on https://stackoverflow.com/help/how-to-ask and maybe http://sscce.org
For OkHttp use the debugging tools it provides to get additional data that explains what is going on, even if it's not immediately clear. Are you getting multiple routes back and the first one is slow and then gets marked as bad?
Event Logging - https://square.github.io/okhttp/events/
HTTP/2 Frame Logging (if you can enable) - https://square.github.io/okhttp/debug_logging/#http2-frame-logging
Run a debugger and capture stack traces of what is going on? Are the IO threads still reading and blocking the network?
User contributions licensed under CC BY-SA 3.0