Irregular internet disruption: certain images and JS not loading

11

first time on ServerFault, and I've got a nice little conundrum.

Since a few months now, we've been having issues with our internet connectivity.

Environment:

Servers: 2 Terminal Servers as an RDSFarm running Windows Server 2008 R2
Browser: Internet Explorer 9
Test/debug browser: Chrome
AntiVirus: Avast 7.0.1455

Problem:

At irregular intervals, websites refuse to load, giving an error saying the page was not accessible, or some images don't load completely. Also, after inspection serveral .js files fail to get loaded.

enter image description here

Findings & What we tried:

First impression:

When I use Chrome during that interval, the site returns an net:: Error 101 or Error 103 after some refreshes. At other times, if it isn't giving the error, several images aren't visible and display an X image. IE just says the page cannot be displayed.

enter image description here

Using Chrome Developer Tools:

It shows in the console that several resources are unavailable, but when I right-click the missing images and select "Show Picture", they show. When I open up the pictures via direct URL, they also show.

enter image description here

Audit via Chrome Developer Tools:

I ran an audit on a page when it was in it's buggy state, and found out some .js files didn't load along with some .png, .jpg and .gif files. Different images load for Chrome and IE.

enter image description here enter image description here

Obfuscated JS Files & Avast:

After checking that out, I found out that most of those .js files are obfuscated JS files, and since we're running Avast 7.0.1455, I was wondering if the Web Shield didn't mess things up.

Then again, it's only happening on the first TS, not the second.

So I turned off WebShield for a day, and see if anything improved. It didn't. Back to square one.

No cache expiration on files:

Several of those files that aren't being loaded were indicated not having a cache expiration.

Caching:

One of our Sysadmins changed the IE cache size to 10MB a while back, which I thought may have been the source of the problem. He changed it back to 65MB or so, but still people run into trouble with their images. It also still happens on 1 TS, and also in Chrome, so I don't think the Group Policy dictating that cache would affect Chrome, would it?

enter image description here

Network Issue: I also thought it might be a network or routing issue, but both the TS-servers are on the same teamed NIC, and the other one is working just fine.

Help!

If anyone has some tips on where to look for issues, or needs more info, please help me out. This has been bothering me for serveral weeks now.

EDIT & UPDATE

The problem still persists, and only on our 2 Terminal Servers.

Here's what me and a colleague did so far:

  • Turn off the Antivirus for a day on one server, to see if it didn't happen. Problem still occured.

  • Checked the MTU-size
    It's the default setting (forgot the exact value :P) Problem still occured.

  • Installed Windows Updates, IE10 Problem still occured.

  • Checked if there were any proxies.
    The AV puts in a proxy as a so-called WebShield. We disabled the service and the program on one server for a day. Problem still occured.

  • Reinstalled the NIC-team as it was getting messed up. (Also reinstalled the NIC drivers) Problem still occured.

  • Checked Group Policies Apparently in both Terminal Servers, there was a Local Machine Policy that enabled Preference Mode in IE, which had some weird customisation done. Disabled that, and... Problem still occured.

It's now even gone so far as that people are having problems uploading and downloading files from SharePoint, and a lot of sites we're using aren't working due to this.

Hunches

It's either to do with the WebShield that breaks connection when it finds something peculiar, but then it shouldn't happen when the AV is turned off.

It could be that redirects are messed up somehow, or there' something with the cache. Strange though that the same issue occurs in Chrome as well as IE9 and IE10.

If anyone has any ideas, It'd be greatly appreciated.

Thanks go out to HopelessN00b for helping me out!

UPDATE:

We are getting some errors in Event Viewer like this on one of our original TS':

Error: (04/04/2013 08:44:42 AM) (Source: Application Error) (User: )
Description: Faulting application name: iexplore.exe, version: 9.0.8112.16470, time stamp: 0x510c8801
Faulting module name: MSHTML.dll, version: 9.0.8112.16470, time stamp: 0x510c9046
Exception code: 0xc0000005
Fault offset: 0x002d0174
Faulting process id: 0x21728
Faulting application start time: 0xiexplore.exe0
Faulting application path: iexplore.exe1
Faulting module path: iexplore.exe2
Report Id: iexplore.exe3

And sometimes this pops up, but apparently that's cos of some WYSE terminals being too old (replacing them with Raspberry Pi's soon hopefully).

Error: (04/04/2013 11:21:46 AM) (Source: TermDD) (User: )
Description: The Terminal Server security layer detected an error in the protocol stream and has disconnected the client.
Client IP: [IP REDACTED].

Hope this helps.

windows-server-2008-r2
internet
terminal-server
cache
rds
asked on Server Fault Mar 7, 2013 by blaa • edited Apr 4, 2013 by blaa

3 Answers

0

Try without bonding the NICs. Setup just one NIC and see if things still work. In the event that it does make sure that your switch port configuration, and Teaming configuration line up.

answered on Server Fault Apr 3, 2013 by Grim76
0

To diagnose the problem without an accurate error message, you need to run:

  • tcpdump on client side (wireshark has a nice display)
  • tcpdump on server side (see what the server is actually sending).
  • wait for the problem to occur
  • examine the packets, and see where the communication is breaking down. If you need help examining the trace, write it to a file.

I suspect you will find an unanswered DNS query. If your ISP is filtering your traffic through a proxy, you should be able to find traces of it in the traffic, especially by comparing the server side capture to the client side capture.

If there is a network quality problem, you may be able to observe it more straightforwardly with traceroute. If the network dump shows that communications went smoothly, but the browser cannot display the data provided, then your problem is desktop funnies on the terminal server.

You should run the packet capture on the terminal server that is making the browser connection that is not working.

answered on Server Fault Apr 18, 2013 by Des Cent
0

The issues has been "resolved" by the ISP. All images and JS and such are appearing normally now for a good week. The one external site not being able to be reached has been resolved by the ISP by placing a proxy between it all.

Unfortunately, the exact reason why or how this had happened still remains a mystery, but it's a safe bet there was something my ISP had changed that did the trick.

Thanks all for the support, and although a lot of answers have been very useful, I can't choose one of them to be the correct one, hence my own.

Thanks again for all your time and effort, and I hope no one else will have to cope with such networking strangeness.

answered on Server Fault Apr 22, 2013 by blaa

User contributions licensed under CC BY-SA 3.0