string url = "http://www.example.com/feed.xml";
var settings = new XmlReaderSettings();
settings.IgnoreComments = true;
settings.IgnoreProcessingInstructions = true;
settings.IgnoreWhitespace = true;
settings.XmlResolver = null;
settings.DtdProcessing = DtdProcessing.Parse;
settings.CheckCharacters = false;
var request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 900000;
request.KeepAlive = true;
request.IfModifiedSince = lastModified;
var response = (HttpWebResponse)request.GetResponse();
Stream stream;
stream = response.GetResponseStream();
stream.ReadTimeout = 600000;
var xmlReader = XmlReader.Create(stream, settings);
while (!xmlReader.EOF)
{
...
When I try this on a large xml file (that is also very slow to download), my azure web app throws a blank page after a couple of minutes.
I saw this on Azure's Failed Request Tracing Logs:
ModuleName: DynamicCompressionModule
Notification: SEND_RESPONSE
HttpStatus: 500
HttpReason: Internal Server Error
HttpSubStatus: 19
ErrorCode: An operation was attempted on a nonexistent network connection. (0x800704cd)
As you can see, I have been "playing around" with the timeout settings. Also tried catching all exceptions but it doesn't catch any.
Also, this works without problems when debugging the web app locally on my computer. It could be that the internet connection at my office is better than Azure's, resulting on the xml file being read fast without any problems.
Any possible workarounds? Edit: I want to keep streaming the XML file (I'm avoiding downloading the whole file because the user has an option to read only the first N entries of the feed). In case the problem described above can't be avoided, I will be happy if someone can help me displaying a meaningful message to the user at least, instead of blank page.
Try using the WebClient Class to get the xml file.
string xmlAsString;
using (var xmlWebClient = new WebClient())
{
xmlWebClient.Encoding = Encoding.UTF8;
xmlAsString = xmlWebClient.DownloadString(url);
}
XmlDocument currentXml = new XmlDocument();
currentXml.Load(xmlAsString);
You could just use
string url = "http://www.example.com/feed.xml";
using(var reader = XmlReader.Create(url){
And it should work as url are supported (see here). And streaming could then be used through yield return x
. This is probably your best bet, since you can let the native component handle the streaming the way it wants. You could even chunk the file via the ReadValueChunk method.
Another consideration, and the one I would guess is the issue, is the size of your Azure instance. Azure instances have a notoriously small amount of memory unless on the highest tier.
I also do not see you disposing of you of any of your streams, which can also lead to memory leaks and excessive memory usage.
And considering it works on your machine, and most personal computers are at least as powerful as an A3 instance (one tier below the top), as well as having an IDE to clean up any memory leaks locally, and it seems viable the azure instance could be the issue.
One potential solution would be to use file streaming. Memory streaming and file streaming are very similar after a certain size. One uses the file system, while the other uses a sys file (IIRC pagefile.sys), so converting to a file stream would have little impact on performance, with the drawback of having to clean up the file after you are done. But when dollars are a consideration, disk streaming is cheaper in the azure world.
try this
static IEnumerable<XElement> StreamCustomerItem(string uri)
{
using (XmlReader reader = XmlReader.Create(uri))
{
XElement name = null;
XElement item = null;
reader.MoveToContent();
// Parse the file, save header information when encountered, and yield the
// Item XElement objects as they are created.
// loop through Customer elements
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "Customer")
{
// move to Name element
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.Element &&
reader.Name == "Name")
{
name = XElement.ReadFrom(reader) as XElement;
break;
}
}
// loop through Item elements
while (reader.Read())
{
if (reader.NodeType == XmlNodeType.EndElement)
break;
if (reader.NodeType == XmlNodeType.Element
&& reader.Name == "Item")
{
item = XElement.ReadFrom(reader) as XElement;
if (item != null)
{
XElement tempRoot = new XElement("Root",
new XElement(name)
);
tempRoot.Add(item);
yield return item;
}
}
}
}
}
}
}
static void Main(string[] args)
{
XStreamingElement root = new XStreamingElement("Root",
from el in StreamCustomerItem("Source.xml")
select new XElement("Item",
new XElement("Customer", (string)el.Parent.Element("Name")),
new XElement(el.Element("Key"))
)
);
root.Save("Test.xml");
Console.WriteLine(File.ReadAllText("Test.xml"));
}
based on below XML
<?xml version="1.0" encoding="utf-8"?>
<Root>
<Item>
<Customer>A. Datum Corporation</Customer>
<Key>0001</Key>
</Item>
<Item>
<Customer>A. Datum Corporation</Customer>
<Key>0002</Key>
</Item>
<Item>
<Customer>A. Datum Corporation</Customer>
<Key>0003</Key>
</Item>
<Item>
<Customer>A. Datum Corporation</Customer>
<Key>0004</Key>
</Item>
<Item>
<Customer>Fabrikam, Inc.</Customer>
<Key>0005</Key>
</Item>
<Item>
<Customer>Fabrikam, Inc.</Customer>
<Key>0006</Key>
</Item>
<Item>
<Customer>Fabrikam, Inc.</Customer>
<Key>0007</Key>
</Item>
<Item>
<Customer>Fabrikam, Inc.</Customer>
<Key>0008</Key>
</Item>
<Item>
<Customer>Southridge Video</Customer>
<Key>0009</Key>
</Item>
<Item>
<Customer>Southridge Video</Customer>
<Key>0010</Key>
</Item>
</Root>
User contributions licensed under CC BY-SA 3.0