A VB.Net windows form using "WebBrowser" and HTMLDocument, HTMLTable, HTMLTableRow to retrieve innerText of the HTML Table row, column. It only works in the first time but fails on subsequent.
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
Dim stockNo As String = ""
Dim stockName String
Dim doc As mshtml.HTMLDocument
Dim table As mshtml.HTMLTable
Dim rows As mshtml.HTMLTableRow
doc = WebBrowser1.Document.DomDocument
table = doc.getElementsByTagName("TABLE").item(0)
For r = 3 To table.rows.length - 1
rows = table.rows.item(r)
Try
stockNo = Replace(rows.cells(0).innerText, " ", "")
stockName = Replace(rows.cells(1).innerText, " ", "")
Catch ex As Exception
Console.WriteLine("Error here: =====> " & ex.ToString)
Console.WriteLine(rows.cells(0))
End Try
Next r
End Sub
Here is the error when execute on "rows.cells(0).innerText"
Error here: =====> System.NotSupportedException: 發生例外狀況於 HRESULT: 0x800A01B6
Microsoft.VisualBasic.CompilerServices.LateBinding.LateGet(Object o, Type objType, String name, Object[] args, String[] paramnames, Boolean[] CopyBack)
Microsoft.VisualBasic.CompilerServices.NewLateBinding.LateGet(Object Instance, Type Type, String MemberName, Object[] Arguments, String[] ArgumentNames, Type[] TypeArguments, Boolean[] CopyBack)
Also try WebBrowser1_ProgressChanged but still not works. Any clue helps. Thanks.
Two examples to perform the same task using the mshtml.HTMLDocument interface and the WebBrowser Document object.
When handling the DocumentCompleted event, we first check its ReadyState. If it's not WebBrowserReadyState.Complete, the current Document is still not ready to be parsed. Note that you can have more than one HtmlDocument
per HTML
Page (Frames and IFrames have their personal Document), so this event can be raised multiple times per page.
WebBrowser1.ReadyState <> WebBrowserReadyState.Complete
To avoid the Late Bound warning or error, cast the WebBrowser HtmlDocument
to a local variable of the same type. The same if you're using the mshtml.HTMLDocument
interface:
Dim wbDoc As HtmlDocument = DirectCast(sender, WebBrowser).Document
Dim htmlDoc As mshtml.HTMLDocument = DirectCast(wbDoc.DomDocument, mshtml.HTMLDocument)
As you can see in the two code snippets, the difference, when using either objects, is - in this case - almost non existent:
Using the mshtml.HTMLDocument
:
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If WebBrowser1.ReadyState <> WebBrowserReadyState.Complete Then Return
Dim startingRow As Integer = 3
Dim wbDoc As HtmlDocument = DirectCast(sender, WebBrowser).Document
Dim htmlDoc As mshtml.HTMLDocument = DirectCast(wbDoc.DomDocument, mshtml.HTMLDocument)
Dim firstTable As mshtml.HTMLTable = htmlDoc.getElementsByTagName("TABLE").OfType(Of mshtml.HTMLTable)().FirstOrDefault()
If firstTable IsNot Nothing Then
For tableRow As Integer = startingRow To firstTable.rows.length - 1
Dim row As mshtml.HTMLTableRow = DirectCast(firstTable.rows.item(tableRow), mshtml.HTMLTableRow)
For col As Integer = 0 To 1
Dim rowCell = DirectCast(row.cells.item(col), mshtml.HTMLTableCell)
If rowCell IsNot Nothing Then
rowCell.innerText = rowCell.innerText?.Replace(" ", "")
Else
'Decide what to do if the cell content is null
End If
Next
Next
End If
End Sub
Using the WebBrowser.Document
directly:
Private Sub WebBrowser1_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
If WebBrowser1.ReadyState <> WebBrowserReadyState.Complete Then Return
Dim startingRow As Integer = 3
Dim doc As HtmlDocument = DirectCast(sender, WebBrowser).Document
Dim firstTable As HtmlElement = doc.GetElementsByTagName("TABLE").OfType(Of HtmlElement)().FirstOrDefault()
If firstTable?.Children.Count > 0 Then
For tableRow As Integer = startingRow To firstTable.Children.Count - 1
Dim rowCells As HtmlElementCollection = firstTable.Children(tableRow).Children
If rowCells Is Nothing Then Continue For
For col As Integer = 0 To 1
If Not String.IsNullOrEmpty(rowCells(col).InnerText) Then
rowCells(col).InnerText = rowCells(col).InnerText.Replace(" ", "")
Else
'Decide what to do if the cell content is null
End If
Next
Next
End If
End Sub
Finally, in my opinion, "make sure all casts and conversions are done explicitly. " from jmcilhinney is right.
rows.cells(0).innerText ===> Will fail on the subsequent use but do not know why the first time is OK
rows = table.rows.item(r) ====> OK, if all casts and conversions are done explicitly
cell0 = rows.cells.item(0)
Thanks...
User contributions licensed under CC BY-SA 3.0