MSXML2.XMLHTTP generating different results than if I simply visited page in a web browser

StackOverflow https://stackoverflow.com/questions/13691414

  •  04-12-2021
  •  | 
  •  

Question

I use Excel VBA to download the page source of certain websites the following way:

    Public Function GetPage(URL As String) As String
        Dim oX As New MSXML2.XMLHTTP
        oX.Open "Get", URL, False
        oX.Send
        GetPage = oX.responseText
    End Function

However, after careful inspection I notice that the HTML source that this code returns is different than the HTML source that's returned when I visit the site with a web browser.

This is the website I'm visiting. At the bottom of the page, it lists two results. Now, if I visit that URL using my GetPage function, it returns the main page's HTML but it doesn't include any of the results--the source specifically says "0 result(s) found." What gives? It doesn't appear to be using JavaScript to replace certain HTML elements, so I'm at a loss. I do notice that when I visit the site myself it goes slowly, but when I run the VBA HTTP request it seems almost instantaneous. Perhaps I have to wait for something?

Does anyone know why I can't find these results if I do an HTTP request through VBA?

Was it helpful?

Solution

Try removing the fragment identifier #results from the end of the URL. Whilst this is perfectly legal syntax for a URL, it seems to cause an issue here as I see this in the response:

 ERROR:  column "results" does not exist
LINE 1: ... = 'O') AND (parent_part_info IS NULL))) LIMIT 10#results OF...

When I run the URL without the #results part, the response comes back after a couple of seconds with the same 2 results shown when visiting the URL in a browser

Also MSXML2.XMLHTTP is a synonym for MSXML2.XMLHTTP30 - i.e. the version originally from the "Microsoft XML, v3.0" library. If you are using the "Microsoft XML, v6.0" library then it's usually advisable to change the reference to MSXML2.XMLHTTP60. See this Microsoft blog post

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top