如何获得HTML元素使用C＃坐标？

https://stackoverflow.com/questions/1547614

20-09-2019
|

题

我打算开发网络爬虫，这将提取的HTML元素的坐标，从网页。我发现，这是可能得到的HTML元素，通过使用“MSHTML”组装坐标。眼下的我想知道这是否是可能的，如何从网页只需要信息（HTML，CSS），然后用适当的MSHTML类获得的正确的协调所有的HTML元素吗

谢谢！

解决方案

我使用这些C＃函数来确定元素位置。需要在所讨论的HTML元素的引用来传递。

public static int findPosX( mshtml.IHTMLElement obj ) 
{
  int curleft = 0;
  if (obj.offsetParent != null ) 
  {
    while (obj.offsetParent != null ) 
    {
      curleft += obj.offsetLeft;
      obj = obj.offsetParent;
    }
  } 

  return curleft;
}

public static int findPosY( mshtml.IHTMLElement obj ) 
{
  int curtop = 0;
  if (obj.offsetParent != null ) 
  {
    while (obj.offsetParent != null ) 
    {
      curtop += obj.offsetTop;
      obj = obj.offsetParent;
    }
  } 

  return curtop;
}

我得到从当前文档的HTML元素像这样：

// start an instance of IE
public SHDocVw.InternetExplorerClass ie;
ie = new SHDocVw.InternetExplorerClass();
ie.Visible = true;

// Load a url
Object Flags = null, TargetFrameName = null, PostData = null, Headers = null;
ie.Navigate( url, ref Flags, ref TargetFrameName, ref PostData, ref Headers );

while( ie.Busy )
{
  Thread.Sleep( 500 );
}

// get an element from the loaded document
mshtml.HTMLDocumentClass document = ((mshtml.HTMLDocumentClass)ie.Document);
document.getElementById("myelementsid");

其他提示

我不知道一个如何在C＃中做到这一点，因为它不是我的首选语言，但它可以使用JavaScript来完成，特别是使用jQuery的的偏移（）函数。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow