题
我有一个尝试索引的GWT应用程序。
我正在使用htmlunit来获取生成的html的内容:
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);
HtmlPage refDesing = webClient.getPage("http://localhost:8080/MyGWTApp/#page2");
FileOutputStream fos1 = new FileOutputStream("D:\\work\\out\\page2.html");
fos1.write(refDesing.asXml().getBytes());
fos1.close();
但是我会收到以下错误,并且页面返回大约为空!
Dec 22, 2010 6:16:25 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Expected content type of 'application/javascript' or 'application/ecmascript' for remotely loaded JavaScript element at 'http://xxxxxxxxxxxx/xxxxxxxx/xxxxxxxx/xxxxxxxxxx.nocache.js', but got 'application/x-javascript'.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [485:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [485:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [518:29] Error in style rule. Invalid token "\n ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [518:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [541:29] Error in style rule. Invalid token "\n ". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [541:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [951:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [951:29] Ignoring the following declarations in this rule.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:24] Error in expression. Invalid token "=". Was expecting one of: <S>, <COMMA>, "/", <PLUS>, "-", <HASH>, <STRING>, ")", <URI>, "inherit", <EMS>, <EXS>, <LENGTH_PX>, <LENGTH_CM>, <LENGTH_MM>, <LENGTH_IN>, <LENGTH_PT>, <LENGTH_PC>, <ANGLE_DEG>, <ANGLE_RAD>, <ANGLE_GRAD>, <TIME_MS>, <TIME_S>, <FREQ_HZ>, <FREQ_KHZ>, <DIMENSION>, <PERCENTAGE>, <NUMBER>, <FUNCTION>, <IDENT>.
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler error
WARNING: CSS error: null [977:29] Error in style rule. Invalid token "\n". Was expecting one of: "}", ";".
Dec 22, 2010 6:16:27 PM com.gargoylesoftware.htmlunit.DefaultCssErrorHandler warning
WARNING: CSS warning: null [977:29] Ignoring the following declarations in this rule.
编辑:
我所说的大约是空的,这是返回的HTML的快照:
请注意,并非HTMLUNIT返回原始页面中显示的所有数据(原始收到的数据)。还有什么?方法?我认为这并不意味着任何编码错误,因为所有单词都是清晰的ASCII字符。
<td align="center" style="vertical-align: top;">
<table class="refDesignGrid" cellspacing="5">
<colgroup>
<col/>
</colgroup>
<tbody align="left">
<tr>
<td align="left" style="vertical-align: top;">
<table cellpadding="0" class="categoryItem" cellspacing="0">
<tbody align="left">
<tr>
<td align="left" style="vertical-align: top;">
<div class="header4">
C++
</div>
</td>
</tr>
</tbody>
</table>
</td>
<td align="left" style="vertical-align: top;">
<table cellpadding="0" class="categoryItem" cellspacing="0">
<tbody align="left">
<tr>
<td align="left" style="vertical-align: top;">
<div class="header4">
Java
</div>
</td>
</tr>
</tbody>
</table>
</td>
<td align="left">
<table cellpadding="0" class="categoryItem" cellspacing="0">
<tbody align="left">
<tr>
<td align="left" style="vertical-align: top;">
<div class="header4">
C#
</div>
</td>
</tr>
</tbody>
</table>
</td>
<td>
?
</td>
</tr>
<tr>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
</tr>
<tr>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
</tr>
<tr>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
</tr>
<tr>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
<td>
?
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
解决方案 2
答案在这里:http://htmlunit.sourceforge.net/faq.html#ajaxdoesnotwork
使用HTMLUNIT的主线程可能在允许背景线程运行之前完成执行。您有几个选择:
webclient.setajaxcontroller(new New nicyLyElyResynchronizingajaxController());将告诉您的网络客户实例重新同步异步XHR。 WebClient.WaitForBackgroundJavaScript(10000);或WebClient.WaitForBackgroundJavascriptStartingBefore(10000);在获取页面和操作之前。明确等待当您的JavaScript运行时预期可以满足的条件,例如
//try 20 times to wait .5 second each for filling the page. for (int i = 0; i < 20; i++) { if (condition_to_happen_after_js_execution) { break; } synchronized (page) { page.wait(500); } }
其他提示
htmlunit可能有点健谈,尤其是使事情看起来比以前更糟糕。
创建这两个类:
import org.w3c.css.sac.CSSException;
import org.w3c.css.sac.CSSParseException;
import com.gargoylesoftware.htmlunit.DefaultCssErrorHandler;
/*
* get rid of warnings... and provide a place to hang a break point
*/
public class QuietCssErrorHandler
extends DefaultCssErrorHandler
{
@Override public void error( CSSParseException e ) throws CSSException
{
super.error( e ) ;
}
@Override public void fatalError( CSSParseException e ) throws CSSException
{
super.fatalError( e ) ;
}
@Override public void warning( CSSParseException e ) throws CSSException
{
}
}
和
import com.gargoylesoftware.htmlunit.IncorrectnessListener;
public class SilentIncorrectnessListener
implements IncorrectnessListener
{
@Override public void notify( String message, Object origin )
{
// do nuttin' honey!
}
}
然后,当您创建WebClient ...
wc.setIncorrectnessListener( new SilentIncorrectnessListener() ) ;
wc.setCssErrorHandler( new QuietCssErrorHandler() ) ;
然后,您应该得到更少的警告。
至于“大约空的” ...这是什么意思?
不隶属于 StackOverflow