Extract Text from web page displayed in a TWebBrowser
-
23-09-2019 - |
Question
I use delphi 7 and I would like to extract ONLY the text displayed in a webpage directly from a web page displayed in a TWebBrowser (no images....). Could it be done & how can I do it?
Solution
I used the following...
procedure TForm1.WebBrowser1DocumentComplete(Sender: TObject;
const pDisp: IDispatch; var URL: OleVariant);
var
Document: IHtmlDocument2;
begin
edit1.text:=url;
document := webbrowser1.document as IHtmlDocument2;
memo2.lines.add(trim(document.body.innerhtml)); // to get html
memo1.lines.add(trim(document.body.innertext)); // to get text
end;
OTHER TIPS
If your wanting to load this into a TRichEdit, then I suggest looking at the WPTools component which has the ability to load the data from an HTML stream, and export as RTF. I use this component to handle my internal email editor (which it appears is what your after).
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow