Skip to content

Only getting half the content #16

@harvest316

Description

@harvest316

Trying NReadability on http://www.propertysearch4u.com.au/buyers-agent-sydney gives me just the following InnerHtml: http://pastebin.com/fuA2QJsH

The first line of content on the actual webpage is "The Buyers Agent Sydney Services that can help Sydneysiders, Interstate investors and Australian Expatriates cost effectively acquire their Sydney property." but the TranscodingResult.ExtractedContent I'm getting starts with "Bidding at auction can be intimidating." which is actually halfway down the page.

Here's how I'm calling it:

var transcodingInput = new WebTranscodingInput(strURL);
var transcoder = new NReadability.NReadabilityWebTranscoder();
var transcodingResult = transcoder.Transcode(transcodingInput);
if (!transcodingResult.ContentExtracted)
    throw new ArgumentNullException();
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(transcodingResult.ExtractedContent);
var bodyNode = doc.DocumentNode.SelectSingleNode("//div[@id='readInner']");

A copy of the original page, in case it changes: http://pastebin.com/20nZ2qRa

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions