Skip to content

Hidden section returned instead of the main article body #21

@razvangoga

Description

@razvangoga

Hi,

I've been using your component for some time with good results, but lately we have encountered more and more cases like this one http://ir.tcfbank.com/file/Index?KeyFile=32068838 where NR decides that the terms and condition body of text (hidden and visible via a popup when you click the "terms and conditions" link at the bottom of the article) is extracted instead of the actual article body.
Technically the decision is correct as the "t&c" body of text is larger and more compact than the main article body.

In other cases the (now) omnipresent "this website uses cookies" text is chosen instead of the article on the same grounds.

Do you have any plans to address such issues in the near future ?

For the moment we have resolved it by using an in house modified version of NR where we can tweak the algorithm regex-es on a case by case basis to exclude the irrelevant content.

Thanks and best regards,
Razvan Goga

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions