Skip to content

list[0] mangles XPath results #3

@sjehuda

Description

@sjehuda

Consider using method str.join(iterable).

descriptions = node.xpath('.//p/descendant-or-self::text()')
description = '' if len(descriptions) == 0 else '<br>'.join(descriptions)

Here is where the problem currently relies.

descriptions = node.xpath('.//p/descendant-or-self::text()')
description = '' if len(descriptions) == 0 else descriptions[0]

Our example target is this <p>We are currently looking at a node with a<b> child node </b>within it.</p>, and we would use .//p/descendant-or-self::text() to catch anything between <p> and </p>;

Once we would print list, we would see
[u'We are currently looking at a node with a ', u'child node', u' within it.', u'.'];

This means that list[0] contains We are currently looking at a node with a, list[1] contains child node, and list[2] contains within it.;

list content
list[0] We are currently looking at a node with a
list[1] child node
list[2] within it.

By obtaining only list[0], we would corrupt the desired XPath result by outputing only the content between the first <p> and the first <b>.

This also significantly hardens the use of XPath String Functions such as substring-before, substring-after and substring, and may require us to make an excessive use of for loops, considering the functions that XPath already provides.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions