Html 在 XPath 中获取(文本)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5453422/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-29 07:37:56  来源:igfitidea点击:

Get (text) in XPath

htmldomxpathhtml-parsing

提问by snoofkin

I have the following DOM structure / HTML, I want to get (just practicing...) the marked data. enter image description here

我有以下 DOM 结构/ HTML,我想获得(只是练习......)标记的数据。 在此处输入图片说明

The one that is under the h2 element. that div[@class="coordsAgence"] element, has some more div children below and some more h2's.. so doing:

在 h2 元素下的那个。那个 div[@class="coordsAgence"] 元素,下面有更多的 div 子元素和更多的 h2.. 这样做:

div[@class="coordsAgence"]

Will get that value, but with additional unneeded text. UPDATE: The value (From this example) that I basically want is that: "GALLIER Dennis" text.

将获得该值,但带有额外的不需要的文本。 更新:我基本上想要的值(来自这个例子)是:“GALLIER Dennis”文本。

回答by Liza Daly

It seems you want the first text node in that div:

您似乎想要该 div 中的第一个文本节点:

div[@class="coordsAgence"]/text()[1]

should do it.

应该这样做。

Note that this assumes that there is actually no whitespace between those comments inside <div class="coordsAgence">; otherwise that whitespace will constitute additional text nodes that you'll have to account for.

请注意,这假设内部的这些注释之间实际上没有空格<div class="coordsAgence">;否则,该空格将构成您必须考虑的额外文本节点。

回答by Wayne

Get the first text node following the first h2in the divwith class "coordsAgence":

获取后的第一个第一个文本节点h2div与类"coordsAgence"

div[@class='coordsAgence']/h2[1]/following-sibling::text()[1]

Note that this first expression returns the first text node after the first h2even when some other node appears between the two. If you want to return the text only when it's the node that immediatelyfollows the first h2, then try something like this:

请注意,第一个表达式返回第一个文本节点之后的第一个文本节点,h2即使在两者之间出现其他节点时也是如此。如果您只想在它是紧跟在 firsth2之后的节点时才返回文本 ,请尝试以下操作:

div[@class='coordsAgence']/h2[1][following-sibling::node()[1][self::text()]]/following-sibling::text()[1]