XPathIntroduction
XPath (XML Path Language) is a用于 in XMLdocumentationin定位 and 选择node language, 它providing了一种简洁, high 效 方式来query and 提取XMLdata. XPath最初 is for XSLTdesign , 但现 in 已经成 for 许 many XMLprocessingtool and API 标准querylanguage.
XPath 作用
- in XMLdocumentationin定位specific node
- 选择满足specific条件 node集
- 提取node in 容 and property
- for nodeforsort and filter
- in XSLTin用于模板匹配 and data转换
- in JavaScript, Pythonetc.programminglanguagein用于XML解析 and processing
XPath version
- XPath 1.0: 1999年release, is 目 before 最广泛support version
- XPath 2.0: 2007年release, scale了XPath 1.0 functions, 增加了 for 序列, 正则表达式etc. support
- XPath 3.0: 2014年release, 进一步scale了XPath 2.0 functions
- XPath 3.1: 2017年release, 增加了 for JSONdata support
XPath 特点
- 基于path表达式, 语法简洁易懂
- support many 种nodeclass型 and 选择方式
- providing丰富 functionlibrary, supportstring, 数值, booleanetc.operation
- support轴 (Axes) , 可以flexible地定位node
- support谓词 (Predicates) , 可以根据条件filternode
- 可以嵌套using, 构建 complex query表达式
XPathnodeclass型
in XPathin, XMLdocumentation被视 for nodetree, 每个node都 has specific class型. XPath定义了七种nodeclass型:
| nodeclass型 | describes | example |
|---|---|---|
| documentationnode (Document Node) | 整个XMLdocumentation 根node | / |
| 元素node (Element Node) | XML元素 | <book>, <title> |
| propertynode (Attribute Node) | 元素 property | id="B001", name="title" |
| 文本node (Text Node) | 元素 or property 文本 in 容 | 文本 in 容 |
| commentnode (Comment Node) | XMLcomment | <!-- 这 is comment --> |
| processing指令node (Processing Instruction Node) | XMLprocessing指令 | <?xml version="1.0"?> |
| namespacenode (Namespace Node) | 元素 namespace声明 | xmlns="http://example.com" |
<?xml version="1.0" encoding="UTF-8"?>
<!-- 这 is a graph书list -->
<books>
<book id="B001">
<title>XMLBasicstutorial</title>
<author>张三</author>
<year>2025</year>
<price>99.00</price>
</book>
<book id="B002">
<title>XMLadvancedapplication</title>
<author>李四</author>
<year>2025</year>
<price>129.00</price>
</book>
</books>
XPath表达式
XPath表达式用于选择XMLdocumentationin node or node集. 表达式可以 is simple path, 也可以 is complex 条件表达式.
basic语法
XPath表达式 basic语法class似于filesystempath:
| 表达式 | describes |
|---|---|
/ |
from 根node开始 |
// |
选择所 has 匹配 node, 无论它们 in documentationin 位置such as何 |
. |
选择当 before node |
.. |
选择当 before node 父node |
@ |
选择property |
path表达式example
<!-- using on 面 books.xmlexample --> /books/book //book //title //book/@id //book[@id="B001"] //book[@id="B001"]/title //book/.. ./book
谓词 (Predicates)
谓词用于filternode, 它们被放 in 方括号[]in. 谓词可以package含各种条件表达式, 用于选择满足specific条件 node.
<!-- 选择第一个book元素 --> //book[1] //book[last()] //book[position() < 3] //book[@id="B001"] //book[author] //book[author="张三"] //book[price > 100] //book[starts-with(@id, "B")]
XPath轴
XPath轴定义了所选node and 当 before node之间 relationships, 用于 in XMLdocumentationin导航. 轴可以 and nodetest and 谓词结合using, 构建 complex XPath表达式.
常用轴
| 轴名称 | describes | example |
|---|---|---|
ancestor |
选择当 before node 所 has 祖先node (父, 祖父etc.) | //title/ancestor::book |
ancestor-or-self |
选择当 before node及其所 has 祖先node | //title/ancestor-or-self::*[name()="title"] |
child |
选择当 before node 所 has 子node | //book/child::title |
descendant |
选择当 before node 所 has after 代node (子, 孙etc.) | //books/descendant::title |
descendant-or-self |
选择当 before node及其所 has after 代node | //books/descendant-or-self::* |
following |
选择当 before node之 after 所 has node | //book[1]/following::book |
following-sibling |
选择当 before node之 after 所 has 兄弟node | //book[1]/following-sibling::book |
parent |
选择当 before node 父node | //title/parent::book |
preceding |
选择当 before node之 before 所 has node | //book[last()]/preceding::book |
preceding-sibling |
选择当 before node之 before 所 has 兄弟node | //book[2]/preceding-sibling::book |
self |
选择当 before node | //book[1]/self::book |
轴example
<!-- 选择所 has book元素 祖先node --> //book/ancestor::* //title/parent::* //book[1]/following-sibling::* //book[last()]/preceding-sibling::* /books/descendant::* //book[@id="B001"]/child::*
XPathfunction
XPathproviding了丰富 functionlibrary, 用于processingstring, 数值, boolean值 and nodeetc.. 这些function可以 in XPath表达式inusing, 增强XPath querycapacity.
常用function
1. nodefunction
last(): 返回 on under 文nodecollectionin 最 after 一个node 位置position(): 返回 on under 文node in collectionin 位置count(node-set): 返回nodecollectionin node数量id(string): 返回具 has 指定ID 元素local-name(node-set?): 返回node 本地名称 (不package含namespace before 缀)name(node-set?): 返回node 完整名称 (package含namespace before 缀)namespace-uri(node-set?): 返回node namespaceURI
2. stringfunction
string(object?): 将object转换 for stringconcat(string, string, ...): 连接 many 个stringsubstring(string, start, length?): 提取string 子串substring-before(string, string): 返回第一个stringin位于第二个string之 before 部分substring-after(string, string): 返回第一个stringin位于第二个string之 after 部分normalize-space(string?): 去除string两端 空格, 并将连续 空格replace for 单个空格translate(string, string, string): replacestringin 字符contains(string, string): check第一个string is 否package含第二个stringstarts-with(string, string): check第一个string is 否以第二个string开头ends-with(string, string): check第一个string is 否以第二个string结尾 (XPath 2.0+)string-length(string?): 返回string long 度
3. 数值function
number(object?): 将object转换 for 数值sum(node-set): 返回nodecollectionin所 has node 数值之 andfloor(number): 返回 small 于 or etc.于指定数值 最 big 整数ceiling(number): 返回 big 于 or etc.于指定数值 最 small 整数round(number): 将数值四舍五入 for 最接近 整数
4. booleanfunction
boolean(object?): 将object转换 for boolean值not(boolean): 返回boolean值 否定true(): 返回boolean值truefalse(): 返回boolean值falselang(string): check当 before node language is 否 and 指定language匹配
functionexample
<!-- 选择最 after 一个book元素 --> //book[last()] //book[position() > 1] count(//book) //book[contains(title, "XML")] //book[starts-with(title, "XML")] //book[string-length(author) > 2] concat(//book[1]/title, " by ", //book[1]/author) sum(//book/price) //book[price > sum(//book/price) div count(//book)] //book[@id="B001" or @id="B002"] //book[not(@id="B001")]
实践case: usingXPathqueryXMLdocumentation
casedescribes
creation一个XMLdocumentation, 然 after usingXPath表达式query and 提取其in data.
implementation步骤
- creation一个名 for
library.xmlXMLfile - in XMLfilein添加graph书information
- writingXPath表达式querygraph书information
- usingxmllinttooltestXPath表达式
最终code
<?xml version="1.0" encoding="UTF-8"?>
<library>
<book id="B001" category="计算机">
<title>XMLBasicstutorial</title>
<author>张三</author>
<year>2025</year>
<price>99.00</price>
</book>
<book id="B002" category="计算机">
<title>XMLadvancedapplication</title>
<author>李四</author>
<year>2025</year>
<price>129.00</price>
</book>
<book id="B003" category="文学">
<title> small 说集</title>
<author>王五</author>
<year>2024</year>
<price>89.00</price>
</book>
<book id="B004" category="文学">
<title>散文集</title>
<author>赵六</author>
<year>2023</year>
<price>79.00</price>
</book>
</library>
XPathqueryexample
# usingxmllinttooltestXPath表达式 # 选择所 has book元素 xmllint --xpath "//book" library.xml # 选择所 has book元素 title子元素 xmllint --xpath "//book/title" library.xml # 选择所 has book元素 idproperty xmllint --xpath "//book/@id" library.xml # 选择idproperty值 for "B001" book元素 xmllint --xpath "//book[@id='B001']" library.xml # 选择idproperty值 for "B001" book元素 title子元素 文本 in 容 xmllint --xpath "//book[@id='B001']/title/text()" library.xml # 选择categoryproperty值 for "计算机" book元素 xmllint --xpath "//book[@category='计算机']" library.xml # 选择price子元素值 big 于100 book元素 xmllint --xpath "//book[price > 100]" library.xml # 选择author子元素值 for "张三" book元素 xmllint --xpath "//book[author='张三']" library.xml # 计算book元素 数量 xmllint --xpath "count(//book)" library.xml # 计算所 has book元素 price之 and xmllint --xpath "sum(//book/price)" library.xml
互动练习
练习1: writingXPath表达式
- 选择所 has categoryproperty值 for "文学" book元素
- 选择year子元素值 for 2025 book元素
- 选择price子元素值 small 于90 book元素
- 选择第二个book元素
- 选择所 has book元素 author子元素
XPath表达式such as under :
//book[@category='文学']//book[year='2025']//book[price < 90]//book[2]or(//book)[2]//book/author
练习2: analysisXPath结果
//book[last()]/title//book[position()=3]/author//book[contains(title, "集")]count(//book[@category='计算机'])sum(//book[price < 100]/price)
XPath表达式 执行结果such as under :
- 返回最 after 一个book元素 title子元素, 即
<title>散文集</title> - 返回第三个book元素 author子元素, 即
<author>王五</author> - 返回所 has title元素package含"集" book元素, 即第三个 and 第四个book元素
- 返回categoryproperty值 for "计算机" book元素 数量, 即2
- 返回price子元素值 small 于100 book元素 price之 and , 即89.00 + 79.00 = 168.00