python elementree-杰瑞科技汇

Of course! Here is a comprehensive guide to using xml.etree.ElementTree in Python, which is the standard library for working with XML data.

（图片来源网络，侵删）

What is `ElementTree`?

ElementTree is a built-in Python library that provides a simple and efficient API for parsing and creating XML data. It treats an XML document as a tree of elements, where each element can have:

A tag (e.g., <book>)
Attributes (e.g., category="fiction")
Text content (e.g., The Hobbit)
Child elements

There are two main implementations you might encounter:

xml.etree.ElementTree: The standard, built-in version that comes with Python. It's fast and good for most use cases.
lxml.etree: A third-party library that is much faster and has more features (like better XPath support and error handling), but you need to install it separately (pip install lxml).

This guide will focus on the built-in xml.etree.ElementTree.

Parsing XML from a String

This is the most common starting point. You have an XML string and you want to convert it into a Python object tree.

（图片来源网络，侵删）

import xml.etree.ElementTree as ET
# A sample XML string
xml_string = """
<library>
    <book category="fiction">
        <title lang="en">The Hobbit</title>
        <author>J.R.R. Tolkien</author>
        <year>1937</year>
    </book>
    <book category="science">
        <title lang="en">A Brief History of Time</title>
        <author>Stephen Hawking</author>
        <year>1988</year>
    </book>
</library>
"""
# Parse the string
# ET.fromstring() parses a string and returns the root element of the tree
root = ET.fromstring(xml_string)
# The root element is an Element object
print(f"Root tag: {root.tag}")  # Output: Root tag: library
# Accessing attributes
print(f"Root attributes: {root.attrib}") # Output: Root attributes: {}
# Find a direct child element
first_book = root.find('book')
print(f"First book category: {first_book.attrib['category']}") # Output: First book category: fiction

Parsing XML from a File

If your XML data is in a file (e.g., library.xml), you should use ET.parse().

File: library.xml

<library>
    <book category="fiction">
        <title lang="en">The Hobbit</title>
        <author>J.R.R. Tolkien</author>
        <year>1937</year>
    </book>
    <book category="science">
        <title lang="en">A Brief History of Time</title>
        <author>Stephen Hawking</author>
        <year>1988</year>
    </book>
</library>

Python Code:

import xml.etree.ElementTree as ET
# Parse the file
# ET.parse() returns an ElementTree object, not the root element directly
tree = ET.parse('library.xml')
root = tree.getroot() # Get the root element from the tree
print(f"Root tag from file: {root.tag}") # Output: Root tag from file: library

Navigating the Element Tree

Once you have the root, you can navigate the tree.

Finding Elements

find(path): Finds the first child element that matches the path.
findall(path): Finds all child elements that match the path.
iter(tag): Creates a tree iterator to loop over all elements with a given tag.

import xml.etree.ElementTree as ET
xml_string = """
<library>
    <book category="fiction">
        <title lang="en">The Hobbit</title>
        <author>J.R.R. Tolkien</author>
    </book>
    <book category="science">
        <title lang="en">A Brief History of Time</title>
        <author>Stephen Hawking</author>
    </book>
</library>
"""
root = ET.fromstring(xml_string)
# --- find() ---
# Find the first 'title' element
first_title = root.find('book/title')
print(f"First title: {first_title.text}") # Output: First title: The Hobbit
# --- findall() ---
# Find all 'author' elements
all_authors = root.findall('.//author') # .// means search anywhere in the tree
print("\nAll authors:")
for author in all_authors:
    print(f"- {author.text}") # Output: - J.R.R. Tolkien, - Stephen Hawking
# Find all 'book' elements
all_books = root.findall('book')
print(f"\nFound {len(all_books)} books.")
# --- iter() ---
# Loop over all 'title' elements in the entire document
print("\nAll titles using iter():")in root.iter('title'):
    print(f"- Language: {title.attrib['lang']}, Text: {title.text}")

XPath-like Expressions: The path argument in find() and findall() supports a simplified subset of XPath:

tag: Direct child tag.
parent/child: A specific child of a parent.
.//tag: Any descendant tag at any depth (very useful!).
[@attrib]: Elements with a specific attribute.
[@attrib='value']: Elements with an attribute matching a specific value.

# Find the author of the book in the 'science' category
science_author = root.find('.//book[@category="science"]/author')
print(f"\nScience author: {science_author.text}") # Output: Science author: Stephen Hawking

Accessing Element Data

element.tag: The tag name (e.g., 'title').
element.text: The text content inside the tag (e.g., 'The Hobbit').
element.attrib: A dictionary of the element's attributes (e.g., {'lang': 'en'}).

book = root.find('book[@category="fiction"]')
print(f"Tag: {book.tag}")
print(f"Attributes: {book.attrib}")
print(f"Text content of the first child (title): {book.find('title').text}")

Modifying XML

You can easily modify the element tree in memory.

import xml.etree.ElementTree as ET
# Let's use the tree from the file example
tree = ET.parse('library.xml')
root = tree.getroot()
# --- Add a new book ---
new_book = ET.Element('book')
new_book.set('category', 'fantasy') # Add an attribute
new_book_title = ET.SubElement(new_book, 'title')
new_book_title.text = "Dune"
new_book_author = ET.SubElement(new_book, 'author')
new_book_author.text = "Frank Herbert"
new_book_year = ET.SubElement(new_book, 'year')
new_book_year.text = "1965"
# Add the new book to the root
root.append(new_book)
# --- Modify an existing element ---
# Change the year of the first book
first_book_year = root.find('book/year')
first_book_year.text = "1937" # It was already 1937, but you get the idea
# --- Remove an element ---
# Remove the 'category' attribute from the first book
first_book = root.find('book')
first_book.attrib.pop('category', None) # The second argument prevents an error if key doesn't exist
# --- Save the changes to a new file ---
tree.write('library_modified.xml', encoding='utf-8', xml_declaration=True)
print("Modified XML saved to library_modified.xml")

Output (library_modified.xml):

<?xml version='1.0' encoding='utf-8'?>
<library>
    <book>
        <title lang="en">The Hobbit</title>
        <author>J.R.R. Tolkien</author>
        <year>1937</year>
    </book>
    <book category="science">
        <title lang="en">A Brief History of Time</title>
        <author>Stephen Hawking</author>
        <year>1988</year>
    </book>
    <book category="fantasy">
        <title>Dune</title>
        <author>Frank Herbert</author>
        <year>1965</year>
    </book>
</library>

Creating XML from Scratch

You can build an XML tree programmatically using Element and SubElement.

import xml.etree.ElementTree as ET
# Create the root element
root = ET.Element('catalog')
# Create sub-elements and add them to the root
product1 = ET.SubElement(root, 'product')
product1.set('id', 'p1')
name1 = ET.SubElement(product1, 'name')
name1.text = 'Super Widget'
price1 = ET.SubElement(product1, 'price')
price1.text = '19.99'
product2 = ET.SubElement(root, 'product')
product2.set('id', 'p2')
name2 = ET.SubElement(product2, 'name')
name2.text = 'Mega Gadget'
price2 = ET.SubElement(product2, 'price')
price2.text = '49.50'
# Create an ElementTree object
tree = ET.ElementTree(root)
# Write to a file
tree.write('products.xml', encoding='utf-8', xml_declaration=True)
print("New XML file 'products.xml' created.")

Output (products.xml):

<?xml version='1.0' encoding='utf-8'?>
<catalog>
    <product id="p1">
        <name>Super Widget</name>
        <price>19.99</price>
    </product>
    <product id="p2">
        <name>Mega Gadget</name>
        <price>49.50</price>
    </product>
</catalog>

Best Practices and Important Notes

Namespaces: XML namespaces can be tricky. If your XML uses them (e.g., <ns0:book>), you need to handle them by defining a mapping.

# For an XML like: <ns0:library xmlns:ns0="http://example.com/library">
ns = {'ns0': 'http://example.com/library'}
# You must use the prefix in your find calls
root.find('ns0:book', namespaces=ns)

Pretty Printing: The standard ElementTree does not pretty-print (add indentation). The output will be a single line. For pretty printing, you often need a helper function or use the lxml library.
Error Handling: Always wrap your XML parsing in a try...except block, as malformed XML will raise a ParseError.
```
try:
    tree = ET.parse('bad_file.xml')
except ET.ParseError as e:
    print(f"Error parsing XML: {e}")
```
Performance: For very large XML files (many gigabytes), ElementTree might load the entire file into memory. In such cases, consider using iterparse which is an event-driven parser that can process files incrementally.

python elementree

What is `ElementTree`?

Parsing XML from a String

Parsing XML from a File

Navigating the Element Tree

Finding Elements

Accessing Element Data

Modifying XML

Creating XML from Scratch

Best Practices and Important Notes

99ANYc3cd6

JTextField Java 如何使用？

企业资源计划ERP教程如何快速上手？

Java JTextField如何实现输入限制与事件监听？

python nslackssvm

斗鱼OBS教程，直播推流怎么设置？

Java Socket编程实例具体怎么实现？

SQL Server 2012安装教程，详细步骤是怎样的？

Java String转byte数组如何正确编码？

FrontPage 2003做网页教程从哪开始学？

java软件测试方法有哪些方法有哪些方法

bartender2025破解版安装教程可靠吗？

JSP如何调用Java方法并正确传参？

Mac如何用U盘安装Win7系统？

Java的try-catch-throw如何高效处理异常？

Camtasia Studio 6教程如何快速上手？

webservice rest java

python elementree

What is ElementTree?

Parsing XML from a String

Parsing XML from a File

Navigating the Element Tree

Finding Elements

Accessing Element Data

Modifying XML

Creating XML from Scratch

Best Practices and Important Notes

相关推荐

Java Socket编程实例具体怎么实现？

What is `ElementTree`?