Python如何解析XML Schema？-杰瑞科技汇

Of course! Working with XML Schema (XSD) in Python is a common task for data validation, especially when dealing with configurations, data interchange formats, or ensuring the structure of XML files generated by your application.

（图片来源网络，侵删）

Here's a comprehensive guide covering the main approaches, from the built-in library to the most powerful third-party options.

The Standard Library: `xmlschema`

Python's standard library (xml.etree.ElementTree) can parse XML but has no built-in support for validating against an XSD schema. For this, you need a third-party library.

The most popular and feature-rich library for this is xmlschema.

Installation

First, install the library:

（图片来源网络，侵删）

pip install xmlschema

Key Features of `xmlschema`

Full XSD 1.0 & 1.1 Support: Supports almost all features of the XML Schema standard.
Data Conversion: It can convert XML data into Python-native types (e.g., xs:integer becomes an int, xs:date becomes a datetime.date object).
JSON Schema Generation: You can convert an XSD schema into a JSON schema.
Easy API: Provides a simple and intuitive API for validation.

Practical Examples with `xmlschema`

Let's set up a simple example. We have an XML file and a corresponding XSD schema to validate it.

File Structure

project/
├── library.xsd
└── books.xml

`library.xsd` (The Schema)

This schema defines the rules for our XML file.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="library">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="book" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="title" type="xs:string"/>
              <xs:element name="author" type="xs:string"/>
              <xs:element name="year" type="xs:positiveInteger"/>
              <xs:element name="price" type="xs:decimal"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

`books.xml` (The Data to Validate)

This file conforms to the schema.

<?xml version="1.0" encoding="UTF-8"?>
<library>
  <book>The Hobbit</title>
    <author>J.R.R. Tolkien</author>
    <year>1937</year>
    <price>12.50</price>
  </book>
  <book>Dune</title>
    <author>Frank Herbert</author>
    <year>1965</year>
    <price>18.99</price>
  </book>
</library>

Example 1: Basic Validation

This is the simplest use case: check if an XML file is valid against an XSD schema.

（图片来源网络，侵删）

import xmlschema
# Define the path to your schema and XML file
schema_path = 'library.xsd'
xml_path = 'books.xml'
try:
    # 1. Create a schema object by loading the XSD file
    #    This is an expensive operation, so do it once and reuse it.
    schema = xmlschema.XMLSchema(schema_path)
    # 2. Validate the XML file
    #    If the file is valid, this will complete without an exception.
    #    If it's invalid, it will raise an xmlschema.XMLSchemaValidationError.
    schema.validate(xml_path)
    print(f"✅ Success: {xml_path} is a valid XML document according to {schema_path}")
except xmlschema.XMLSchemaValidationError as e:
    print(f"❌ Validation Error: {xml_path} is NOT valid.")
    print(f"   Reason: {e.reason}")
    print(f"   Path: {e.path}")
    print(f"   Offending element: {e.object!r}")
except FileNotFoundError as e:
    print(f"❌ File not found: {e}")

Example 2: Parsing with Data Conversion

A powerful feature of xmlschema is its ability to parse the XML into Python data structures, respecting the types defined in the XSD.

import xmlschema
schema = xmlschema.XMLSchema('library.xsd')
# The `to_dict` method parses the XML and converts it to a Python dictionary
# with native Python types (int, float, etc.).
try:
    data = schema.to_dict('books.xml')
    print("--- Parsed Data (as Python dict) ---")
    import json
    print(json.dumps(data, indent=2))
    print("\n--- Accessing Data ---")
    first_book_title = data['library']['book'][0]['title']
    print(f"The first book's title is: '{first_book_title}'")
    first_book_year = data['library']['book'][0]['year']
    print(f"The first book's year is: {first_book_year} (type: {type(first_book_year)})")
except xmlschema.XMLSchemaValidationError as e:
    print(f"❌ Parsing failed: {e.reason}")

Output of Example 2:

--- Parsed Data (as Python dict) ---
{
  "library": {
    "book": [
      {
        "title": "The Hobbit",
        "author": "J.R.R. Tolkien",
        "year": 1937,
        "price": 12.5
      },
      {
        "title": "Dune",
        "author": "Frank Herbert",
        "year": 1965,
        "price": 18.99
      }
    ]
  }
}
--- Accessing Data ---
The first book's title is: 'The Hobbit'
The first book's year is: 1937 (type: <class 'int'>)

Example 3: Handling Invalid XML

Let's create an invalid XML file to see how the error handling works.

`invalid_books.xml`

<?xml version="1.0" encoding="UTF-8"?>
<library>
  <book>The Hobbit</title>
    <author>J.R.R. Tolkien</author>
    <year>1937</year>
    <price>twelve pounds</price> <!-- Invalid: price must be a decimal -->
  </book>
</library>

Now, run the validation code from Example 1 against invalid_books.xml:

import xmlschema
schema = xmlschema.XMLSchema('library.xsd')
xml_path = 'invalid_books.xml'
try:
    schema.validate(xml_path)
    print(f"✅ Success: {xml_path} is valid.")
except xmlschema.XMLSchemaValidationError as e:
    print(f"❌ Validation Error: {xml_path} is NOT valid.")
    print(f"   Reason: {e.reason}") # The error message
    print(f"   Path: {e.path}")     # The XPath to the error location
    print(f"   Offending value: {e.object!r}") # The value that caused the error

Output of Example 3:

❌ Validation Error: invalid_books.xml is NOT valid.
   Reason: invalid literal for Decimal: 'twelve pounds'
   Path: /library/book[1]/price
   Offending value: 'twelve pounds'

Alternative Libraries

While xmlschema is the recommended choice for most use cases, it's good to know about alternatives.

`xsdvalidate`

This is a much simpler library if you only need validation and don't require the data conversion features.

Pros: Lightweight, simple API.
Cons: Lacks advanced features like data binding or JSON schema generation.

Installation:

pip install xsdvalidate

Example:

from xsdvalidate import validate
# The validate function returns True or False
is_valid = validate('library.xsd', 'books.xml')
if is_valid:
    print("✅ The XML is valid.")
else:
    print("❌ The XML is invalid.")

`lxml`

The lxml library is a high-performance Pythonic binding for the C libraries libxml2 and libxslt. It has excellent XSD validation capabilities but can be more complex to set up and use than xmlschema.

Pros: Extremely fast, powerful XPath and XSLT support.
Cons: Steeper learning curve, requires external C libraries.

Example with lxml:

from lxml import etree
# Parse the schema
schema = etree.XMLSchema(etree.parse('library.xsd'))
# Parse the XML document
xml_doc = etree.parse('books.xml')
# Validate
is_valid = schema.validate(xml_doc)
if is_valid:
    print("✅ The XML is valid.")
else:
    print("❌ The XML is invalid.")
    for error in schema.error_log:
        print(f"   Line {error.line}: {error.message}")

Summary and Recommendation

Feature	`xmlschema`	`xsdvalidate`	`lxml`
Primary Use	Validation & Data Binding	Simple Validation	High-Parsing & Validation
Ease of Use	Excellent	Very Easy	Good
Data Conversion	Yes (Best)	No	No
Performance	Good	Good	Excellent
XSD Features	Full 1.0 & 1.1 support	Basic	Good (1.0)
Recommendation	Best for most projects	Quick scripts, validation-only	High-performance, complex XML needs

Final Recommendation:

For any new project in Python, start with xmlschema. It provides the best balance of power, ease of use, and features. Its ability to convert XML to Python objects with the correct types is a huge productivity booster and prevents a lot of manual type-checking and conversion code in your application. Use xsdvalidate for simple, one-off validation tasks, and lxml if you have extreme performance requirements or are already using it for other XML processing.

Python如何解析XML Schema？

The Standard Library: `xmlschema`

Installation

Key Features of `xmlschema`

Practical Examples with `xmlschema`

File Structure

`library.xsd` (The Schema)

`books.xml` (The Data to Validate)

Example 1: Basic Validation

Example 2: Parsing with Data Conversion

Example 3: Handling Invalid XML

`invalid_books.xml`

Alternative Libraries

`xsdvalidate`

`lxml`

Summary and Recommendation

99ANYc3cd6

会声会影X6视频教程怎么学？

TSplines教程怎么学？入门到精通步骤？

Java中String为何不能直接交换？

家具cad教程视频教程

Audacity教程怎么学？新手入门指南在哪？

Java Socket编程实例具体怎么实现？

Java用OpenOffice操作Word，如何实现？

wordcloud for python

hbase counter python

Java客户端如何调用WebService？

java identityhashmap

机械系统动力学分析及adams应用教程

Ubuntu 14.04如何安装Java？

OpenCV Python摄像头如何实时调用？

Python 2.7.9 Windows如何安装配置？

Java ASCII转String，如何高效实现？

Python如何解析XML Schema？

The Standard Library: xmlschema

Installation

Key Features of xmlschema

Practical Examples with xmlschema

File Structure

library.xsd (The Schema)

books.xml (The Data to Validate)

Example 1: Basic Validation

Example 2: Parsing with Data Conversion

Example 3: Handling Invalid XML

invalid_books.xml

Alternative Libraries

xsdvalidate

lxml

Summary and Recommendation

相关推荐

Java Socket编程实例具体怎么实现？

The Standard Library: `xmlschema`

Key Features of `xmlschema`

Practical Examples with `xmlschema`

`library.xsd` (The Schema)

`books.xml` (The Data to Validate)

`invalid_books.xml`

`xsdvalidate`

`lxml`