Differences between revisions 5 and 6
Revision 5 as of 2023-03-01 15:29:57
Size: 1040
Comment:
Revision 6 as of 2023-04-11 14:35:15
Size: 1079
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
'''`xml.sax`''' is a module for parsing XML with a SAX (Simple API for XML) parser. '''`xml.sax`''' is a module for parsing XML.

This
parser uses the '''SAX''' ('''S'''imple '''A'''PI for '''X'''ML) API.

Python XML SAX

xml.sax is a module for parsing XML.

This parser uses the SAX (Simple API for XML) API.


Usage

from xml.sax import handler, make_parser

class MyHandler(handler.ContentHandler):
    def __init__(self):
        handler.ContentHandler.__init__(self)
        self.in_page = False
        self.character_buffer = ""

    def startElement(self, name, attrs):
        if name == "page":
            self.in_page = True

    def endElement(self, name):
        if name == "page":
            self.in_page = False
        print(self.character_buffer)
        self.character_buffer = ""

    def characters(self, data):
        self.character_buffer += data

def parse(filename):
    parser = make_parser()
    handler = MyHandler()
    parser.setContentHandler(handler)
    parser.parse(filename)


See also

Python xml.sax module documentation


CategoryRicottone

Python/XmlSax (last edited 2023-04-11 14:35:15 by DominicRicottone)