196.html ( File view )

  • By 2010-08-21
  • View(s):12
  • Download(s):0
  • Point(s): 1
			






Safari | Python Developer's Handbook -> Summary





< BACKMake Note | BookmarkCONTINUE >
152015024128143245168232148039199167010047123209178152124239215162148043124058173185002141

Summary

This chapter provides information concerning how to use Python for data parsing and manipulation. You learned how to interpret XML, SGML, and HTML documents and how to parse and manipulate email messages, among other things. As you might already know, Python can be used as a very effective and productive tool to parse and manipulate information from the Web.

Extensible Markup Language describes a class of data objects called XML documents and partially describes the behavior of computer programs that process them. For those who want to play around with XML in Python, there is a Python/XML package to serve several purposes at once. This package contains everything required for basic XML applications, along with documentation and sample code.

Besides that, the xmllib module serves as the basis for parsing text files formatted in XML. Note that xmllib is not XML 1.0 compliant, and it doesn't provide any Unicode support. It provides just simple XML support for ASCII only element and attribute names.

Many XML-based technologies are available for Python/XML development, such as

SAX This is a common event-based interface for object-oriented XML parsers.

The Document Object Model (DOM)This is a standard interface for manipulating XML and HTML documents developed by the World Wide Web Consortium. 4DOM is a Python library for XML and HTML processing and manipulation using the W3C's Document Object Model for interface.

XSLT This is an XML transformation processor based on the W3C's specification.

XML Bookmark Exchange Language (XBEL) This is an Internet "bookmarks" interchange format.

SOAP This is an XML/HTTP-based protocol for accessing services, objects, and servers in a platform-independent manner. Scarab is a minimal Python SOAP implementation.

PythonPoint This has a simple XML markup language for doing presentation slides and converting them to PDF documents.

Pyxie This is an Open Source XML processing library for Python.

XML-RPC This is a specification and a set of implementations that allow software running on different operating systems and different environments to make procedure calls over the Internet. It is important to say that Python has its own implementation of XML-RPC.

XDR This is a standard for data description and encoding. Protocols such as RPC and NFS use XDR to describe the format of their data.

But Python is not just XML. It also provides support for other markup languages.

The sgmllib module is an SGML (Standard Generalized Markup Language) parser subset. Although it has a simple implementation, it is powerful enough to build the HTML parser.

The htmllib module defines a parser class that can serve as a base for parsing text files formatted in HTML. Two helper modules are used by htmllib:

  • The htmlentitydefs module is a dictionary that contains all the definitions for the general entities defined by HTML 2.0.

  • The formatter module is used for generic output formatting by the HTMLPARSER class of the htmllib module.

Apart from markup languages, this chapter also covers mail messages manipulation.

MIME (Multipurpose Internet Mail Extensions) is a standard for sending multi-part multimedia data through Internet mail. This standard exposes mechanisms for specifying and describing the format of Internet message bodies. Python provides many modules to support MIME messages, including the following:

mimetools Provides utility tools for parsing and manipulation of MIME multi-part and encoded messages.

MimeWriter Implements a generic file-writing class that is used to create MIME encoded multi-part files (messages).

multifile Enables you to treat distinct parts of a text file as file-like input objects.

mailcap Reads mailcap files and configures how MIME-aware applications react to files with different MIME types.

mimetypes Supports conversions between a filename or URL and the MIME type associated with the filename extension.

quopri Performs quoted-printable transport encoding and decoding of MIME quoted-printable data.

mailbox Implements classes that allow easy and uniform access to read various mailbox formats in a UNIX system.

mimify Contains functions to convert and process simple and multi-part mail messages to/from MIME format.

rfc822 Parses mail headers that are defined by the Internet standard RFC 822.

Python uses the following modules for general data conversions:

netrc Parses, processes, and encapsulates the .netrc configuration file format used by UNIX FTP program and other FTP clients.

mhlib Provides a Python interface to access MH folders, mailboxes, and their contents.

base64 Performs base64 encoding and decoding of arbitrary binary strings into text string that can be safely emailed or posted.

binhex Encodes and decodes files in binhex4 format. This format is commonly used to represent files on Macintosh systems.

uu Encodes and decodes files in uuencode format.

binascii Implements methods to convert data between binary and various ASCII-encoded binary representations, including binhex,uu, and base64.


Last updated on 1/30/2002
Python Developer's Handbook, © 2002 Sams Publishing

< BACKMake Note | Bookmark
...
Expand> <Close
Sponsored links

File list

Tips: You can preview the content of files by clicking file names^_^
Name Size Date
0672319942.html3.37 kB01-06-02 20:06
1.html5.10 kB01-06-02 20:06
10.html3.69 kB01-06-02 20:06
100.html5.41 kB01-06-02 20:06
101.html7.96 kB01-06-02 20:06
102.html3.75 kB01-06-02 20:06
103.html3.75 kB01-06-02 20:06
104.html5.81 kB01-06-02 20:06
105.html16.46 kB01-06-02 20:06
106.html25.87 kB01-06-02 20:06
107.html7.44 kB01-06-02 20:06
108.html20.95 kB01-06-02 20:06
109.html10.02 kB01-06-02 20:06
11.html3.66 kB01-06-02 20:06
110.html9.58 kB01-06-02 20:06
111.html9.96 kB01-06-02 20:06
112.html11.34 kB01-06-02 20:06
113.html7.87 kB01-06-02 20:06
114.html13.56 kB01-06-02 20:06
115.html3.76 kB01-06-02 20:06
116.html3.76 kB01-06-02 20:06
117.html4.27 kB01-06-02 20:06
118.html4.27 kB01-06-02 20:06
119.html9.45 kB01-06-02 20:06
12.html3.66 kB01-06-02 20:06
120.html8.62 kB01-06-02 20:06
121.html65.99 kB01-06-02 20:06
122.html28.42 kB01-06-02 20:06
123.html14.92 kB01-06-02 20:06
124.html7.17 kB01-06-02 20:06
125.html24.21 kB01-06-02 20:06
126.html10.82 kB01-06-02 20:06
127.html10.54 kB01-06-02 20:06
128.html3.86 kB01-06-02 20:06
129.html3.86 kB01-06-02 20:06
13.html12.82 kB01-06-02 20:06
130.html5.85 kB01-06-02 20:06
131.html5.30 kB01-06-02 20:06
132.html29.95 kB01-06-02 20:06
133.html61.54 kB01-06-02 20:06
134.html40.34 kB01-06-02 20:06
135.html9.11 kB01-06-02 20:06
136.html15.53 kB01-06-02 20:06
137.html3.86 kB01-06-02 20:06
138.html3.86 kB01-06-02 20:06
139.html5.80 kB01-06-02 20:06
14.html11.79 kB01-06-02 20:06
140.html14.32 kB01-06-02 20:06
141.html26.46 kB01-06-02 20:06
142.html21.93 kB01-06-02 20:06
143.html17.03 kB01-06-02 20:06
144.html7.95 kB01-06-02 20:06
145.html28.59 kB01-06-02 20:06
146.html52.14 kB01-06-02 20:06
147.html8.37 kB01-06-02 20:06
148.html3.63 kB01-06-02 20:06
149.html3.63 kB01-06-02 20:06
15.html10.93 kB01-06-02 20:06
150.html4.42 kB01-06-02 20:06
151.html16.23 kB01-06-02 20:06
152.html32.55 kB01-06-02 20:06
153.html13.13 kB01-06-02 20:06
154.html28.33 kB01-06-02 20:06
155.html40.15 kB01-06-02 20:06
156.html23.47 kB01-06-02 20:06
157.html7.73 kB01-06-02 20:06
158.html10.61 kB01-06-02 20:06
159.html3.72 kB01-06-02 20:06
16.html11.42 kB01-06-02 20:06
160.html3.72 kB01-06-02 20:06
161.html3.64 kB01-06-02 20:06
162.html3.64 kB01-06-02 20:06
163.html4.70 kB01-06-02 20:06
164.html60.18 kB01-06-02 20:06
165.html42.25 kB01-06-02 20:06
166.html17.91 kB01-06-02 20:06
167.html9.76 kB01-06-02 20:06
168.html13.52 kB01-06-02 20:06
169.html10.35 kB01-06-02 20:06
17.html17.40 kB01-06-02 20:06
170.html9.08 kB01-06-02 20:06
171.html3.61 kB01-06-02 20:06
172.html3.61 kB01-06-02 20:06
173.html6.95 kB01-06-02 20:06
174.html27.86 kB01-06-02 20:06
175.html28.55 kB01-06-02 20:06
176.html16.39 kB01-06-02 20:06
177.html24.60 kB01-06-02 20:06
178.html10.20 kB01-06-02 20:06
179.html3.62 kB01-06-02 20:06
18.html12.04 kB01-06-02 20:06
180.html3.62 kB01-06-02 20:06
181.html7.21 kB01-06-02 20:06
182.html11.83 kB01-06-02 20:06
183.html17.37 kB01-06-02 20:06
184.html87.57 kB01-06-02 20:06
185.html25.23 kB01-06-02 20:06
186.html6.62 kB01-06-02 20:06
187.html3.73 kB01-06-02 20:06
188.html3.73 kB01-06-02 20:06
189.html4.76 kB01-06-02 20:06
19.html5.49 kB01-06-02 20:06
190.html74.34 kB01-06-02 20:06
191.html9.56 kB01-06-02 20:06
192.html31.86 kB01-06-02 20:06
193.html67.73 kB01-06-02 20:06
194.html69.48 kB01-06-02 20:06
195.html32.75 kB01-06-02 20:06
196.html10.74 kB01-06-02 20:06
197.html3.54 kB01-06-02 20:06
198.html3.54 kB01-06-02 20:06
199.html3.81 kB01-06-02 20:06
2.html5.10 kB01-06-02 20:06
20.html9.25 kB01-06-02 20:06
200.html3.81 kB01-06-02 20:06
201.html8.82 kB01-06-02 20:06
202.html7.07 kB01-06-02 20:06
203.html50.32 kB01-06-02 20:06
204.html8.07 kB01-06-02 20:06
205.html7.53 kB01-06-02 20:06
206.html3.60 kB01-06-02 20:06
207.html3.60 kB01-06-02 20:06
208.html7.22 kB01-06-02 20:06
209.html21.63 kB01-06-02 20:06
21.html5.84 kB01-06-02 20:06
210.html24.67 kB01-06-02 20:06
211.html30.17 kB01-06-02 20:06
212.html159.15 kB01-06-02 20:06
213.html18.72 kB01-06-02 20:06
214.html6.54 kB01-06-02 20:06
215.html7.22 kB01-06-02 20:06
216.html6.76 kB01-06-02 20:06
217.html3.11 kB01-06-02 20:06
218.html3.54 kB01-06-02 20:06
219.html4.14 kB01-06-02 20:06
22.html3.69 kB01-06-02 20:06
220.html4.14 kB01-06-02 20:06
221.html4.82 kB01-06-02 20:06
222.html50.92 kB01-06-02 20:06
223.html3.87 kB01-06-02 20:06
224.html57.67 kB01-06-02 20:06
225.html37.66 kB01-06-02 20:06
226.html5.04 kB01-06-02 20:06
227.html3.85 kB01-06-02 20:06
228.html3.85 kB01-06-02 20:06
229.html4.47 kB01-06-02 20:06
23.html3.69 kB01-06-02 20:06
230.html21.41 kB01-06-02 20:06
231.html19.56 kB01-06-02 20:06
232.html26.27 kB01-06-02 20:06
233.html10.07 kB01-06-02 20:06
234.html22.22 kB01-06-02 20:06
235.html36.83 kB01-06-02 20:06
236.html49.23 kB01-06-02 20:06
237.html16.63 kB01-06-02 20:06
238.html6.96 kB01-06-02 20:06
239.html3.09 kB01-06-02 20:06
24.html4.86 kB01-06-02 20:06
240.html3.43 kB01-06-02 20:06
241.html3.59 kB01-06-02 20:06
242.html3.59 kB01-06-02 20:06
243.html17.49 kB01-06-02 20:06
244.html9.38 kB01-06-02 20:06
245.html16.62 kB01-06-02 20:06
246.html9.81 kB01-06-02 20:06
247.html11.50 kB01-06-02 20:06
248.html8.95 kB01-06-02 20:06
249.html8.93 kB01-06-02 20:06
25.html11.48 kB01-06-02 20:06
250.html10.89 kB01-06-02 20:06
251.html8.21 kB01-06-02 20:06
252.html5.14 kB01-06-02 20:06
253.html3.61 kB01-06-02 20:06
254.html3.61 kB01-06-02 20:06
255.html4.61 kB01-06-02 20:06
256.html4.61 kB01-06-02 20:06
257.html43.07 kB01-06-02 20:06
258.html10.12 kB01-06-02 20:06
259.html7.99 kB01-06-02 20:06
26.html25.88 kB01-06-02 20:06
260.html17.17 kB01-06-02 20:06
261.html10.99 kB01-06-02 20:06
262.html15.70 kB01-06-02 20:06
263.html36.19 kB01-06-02 20:06
264.html61.93 kB01-06-02 20:06
265.html39.81 kB01-06-02 20:06
266.html15.13 kB01-06-02 20:06
267.html16.34 kB01-06-02 20:06
268.html3.55 kB01-06-02 20:06
269.html3.55 kB01-06-02 20:06
27.html27.92 kB01-06-02 20:06
270.html14.30 kB01-06-02 20:06
271.html14.21 kB01-06-02 20:06
272.html9.44 kB01-06-02 20:06
273.html8.41 kB01-06-02 20:06
274.html5.45 kB01-06-02 20:06
275.html5.45 kB01-06-02 20:06
276.html8.08 kB01-06-02 20:06
277.html9.78 kB01-06-02 20:06
278.html6.04 kB01-06-02 20:06
279.html6.89 kB01-06-02 20:06
28.html12.02 kB01-06-02 20:06
280.html11.77 kB01-06-02 20:06
281.html10.18 kB01-06-02 20:06
282.html4.98 kB01-06-02 20:06
283.html4.98 kB01-06-02 20:06
284.html4.12 kB01-06-02 20:06
285.html5.78 kB01-06-02 20:06
286.html12.42 kB01-06-02 20:06
287.html6.87 kB01-06-02 20:06
29.html45.69 kB01-06-02 20:06
3.html5.51 kB01-06-02 20:06
30.html10.03 kB01-06-02 20:06
31.html23.06 kB01-06-02 20:06
32.html22.26 kB01-06-02 20:06
33.html18.88 kB01-06-02 20:06
34.html18.25 kB01-06-02 20:06
35.html15.70 kB01-06-02 20:06
36.html5.50 kB01-06-02 20:06
37.html11.83 kB01-06-02 20:06
38.html3.69 kB01-06-02 20:06
39.html3.69 kB01-06-02 20:06
4.html5.51 kB01-06-02 20:06
40.html8.24 kB01-06-02 20:06
41.html12.92 kB01-06-02 20:06
42.html4.51 kB01-06-02 20:06
43.html3.90 kB01-06-02 20:06
44.html3.52 kB01-06-02 20:06
45.html5.99 kB01-06-02 20:06
46.html4.33 kB01-06-02 20:06
47.html4.19 kB01-06-02 20:06
48.html4.01 kB01-06-02 20:06
49.html3.95 kB01-06-02 20:06
5.html4.69 kB01-06-02 20:06
50.html3.63 kB01-06-02 20:06
51.html3.75 kB01-06-02 20:06
52.html5.56 kB01-06-02 20:06
53.html4.41 kB01-06-02 20:06
54.html6.99 kB01-06-02 20:06
55.html3.52 kB01-06-02 20:06
56.html3.95 kB01-06-02 20:06
57.html3.69 kB01-06-02 20:06
58.html4.27 kB01-06-02 20:06
59.html3.88 kB01-06-02 20:06
6.html4.69 kB01-06-02 20:06
60.html3.58 kB01-06-02 20:06
61.html4.17 kB01-06-02 20:06
62.html3.93 kB01-06-02 20:06
63.html3.68 kB01-06-02 20:06
64.html4.16 kB01-06-02 20:06
65.html4.04 kB01-06-02 20:06
66.html4.97 kB01-06-02 20:06
67.html3.91 kB01-06-02 20:06
68.html3.55 kB01-06-02 20:06
69.html3.51 kB01-06-02 20:06
7.html10.67 kB01-06-02 20:06
70.html3.52 kB01-06-02 20:06
71.html3.83 kB01-06-02 20:06
72.html4.02 kB01-06-02 20:06
73.html21.61 kB01-06-02 20:06
74.html12.94 kB01-06-02 20:06
75.html31.26 kB01-06-02 20:06
76.html11.98 kB01-06-02 20:06
77.html4.24 kB01-06-02 20:06
78.html4.26 kB01-06-02 20:06
79.html10.95 kB01-06-02 20:06
8.html10.67 kB01-06-02 20:06
80.html12.11 kB01-06-02 20:06
81.html4.88 kB01-06-02 20:06
82.html6.48 kB01-06-02 20:06
83.html5.64 kB01-06-02 20:06
84.html13.30 kB01-06-02 20:06
85.html7.00 kB01-06-02 20:06
86.html3.90 kB01-06-02 20:06
87.html4.59 kB01-06-02 20:06
88.html5.00 kB01-06-02 20:06
89.html17.71 kB01-06-02 20:06
9.html3.69 kB01-06-02 20:06
90.html6.00 kB01-06-02 20:06
91.html3.68 kB01-06-02 20:06
92.html3.68 kB01-06-02 20:06
93.html12.98 kB01-06-02 20:06
94.html10.60 kB01-06-02 20:06
95.html19.93 kB01-06-02 20:06
96.html10.34 kB01-06-02 20:06
97.html6.27 kB01-06-02 20:06
98.html6.77 kB01-06-02 20:06
99.html16.97 kB01-06-02 20:06
front_matter.html4.16 kB01-06-02 20:06
index.html3.37 kB01-06-02 20:06
new_toc.html15.79 kB01-06-02 20:06
rindex1.html53.02 kB01-06-02 20:05
rindex10.html5.16 kB01-06-02 20:05
rindex11.html3.96 kB01-06-02 20:05
rindex12.html15.07 kB01-06-02 20:05
rindex13.html84.43 kB01-06-02 20:05
rindex14.html10.85 kB01-06-02 20:05
rindex15.html26.04 kB01-06-02 20:05
rindex16.html58.33 kB01-06-02 20:05
rindex17.html3.52 kB01-06-02 20:05
rindex18.html16.83 kB01-06-02 20:05
rindex19.html52.89 kB01-06-02 20:05
rindex2.html13.47 kB01-06-02 20:05
rindex20.html17.55 kB01-06-02 20:05
rindex21.html13.56 kB01-06-02 20:05
rindex22.html10.63 kB01-06-02 20:05
rindex23.html16.92 kB01-06-02 20:05
rindex24.html4.90 kB01-06-02 20:05
rindex25.html2.50 kB01-06-02 20:05
rindex3.html43.59 kB01-06-02 20:06
rindex4.html21.76 kB01-06-02 20:06
rindex5.html22.31 kB01-06-02 20:06
rindex6.html36.57 kB01-06-02 20:06
rindex7.html14.06 kB01-06-02 20:06
rindex8.html11.93 kB01-06-02 20:06
rindex9.html25.33 kB01-06-02 20:06
toc.html24.97 kB01-06-02 20:06
<(ebook>0.00 BPython) O'Reilly
<0672319942>0.00 B25-08-02 17:24
<graphics>0.00 B25-08-02 17:24
<images>0.00 B25-08-02 17:24
<oreillyi>0.00 B25-08-02 17:24
oreillyM.css4.42 kB31-05-02 17:16
oreillyN.css4.42 kB31-05-02 17:16
00.gif41.00 B31-05-02 17:16
01fig01.gif58.03 kB28-01-02 08:54
01fig02.gif39.76 kB28-01-02 08:54
01fig03.gif22.06 kB28-01-02 08:54
01fig04.gif64.75 kB28-01-02 08:54
02fig01.gif54.48 kB28-01-02 08:54
02fig02.gif16.29 kB28-01-02 08:54
02fig03.gif14.11 kB28-01-02 08:54
02fig04.gif47.82 kB28-01-02 08:54
06fig01.gif32.19 kB28-01-02 08:54
06fig02.gif58.39 kB28-01-02 08:54
06fig03.gif48.96 kB28-01-02 08:54
07fig01.gif47.96 kB28-01-02 08:54
07fig02.gif8.99 kB28-01-02 08:54
07fig03.gif9.46 kB28-01-02 08:54
07fig04.gif11.51 kB28-01-02 08:54
07fig05.gif8.27 kB28-01-02 08:54
07fig06.gif31.46 kB28-01-02 08:54
12fig01.gif34.54 kB28-01-02 08:54
15fig01.gif1.30 kB28-01-02 08:54
15fig02.gif1.50 kB28-01-02 08:54
15fig03.gif5.93 kB28-01-02 08:54
15fig04.gif924.00 B28-01-02 08:54
15fig05.gif1.55 kB28-01-02 08:54
15fig06.gif1.62 kB28-01-02 08:54
15fig07.gif1.69 kB28-01-02 08:54
15fig08.gif1.02 kB28-01-02 08:54
15fig09.gif2.54 kB28-01-02 08:54
15fig10.gif1.92 kB28-01-02 08:54
15fig11.gif2.44 kB28-01-02 08:54
15fig12.gif2.48 kB28-01-02 08:54
15fig13.gif2.36 kB28-01-02 08:54
15fig14.gif5.43 kB28-01-02 08:54
15fig15.gif5.51 kB28-01-02 08:54
15fig16.gif2.18 kB28-01-02 08:54
15fig17.gif12.22 kB28-01-02 08:54
16fig01.gif28.99 kB28-01-02 08:54
16fig02.gif23.02 kB28-01-02 08:54
16fig03.gif32.48 kB28-01-02 08:54
16fig04.gif17.27 kB28-01-02 08:54
16fig05.gif23.88 kB28-01-02 08:54
16fig06.gif23.85 kB28-01-02 08:54
16fig07.gif36.76 kB28-01-02 08:54
16fig08.gif20.66 kB28-01-02 08:54
16fig09.gif71.63 kB28-01-02 08:54
16fig10.gif23.23 kB28-01-02 08:54
16fig11.gif22.97 kB28-01-02 08:54
16fig12.gif22.06 kB28-01-02 08:54
16fig13.gif49.73 kB28-01-02 08:54
16fig14.gif60.34 kB28-01-02 08:54
18fig01.gif63.94 kB28-01-02 08:54
18fig02.gif24.07 kB28-01-02 08:54
ccc.gif109.00 B28-01-02 08:54
spacer.gif41.00 B31-05-02 17:16
view.gif41.00 B31-05-02 17:16
0672319942_s.jpg2.91 kB29-01-02 11:29
...
Sponsored links
  • Sent successfully!
  • 1 point

196.html (1.65 MB)

Need 1 point
Your Point(s)

Your Point isn't enough.

Get point immediately by PayPal

More(Debit card / Credit card / PayPal Credit / Online Banking)

Submit your source codes. Get more point

^_^"Oops ...

Sorry!This guy is mysterious, its blog hasn't been opened, try another, please!
OK

Warm tip!

CodeForge to FavoriteFavorite by Ctrl+D