Beautiful Soup Documentation
Beautiful Soup Documentation
Release 4.4.0 Leonard Richardson
Dec 24, 2019
Contents
1 Getting help
3
2 Quick Start
5
3 Installing Beautiful Soup
9
3.1 Problems after installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Installing a parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Making the soup
13
5 Kinds of objects
15
5.1 Tag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2 NavigableString . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.3 BeautifulSoup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.4 Comments and other special strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 Navigating the tree
21
6.1 Going down . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Going up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.3 Going sideways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.4 Going back and forth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 Searching the tree
29
7.1 Kinds of filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.2 find_all() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.3 Calling a tag is like calling find_all() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.4 find() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.5 find_parents() and find_parent() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.6 find_next_siblings() and find_next_sibling() . . . . . . . . . . . . . . . . . . . . 37
7.7 find_previous_siblings() and find_previous_sibling() . . . . . . . . . . . . . . 38
7.8 find_all_next() and find_next() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.9 find_all_previous() and find_previous() . . . . . . . . . . . . . . . . . . . . . . . . 39
7.10 CSS selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8 Modifying the tree
43
8.1 Changing tag names and attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
8.2 Modifying .string . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
i
8.3 append() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8.4 extend() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8.5 NavigableString() and .new_tag() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8.6 insert() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 8.7 insert_before() and insert_after() . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 8.8 clear() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 8.9 extract() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 8.10 decompose() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 8.11 replace_with() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 8.12 wrap() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 8.13 unwrap() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 8.14 smooth() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
9 Output
49
9.1 Pretty-printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
9.2 Non-pretty printing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
9.3 Output formatters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
9.4 get_text() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
10 Specifying the parser to use
55
10.1 Differences between parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
11 Encodings
57
11.1 Output encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
11.2 Unicode, Dammit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
12 Line numbers
63
13 Comparing objects for equality
65
14 Copying Beautiful Soup objects
67
15 Parsing only part of a document
69
15.1 SoupStrainer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
16 Troubleshooting
71
16.1 diagnose() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
16.2 Errors when parsing a document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
16.3 Version mismatch problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
16.4 Parsing XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
16.5 Other parser problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
16.6 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
16.7 Improving Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
17 Translating this documentation
75
18 Beautiful Soup 3
77
18.1 Porting code to BS4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
ii
Beautiful Soup Documentation, Release 4.4.0
Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.
These instructions illustrate all major features of Beautiful Soup 4, with examples. I show you what the library is good for, how it works, how to use it, how to make it do what you want, and what to do when it violates your expectations.
This document covers Beautiful Soup version 4.8.1. The examples in this documentation should work the same way in Python 2.7 and Python 3.2.
You might be looking for the documentation for Beautiful Soup 3. If so, you should know that Beautiful Soup 3 is no longer being developed and that support for it will be dropped on or after December 31, 2020. If you want to learn about the differences between Beautiful Soup 3 and Beautiful Soup 4, see Porting code to BS4.
This documentation has been translated into other languages by Beautiful Soup users:
?.
? ()
?.
? Este documento tamb?m est? dispon?vel em Portugu?s do Brasil.
Contents
1
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- ryu documentation read the docs
- lxml create xml from schema
- beautiful soup documentation
- beautiful soup documentation — beautiful soup 4 9 0
- add library to python request
- beautiful soup documentation — beautiful soup v4 0 0
- dragline documentation
- portable python projects
- natural language processing using python
- beautiful soup tutorial rxjs ggplot2 python data
Related searches
- beautiful words with beautiful meanings
- beautiful words with beautiful meaning
- ham and bean soup calories
- homemade bean soup nutrition
- calculator soup mixed fractions
- calculator soup percent of change
- calculator soup fraction to decimal
- calculator soup grams to moles
- calculator soup percentage difference
- calculator soup decimal to percent
- calculator soup percent to decimal
- calculator soup simplify fractions