🔧 BeautifulSoup Cheat Sheet Python
Nachrichtenbereich: 🔧 Programmierung
🔗 Quelle: dev.to
BeautifulSoup Cheat Sheet
Commonly Used find
and select
Methods in BeautifulSoup
1. find()
- Purpose: Find the first occurrence of a tag.
-
Usage:
soup.find('tag_name', {attributes}, text=optional_text)
-
Example:
first_div = soup.find('div') p_with_class = soup.find('p', class_='example') a_tag = soup.find('a', href='/home')
2. find_all()
- Purpose: Find all occurrences of a tag.
-
Usage:
soup.find_all('tag_name', {attributes}, limit=number)
-
Example:
all_p_tags = soup.find_all('p') all_links = soup.find_all('a', class_='link') first_five_divs = soup.find_all('div', limit=5)
3. select()
- Purpose: Find all tags matching a CSS selector.
-
Usage:
soup.select('CSS_selector')
-
Example:
divs_with_class = soup.select('div.example') links_in_divs = soup.select('div a') element_with_id = soup.select('#specific-id')
4. find_parents()
/ find_parent()
- Purpose: Find parent(s) of a tag.
-
Usage:
soup.find_parent('tag_name')
orsoup.find_parents('tag_name')
-
Example:
parent_div = soup.find('span').find_parent('div') all_parents = soup.find('span').find_parents()
5. find_next_sibling()
/ find_previous_sibling()
- Purpose: Find the next or previous sibling of a tag.
-
Usage:
soup.find_next_sibling('tag_name')
orsoup.find_previous_sibling('tag_name')
-
Example:
next_sibling = soup.find('div').find_next_sibling() prev_sibling = soup.find('div').find_previous_sibling()
6. find_all_next()
/ find_all_previous()
- Purpose: Find all tags after or before a specific tag.
-
Usage:
soup.find_all_next('tag_name')
orsoup.find_all_previous('tag_name')
-
Example:
next_p_tags = soup.find('h1').find_all_next('p') previous_div_tags = soup.find('h2').find_all_previous('div')
7. select_one()
- Purpose: Find the first element matching a CSS selector.
-
Usage:
soup.select_one('CSS_selector')
-
Example:
first_div_container = soup.select_one('div.container') first_link_in_main = soup.select_one('#main a')
8. find_next()
/ find_previous()
- Purpose: Find the next or previous element in the document.
-
Usage:
soup.find_next('tag_name')
orsoup.find_previous('tag_name')
-
Example:
next_p_tag = soup.find('div').find_next('p') previous_div = soup.find('p').find_previous('div')
9. find_all(string=True)
- Purpose: Find all occurrences of a specific string or text.
-
Usage:
soup.find_all(string="text_to_find")
-
Example:
python_mentions = soup.find_all(string="Python") programming_mentions = soup.find_all(string=lambda text: "Programming" in text)
10. find_all(True)
(Find all tags)
- Purpose: Find all tags in the document.
-
Usage:
soup.find_all(True)
-
Example:
all_tags = soup.find_all(True)
Example Use Cases
-
Find all links on a page:
links = soup.find_all('a', href=True) for link in links: print(link['href'])
-
Find all headings (h1 to h6):
headings = soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6']) for heading in headings: print(heading.get_text())
-
Extract text from a specific class using CSS selector:
text_in_class = soup.select_one('.specific-class').get_text()
-
Find all images on a page:
images = soup.find_all('img') for image in images: print(image['src'])
Notes
-
find()
andfind_all()
are the go-to methods for finding elements based on tag names and attributes. -
select()
andselect_one()
are very powerful if you're comfortable with CSS selectors. - Navigational methods like
find_next()
,find_previous()
, andfind_parents()
help when you need to traverse through sibling and parent tags. -
find_all(string=True)
is useful when searching for specific text rather than tags.
Additional Methods:
-
find_all(True)
– Finds all tags in the document, useful when you want to iterate over everything. -
get_text()
– Extracts the text from a tag, stripping away HTML tags.
python
# Example of extracting text:
text = soup.find('p').get_text()
🔧 BeautifulSoup Cheat Sheet Python
📈 56.86 Punkte
🔧 Programmierung
🔧 Scraping Webpage Using BeautifulSoup In Python
📈 32.72 Punkte
🔧 Programmierung
🔧 Python RegEx Cheat Sheet
📈 29.44 Punkte
🔧 Programmierung
🔧 Python Basics Cheat Sheet
📈 29.44 Punkte
🔧 Programmierung
🔧 Cheat sheet for development in Python
📈 29.44 Punkte
🔧 Programmierung
📰 SANS Cheat Sheet: Python 3
📈 29.44 Punkte
📰 IT Security
🐧 ELinks with BeautifulSoup
📈 27.43 Punkte
🐧 Linux Tipps
🔧 Docker Cheat Sheet
📈 24.15 Punkte
🔧 Programmierung
🔧 🚀Java Stream API Cheat Sheet for Developers
📈 24.15 Punkte
🔧 Programmierung
🔧 Git Cheat Sheet – Git Commands You Should Know
📈 24.15 Punkte
🔧 Programmierung
🔧 Linux and Git-GitHub cheat sheet!
📈 24.15 Punkte
🔧 Programmierung
📰 Quantum computing: A cheat sheet
📈 24.15 Punkte
📰 IT Security Nachrichten
🐧 Linux File Permissions Cheat Sheet
📈 24.15 Punkte
🐧 Linux Tipps
🔧 🚀 Docker Cheat sheet for Beginners
📈 24.15 Punkte
🔧 Programmierung
📰 Cheat Sheet für Microsoft PowerToys
📈 24.15 Punkte
🤖 Android Tipps