Of course! It seems you're looking for information on how to find things within a Python string. This is a very common task.

The most important thing to know is that Python strings are immutable. This means you cannot change them in place. Instead, methods that "find" or "replace" things return a new string with the changes.
Here is a comprehensive guide to the main methods and concepts for finding and manipulating strings in Python.
Finding Substrings (The Basics)
These methods are used to check if a substring exists and where it is located.
in Operator (The Easiest Check)
This is the most straightforward way to see if a substring exists within a string. It returns True or False.

text = "Hello, world!"
# Check if a substring exists
print("world" in text) # Output: True
print("python" in text) # Output: False
str.find(substring)
This method returns the lowest index where the substring is found. If the substring is not found, it returns -1.
text = "Hello, world, welcome to the world of Python."
# Find the index of "world"
index = text.find("world")
print(f"Found 'world' at index: {index}") # Output: Found 'world' at index: 7
# Find a substring that appears multiple times
# find() only returns the first occurrence
index_second = text.find("world", index + 1) # Start searching after the first 'world'
print(f"Found second 'world' at index: {index_second}") # Output: Found second 'world' at index: 24
# Find a substring that doesn't exist
not_found = text.find("python")
print(f"Index of 'python': {not_found}") # Output: Index of 'python': -1
str.rfind(substring)
This is the same as find(), but it returns the highest index (the last occurrence).
text = "Hello, world, welcome to the world of Python."
# Find the last occurrence of "world"
last_index = text.rfind("world")
print(f"Last 'world' found at index: {last_index}") # Output: Last 'world' found at index: 24
str.index(substring)
This works just like find(), but with one crucial difference: if the substring is not found, it raises a ValueError instead of returning -1. This is useful if you consider the absence of the substring to be an error condition.
text = "Hello, world!"
# Substring exists
print(text.index("world")) # Output: 7
# Substring does NOT exist
try:
text.index("python")
except ValueError as e:
print(f"Error: {e}")
# Output: Error: substring not found
Checking the Start and End of a String
These are highly optimized and very common for checking file extensions, URLs, etc.

str.startswith(prefix)
Returns True if the string starts with the specified prefix, otherwise False.
filename = "image_001.jpg"
url = "https://www.python.org"
print(filename.startswith("image")) # Output: True
print(filename.startswith("data")) # Output: False
print(url.startswith("https://")) # Output: True
str.endswith(suffix)
Returns True if the string ends with the specified suffix, otherwise False.
filename = "image_001.jpg"
csv_file = "data_report.csv"
print(filename.endswith(".jpg")) # Output: True
print(csv_file.endswith(".txt")) # Output: False
Extracting Substrings (Slicing)
Once you've found a substring's position, you can extract it using slicing.
The syntax is string[start:end]. It extracts from start up to (but not including) end.
text = "Hello, Python world!" # Get the word "Python" # We know it starts at index 7 and ends at index 13 (7+6) python_word = text[7:13] print(python_word) # Output: Python # Slicing is forgiving with out-of-bounds indices # This will not cause an error print(text[100:200]) # Output: (an empty string)
Advanced Pattern Matching with Regular Expressions
For complex "find" operations (e.g., finding all email addresses, phone numbers, or patterns in text), Python's re module is the tool for the job.
re.search(pattern, string)
Scans through the string looking for the first location where the pattern matches. It returns a match object if found, otherwise None.
import re
text = "My email is john.doe@example.com and my phone is 123-456-7890."
# Find an email address
email_match = re.search(r'\S+@\S+', text) # A simple regex for email
if email_match:
print(f"Found email: {email_match.group()}") # .group() gets the matched text
# Output: Found email: john.doe@example.com
# Find a phone number
phone_match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if phone_match:
print(f"Found phone: {phone_match.group()}")
# Output: Found phone: 123-456-7890
re.findall(pattern, string)
Finds all non-overlapping matches of the pattern in the string and returns them as a list of strings.
import re text = "Apples are sweet, and oranges are citrusy. Bananas are also sweet." # Find all words that are 'sweet' sweet_words = re.findall(r'sweet', text) print(sweet_words) # Output: ['sweet', 'sweet'] # Find all words that start with 'a' and end with 'e' ae_words = re.findall(r'\ba\w*e\b', text) # \b is a word boundary print(ae_words) # Output: ['are', 'and', 'are', 'also']
Summary Table
| Method/Operator | What It Does | Return Value on Success | Return Value on Failure |
|---|---|---|---|
in |
Checks for existence | True |
False |
str.find() |
Finds the first index | Integer index (e.g., 7) |
-1 |
str.rfind() |
Finds the last index | Integer index (e.g., 24) |
-1 |
str.index() |
Finds the first index | Integer index (e.g., 7) |
Raises ValueError |
str.startswith() |
Checks if string starts with a prefix | True |
False |
str.endswith() |
Checks if string ends with a suffix | True |
False |
re.search() |
Finds the first pattern match | Match object | None |
re.findall() |
Finds all pattern matches | List of strings | Empty list [] |
Complete Example
Let's put it all together. Imagine we want to extract all image filenames from a log file.
import re
log_data = """
INFO: Processing image1.png at 10:00:05
ERROR: Could not find image2.jpg
INFO: Successfully processed photo3.jpeg
INFO: Starting task for image4.png
"""
# 1. Find all lines that start with "INFO:"
info_lines = [line for line in log_data.splitlines() if line.startswith("INFO:")]
print("--- Info Lines ---")
for line in info_lines:
print(line)
# Output:
# --- Info Lines ---
# INFO: Processing image1.png at 10:00:05
# INFO: Successfully processed photo3.jpeg
# INFO: Starting task for image4.png
# 2. Use a regular expression to find all image filenames (e.g., .png, .jpg, .jpeg)
# re.findall is perfect for this
image_filenames = re.findall(r'\w+\.(png|jpg|jpeg)', log_data)
print("\n--- Found Image Filenames ---")
print(image_filenames)
# Output:
# --- Found Image Filenames ---
# ['png', 'jpg', 'jpeg', 'png']
# To get the full filename, we can modify the regex slightly
full_filenames = re.findall(r'\w+\.(png|jpg|jpeg)', log_data) # This is not quite right for the full name
# A better regex for the full name:
full_filenames_corrected = re.findall(r'\w+\.(?:png|jpg|jpeg)', log_data)
print("\n--- Found Full Image Filenames ---")
print(full_filenames_corrected)
# Output:
# --- Found Full Image Filenames ---
# ['image1.png', 'image2.jpg', 'photo3.jpeg', 'image4.png']
Choose the method that best fits your specific problem! For simple checks, in and startswith/endswith are great. For finding positions, find is perfect. For complex pattern matching, re is the way to go.
