Selenium Extract href by ID Using Python and Java

When performing web scraping or test automation using Selenium, a common requirement is to extract the href value of an anchor (<a>) element with a dynamic or patterned id like id="p1234".

In this guide, you’ll learn how to extract href values from anchor elements by targeting IDs that match a pattern, such as p123, p45678, etc., using Python and Java with Selenium WebDriver.

πŸ” Problem Statement

You are trying to extract the href value from anchor tags like:

<a href="/profile/user-profile.html" id="p12345">User Profile</a>
<a href="/profile/another-profile.html" id="p67890">Another</a>

Where the id follows the format p followed by digits.

You might try find_element_by_id() in Selenium, but the issue is that the ID is dynamic, and you don’t know the exact number.


🐍 Solution in Python Selenium

You can solve this using XPath with a partial match using regular expressions or starts-with.

βœ… Step-by-Step Python Code

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("file:///path/to/your/file.html") # or your target URL

# Use XPath to find all <a> elements with ID starting with 'p' followed by digits
elements = driver.find_elements(By.XPATH, "//a[starts-with(@id, 'p') and @href]")

for element in elements:
element_id = element.get_attribute("id")
href = element.get_attribute("href")
print(f"ID: {element_id}, HREF: {href}")

driver.quit()

πŸ”Ž Output:

ID: p12345, HREF: https://example.com/profile/user-profile.html
ID: p67890, HREF: https://example.com/profile/another-profile.html

βœ… Bonus Tip: You can use re (regex) in Python to further filter the results if needed.


β˜• Java Solution Using Selenium WebDriver

Java also allows XPath filtering with Selenium using the same logic.

βœ… Java Code

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

import java.util.List;

public class ExtractHrefById {
public static void main(String[] args) {
WebDriver driver = new ChromeDriver();
driver.get("file:///path/to/your/file.html"); // or your target URL

// Find elements with id starting with 'p' and get href
List<WebElement> elements = driver.findElements(By.xpath("//a[starts-with(@id, 'p') and @href]"));

for (WebElement el : elements) {
String id = el.getAttribute("id");
String href = el.getAttribute("href");
System.out.println("ID: " + id + ", HREF: " + href);
}

driver.quit();
}
}

πŸ“Œ XPath Breakdown

//a[starts-with(@id, 'p') and @href]
  • //a – Selects all anchor tags.
  • starts-with(@id, 'p') – Filters where id starts with "p".
  • @href – Ensures the element has an href attribute.

If your id pattern is more complex (like p followed strictly by digits), XPath 2.0 would allow full regex. But in Selenium, use starts-with + filtering in code.

Previous Article

7 Seductive Secrets Girls Wish Guys Knew (Unlock Her Wild Side Tonight)

Next Article

Python Singleton Implementation – 5 Best Methods Compared

Write a Comment

Leave a Comment

Your email address will not be published. Required fields are marked *

Subscribe to our Newsletter

Subscribe to our email newsletter to get the latest posts delivered right to your email.
Pure inspiration, zero spam ✨