Identifying elements in the page | XPath functions

  In the Previous Tutorial, we learned the following –

  • Introduction to the mighty XPath to locate elements on the page.
  • Writing XPath query that starts searching the XML from root node by using single forward-slash ‘/’.
  • Using double forward-slash ‘//’ at the start of the XPath query so that it could search the matching element anywhere in the XML.
  • Look for immediate child element by using single forward-slash ‘/’.
  • Jump to any of the matching child elements, grandchild or so on, by using double forward-slash ‘//’.

In this tutorial, we’ll learn different XPath functions.

Writing full attribute value in XPath

 Let us learn by going through some other example.

  • Let us launch Google Chrome and navigate to http://www.imdb.com/
  • How about inspecting image IMDbPro?

Following is the source code snippet that Chrome developer tools would have displayed:

<img src="http://i.media-imdb.com/images/SF4a741137cf9a260e127fef64455ebfbc/navbar/imdbpro_logo_nb.png"
alt="IMDbPro Menu" style="background-color: transparent;">
</img>

As we can see the element has img tag that has the following attributes – src, alt, style.

If we use src attribute this is how our XPath query would look like:

//img[@src='http://i.media-imdb.com/images/SF4a741137cf9a260e127fef64455ebfbc/navbar/imdbpro_logo_nb.png']

Hey, Isn’t the XPath query looking so long?

Yes, that is one of the reasons for constructing a partial XPath query. Also as the ‘src’ attribute contains a URL in its value, there are chances that it may change in later releases.

XPath Functions

Suppose we opened the login page of a website xyz.com. Its Login button’s id attribute has the value of 987submit123. We refresh the page and the id attribute’s value changes to 321submit987 and so on.

In other words, a part of the attribute’s value is static while the rest is dynamic.

In these scenarios, we would go for a partial XPath query.

contains()

Let us try to construct such an XPath query to locate the ‘IMDbPro’ object shown in the above steps. Here is the source code snippet of that element –

<img src="http://i.media-imdb.com/images/SF4a741137cf9a260e127fef64455ebfbc/navbar/imdbpro_logo_nb.png"
alt="IMDbPro Menu" style="background-color: transparent;">
</img>

We can use XPath’s contains() function so that we can pass a part of the value for the src attribute:

//img[contains(@src,'imdbpro')]

 Wait… Explain this XPath man?

This XPath would query for the img tag element ANYWHERE (//) in the document, that has src attribute. And yes, it doesn’t care what its exact value is. It just makes sure that the value contains imdbpro. Here contains() is an XPath function that checks if the attribute CONTAINS the value. 

starts-with()

Let us try to locate the same element but by using a different XPath function – starts-with():

 //img[starts-with(@alt,’IMDbPro’)]

This xpath query is hunting for img tag element ANYWHERE in the document, that has an alt property. It will only identify that object if its alt property value STARTS WITH IMDbPro.  

text()

Let me show me a way to locate an element using its displayed text.

Let us Inspect the text ‘Opening This Week’ in the same IMDB page

This is how the source code snippet should look like:

<h3 style="background-color: transparent;">
    Opening This Week
</h3>

We can use XPath function text() to construct a query:along with using ‘*’. Try this XPath –

//h3[text()=’Opening This Week’]

Wildcard (*)

If we do not know the tag or we suspect the tag is dynamic we can use wildcard character *. The above xpath can also be written like this:

//*[text()=’Opening This Week’]

It is saying – Hey catch ANY(*) element anywhere in the document. It must be displaying the following text – ‘Opening This Week’. Note that * says that the tag name can be ANYTHING it does not care. We use ‘*’ when we suspect the tag name can change or we don’t know the tag name. 

Using two XPath functions ‘contains()’ and ‘text()’ together

//h3[contains(text(),’This Week’)] 

This XPath query is looking for and h3 tag element anywhere in the document that CONTAINS following TEXT – This Week 

Did you miss using @ before text().

We did not use @ after [. In all previous XPaths, we have used @ at this position. This is because we use ‘@’ before properties(or attributes) – like id, class, name, alt, src. text() is a XPath function. It is not object property. If you look closely at the source code you would not find text(), like you find id, class, name etc.

Indexing in XPath

What if the page has five elements with identical properties and we need to locate the second one?

I explained it in the previous XPath Basics tutorial with this example –

//table/tbody/tr/td[2]/div/input[@id='gbqfq']

It is called XPath indexing. The index starts with 1. So td[2] will match the second td element.

Summary

In this tutorial, we learned to construct Partial XPath to locate Objects. We also learned following XPath functions – contains(), starts-with() and text(). We used * in place of tag name where we were not sure about the exact tag name. We revisited XPath indexing.  

Assignments

  • Assignment#1 – Construct Partial XPath (use function ‘starts-with()’ for ‘class’ attribute) to locate IMDb logo. Try highlighting using Selenium IDE.
  • Assignment#2 – It is not actually one assignment. LOL. Go to the footer of the page and try to construct XPaths to locate ALL links. You can use any method of your choice. Try highlighting using Selenium IDE.

Post your magical XPaths in comments. I would request you not to post your XPath as a working solution until you tried on Selenium IDE and it highlighted the target Object. Otherwise, it will confuse others.

If you have any issues constructing XPath for assignments or any other objects, please feel free to ask for help. But make sure to post what all combinations you have tried.

In the Next Tutorial, we will explore one more method of locating elements, i.e the magical CSS locators.

27 thoughts on “Identifying elements in the page | XPath functions”

  1. first of all a big thanks for including assignments as suggested and that too 2 per post 🙂 🙂

    To highlight different things on http://www.imdb.com

    Over all Box of the imdb
    //*[starts-with(@class,'navbarSprite')]

    Assignment 1) answers

    imdb logo without using tag name
    //*[starts-with(@class,'navbarSprite h')]

    imdb logo using tag name
    //a[starts-with(@class,'navbarSprite h')]

    Assignment 2 )

    To search links in footer

    //a[contains(@onclick,'footer')]

    Its searching only first link, though the code is generic.

    Some Questions i have
    **********************

    Q1) Is there a way to do a case insensitive search ? like instead of 'IMDB' i can put 'imDb'

    Q2) how to search after the first occurence is found, as its able to search first everytime

    Also please let me know how you have moved from blogger to your own domain, i mean to ask where the site is hosted,, the thing(any tutorial to move) followed while moving

    Site is really nice , Thanks a lot ..

    Also if possible , give me some hint if i can improve the above answers for the assignments.

    Regards
    http://www.udzial.com
    Udzial Means Share

    Reply
  2. Good work dude.

    Re Q1 : Unfortunately XPath 1.0 doesnot provide a direct way. v 2.0 has lower-case(), upper-case() functions. But there is a workaround. Let us say the 'title' attribute has 'Home' property. We can first instruct XPath to treat uppercase as lowercase and then apply assignment –
    //a[translate(@title, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz')='home']

    Re Q2) Do you mean to locate second, third… matching elements? If yes, Use Xpath indexing. Try this –
    //a[contains(@onclick,'footer')][1]
    //a[contains(@onclick,'footer')][2] etc

    Reply
  3. q1) reply
    cant we give a range,, we need to type a till z , is there a way to a–Z(range kind of thing)

    q2) reply

    this worked,, but what if i don't know the number of elements in footer so is there a generic way to traverse all footer from say 1 … 100

    and yeah thanks for replying

    Regards
    Gaurav Khurana

    http://www.udzial.com
    Udzial Means Share

    Reply
  4. q1) Unfortunately that is the limitation with XPath. Better to follow case sensitive if you want to avoid.

    q2) yes we can do that. first we can locate the footer object in which all target links are residing. Then inside that footer object we can hunt for links. For example – all links are under this loocator – //div[id='footer']

    We can go like this –
    List lstFooterLinks = driver.findElements(By.xpath("//div[id='footer']//a"))

    then write loop to iterate till list size. e.g for (int i=0; i<lstFooterLinks.size();i++)
    then access all by lstFooterLinks.get(i)

    Reply
  5. Answer to assignment 1:
    //a[starts-with(@class,'navbarSprite h')]
    Answer to assignment 2:
    Home – //*[contains(@onclick,'footer')]
    Search – //a[2][contains(@onclick,'footer')]
    Site Index – //a[contains(@onclick,'siteindex')], //a[starts-with(@href,'/a2z')]
    //a[contains(@onclick,'intheaters')]
    //a[contains(@onclick,'comingsoon')]

    Reply
  6. 1.//a[starts-with(@class,'navbarSprite')]

    2. List of links:
    //p[@class='footer'][1]
    Each link:
    //p[@class='footer'][1]/a[1]
    //p[@class='footer'][1]/a[2]
    //p[@class='footer'][1]/a[3]
    //p[@class='footer'][1]/a[4]

    //p[@class='footer'][1]/a[18]

    Reply
  7. Hi Shadab,

    Great work man and writing skill is also superb..

    Now, let's come to the main point i have completed the assignment. But I was trying to implement one of your method, accessing all elements by using loop as you put in comments:

    "We can go like this –
    List lstFooterLinks = driver.findElements(By.xpath("//div[id='footer']//a"))

    then write loop to iterate till list size. e.g for (int i=0; i<lstFooterLinks.size();i++)
    then access all by lstFooterLinks.get(i)"

    When I tried, it ask me to import packages for List and there were many but i couldn't figure out the correct one.
    It would be great if you can put a complete sample code here for this( suppose we would like to click every link one by one) along with the packages need to import.

    Reply
  8. iframe src="http://www.facebook.com/plugins/likebox.php?href=www.facebook.com%2Fudzial&width=250&colorscheme=dark&show_faces=true&stream=false&header=false&height=238&quot; scrolling="no" style="border:none; overflow:hidden; width:250px; height: 238px;" allowtransparency="true" frameborder="0"> iframe

    http://udzial.com/2014/07/hello-world.html

    i tried the above thing with the below but its not working for

    //iframe[@starts-with(src,'http')]

    will it not catch iframe elements

    Reply
  9. For frames you need to work differently…
    1. First, you need to select that frame
    driver.switchTo().frame(frame-locator);

    In your case frame-locator could be //iframe[contains(@src,'likebox.php')]
    Thou I didn't test it for your source code snippet…You can try other variations as well.

    2. Then perform actions on elements inside that frame.
    driver.findElement("element-locator").click;

    You just gave me one more tutorial idea 🙂

    Reply
  10. Assignment 1 :
    //div/span/a[starts-with(@class,’navbar’)]
    or
    //a[starts-with(@class,’navbar’)]

    Assignment 2:
    1.
    //span[contains(text(),’Amazon Video’)]
    or
    //*[contains(text(),’Amazon Video’)]

    2. //span[contains(text(),’Prime Video’)]
    3.//span[contains(text(),’Amazon Germany’)]
    4.//span[contains(text(),’Amazon Italy’)]
    5.//span[contains(text(),’Amazon France’)]
    6.//span[contains(text(),’Amazon India’)]
    6.//span[contains(text(),’DPReview’)]
    7.//span[contains(text(),’Audible’)]

    Reply
  11. following is teh script for IMDB logo.

    Tried the following xpath query:
    logo = browser.find_element_by_xpath(“//a[starts-with(@title,’home’)]”) but getting no such element exception.

    Reply
  12. following is the script for IMDB logo.

    Tried the following xpath query:
    logo = browser.find_element_by_xpath(“//a[starts-with(@title,’home’)]”) but getting no such element exception.

    Reply

Leave a Reply