Thursday, December 15, 2016

WebDriver / UI Testing: Where to Start

I’ve been asked this question enough times over the years I thought I’d (finally!) write up a quick post to point folks in the right directions for getting started.

What are the best practices for getting started with WebDriver (or some other User Interface functional automation tool)?

So first off, forget “best practices.” There ain’t no such a thing. There are guidelines and lessons learned you need to adapt to your own situation. With that in mind, here are some of my thoughts and references you should consider as starting points for your own journey.

Whatever you do, do not jump straight into throwing code or tooling around in order to try and solve some poorly understood problem! Invest some time thinking, planning, and experimenting. You’ll be much happier with the end result. (Ask me how I know…)

Clearly Define The Problem You’re Trying to Solve

Why do you want to move to WebDriver or some other nifty UI functional test tool? Get together with your entire team (which includes support, product owners, stakeholders, etc.) and talk about the “why?” behind your thinking. Look above the “simple” technical issues to larger process and business problems. Some of those responses might include:

  • We’re seeing a huge spike in support tickets due to unclear functionality or outright defects after every major release
  • We’re missing ship dates due to the time needed for regression testing
  • Work items slip in to following iterations due to time for testing…
  • and the corollary: Testers get work items only a day or two before the iteration finishes
  • Regression defects get found a week or two after the root cause
  • We’re not building what I (the stakeholder/PO) wanted

There aren’t necessarily wrong answers or items for this list. Just make sure you understand the problem you’re trying to solve—and you may find different ones to solve after these discussions!

Choose a Tool That Works for Your Team and Situation

WebDriver is a wonderful tool that I love and use all the time. However, it’s one of many tools that might solve the job for you. Consider the things you identified above, then have a look at all potential solution toolsets.

Do you want a plain English/Domain Specific Language toolset that helps you clarify specifications and behaviors?

Do you want a record and playback tool that will help you get a starting point built quickly? (Dismissive of record and playback? They used to suck. They’ve grown up. A lot. Have a look and be open-minded.)

Do you want to just write thin wrappers around WebDriver and roll forward?

All these are considerations you need to take in mind.

Go read Elisabeth Hendrickson’s wonderful blogpost from 2011 “Selecting test automation tools.” Her post is still one of the best I’ve ever read on going through the selection process. You’ll need to update the list of actual frameworks, drivers, and tools, but still…

Plan Your Approach

Once you’ve selected your toolset, spend some time laying out your initial approach for your UI tests. Spend time working with your toolset to understand how it handles things like representing pages/views/etc. Figure out where you can tie in calls to your own custom code for doing things like test setup, invoking heuristics/oracles, environment configuration, etc.

Get clear on how the toolset works and what a test execution lifecycle looks like. For example, let’s say you’re using Cucumber and WebDriver on the Java stack. You’ll also need some form of test execution framework like TestNG, JUnit, etc.

In these cases JUnit (or TestNG) is generally the starting point for any test execution. Details vary, but JUnit likely starts a single test that calls out to the Cucumber runner. Your Cucumber step definitions make WebDriver statements that manipulate a browser while referencing your own custom page objects for details on each page/view/etc. Those step definitions may also invoke your own custom framework to create data, validate database conditions, and tear down test structures.

Understanding this level of detail ahead of time is critical to getting things well-organized from the start. Of course you’ll learn things and adapt as you move forward, but at least you’ll be starting from a good point.

Understand Your Process

Successful test automation, especially functional/UI test automation, depends on a good process. Conversations around testing need to happen early, regularly, and frequently.

Begin conversations about functional testing at the release planning stage. Make sure your UI/DX designers are taking testability in to account. Start discussing what workflows might look like, and which ones will be critical to automate. Lay out initial plans for test data.

Three Amigos

Three Amigos conversations have become my most-favorite tool for helping teams quickly improve their delivery behaviors. I wrote a post on it earlier this year, and had a great conversation with the folks at the Eat, Sleep, Code podcast on the topic.

Concurrent Testing, Not N+1 (or 2, or 3)

Great Three Amigos are the single most important thing to getting your testing happening in the same iteration as development work is completed. A good conversation will let the team start writing tests at the same time the feature is being developed. Mature teams do this as second nature: A quick discussion of the work flow, a quick scribble of field locators, and off everyone goes. It is literally that easy once you get past the very small learning curve.

Clarify Your Coverage

How many UI tests should we have?

As few as possible, and then ten percent less than that.

Seriously, many newer teams rely far, far too much on UI testing. Push as much testing as close to the code as possible. UI tests should not be checking multiple scenarios of algorithms. UI tests should not be checking every parameter for input validation. Those sorts of tests should be handled at the unit test level, either on server side code via something like JUnit, or in the browser’s JavaScript via something like Jasmine.

Focus your UI tests on high-value, critical flows: Can I create a new sales order? Does a failed timecard properly flag for audit if the number of hours worked are over the business’s max hour limit?

Those are the sorts of high-value scenarios we want validated at the UI level. Handle the other situations at more appropriate levels of unit or integration testing.

Pair Your Developers and Testers

Want the best chance of success for your UI functional tests? Of course you do.

Then burn down silos between testers and developers. Get them pairing together on actual feature work. Have the entire team own the UI automation suite. The benefits from this approach are numerous:

  • Tests rarely have any coverage overlap with other test types because everyone knows what’s getting tested where
  • UI test suites are built and maintained with software craftsmanship principles in mind. Duplication, complexity, dependencies—all are mitigated when folks with different applicable skills are writing the tests.
  • Your system itself becomes more testable as the team designs and builds system code with things like good locators, asynch support, and configurability

Learn the Common Issues of UI Testing

So now you’re eight hours and 23,942 words in to this blog post and I’m finally getting to some specifics around WebDriver. There’s a point to that. If you start way down here and ignore everything above this, well, frankly you’re screwed. Work on all the critical things above first, then when it comes time to handle a few specific things you’ll be in much better shape.

Locators, Locators, Locators

WebDriver (and other UI tools) need a way to find and interact with elements on a page or view. Some tools use different terminology, but the idea’s the same: Find a thing on the page’s DOM (or mobile device’s view, or whatever), then inspect/inject/interact with it somehow.

Here’s a list of things to consider when working with locators:

  • Store locator definitions in a Page Object class or similar abstraction where they’re defined once in the entire codebase. Avoid duplication. All other tests/steps/whatever refer to those abstractions.
  • Prefer using “id” attributes when working with HTML. IDs are fast to work with and are unique on valid HTML pages.
  • Beware of controls that auto-generate IDs. They may be tied to that element’s location, and may not be constant.
  • Look for ways to customize IDs for auto-generated widgets/controls/whatever.
  • Consider JQuery style locators next when IDs aren’t appropriate.
  • Avoid XPath locators, except when it makes sense. Start your XPath expressions as close to the element as possible, never from the root of the document.
  • Consider using text content instead of attribute-based locators when dealing with dynamic content, eg
    FindElement(By.XPath("@myDataTable//tr[contains(.,'Some unique text')]"))


Brittle, intermittently failing tests due to asynchronous situations are often what cause many teams to lose faith in their test suites—or abandon them altogether.

Learning to deal with asynch situations is critical. The first, best step is ensuring a really good Three Amigos conversation as the work item is being discussed. The group needs to explicitly discuss any potential async situations before the tests or system code are written. This ensures the best chance for a stable, reliable test.

Sam: “You’re working this feature to display search results as a user keys in criteria, right?”
Reena: “Yes. The search results change as each character is typed. I know some word wheel searches require two characters to narrow; this one will be on every character.”
Sam: “OK, so that is a callback from each typed character. Does the async action (the callback) complete when the result list updates with the narrowed list, or is there some other action/event?”
Reena: “The list updating signals the callback is complete.”
Sam: “Simple! And this user story is only on narrowing the results, correct? Selecting a result is a different story?”
Reena: “Correct. Do you need any custom IDs for your locators, or will the text content of the list be good enough?”
Sam: “The list will be fine as is, but can you please label the input field with an ID of ‘search’ ?”

And so it goes. Other things to consider to mitigate as much asynch pain as possible (you’ll never eliminate it…)

  • WebDriverWait is your best friend for async situations. Learn how it works, and learn the different ExpectedConditions. (Learn its equivalent for other toolsets.)
  • Determine what the exact condition you need to move forward with your test, then set a wait on that condition, e.g.
    • A button must be enabled. WebDriverWait.until( ExpectedConditions.elementToBeClickable( some element locator )
    • Element contains text WebDriverWait.until( ExpectedConditions.textToBePresentInElement( some text )
      • An element is on the DOM WebDriverWait.until( ExpectedConditions.elementExists( some element locator )
      • and so on
  • Avoid using Thread.Sleep() or any variant of it except when troubleshooting. Use of Thread.Sleep() in production tests should be limited to the most exceptional of situations, and should never be allowed without discussion with other team members. I’ve had suites of over 15,000 tests with less than five Thread.Sleep() statements. That’s your benchmark.
  • Situations with multiple asynch calls do not need complex logic. Simply list out each condition separately. The timing will work out no matter what. Honest.
    WebDriverWait.until( ExpectedConditions.textToBePresentInElement( some text )
    WebDriverWait.until( ExpectedConditions.textToBePresentInElement( some other text )
    WebDriverWait.until( ExpectedConditions.elementToBeClickable( some element locator )
  • When working with complex async situations, consider building flags that appear on the DOM when an action is complete. The flow might look something like this:
    • Click submit button
    • WebDriverWait.until( ExpectedConditions.elementExists( By.Id("update_completed")
    • Complex events begin on server side and in browser
    • Complex events complete
    • Page updates with a hidden <div id='update_completed'>
    • Wait conditions triggers, test moves on

You can see an example of that flag in action on line 23 of this KendoGrid example in my GitHub repo.

Go Practice

Talk, experiment, fail, cuss, learn.

One of the best places to go practice the simple parts of UI automation is the awesome site set up by some of WebDriver’s community leaders.


Unknown said...

Great post, Jim. Thanks for taking the time to put it together. You helped clarify some things I'm running into now.

Jordan said...

Hi Jim,

Thanks for the great article. One question if you have a moment though:

Do you think the Page Object Model falls down on larger products? I realize that keeping UI testing to a minimum is important, but the cost of standing up POM classes carries some significant overhead. This is especially true in larger, more complex applications.

Are there any patterns more suited to these complex types of apps or is the overall belief that if you've hit the point where POM classes are out of control you have too many UI automation tests?


Jim Holmes said...


Thanks for your comment!

POM is the way to go, period. Large systems are complex with lots of screens. Even a disciplined team focusing on high-value tests will still end up with lots of PO classes. I regularly end up with many more POs than actual screens/views/pages because I'll decompose complex pages in to sub classes--think of an order entry page. That may have sections for customer info, shipping, billing, shopping cart, etc. Those are all distinct domain objects, so they merit their own PO classes.

I'd love to hear more detail about the overhead you're running in to. Feel free to drop me a mail at and we can discuss in more detail.

Subscribe (RSS)

The Leadership Journey