Tuesday, May 25, 2010

Exploring Selenium 2: Introduction & Lessons Learned in Selenium 1

We use Selenium (version 1.x) at work as a regression / functional test tool. We’ve gone through a number of iterations over the last year and a half since I first started pimping Selenium and learned some valuable lessons. Now we’re in the process of migrating to Selenium 2/WebDriver and I’ve been really pleased with the results so far.

I thought I’d write a short series of short articles showing some of the things we’re learning about Selenium 2 / WebDriver.  To kick off the series I thought I’d lay out an overview of the journey we’ve taken so far, and some rationale why I knowingly chose a less-than-optimal solution at the start.

First, a quick overview of the bits and pieces of Selenium v1, the toolset we started out with:

  • Selenium IDE (SIDE): A Firefox plug in that lets you record browser actions (navigation, clicks, input) and play back those. You can write assertions to check for things like text, elements, etc. Scripts can be saved out as HTML or exported to Java, Ruby, C#, and a few others. You can write your own templates for outputting scripts.
  • Selenium Remote Control (Selenium RC): A Java-based server app that launches, manages, and kills browser sessions. You start up RC, then run tests written in one of the various supported languages (lots of client libraries around), and RC does the rest. RC lets you run your tests against different browser platforms. If your scripts are simple, you can run one test suite against Firefox, IE, and Chrome without re-writing. Theoretically…
  • Selenium Core: This is the basic framework for both SIDE and RC. When you spin up RC you’re getting core as part of the environment.
  • Selenium Grid: Manages splitting off test suites to run your tests in parallel, thereby massively speeding up your tests. (I haven’t used Grid.)

Initially we started out using Selenium IDE, the record and playback tool, for writing our tests. Some caveats should be noted about our test goals: we were building out automated test suite from ground zero and were focusing on very fundamental, simple functionality tests. Think CRUD operations, navigation confirmation, etc. No fancy UI manipulation, no dealing with Ajax or Javascript. With this in mind, we could use SIDE and save off the scripts as HTML files and feed those to RC. This approach had some pros and cons.


  • At the time, our Program Managers were our primary testers. SIDE’s record/playback made it easy to get some tests written. ( Some tests > no tests)
  • HTML based scripts made it easy for folks writing tests to alter the scripts as needed because SIDE’s recording actions didn’t deal well with dynamic content or a number of other things. We’d have to clean all that up manually.
  • RC takes HTML scripts as an input, so we could automate what tests we were able to come up with.
  • HTML files are easily wrapped in to our source control system. Treat your tests like production code, because they are production code. Get your test files, regardless of format, in to your source control. Go on, I’ll wait here while you do it.


  • Record and playback tools are generally failures for anything above a trivial one-off script. The tools create fragile, wonky scripts which require a lot of tweaking to get right and more tweaking to maintain. (Yes, yes, yes, you may have had glorious success with tools from Rational or Mercury. 99% of the rest of the testing world can’t afford the hundreds of thousands of dollars for those tools.)
  • SIDE and Selenium don’t do well with relative links in their HTML files. This made it impossible to centralize common functions like logging in to and out of the system, creating users, etc. This is perhaps the biggest strike – if ANY link changes anywhere, you’ve got to fix up every occurrence of it. Massive violation of DRY, massive fail.
  • Managing Selenium RC’s server app during a build cycle can get tedious.
  • RC turned out to be even more finicky with timing issues around scripts. Any web UI script is already very sensitive to timing issues around your page loading and rendering. HTML scripts were even more so.

I’d known about some of the SIDE limitations going in to the effort; however, I needed something that would be a quick win to show how functional tests could help the entire team. We had some great success with those initial tests, but I had always wanted to push to “real” Selenium tests via Ruby or .NET. I knew we’d have much better results, and it was also a way to get me back in to writing code on a regular basis. (Cue evil mastermind’s laugh.)

Making the case for moving to code for our Selenium was greatly simplified because I’d show the value prop of test automation already. We’d caught a number of regression bugs simply through the automation tests chugging away and finding things like “Whoops. The logon link isn’t on the main page anymore…” Sometimes you need to start really small and get a few victories on the board before pressing on to the better solution.

With all this in mind, I started writing Selenium tests using the .NET libraries and running those via RC. I started a migration path converting the old HTML tests to C#-based tests and passing those off to RC for execution. More pros and cons:


  • I get to write real code. I’ve been in a Practice Lead, Program Manager, or Quality Lead for several years now and haven’t been able to crack open Visual Studio but once every few months. Writing code makes me very happy, so quite frankly I’m not ashamed to list this as a pro.
  • Using a “real” language to write tests instead of HTML scripts lets us start to build up a common library for frequently used calls. I can centralize things like log in / log out, content creation, etc. (I’m not centralizing tests for those steps, mind you, just the actions which might be prerequisites for other tests.) Building up infrastructure/an API for your test environment is critical to your success – speed, maintainability, readability, the benefits just go on and on.
  • I can now use a languages tools like iterators, branches, etc. to start writing much more powerful tests and support functions.
  • Did I mention I get to write real code again?
  • Tests in native code run through RC are much, much more stable and performant. We saw a significant reduction in false failures and tests ran faster.


  • Your tests are now in real code. If you were hoping to leverage non-coders to help write your tests then you’ve cut off that resource. Now you’re looking at finding time from your dev team to write your tests, or you’re looking at finding QA resources who can write code well enough to write your tests. (I found a great one of the latter. No, you can’t have her.)
  • Selenium RC is still in the picture. You have to spin it up, and you have to deal with false failures injected into your tests due to RC being the middle man broker.

At this point I’d been fortunate enough to get another couple full-time, dedicated QA resources on board. This gave me the flexibility to start looking at other tools. Based on a couple comments from Jeremy Miller and Adam Goucher I decided to look in to Selenium 2.

Selenium 2 is combining with WebDriver, a nifty pairing that lets you get rid of having to deal with the separate Selenium RC server and directly manipulate the browser. Less moving parts is a huge win in my book. The API is a lot cleaner, much more readable, and solves a number of technical issues around timing. The tooling is officially in “Alpha” status; however, it’s quite full-featured already and quite stable. We’ve been having great success with it so far.

I’ll follow this introductory post with several others walking through our move from Selenium 1 to Selenium2/WebDriver.


Mike Kvintus said...

Great post! I can't wait to read your follow-up posts.

Unknown said...

This is a really helpful post - a great combo of big picture, rational, and practical detail.
I'm looking forward to the next post(s) on experience with Selenium 2.

Unknown said...

Hey, I thought this post looked very promising, but a quick search hasn't turned up follow up posts. Any chance of an update? I'm just considering embarking on the same migration from selenium to selenium2 with .net tests and I'd love to hear about any gotchas.

Anonymous said...

With Selenium 1 you could do "real" programming with custom (JavaScript) commands in the user-extensions.js. As you eluded to, right now we have testers create and maintain the Selenium 1 tests because they are simple HTML format. Selenium 2 isn't progress if HTML format tests and JavaScript custom commands aren't available. Is that the case?

Frank Cohen said...

Good post. Have you considered using TestMaker Object Designer: http://www.pushtotest.com/designer


Subscribe (RSS)

The Leadership Journey