How to scale Selenium?
Introduction
Selenium/Webdriver is a powerful test automation framework for testing web applications. It is the actual standard for web application test automation.
Selenium/Webdriver is great because of:
- Real world experience: Selenium/Webdriver launches and drives a real full GUI browser to simulate the user behavior.
- Multiple Browser/OS support: Selenium/Webdriver almost support all popular browsers and OS
- Free: Selenium/Webdriver is free.
- Technical support: The Selenium/webdriver community is very active. The Selenium is supported by SauceLab. The Chromedriver is supported by Google.
However, using Selenium/Webdriver is not easy. It is because:
- Flaky test: Selenium/Webdriver may not generate 100% accurate results. Flaky test happens from time to time
- Scalability: Selenium/Webdriver architecture is not scalable to support a big cluster like 10+ nodes and 1000+ tests at the same time. I do not have the data. Please do not quote me. Based on my personal experience, it is already pretty painful to use about 4 nodes and run 30+ test suites at the same time.
Next, I am going to talk about the challenges to deal with the scalability and flaky test.
Challenges
Based on my experience, the major challenge comes from two places:
- Many moving parts
- Not scalable architecture
If you investigate the full lifecycle of a Selenium call, you may find the following path:
Selenium Client -> Selenium Hub -> Selenium Node -> Webdriver (ChromeDriver/SafariDriver/IEdriver) -> Plugin/Extension in Browser -> Browser
A selenium command is a JSON object. It is sent from the client to the selenium hub using HTTP protocol. The hub then relays this request to a Selenium Node. The Selenium Node relays to this request to the Webdriver. The Webdriver interprets this request and converts it to a few calls to the plugin/extension in a opened browser. The plugin/extension in the browser executed the calls to do the real work like extract text from a DOM node or change text of a DOM node.
Selenium's command/protocol is quite verbose. For each test case, there could be hundreds of Selenium commands sent to the Selenium Grid. You will understand it when you see the console of a Selenium Node.
Selenium Hub is a sort of bottleneck. There could be only 1 Selenium hub in a Selenium Grid. When it is overwhelmed, the whole Selenium Grid could be impacted.
Running web application test is also expensive or resource intensive. Each browser session may take quite a lot CPU time and memory when rending web pages and handling client side javascript code. When you run multiple browser sessions at the same time, the Selenium Node may become busy. When the Selenium Node is a little busy (Not need to be 90% or 100%, just when the cpu utilization is about 70%), there could be different random things happening.
Many moving parts, plus many commands, plus a bottleneck, plus the resource intensive web application test. Do you still expect every test case is run accurately every time? I do not believe so.
So, what can we do? Obviously, changing the architecture is impossible for users. What we can do is to properly design the Selenium Grid and tune different parts to achieve the best result we can get.
Good practices
1. Set up Selenium Grids
Using Selenium Grid is the recommended way to scale Selenium. However, as I mentioned, the Selenium Grid itself may become the bottleneck and have a lot of problem. Therefore, I do not believe that it is feasible to manage a Selenium Grid with 10+ Selenium Nodes. If you do have a use case to set up a big Selenium Grid, it may be better to set up a few smaller Grids.
2. Tune the Selenium parameters
You need to add the Xms and Xmx Java options to increase the Java heapsize. The default 64M heap size is too small. When there are many concurrent tests running, the JVM may do quite a few GC, which will slow down or screw up the tests. A good practice is to use VisualVM to monitor the heap size and GC Activity when tests are running. That way, you will know what the heapsize is appropriate. Do not forget the add Xms and Xmx for the Selenium Hub also.
You also need to plan the total number of sessions and the number of sessions for each browser type. By doing this, you limit the number of browser sessions a Selenium Node can run at the same time. As a result, you can avoid a Selenium Node to be super busy. To do so, you have to run multiple tests to verify when the system resource utilization reaches the threshold (for example 70%).
3. Tune the host of the Selenium Node
It is often the case we only focus on the Selenium itself and forget about the host. If the host has 1 CPU and 4G memory, it is not surprise that the tests could be slow and error-prone. Again, web application test could be resource intensive. Make sure you have the proper hardware and keep an eye on the resource utilization.
Meanwhile, you also need to tune the OS level parameters. I found this because of my recent troubleshooting experience.
Because Selenium Protocol is "verbose", the ephemeral ports can be used up very quickly. The problem was mentioned by the Selenium/Webdriver team.
https://code.google.com/p/selenium/wiki/ScalingWebDriver
A good practice is to decrease the Time_Wait timeout time so that the Time_Wait state ports can be reused sooner.
For Linux, edit the file sysctl.conf by adding
net.ipv4.tcp_fin_timeout = 30
For Linux, edit the file sysctl.conf by adding
net.ipv4.tcp_fin_timeout = 30
For Windows, please refer to this page: https://msdn.microsoft.com/en-us/library/aa560610(v=bts.20).aspx
For Unix, please refer to this page: http://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux.html
4. Update the Selenium jar and Webdriver binary files
The newer code always contain better features and more bug fixes. There could be some performance fixes or some changes to improve the functionality/performance for a specific browser. Meanwhile, the previous Selenium/webdrvier code may NOT be compatible with the latest up-to-date browser versions. So, check the Selenium HQ download page from time to time. When your browsers have a major release (for example, Chrome 44 to Chrome 45), you must check whether there are new Selenium jar or webdriver binary released.
5. Retry!
All above good practices may help to improve the test stability from X% to 90%. However, they still can not ensure that the tests are 100% stable. The best strategy here is to retry the failed test cases.
It is easier to do this if you are using Cucumber. Cucumber saves the failed test cases into a rerun file so that you can rerun them later. For other test framework, it is not too difficult also. As long as your script go through the test results for the first run, you should be able to pick up the failed ones and rerun them again.
Summary
Selenium/Webdriver is a powerful web application test framework and is super popular. It is often the case that A Selenium grid is set up to support multiple concurrent test sessions. That way, the tests can finish quicker and can spread across different Browser/OS combinations. However, it is not easy to properly scale a Selenium Grid. That could be multiple moving parts and some architectural concerns involved. You may be able to apply a few good practices mentioned in this document to make your Selenium Grid more reliable and scalable. Although the 100% test reliability can not be achieved, we will achieve a good enough level, hopefully 90%.
Last but not least, Selenium/Webdriver is great. Enjoy it.