Wednesday, November 27, 2013

Automation with Selenium WebDriver and Selenium Grid for multiple browser drivers

What is Selenium Grid?

Selenium Grid is a part of the Selenium Suite that specializes on running multiple tests across different browsers, operating systems, and machines in parallel.

With the release of Selenium 2.0, the Selenium Server now has built-in Grid functionality.

The selenium-server-standalone package includes the Hub, WebDriver, and legacy RC needed to run the grid. Ant is not required anymore!!!!

Selenium Grid uses a hub-node concept, where test cases will be running on a single machine called a hub, but the execution will be done by different machines called nodes. i.e node will have the browser drivers and hub will passing your test cases to each node and that will be executing.

Why we have to use Selenium Grid?

  • Run your tests against different browsers/operating systems and machines all at the same time.This will ensure that the application you are testing is fully compatible with a wide range of browser-OS combinations.
  • Save time in execution of your test suites. If you set up Selenium Grid to run, say, 4 tests at a time, then you would be able to finish the whole suite around 4 times faster.
What is a Hub and Node?

The Hub
  • The hub is the central point where you load your tests into.
  • There should only be one hub in a grid.
  • The hub is launched only on a single machine, say, a computer whose OS is Windows XP/7/vista/8 and whose browser is IE.
  • The machine containing the hub is where the tests will be run, but you will see the browser being automated on the node.
The Nodes
  • Nodes are the Selenium instances that will execute the tests that you loaded on the hub.
  • There can be one or more nodes in a grid.
  • Nodes can be launched on multiple machines with different platforms and browsers.
  • The machines running the nodes need not be the same platform as that of the hub.

example :

HUB :   Machine H
NODE1 : Machine IE(which as IE driver)
NODE2 : Machine CHROME(which as CHROME driver)
NODE3 : Machine FIREFOX (which as FIREFOX driver)

How to configure the selenium server for remote/virtual machine web drivers?

Quick Start:

1.Download the respective webdriver and keep in proper location.
2.Download the selenium server/client jar files.
3. run the selenium server jar which will act as hub as mentioned below

java -jar selenium-server-standalone-2.37.0.jar -role hub

this hub will act as center point for access for all remote web driver pull and it will monitor all nodes.


4. run the selenium server jar as node which will provite webdrivers

change localhost to ip address if hub is running in different box/vm.

java -jar selenium-server-standalone-2.37.0.jar -role node -hub http://localhost:4444/grid/register
-browser browserName="internet explorer",

5. Running test from grid

Selenium selenium = new DefaultSelenium(“localhost”, 4444, “*firefox”, “”);

We should use remote driver with desiredCapabilities object to define which browser, version and platform you wish to use.

DesiredCapabilities capability = DesiredCapabilities.firefox();

RemoteWebDriver object
WebDriver driver = new RemoteWebDriver(new URL("http://localhost:4444/wd/hub"), capability); 

A node matches if all the requested capabilities are met. To request specific capabilities on the grid, specify them before passing it into the WebDriver object



6. Now both hub and node are running, we have to implement the code to get the driver from hub.

for full sample click here

to get the full selenium simple junit sample go to my github


  • Selenium Grid is used to run multiple tests simultaneously in different browsers and platforms.
  • Grid uses the hub-node concept.
  • The hub is the central point wherein you load your tests.
  • Nodes are the Selenium instances that will execute the tests that you loaded on the hub.
  • There are 2 ways to verify if the hub is running: one was through the command prompt, and the other was through a browser
  • To run test scripts on the Grid, you should use the DesiredCapabilities and the RemoteWebDriver objects.
  • DesiredCapabilites is used to set the type of browser and OS that we will automate
  • RemoteWebDriver is used to set which node (or machine) that our test will run against.

Friday, September 27, 2013


Generally, you can just ignore it. This exception will be thrown when the client has abruptly aborted the HTTP request while the page is still loading or continuous requesting/clicking. This will occur when the client pressed Esc, or hastily navigated away, or closed the browser, or got network outage, or even caught fire. All of this is totally out your control.

catch the exception to suppress the error on server log
try {
catch (ClientAbortException e) {

Friday, August 23, 2013

Refactor package change on whole project (JAVA) or Change package/import of whole project

Some times in our application development we will come across situation like changing the package structure of the project. As all classes present in the project are having the old package and import statement declaration, we have to change it manually or by using some editor.
eclipse is not having any option to change the package/import statement dynamically on structure change, we have to write some code to do the job of converting or by script.
I have written Ant target which will to the job ;)

old package : xxx.yyy

new package :

after running below ant target it will change all java classes package/import statement to latest structure. Ant Script :

Thursday, July 18, 2013

RESTful Web Service

What Are RESTful Web Services?

Representational State Transfer (REST) is an architectural style that specifies constraints, such as the uniform interface, that if applied to a web service induce desirable properties, such as performance, scalability, and modifiability, that enable services to work best on the Web.
In the REST architectural style, data and functionality are considered resources and are accessed using Uniform Resource Identifiers (URIs), typically links on the Web. The resources are acted upon by using a set of simple, well-defined operations. The REST architectural style constrains an architecture to a client/server architecture and is designed to use a stateless communication protocol, typically HTTP. In the REST architecture style, clients and servers exchange representations of resources by using a standardized interface and protocol.

Principles of RESTful Web Services

Resource identification through URI: A RESTful web service exposes a set of resources that identify the targets of the interaction with its clients. Resources are identified by URIs, which provide a global addressing space for resource and service discovery.

Uniform interface: Resources are manipulated using a fixed set of four create, read, update, delete operations: PUT, GET, POST, and DELETE. PUT creates a new resource, which can be then deleted by using DELETE. GET retrieves the current state of a resource in some representation. POST transfers a new state onto a resource.

Self-descriptive messages: Resources are decoupled from their representation so that their content can be accessed in a variety of formats, such as HTML, XML, plain text, PDF, JPEG, JSON, and others.

Stateful interactions through hyperlinks: Every interaction with a resource is stateless; that is, request messages are self-contained. Stateful interactions are based on the concept of explicit state transfer. Several techniques exist to exchange state, such as URI rewriting, cookies, and hidden form fields. State can be embedded in response messages to point to valid future states of the interaction.

Spring-ws provider implementation sample code

Spring + RESTful Jersey Sample

Monday, June 24, 2013

Hadoop ?

What is Hadoop?

Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes. Its distributed file system facilitates rapid data transfer rates among nodes and allows the system to continue operating uninterrupted in case of a node failure. This approach lowers the risk of catastrophic system failure, even if a significant number of nodes become inoperative.

Hadoop was inspired by Google's Map Reduce , a software framework in which an application is broken down into numerous small parts. Any of these parts (also called fragments or blocks)

can be run on any node in the cluster. Doug Cutting, Hadoop's creator, named the framework after his child's stuffed toy elephant. The current Apache Hadoop ecosystem consists of the Hadoop kernel, MapReduce, the Hadoop distributed file system (HDFS) and a number of related projects such as Apache Hive, HBase and Zookeeper. The Hadoop framework is used by major players including Google, Yahoo and IBM, largely for applications involving search engines and advertising. The preferred operating systems are Windows and Linux but Hadoop can also work with BSD and OS X.

Why Hadoop? What is BigData?

Big data is a general term used to describe the voluminous amount of unstructured and semi-structured data a company creates, data that would take too much time and cost too much money to load into a relational database for analysis. (Big data doesn't refer to any specific quantity, the term is often used when speaking about petabytes and exabyte’s of data). A primary goal for looking at big data is to discover repeatable business patterns. It’s generally accepted that unstructured data, most of it located in text files, accounts for at least 80% of an organization’s data. If left unmanaged, the sheer volume of unstructured data that’s generated each year within an enterprise can be costly in terms of storage. Unmanaged data can also pose a liability if information cannot be located in the event of a compliance audit or lawsuit. Big data analytics is often associated with cloud computing because the analysis of large data sets in real-time requires a framework like Map Reduce to distribute the work among tens, hundreds or even thousands of computers.