Technology Investigation Outline - Servlets
The purpose of this document is to explore the capabilities of servlets in the context of developing web-based applications.
The document provides a broad overview of standard servlet technology, including:
A list of related online resources is provided at the end of the document.
What is a Servlet?
A Servlet is a Java component that can be plugged into a Java-enabled web server to provide custom services.
Servlets can be used to implement a variety of protocols, such as file transfer protocol (FTP) and HyperText Transfer Protocol (HTTP). This technical investigation will focus on the implementation of HTTP servlets, which are the most commonly used variety.
Servlets work within a request/response model. This model consists of a client sending a request message to a server (for example, to fetch a particular web page) and the server responding by returning a reply message (for example the content of the requested web page).
Servlets are Java classes based on Sun's Servlet API. They are not self-contained programs, but rather must be loaded into and executed by a servlet engine. This servlet engine is usually part of a traditional web server, such as Apache or Netscape FastTrack).
Servlets can be used in the implementation of an n-tier architecture. For example, in a three-tier system you have the client browser providing the user interface, a database back-end, with a servlet-enabled web server performing the application processing.
A traditional approach to allow the execution of web applications is Common Gateway Interface (CGI). CGI works by allowing clients to invoke programs or scripts on the server, with the results being sent back via web pages. Servlets differ from CGI applications in two main ways. Firstly, traditional CGI applications are self-contained programs, with a new instance spawned every time a request is made. As a result there is a considerable overhead associated with each invocation. Secondly, CGIs are specific to the HTTP protocol.
The advantages of servlets over CGI applications are summarised in "Java servlets could save the day". Compared to CGI applications, servlets are:
Servlets should also be distinguished from applets, which is also a Java-based technology. Servlets are executed on the server with only the results being passed back to the client. Applets are downloaded to and executed on the client. Servlets thus avoid the problems associated with incompatibilities in Java implementations on different client platforms. Also, performance should be improved since typically less data needs to be transferred over the network.
How Servlets Work
As mentioned in the previous section, servlets operate within the request/response paradigm. Below is a diagram providing and overview of servlet operation.
Derived from: Beyond CGI: Developing Java Servlets, Sun courseware
HTTP servlets are subclasses of the standard Java extension class javax.servlet.http.HttpServlet. This class implements the basic framework for servlets, specified in the javax.servlet.Servlet interface. This interface comprises three main methods:
The init() method is invoked after a servlet is loaded for the first time. Any initialisation activities can be performed at this time. An example activity is the establishment of a connection to a database. The ServletConfig object, which is passed into the init() method, can be used to store configuration details, which are then retrievable using the servlet's getServletConfig() method.
The service() method is the main part of the servlet. It is invoked whenever a client makes a request to the specific servlet. The service() method reads the request, passed in as a ServletRequest object, and produces the response message, represented as a ServletResponse object. Since there is only a single instance of a servlet, and this instance can multiple threads may call the instance's service() method at the same time, the method needs to be implemented in a thread-safe manner.
Once a servlet has performed its tasks can be disposed of properly by invoking the destroy() method. This method performs any steps necessary to clean up after the servlet, such as closing database connections and freeing up any other resources that the servlet used. Note that this method must also be thread-safe.
Servlets support both the GET and POST methods of form submission. The programmer simply overrides the appropriate method, doGet() or doPost(), to define how the request is to be processed.
A servlet is invoked via the mapping of a URL to a specific class. The servlet engine parses the incoming request to determine the desired servlet class and any parameters encoded in the URL are passed to it. If necessary the servlet will be loaded, and the init() method invoked. The service() method of the loaded servlet is then invoked, and any response is passed back to the client which triggered the original request.
Developing a Form Using Servlets
To demonstrate how to use servlets to develop a form, the following example servlet will be discussed. It's a rather simple servlet that prompts the user to enter a string, which is then encoded into a format that is URL-friendly.
There are two parts to the servlet:
It is not necessary that forms be handled this way. In fact for complex forms this may not be sensible. However it will simplify things for the purposes of this document.
When the servlet is invoked via a GET request, for example when the user clicks on a hyperlink or types in the servlet's URL, the doGet() method of the servlet will be called, thereby displaying the form (lines 18 to 42). When this form has been submitted by the user, a POST request will be sent to the web server, causing the doPost() method of the servlet to be executed, thereby processing the form (lines 43 to 77).
A line-by-line description will now be provided.
Lines 1 to 10 contain some comments regarding the program.
Lines 11 to 15 contain statements to import the required packages:
Line 16 declares that the URLEncodingServlet class is a subclass of javax.servlet.http.HttpServlet.
Lines 18 to 42 override the doGet() method of the HttpServlet class:
Lines 43 to 77 override the doPost() method of the HttpServlet class:
Lines 78 to 81 override the getServletInfo() method of the HttpServlet class, to provide descriptive information about the servlet.
Servlets and State
HTTP is a stateless protocol. That is, it consists of isolated transactions with no way of keeping track of requests and responses. This has the advantage that a server can go off-line for a brief period of time without the browser client even noticing that it was down.
HTTP servlets are by default stateless. This means that there is no way of keeping track of workflow beyond a single web page. Only the information presented within that page can be used to process the request.
To overcome this problem, servlets can be made "stateful" - that is, be made to remember state. The servlet API provides a couple of mechanisms to preserve state:
A session provides the appearance of a continuous connection from the same client over a set period of time. The HttpSession class provides the necessary functionality. Basically, once a session is created, it can be used to maintain a set of keyname/value pairs for the duration of the session. There is a default value for the duration of a session, however it is possible to modify this value.
[API: javax.servlet.http.HttpSession interface]
Cookies provide a way of maintaining state information across multiple browser sessions. An example where this is useful would be for maintaining user preferences.
[API: javax.servlet.http.Cookie interface]
The following table provides a brief comparison between cookies and sessions:
Source: Beyond CGI: Developing Java Servlets, Sun courseware
An important consideration regarding the usefulness of cookies is that users often disable them in their browsers. Sessions, which are generated on the server, are therefore more generally applicable.
Developing Object Frameworks based on Servlets
Part of a servlet's function is to generate a response to a request. This is typically done via the generation of Hypertext Markup Language (HTML) data.
A servlet has access to a HttpOutputStream or PrintWriter which provides the medium for returning data. The standard means by which "pure" servlets generate HTML is by invoking a series of println() methods on the output stream. This can be very awkward and error-prone, especially when the dealing with complex HTML markup. Even simple examples like the one provided earlier can be cumbersome to code.
To overcome this, it is possible to use an object framework (set of classes) which abstracts over HTML. For example, it is possible to create a HtmlPage class to represent the HTML page, which contains a Head object and a Body object. In turn these objects can contain other elements as defined by the HTML standard. An example of such a framework is the Element Construction Set (ECS) developed by the Java Apache Project. The PIRSA Toolkit also provides a framework for HTML abstraction.
However such frameworks fail to address a more fundamental issue, namely the separation of presentation logic from application logic. By tightly binding the presentation of information within the servlet, any changes to layout requires editing the Java code, recompiling it, restarting the server, then finally retesting the servlet. Another point worth considering is that web pages are typically designed using powerful tools to provide a desired look-and-feel. The HTML markup then needs to be translated into a series of Java statements using the object framework. This can be automated, but adds extra steps which slow down the development process and can contribute to errors.
There are ways to achieve a separation between presentation and application logic while still taking advantage of servlet technology. Two such approaches are Java Server Pages (JSP) and servlet template engines (for example WebMacro). An additional benefit is that they remove the requirement to translate HTML to Java sattements. They also facilitate the division of labour between graphic designers and application developers. However such technologies are beyond the scope of this technical investigation.
Other Relevant Issues
Servlets support three types of authentication, namely:
Source: Beyond CGI: Developing Java Servlets, Sun courseware
With respect to basic authentication, web servers can be setup to resolve a protected URL to point to a servlet. This servlet can check the Authorization header set by the browser to determine if user is already authenticated (using the getHeader() method from HttpServletRequest). If not the servlet sends back a 401 error page with the appropriate headers set to prompt the user for a username/password combination. It is also possible to have the servlet initiate the authentication challenge by directly setting the www-authenticate header (using setHeader() from HttpServletResponse) and generating a 401 error by setting the status (using setStatus()).