subsection contents

Technology Investigation Outline - Servlets

Introduction

The purpose of this document is to explore the capabilities of servlets in the context of developing web-based applications.

The document provides a broad overview of standard servlet technology, including:

  1. Broad overview of servlet operation
  2. Development steps for a form
  3. Stateful vs Stateless servlets
  4. Consideration of object frameworks based on servlets
  5. Other issues

A list of related online resources is provided at the end of the document.

 

What is a Servlet?

A Servlet is a Java component that can be plugged into a Java-enabled web server to provide custom services.
Source: Fundamentals of Java Servlets

Servlets can be used to implement a variety of protocols, such as file transfer protocol (FTP) and HyperText Transfer Protocol (HTTP). This technical investigation will focus on the implementation of HTTP servlets, which are the most commonly used variety.

Servlets work within a request/response model. This model consists of a client sending a request message to a server (for example, to fetch a particular web page) and the server responding by returning a reply message (for example the content of the requested web page).

Servlets are Java classes based on Sun's Servlet API. They are not self-contained programs, but rather must be loaded into and executed by a servlet engine. This servlet engine is usually part of a traditional web server, such as Apache or Netscape FastTrack).

Servlets can be used in the implementation of an n-tier architecture. For example, in a three-tier system you have the client browser providing the user interface, a database back-end, with a servlet-enabled web server performing the application processing.

A traditional approach to allow the execution of web applications is Common Gateway Interface (CGI). CGI works by allowing clients to invoke programs or scripts on the server, with the results being sent back via web pages. Servlets differ from CGI applications in two main ways. Firstly, traditional CGI applications are self-contained programs, with a new instance spawned every time a request is made. As a result there is a considerable overhead associated with each invocation. Secondly, CGIs are specific to the HTTP protocol.

The advantages of servlets over CGI applications are summarised in "Java servlets could save the day". Compared to CGI applications, servlets are:

  • far more efficient because they're only loaded once instead of each time they're executed;
  • much safer because Java's superb memory management makes the dreaded CGI "memory leak" a thing of the past;
  • cross-platform because Java really does run anywhere when it's on the server;
  • arguably easier to write, at least once you've mastered Java, because of the superior tools, the Sun API, and the greater readability of Java code when compared to the more esoteric Perl and Tcl, for example.

Servlets should also be distinguished from applets, which is also a Java-based technology. Servlets are executed on the server with only the results being passed back to the client. Applets are downloaded to and executed on the client. Servlets thus avoid the problems associated with incompatibilities in Java implementations on different client platforms. Also, performance should be improved since typically less data needs to be transferred over the network.

 

How Servlets Work

As mentioned in the previous section, servlets operate within the request/response paradigm. Below is a diagram providing and overview of servlet operation.

Derived from: Beyond CGI: Developing Java Servlets, Sun courseware

HTTP servlets are subclasses of the standard Java extension class javax.servlet.http.HttpServlet. This class implements the basic framework for servlets, specified in the javax.servlet.Servlet interface. This interface comprises three main methods:

  • init()
  • service()
  • destroy()
In addition, the following ancillary methods are available:
  • getServletConfig()
  • getServletInfo()

The init() method is invoked after a servlet is loaded for the first time. Any initialisation activities can be performed at this time. An example activity is the establishment of a connection to a database. The ServletConfig object, which is passed into the init() method, can be used to store configuration details, which are then retrievable using the servlet's getServletConfig() method.

The service() method is the main part of the servlet. It is invoked whenever a client makes a request to the specific servlet. The service() method reads the request, passed in as a ServletRequest object, and produces the response message, represented as a ServletResponse object. Since there is only a single instance of a servlet, and this instance can multiple threads may call the instance's service() method at the same time, the method needs to be implemented in a thread-safe manner.

Once a servlet has performed its tasks can be disposed of properly by invoking the destroy() method. This method performs any steps necessary to clean up after the servlet, such as closing database connections and freeing up any other resources that the servlet used. Note that this method must also be thread-safe.

For more detailed information regarding the basic servlet architecture, refer to the Servlet Essentials tutorial, or Sun's Java Servlet Whitepaper.

Servlets support both the GET and POST methods of form submission. The programmer simply overrides the appropriate method, doGet() or doPost(), to define how the request is to be processed.

A servlet is invoked via the mapping of a URL to a specific class. The servlet engine parses the incoming request to determine the desired servlet class and any parameters encoded in the URL are passed to it. If necessary the servlet will be loaded, and the init() method invoked. The service() method of the loaded servlet is then invoked, and any response is passed back to the client which triggered the original request.

 

Developing a Form Using Servlets

To demonstrate how to use servlets to develop a form, the following example servlet will be discussed. It's a rather simple servlet that prompts the user to enter a string, which is then encoded into a format that is URL-friendly.

There are two parts to the servlet:

  • display form
  • form processor

It is not necessary that forms be handled this way. In fact for complex forms this may not be sensible. However it will simplify things for the purposes of this document.

When the servlet is invoked via a GET request, for example when the user clicks on a hyperlink or types in the servlet's URL, the doGet() method of the servlet will be called, thereby displaying the form (lines 18 to 42). When this form has been submitted by the user, a POST request will be sent to the web server, causing the doPost() method of the servlet to be executed, thereby processing the form (lines 43 to 77).

 

     1  /**
     2   * Sample Java Servlet
     3   * Name: URLEncodingServlet.java
     4   * Purpose: Present user with a form for entering a single string.
     5   *          When form is submitted, contents of string are encoded as URL,
     6   *          which is then returned to the user.
     7   * @author: brunoa
     8   * Date created:  05 April 2000 14:30
     9   * Last modified: 06 April 2000 09:45
    10   */

    11  import java.io.*;
    12  import java.net.*;
    13  import java.util.*;
    14  import javax.servlet.*;
    15  import javax.servlet.http.*;

    16  public class URLEncodingServlet extends HttpServlet
    17  {
    18    public void doGet(HttpServletRequest request,
    19                      HttpServletResponse response)
    20      throws IOException, ServletException
    21    {
    22      response.setContentType("text/html");
    23      PrintWriter output = response.getWriter();
    24      output.println("<HTML>");
    25      output.println("<HEAD><TITLE>URLEncoderServlet Form</TITLE></HEAD>");
    26      output.println("<BODY BGCOLOR=\"gainsboro\">");
    27      output.println("<H2>URLEncoderServlet Form</H2>");
    28      output.println("<FORM NAME=\"URLEncoder\" ACTION=\"URLEncodingServlet\"");
    29      output.println(" METHOD=\"POST\">");
    30      output.println("<P>String to encode: <INPUT TYPE=\"TEXT\"");
    31      output.println(" NAME=\"input_string\" VALUE=\"\" SIZE=\"60\">");
    32      output.println("</P>");
    33      output.println("<INPUT TYPE=\"RESET\">");
    34      output.println("<INPUT TYPE=\"SUBMIT\">");
    35      output.println("</FORM>");
    36      output.println("<P>&nbsp;</P>");
    37      output.println("<P><SMALL><I>Script: " + getServletInfo() + "<BR>");
    38      output.println("Date: " + new Date() + "</I></SMALL></P>");
    39      output.println("</BODY>");
    40      output.println("</HTML>");
    41      output.close();
    42    }


    43    public void doPost(HttpServletRequest request,
    44                       HttpServletResponse response)
    45      throws IOException, ServletException
    46    {
    47      response.setContentType("text/html");
    48      PrintWriter output = response.getWriter();
    49      output.println("<HTML>");
    50      output.println("<HEAD><TITLE>URLEncodingServlet Output</TITLE></HEAD>");
    51      output.println("<BODY BGCOLOR=\"gainsboro\">");
    52      output.println("<H2>URLEncodingServlet Output</H2>");
    53      output.println("<P>Input string: ");
    54      String inputString = request.getParameter("input_string");
    55      output.println(inputString);
    56      output.println("<P>URL Encoded: ");
    57      output.println(URLEncoder.encode(inputString));
    58      output.println("</P>");
    59      output.println("<P>&nbsp;</P>");
    60      output.println("<HR>");
    61      output.println("<H2>URLEncodingServlet Form</H2>");
    62      output.println("<FORM NAME=\"URLEncoder\" ACTION=\"URLEncodingServlet\"");
    63      output.println(" METHOD=\"POST\">");
    64      output.println("<P>String to encode: ");
    65      output.println("<INPUT TYPE=\"TEXT\" NAME=\"input_string\" VALUE=\"");
    66      output.println(inputString + "\" SIZE=\"60\">");
    67      output.println("</P>");
    68      output.println("<INPUT TYPE=\"RESET\">");
    69      output.println("<INPUT TYPE=\"SUBMIT\">");
    70      output.println("</FORM>");
    71      output.println("<P>&nbsp;</P>");
    72      output.println("<P><SMALL><I>Script: " + getServletInfo() + "<BR>");
    73      output.println("Date: " + new Date() + "</I></SMALL></P>");
    74      output.println("</BODY>");
    75      output.println("</HTML>");
    76      output.close();
    77    }

    78    public String getServletInfo()
    79    {
    80      return "URLEncodingServlet 0.1 by Bruno Andrighetto";
    81    }

    82  }  // URLEncodingServlet

 

A line-by-line description will now be provided.

Lines 1 to 10 contain some comments regarding the program.

Lines 11 to 15 contain statements to import the required packages:

  • java.io package is required for generating output
  • java.net package is required for providing the URL-encoding functionality
  • java.util package is required for displaying the date
  • javax.servlet package is required for the implementation of servlets
  • javax.servlet.http package is required for the implementation of HTTP servlets

Line 16 declares that the URLEncodingServlet class is a subclass of javax.servlet.http.HttpServlet.

Lines 18 to 42 override the doGet() method of the HttpServlet class:

Lines 18 and 19: Note that it requires two parameters, namely objects implementing the javax.servlet.http.HttpServletRequest interface (allowing access to the request) and the javax.servlet.http.HttpServletResponse interface (allowing access to the response).

Line 20 declares that the doGet() method can throw IOExceptions and ServletExceptions.

Line 22 sets the content type of the response to "text/html", the standard for HTML web pages.

Line 23 gets the response's output stream and instantiates a PrintWriter object to reference it.

Lines 24 to 40 send the contents of the web page to the output stream.

Lines 28 and 29 specify the FORM tag. Note that the action attribute is set to the servlet processing the form (in this case, the same as the servlet displaying the form). Also, the form will be submitted using the POST method.

Line 37 calls the servlet's getServletInfo() method to obtain descriptive information about the servlet.

Line 38 uses the Date class to get the current date.

Line 41 closes the output stream.

Lines 43 to 77 override the doPost() method of the HttpServlet class:

Lines 43 and 44: Note that it requires two parameters, namely objects implementing the javax.servlet.http.HttpServletRequest interface (the request) and the javax.servlet.http.HttpServletResponse (the response) interface.

Line 45 declares that the doPost() method can throw IOExceptions and ServletExceptions.

Line 47 sets the content type of the response to "text/html", the standard for HTML web pages.

Line 48 gets the response's output stream and instantiates a PrintWriter object to reference it.

Lines 49 to 75 send the contents of the web page to the output stream.

Line 54 extracts the "input_string" parameter from the submitted form, using the getParameters() method invoked upon the request object.

Line 57 performs the URL-encoding upon the input string, using the java.net.URLEncoder class.

Line 72 calls the servlet's getServletInfo() method to obtain descriptive information about the servlet.

Line 73 uses the Date class to get the current date.

Line 76 closes the output stream.

Lines 78 to 81 override the getServletInfo() method of the HttpServlet class, to provide descriptive information about the servlet.

 

Servlets and State

HTTP is a stateless protocol. That is, it consists of isolated transactions with no way of keeping track of requests and responses. This has the advantage that a server can go off-line for a brief period of time without the browser client even noticing that it was down.

HTTP servlets are by default stateless. This means that there is no way of keeping track of workflow beyond a single web page. Only the information presented within that page can be used to process the request.

To overcome this problem, servlets can be made "stateful" - that is, be made to remember state. The servlet API provides a couple of mechanisms to preserve state:

  • Sessions
  • Cookies

A session provides the appearance of a continuous connection from the same client over a set period of time. The HttpSession class provides the necessary functionality. Basically, once a session is created, it can be used to maintain a set of keyname/value pairs for the duration of the session. There is a default value for the duration of a session, however it is possible to modify this value.

[API: javax.servlet.http.HttpSession interface]

Cookies provide a way of maintaining state information across multiple browser sessions. An example where this is useful would be for maintaining user preferences.

[API: javax.servlet.http.Cookie interface]

The following table provides a brief comparison between cookies and sessions:

Cookies Sessions
Memory size limited to 2K Memory limit depends on server memory
Multiple cookies required - one name/value pair per cookie Single session can maintain as much information as required
Number of cookies on client is limited Number of sessions depends on server memory
Contain text only Contain Java technology objects

Source: Beyond CGI: Developing Java Servlets, Sun courseware

An important consideration regarding the usefulness of cookies is that users often disable them in their browsers. Sessions, which are generated on the server, are therefore more generally applicable.

 

Developing Object Frameworks based on Servlets

Part of a servlet's function is to generate a response to a request. This is typically done via the generation of Hypertext Markup Language (HTML) data.

A servlet has access to a HttpOutputStream or PrintWriter which provides the medium for returning data. The standard means by which "pure" servlets generate HTML is by invoking a series of println() methods on the output stream. This can be very awkward and error-prone, especially when the dealing with complex HTML markup. Even simple examples like the one provided earlier can be cumbersome to code.

To overcome this, it is possible to use an object framework (set of classes) which abstracts over HTML. For example, it is possible to create a HtmlPage class to represent the HTML page, which contains a Head object and a Body object. In turn these objects can contain other elements as defined by the HTML standard. An example of such a framework is the Element Construction Set (ECS) developed by the Java Apache Project. The PIRSA Toolkit also provides a framework for HTML abstraction.

However such frameworks fail to address a more fundamental issue, namely the separation of presentation logic from application logic. By tightly binding the presentation of information within the servlet, any changes to layout requires editing the Java code, recompiling it, restarting the server, then finally retesting the servlet. Another point worth considering is that web pages are typically designed using powerful tools to provide a desired look-and-feel. The HTML markup then needs to be translated into a series of Java statements using the object framework. This can be automated, but adds extra steps which slow down the development process and can contribute to errors.

There are ways to achieve a separation between presentation and application logic while still taking advantage of servlet technology. Two such approaches are Java Server Pages (JSP) and servlet template engines (for example WebMacro). An additional benefit is that they remove the requirement to translate HTML to Java sattements. They also facilitate the division of labour between graphic designers and application developers. However such technologies are beyond the scope of this technical investigation.

 

Other Relevant Issues

Authentication

Servlets support three types of authentication, namely:

  • Basic authentication
    • username/password
    • supported via access control lists (ACLs)
    • not very secure - passwords sent as clear text
  • Digest authentication
    • small portion of password (fingerprint) exchanged
  • Secure sockets layer (SSL) server authentication
    • establishes authentication between client and server - https protocol
    • often via Certificates, using encryption

Source: Beyond CGI: Developing Java Servlets, Sun courseware

With respect to basic authentication, web servers can be setup to resolve a protected URL to point to a servlet. This servlet can check the Authorization header set by the browser to determine if user is already authenticated (using the getHeader() method from HttpServletRequest). If not the servlet sends back a 401 error page with the appropriate headers set to prompt the user for a username/password combination. It is also possible to have the servlet initiate the authentication challenge by directly setting the www-authenticate header (using setHeader() from HttpServletResponse) and generating a 401 error by setting the status (using setStatus()).


Chunking

According to the servlet API, HTTP 1.1 chunked encoding (chunking) means that the response has a Transfer-Encoding header. To implement this, the API suggests not setting the Content-Length header. This is described in the documentation for the HttpServlet abstract class.


Online Resources

 

Date: Thursday, April 6, 2000 3:39 PM