Home > Projects > Course Projects > Internet Programming Languages
Brian Harrington, Yuhua Zhang, and Jason Christian
CSCI5250.001
Abstract: This paper is a brief overview of some of the dynamic scripting technologies that are available as well as some of the common XML specifications/protocols which are used today. Perl, PHP, JSP/Serlvets, and ASP/Web Classes are briefly covered in terms of their particular features or advantages. The XML specifications/protocols XHTML, XForms, and XSLT are covered as to how they can be used with or instead of legacy technology such as HTML. Furthermore, we discuss some of the XML protocols used for language and architecture neutral services including XML-RPC, SOAP, WSDL, and UDDI.
Introduction
There are a wide variety of programming languages used for the Internet. Some of the most popular languages for dynamic scripting include the Practical Extraction and Report Language (Perl), Hypertext Pre-Processor (PHP), Java Server Pages (JSP)/Servlets, and Active Server Pages (ASP)/Web Classes. These languages each have their own specialties allowing them to do some tasks extremely well. In addition, XML has become a pervasive technology for the Internet in that it allows you to easily create protocols that are neutral with respect to programming language and machine architectures. Some common XML protocols used today include: XHMTL, XForms, XSLT, XML-RPC, SOAP, WSDL, and UDDI.
Dynamic Scripting
Perl
Practical Extraction and Report Language (Perl) allows programmers who are familiar with languages frequently used on UNIX/Linux such as Bourne shell, csh, awk, sed, grep, and C to begin web programming fairly quickly. Perl is an interpreted language so it can be used on any system that has an interpreter. There are currently interpreters available for most UNIX/Linux variants, Windows, and Macintosh.
One of the main advantages of Perl is that is has a large set of builtin functions that you can take advantage of in your programs. It has builtins for almost everything that is in section 2 of the UNIX manuals, which makes it very popular with system administrators. One of the really nice features of Perl is its ability to work with strings. Perl takes care of all of the allocation, deallocation, concatenation, etc. Furthermore, Perl has some useful regular expressions API's that developers can take advantage of. Since internet programming will usually involve manipulating strings this is a very important aspect.
The primary drawback for Perl is speed when dealing with very busy sites. Perl is interpreted and therefore will not execute as fast as compiled CGI programs. However, for most applications Perl will be fast enough to get the job done and as newer interpreters become available performance will probably improve.
Information/Tools
- http://www.possibility.com/Perl/
- http://www.perl.com/
- http://perldoc.perl.com/
- http://www.perl.com/pub/a/language/info/software.html
- http://archive.ncsa.uiuc.edu/General/Training/PerlIntro/
PHP
Hypertext Pre-Processor (PHP) is a language that is becoming very popular for internet programming. PHP borrows most of its syntax from C although it also has some elements of Perl. Perl or C programmers should have not trouble in learning PHP. PHP like perl is interpreted and available on a wide variety of systems. Furthermore, PHP has a large builtin function library and easy string manipulation like Perl.
PHP was designed specifically for developing web applications and so it has a number of advantages over Perl and CGI for Internet programming. First, PHP allows you to embed your code within HTML using special tags. In Perl you would have to use print statements to output the HTML. In addition PHP has native support for a large number of popular databases. PHP provides an easy to learn and use environment for creating and debugging web applications.
Information/Tools
- http://www.php.net/
- http://webreference.com/new/php.html
- http://www.gimpster.com/php/tutorial/links.php
JSP/Servlets
Java Server Pages (JSP) and Servlets allow you to take advantage of the huge set of Java API's for your web applications. One, notable API is JDBC. The java database access is abstracted very well so that you can easily convert your applications to work with a variety of databases with very few code modifications. There is not a different API that must be used for each particular database.
Another benefit is that there is an easy to see separation between the backend code/content and the display that is shown to the user. JSP pages make it easy to control the display and use a little Java code for simple tasks or getting the content. Servlets and Beans can be used to provide more complex functionality without having to worry about how the output is displayed. JSP pages are also easier to edit and change as the display will probably change more than the backend code that does the work.
Furthermore, there are several tag libraries and JSP style clones that can be used with standard servlet containers. Two examples are GNU Server Pages which are used by WalMart and Tea used by Disney. Both of these have servlets which compile and run pages similar to JSP with particular enhancements, for example, Tea pages are very simple for people with no programming experience.
Performance
Performance can vary greatly between different implementations of the JSP/Servlet specification. There are several open source and commercial implementations that have fairly good performance for most applications. Used by a number of companies including:
- AOL
- BEA
- Delta Air Lines
- Dow Jones Indexes
- New Jersey Transit
- Oracle
- Sun Microsystems
- US Department of Education
There are also some companies such as WalMart that use alternatives based on JSP/Servlets such as GSP.
Comparisons
- http://istlab.dmst.aueb.gr/~george/articles/dynweb/present.pdf
- http://www.cs.uow.edu.au/people/nabg/399/jsp.pdf
- http://www.caucho.com/articles/benchmark.xtp
JSP Alternatives
- http://gsp.sourceforge.net/
- http://teatrove.sourceforge.net/
- http://www.bitmechanic.com/projects/gsp/
Performance Tools
- http://www.bea.com/products/weblogic/jrockit/index.shtml
- http://www.excelsior-usa.com/jet.html
- http://www.instantiations.com/jove/product/thejovesystem.htm
- http://www.borland.com/optimizeit/index.html
Servlet Engines
- http://jakarta.apache.org/tomcat/index.html
- http://www.w3.org/Jigsaw/
- http://jetty.mortbay.org/jetty/index.html
Application Servers (J2EE)
- http://commerce.bea.com/downloads/weblogic_platform.jsp
- http://www.borland.com/besappserver/index.html
- http://www.jboss.org/
- http://www.macromedia.com/software/jrun/
- http://www-3.ibm.com/software/webservers/appserv/
- http://www.oracle.com/ip/deploy/ias/
- http://wwws.sun.com/software/products/appsrvr/home_appsrvr.html
ASP/Web Classes
Active Server Pages (ASP) and Web Classes are Microsoft's equivalent of the JSP/Servlets. ASP pages allow you to use VBScript or JScript to create dynamic pages similar to PHP or JSP pages. Web Classes can be used for more intensive processing and to obtain more flexibility. A web class will be a DLL which is compiled from a language such as Visual Basic (VB) or Visual C++ (VC++). VB is especially well suited for creating Web Classes because it can automatically take care of a lot of the interfacing you have to do in VC++ by using the VB Web Designer. Through Web Classes you can do pretty much anything you can do with VB.
Furthermore, with Microsoft's new .Net systems ASP and web technology is now available in several more languages and with a larger set of API's. Specifically the XML API's needed for creating web services have improved. The primary drawback to ASP/Web Classes technology is that it is tied to Microsoft's IIS web server and may not be available for some systems. However, there is a ASP library called ChiliSoft from Sun Microsystems that will allow you to run ASP's on other platforms.
Information/Tools
- http://wwws.sun.com/software/chilisoft/
- http://msdn.microsoft.com/library/default.asp?url=/nhp/default.asp?contentid=28000522
- http://msdn.microsoft.com/netframework/default.asp
- http://www.programmersheaven.com/
XML Protocols
XHTML
Extensible Hypertext Markup Language (XHTML) is a XML DTD that describes an HTML like markup. It is intended by the W3C to eventually replace HTML as the dominant form of document on the Internet. XHTML has a number of benefits over HTML as it does not allow for sloppy markup. Current web browsers are very forgiving about the HTML that they will accept. XHTML requires that all documents be well-formed, that tags and attributes are all lower case, and that values are properly quoted.
The biggest problem for XHTML is that is slow to be adopted. One reason is probably that most WYSIWYG editors that allow you to export HTML output sloppy code. Until common editors output proper XHTML then it will be slow to be adopted. Furthermore, users are often very slow to upgrade to newer browsers. Some older browsers have some problems viewing XHTML due to some of the small differences between XML encoding and HTML.
Information/Tools
- http://www.w3c.org/
- http://www.w3c.org/MarkUp/
- http://www.w3c.org/MarkUp/#tidy
- http://www.w3schools.com/xhtml/xhtml_html.asp
XForms
XForms are the next generation of web forms. They are richer and much more flexible than current forms created using HTML. The goal is to be able to separate the description of what the form does and how the form looks. An XForms Model defines a form which can be displayed using a variety of methods including traditional XHTML forms, XForms User Interface, WML, or other proprietary displays.
XForms also includes the XForms Submit Protocol which defines how the data is to be sent and received. Furthermore, it allows you to specify how to suspend, resume, and complete forms. This is useful if a form will not be completed in one session. XForms provides an easy way to create forms that can be used from a variety of devices and with a variety of technologies.
Information
XSLT
Extensible Stylesheet Language Transformations (XSLT) allow you to convert XML documents into another XML document. For example, if I have all of my documentation in Docbook I can use XSLT to convert it to XHTML (or HTML) for display in browsers or to XSL-FO so I can generate a PDF file.
Another use might be to create XHTML or HTML files that are optimized for a particular browser. For example, I have an XML file with the actual content and when a user visits the site their user-agent is discovered and the XML file is transformed dynamically into HTML which is optimized for that particular browser. This can be extremely useful if the site is visited by a wide variety of users with different needs such as: graphical browsers like mozilla and IE, text based browsers such as lynx, and PDA browsers with small resolutions. In combination with scripting languages such as perl, PHP, JSP/Servlets, or ASP this can be used to cleanly separate content from display.
In addition, this can greatly simplify the task of updating and maintaining a site. For example, if you have online documentation, you just update the XML file and you are done. All of the pages can be generated from the XML dynamically. Table of contents, figure lists, indexes, etc. can all be generated on demand from the original file. Furthermore, this can save a lot of harddrive space since you don't have to store all the individual files in all of the different formats.
The main drawback to using XSLT dynamically is the performance. It would have to parse the XML for the page and the stylesheet and then apply the transformation rules for every request. If you have a lot of requests this could potentially be too slow, especially if the documents are large. For example, generating a PDF file for a large manual would take a lot of time and processing power. However, the results could be cached so it only happens once after the XML document is changed or XSLT could be used to generate a static site from the content when changed.
Information/Tools
- http://www.w3.org/TR/xslt
- http://www.w3.org/TR/xslt20/
- http://www.xslt.com/
- http://xml.apache.org/xalan-j/index.html
- http://xml.apache.org/xalan-c/index.html
- http://java.sun.com/xml/jaxp/
- http://www.renderx.com/
- http://xml.apache.org/fop/
PDF Rendering Examples
XML-RPC
XML-RPC is a way of doing Remote Procedure Calls encoded for the network as XML. Usually done as an HTTP request. This has a number of benefits including portability across languages and architectures. As long as the call works the same way with the same XML requests it doesn't really matter what language the program or architecture on the end is written in. Also, the text based calls are easy to read and debug compared with binary data especially considering the simple syntax.
The main problem with XML-RPC is the overhead that is involved. You have to parse the XML and interpret the XML which is much more costly than most binary formats. Also, binary data must be encoded using base64 encoding which increases the size and it must be decoded.
XML-RPC is much simpler and more limited than SOAP, however, it is very easy to implement the protocol and libraries to support it are much smalller than corresponding soap libraries. In addition it is easier for programmers to understand the format. The other benefit from its simplicity is that is uses less bandwidth than SOAP so it should be faster for users who are behind modems or other slow connections.
Example Call
<?xml version="1.0"?>
<methodCall>
<methodName>examples.getStateName</methodName>
<params>
<param>
<value><i4>41</i4></value>
</param>
</params>
</methodCall>
Example Response
<?xml version="1.0"?>
<methodResponse>
<params>
<param>
<value><string>South Dakota</string></value>
</param>
</params>
</methodResponse>
Information/Tools
- http://www.xmlrpc.com/
- http://java.sun.com/webservices/docs/1.0/tutorial/doc/JAXRPC.html
- http://java.sun.com/xml/jaxrpc/faq.html
- http://xml.apache.org/xmlrpc/
- http://weblog.masukomi.org/writings/xml-rpc_vs_soap.htm
- http://www.w3.org/2000/03/xp65435/brent.ppt
SOAP
Simple Object Access Protocol (SOAP) is a mechanism similar to XML-RPC for performing remote procedure calls. However, SOAP is currently more widely used and there are many toolkits available for many languages. The primary reason that it has been adopted more than XML-RPC is that it is much more flexible.
SOAP uses a “Call-Response” mechanism to make calls. The HTTP post method is used to send an XML document containing the information for the procedure call. A program on the server will receive the request, execute the function, and then send the reply back. This is similar to calling objects from transaction servers. SOAP envelopes the data into two main parts a header and a body. Furthermore, SOAP makes extensive use of namespaces.
For the future SOAP is being extended to provide richer functionality and security. Items that should be available in the near future include messages that have attachments, security extensions including digital signatures, and encryption. SOAP continues to be used for creating web services and will become more flexible and more secure.
Information/Tools
- http://www.w3c.org/2000/xp/Group/
- http://msdn.microsoft.com/library/default.asp?url=/nhp/default.asp?contentid=28000523
- http://xml.apache.org/soap/index.html
- http://www.soapware.org/
- http://www.vbxml.com/soap/
- http://www.w3.org/2002/ws/
WSDL and UDDI
Web Services Description Language (WSDL) defines the structure and available methods for web services. Before you can use SOAP to make remote calls you must know what methods are available and what parameters they take. In addition, WSDL can be used to tell the port that the you must connect on to access a particular procedure.
WSDL is normally generated automatically by a program. It is a very complicated specification and in general you will not need to create it yourself. However, you may need to know the basic syntax so you can adjust the files if needed.
Universal Description, Discovery, and Integration (UDDI) is similar to DNS in that it is a central repository for web service definitions. A user can search a UDDI database to find all web services that provide particular features that they are looking for.
Information/Tools
- http://www.w3.org/TR/wsdl
- http://www.oasis-open.org/cover/wsdl.html
- http://www.learnxmlws.com/tutors/wsdl/wsdl.aspx
- http://www.uddi.org/
- http://www.uddi.org/find.html
- http://uddi.ibm.com/
- http://uddi.microsoft.com/
- http://www.vbws.com/services/WeatherRetriever.asmx?op=GetWeather
Conclusion
As you have seen there are a wide variety of programming languages used for Internet programming today. In general the functionality of these languages overlap, however, depending on environment and needs one may be a better choice than another. XML is now a widely used technology on the Internet and continues to grow as it provides a simple solution to many of the problems faced for todays applications.