Servers & Scripting
Web Server as Hardened File Server: A web server gets it's name because it 'serves' web pages. The context of the word 'server' comes from file server, which in networking terms is a computer dedicated to storing files used by a group, which allows people access to these files. The files are available to all, and can be protected, and backed up from one (hopefully) safe place. A web server is therefore a specialized server, in fact, it is frequently hardened, (protected) as it is possible to attack a web server from anywhere in the world: Web Server Security FAQ
The 'web world' can be a pretty cold place: Defacement Archive Gets Defaced
When we use the word 'server' in class, we'll likely mean 'web server' from now on.
3 Contexts For Client: Wherever there is a server, there is likely a client. When you read the word client, it may mean one of three overlapping contexts: the user, the user's computer or the user's browser. The user's browser is probably the most important context. For us, the client will always have a local connotation. The server we'll consider remote.
Browser As User Agent: The browser is important because it is the program that interacts with our web pages. Just as programs handle files in different ways, browsers have differences. How a user agent processes an HTML file, an image, a CSS file or a JavaScript file can vary considerably. Once the related files are delivered to the browser from the server, there is usually no active connection between the server & client. The browser is left to create the page from the related files on it's own.
A browser is one of a larger group of web page 'clients' call user agents. There are user agents that don't even process visual information: Braille Browser
Web Page: Whenever a person clicks on a web page, the browser sends a page request to their Internet Service Provider (ISP) which consequently queries it's resources to find an address that matches that request. The web address could look like this:
http://www.example.com/
The ISP will send this to a special server called a Domain Name Server (DNS) that has software to translate the human friendly "example.com" to a number, as all internet addresses are really numbers like this:
http://www.198.15.34.56/
Your request bounces through the internet on special devices called routers and switches (both specialized computers) until your request reaches the server that is hosting the page you requested. When it reaches the server, your page is generated and then bounced back to you, again through the internet, until the string of ones and zeros are recombined into the web page you can see.
Index Files: Note the above web address ends with a 'slash', and not a file name. If no specific HTML or other file is identified, the server frequently needs to decide what to do. Most servers are configured to search for an appropriate index file, for example, index.html, or index.php. Having one of each (.html and .php file) can create a conflict, and the server itself will determine which file to show, the PHP or the HTML file! It's usually not a good idea to have two competing index files in a folder.
Directory Browsing: If we don't have an index file designated in a folder, some servers will allow directory browsing, which means the server shows all files in the folder, which is a security risk. Later we'll learn to turn off directory browsing with a UNIX access file called .htaccess, but for now, we can place an index.php file in every folder. Yes, even your images folder!
Page vs File: When we see a 'web page' is it ever really a single page? Almost all 'pages' we see include images, or CSS files. Aren't these part of our 'web page'? Once our browser is delivered the HTML page, all contingent (dependent) CSS or image files are requested as well. As we move forward, the distinction between 'page' and 'file' will be important. A page is a visual representation created by the interaction of separate files.
Operating Systems: Most web servers are running operating systems based on Unix, an operating system created in the early 1970s. It is noted for being stable, quick and obscure. The source code of the operating system has been available for many years, which has led to many "versions" (flavors) of Unix being developed.
UNIX is "command line" based, meaning that it is built to run without any graphical elements whatever. Companies that build their own versions of Unix (flavors) sometimes build a graphical component as well (a GUI, or Window environment). Linux is a new and popular variation of a Unix based operating system.
Microsoft technologies such as ASP (Active Server Pages) and it's successor, ASP.NET are usually hosted on Windows Servers. Windows servers are primarily visually based, and integrate well with Microsoft networks, making these ASP.NET the primary choice for a customer that uses many Microsoft products internally.
Web Server Software: Just as there is a program on the client machine (the browser) there is a program on the server that handles the web page requests. The web server software used can vary, based on the operating system of the server. Two of the most common operating systems used for web servers are UNIX or Windows. For UNIX based systems, the web server software is usually Apache, which is a bad pun for a 'patchy' program, built of several smaller programs.
Microsoft has it's own version of web server software, called Internet Information Services (IIS). IIS has been included in one version or another of Microsoft Server software since NT 4.0 (1996). The advantages of using Microsoft server products includes GUI (Windows) configuration, ease of deployment and industry giant support.
Dynamic Web Pages: Web pages written in pure HTML are considered as static pages, since they are unable to change according to user input. Web pages can be created that are more dynamic (flexible, and changeable) using one of 2 approaches either via client or server side programming. The differences between them are significant to web developers.
Client Side Scripting: Programs can be designed to run independent of the server. In this case, the program code is built into the web page, and is run by the user's browser. Since the user is the client of the server, this is called client side scripting. The advantage of this process is that the server is not tied up unnecessarily running scripts for the user. The disadvantage is the user can gain access to the data, making it insecure. JavaScript is the client side scripting language of choice, as it is the most widely accepted language.
Server Side Scripting: Programs designed to run in the course of serving up web pages and are housed on on the server use server side scripting. It has the advantages of being potentially more secure than client side scripting, and can be used to access data stores such as text files and databases.
Server side scripting originally was handled by a Common Gateway Interface (CGI). Many server side languages still use this means, including Perl. However, the CGI's ability to handle a large volume of traffic was the issue that led to specialized server programs, called pre-processors to intercept a request for a web page, and pass the request to a more capable server side program.
Server-side scripting allows us to provide dynamic content based on user interaction, and our business logic requirements. Unlike a typical HTML page, which displays static information, a page that incorporates server-side scripting can change dynamically over time, interact with databases and other data sources, and provide content and transactions with users. In our class, we will focus on server side scripting. Server side scripting languages include, Perl, PHP, VBScript, and Java.
Server Side Preprocessor: Most dynamic page environments use a special program that intercepts the request for a page by the user, and processes the page in advance of sending it to the server software. This "pre-processor" allows the server to serve up straightforward pages (as far as it knows) due to the fact that the "pre-processor" filters out the request and feeds the required page to server to create the dynamic effect. The way to see this is to look at the source code generated by a dynamic processor, like the PHP pre-processor, or the ASP pre-processor (Microsoft). The source code looks like any other static page (only uglier, potentially much uglier, since the code is dynamically produced by a machine) and is only the OUTPUT of the PHP page written by the developer! This is why we must never overwrite a dynamic page with it's static output.
Web Databases: Data is stored and accessed over the web by a Database Management System (DBMS) designed to facilitate access to the data. The abilities and limitations of the DBMS are a major concern to the developer, who must limit the number and duration of "hits" to the database in order to allow it to serve many users.
The DBMS systems allow a developer to create Queries to access the data, using their own variant of a universal database language called SQL, which stands for Structured Query Language. Web database systems in use today include Oracle, SQL Server, Microsoft Access and MySQL.
The Server Side Environment: The developer will usually need to choose between programming environments that include compatible elements. It is not usually recommended to mix environments to a great degree, as there are many potential pitfalls. The most common potential web development environments are PHP, ASP, JSP, Cold Fusion and the .NET environment. Below are examples of the environments a developer may choose:
| Operating System |
Server Software |
Pre-processor |
Scripting Language |
DBMS |
Ext. |
| Red Hat Linux 9.0 |
Apache 2.0 |
PHP pre-processor |
PHP |
MySQL |
.php |
| Windows 2000 |
IIS 5.0 |
ASP 3.0 pre-processor |
VBScript |
MS Access |
.asp |
| Windows 2003 |
IIS 6.0 |
.NET environment |
C# |
SQL Server |
.aspx |
| Unix/Windows |
J2EE/Tomcat |
JSP/Servlets |
Java |
Oracle |
.jsp |
For our purposes, we have chosen PHP with MySQL as our development environment. The advantages are that the pages will run on Linux/Unix servers which are very stable and secure, and run on Apache web server software which is fast and efficient. PHP and MySQL are both open source software, which promotes sharing code and a stable development environment.
Contrast our environment with the latest .NET environment, where development is very product dependent. As the direction of one corporation turns, so must all developers who embrace that environment. However, the advantages are ease of implementation and the advantage of development benefits produced by an industry giant.
Which Environment for my customer?: This is a good question to be asked for every potential client. If I know a client is a Microsoft client and uses Microsoft technologies internally in their business or network, I would recommend using ASP.NET, with pages written in C#, and connect to a SQL Server database. If that means the customer needs to go elsewhere for assistance so be it. If the client is running on a UNIX server, chances are PHP/MySQL are a good bet. We can determine which environment a client is using by visiting netcraft, and click on what's that site running, and input the customer's domain name.
What About A Content Management System?: There have been many systems designed in the technologies above that allow users to maintain their websites without needing to resort to learning programming. These are called Content Management Systems (CMS), and they vary wildly in quality and features. On the PHP/MySQL side the strongest bets are WordPress, Joomla & Drupal, in that order (from simple to complex & capable).
Frameworks: If we decide our clients needs are specific, and decide not to use a CMS, we will either be building the site ourselves (a custom site) or could speed up our development process by using a web application framework
Using a framework can speed up our development process and make it easier for other developers to understand the code. Frameworks frequently employ the MVC (model-view-controller) architecture, which separates the design from web plumbing.
There are PHP frameworks such as CakePHP & CodeIgniter. In this class, however, we'll use a mimimum of third party applications so we can focus on a fundamental understanding of the underlying architecture. We can build many of the capabilities of frameworks on our own.
What do I do for my customer?: Picking a development environment & a framework (or not) is a huge decision to make for a customer. Interview the customer carefully, and consider all alternatives to help them make the best choice. Never do resume driven design. That means, do what's best for the customer, not what makes our resume 'sparkle' with cool new technologies.
Here's a handout that may help you make good decisions for your 'web' customers: Server Development Decisions