Web and CGI Programming

The World Wide Web (WWW), or the Web, is a combination of resources and users on the Internet that uses the Hypertext Transfer Protocol (HTTP) (RFC 2616 1999), for information exchange. Since its debut in the early 90’s, coupled with the ever-expanding capability of the Internet, the Web has become an indispensable part of daily lives of people all over the world. It is therefore important for Computer Science students to have some understanding of this technology. In this section, we shall cover the basics of HTTP and Web programming. Web programming in general includes the writing, markup and coding involved in Web development, which includes Web content, Web client and server scripting and network security. In a narrower sense, Web programming refers to creating and maintaining Web pages. The most common languages used in Web programming are HTML, XHTML, JavaScript, Perl 5 and PHP.

1. HTTP Programming Model

HTTP is a server-client based protocol for applications on the Internet. It runs on top of TCP since it requires reliable transfer of files. Figure 13.10 shows the HTTP programming model.

Fig. 13.10 HTTP programming model

In the HTTP programming model, a HTTP server runs on a Web server host. It waits for requests from a HTTP client, which is usually a Web browser. At the HTTP client side, the user enters a URL (Uniform Resource Locator) of the form

http://hostname[/filename]

to send a request to a HTTP server, requesting for a file. In the URL, http identifies the HTTP protocol, hostname is the host name of the HTTP server and filename is the requested file. If filename is not specified, the default file is index.html. The client first connects to the server to send the request. Upon receiving a request, the server sends the requested file back to the client. Requested files are usually Web page files written in the HTML language for the browser to interpret and display, but it may also be files in other formats, such as video, audio or even binary files.

In HTTP, a client may issue URLs to send requests to different HTTP servers. It is unnecessary, nor desirable, for a client to maintain a permanent connection with a specific server. A client connects to a server only to send a request, after which the connection is closed. Likewise, a server connects to a client only to send a reply, after which the connection is again closed. Each request or reply requires a separate connection. This means that HTTP is a stateless protocol since there are no information maintained across successive requests or replies. Naturally, this would cause a lot of overhead and inefficiency. In order to remedy this lack of state information problem, HTTP server and client may use cookies, which are small piece of data imbedded in requests and replies, to provide and maintain some state information between them.

2. WebPages

Web pages are files written in the HTML markup language. A Web file specifies the layout of a Web page by a series of HTML elements for a Web browser to interpret and display. Popular Web browsers include Internet Explorer, Firefox, Google Chrome, etc. To create a Web page amounts to creating a text file using HTML elements as building blocks. It is more a clerical type work than programming. For this reason, we shall not discuss how to create Web pages. Instead, we shall only use an example HTML file to illustrate the essence of Web pages. The following shows a simple Web page file in HTML.

  1. <html>
  2. <body>
  3. <h1>H1 heading: A Simple Web Page</h1>
  4. <P>This is a paragraph of text</P>
  5. <!— this is a comment line—— >
  6. <P><img src=”firefox.jpg” width=16></P>
  7. <a href=”http://www.eecs.wsu.edu/~cs360″>link to cs360 web page</a>

<P>

  1. font color=”red”>red</font>
  2. font color=”blue”>blue</font>
  3. font color=”green”>green</font>

</P>

<!– a table—– >

  1. <table>

    12.    <tr>
  13.       <th>name</th>
  14.       <th>ID</th>
  15.    </tr>
  16.    <tr>
  17.        <th>kwang</th>
  18.        <th>12345</th>
  19.    </tr>

  1. </table>

<!— a FORM—– >

  1. <FORM>
  2. Enter command: <INPUT NAME=”command”><P>
  3. Submit command: <INPUT TYPE=”submit” VALUE=”Click to Submit”>
  4. </FORM>
  5. </body>
  6. </html>

Explanations of HTML File Contents

A HTML file comprises HTML elements. Each HTML element is specified by a matched pair of open and close tags.

<tag>contents</tag>

In fact, a HTML file itself may be regarded as a HTML element specified by a matched pair of <html> tags.

<html>HTML file</html>

Lines 1 to 26 specify a HTML file. A HTML file includes a body specified by a matched pair of <body> tags

<body>body of HTML file</body>

Lines 2 to 25 specify the body of the HTML file.

A HTML file may use the tags <H1> to <H7> to display head lines of different font sizes.

Line 3 specifies a <H1> head line.

Each matched pair of <P> tags specifies a paragraph, which is displayed on a new line.

Line 4 specifies a paragraph of text.

Line 5 specifies a comment line, which will be ignored by the browser.

Line 6 specifies an image file, which will be displayed with width pixels per row.

Line 7 specifies a link element

<a HREF=”link_URL”>link</a>

in which the attribute HREF specifies a link_URL and a text string describing the link. The browser usually displays link texts in dark blue color. If the user clicks on a link, it will direct the request to a Web server identified by the link_URL. This is perhaps the most powerful feature of Web pages. It allows the user to navigate to anywhere in the Web by following the links.

Lines 8 to 10 use <font> elements to display texts in different colors. The <font> element can also specify text in different font sizes and styles.

Lines 11 to 20 specify a table with <tr> as rows, and <th> as columns in each row.

Lines 21 to 24 specify a form for collecting user inputs and submitting them to a Web server for processing. We shall explain and demonstrate HTML forms in more detail in the next section on CGI programming.

Figure 13.11 shows the Web page of the above HTML file.

3. Hosting Web Pages

Now that we have a HTML file. It must be placed in a Web server. When a Web client requests the HTML file by a URL, the Web server must be able to locate the file and send it back to the client for display. There are several ways to host Web pages.

(1). Sign up with a commercial Web hosting service provider with a monthly fee. For most casual users, this may not be an option at all.

(2). User account on an institutional or departmental server. If the reader has a user account on a server machine running Linux, it’s very easy to create a private website in the user’s home directory by the following steps

. login to the user account on the server machine.

. in the user’s home directory, create a public_html directory with permissions 0755.

. in the public_html directory, create an index.html file and other HTML files.

As an example, from a Web browser on the Internet, entering the URL http://cs360.eecs.wsu.edu/~kcw will access the author’s website on the server machine cs360.eecs.wsu.edu.

(3.) Standalone PC or laptop: The steps described here are for standalone PCs or laptops running standard Linux, but it should be applicable to other Unix platforms as well. For some reason, Ubuntu Linux chooses to do things differently, deviating from the standard setups of Linux. Ubuntu users may consult the HTTPD-Apache2 Web Server Web page of Official Ubuntu Documentation for details.

4. Configure HTTPD for Web Pages

(3).1. Download and install the Apache Web server. Most Linux distributions, e.g. Slackware Linux 14.2, come with the Apache Web server installed, which is known as HTTPD.

(3).2. Enter ps -x | grep httpd to see whether httpd is running. If not, enter

sudo chmod +x /etc/rc.d/rc.httpd

to make the rc.httpd file executable. This would start up httpd during next booting. Alternatively, httpd can also be started up manually by entering

sudo /usr/sbin/httpd -kstart.

(3).3. Configure httpd.conf file: Operations of the HTTPD server are governed by a httpd.conf file in the /etc/httpd/ directory. To allow individual user websites, edit the httpd.conf file as follows.

. Uncomment these lines if they are commented out

Loadmodule dir_module MODULE_PATH

Include /etc/httpd/extra/httpd-userdir.conf

. In the first Directory block

<Directory />

Require all denied # deny requests for all files in /

</Directory>

Change the line Require all denied to Require all granted.

. All user home directories are in the /home directory. Change the line

DocumentRoot /srv/httpd/htdocs to DocumentRoot /home

. The default directory for all HTML file is htdoc. Change the line

<Directory /srv/httpd/htdocs> to <Directory /home>

After editing the httpd.conf file, restart the httpd server or enter the commands

ps -x | grep httpd # to see httpd PID

sudo kill -s 1 httpdPID

The kill command sends a number 1 signal to httpd, causing it to read the updated httpd.conf file without restarting the httpd server.

(3).4. Create a user account by adduser user_name. login to the user account by

ssh user_name@localhost

Create public_html directory and HTML files as before.

Then open a Web browser and enter http://localhost/~user_name to access the user’s Web pages.

5. Dynamic Web Pages

Web pages written in standard HTML are all static. When fetched from a server and displayed by a browser, the web page contents do not change. To display a Web page with different contents, a different Web page file must be fetched from the server again. Dynamic Web pages are those whose contents can vary. There are two kinds of dynamic Web pages, known as client-side and server-side dynamic Web pages, respectively. Client-side dynamic Web page files contain code written in JavaScripts, which are executed by a JavaScripts interpreter on the Client machine. It can respond to user inputs, time events, etc. to modify the Web page locally without any interaction with the server. Server-side dynamic Web pages are truly dynamic in the sense that they are generated dynamically in accordance with user inputs in the URL request. The heart of server-side dynamic Web pages lies in the server’s ability to either execute PHP code inside HTML files or CGI programs to generate HTML files by user inputs.

6. PHP

PHP (Hypertext Preprocessor) (PHP 2017) is a script language for creating server-side dynamic Web pages. PHP files are identified by the .php suffix. They are essentially HTML files containing PHP code for the Web server to execute. When a Web client request a PHP file, the Web server will process the PHP statements first to generate a HTML file, which is sent to the requesting client. All Linux systems running the Apache HTTPD server support PHP, but it may have to be enabled. To enable PHP, only a few modifications to the httpd.conf file are needed, which are shown below.

After enabling PHP in httpd.conf, restart the httpd server, which will load the PHP module into Linux kernel. When a Web client requests a .php file, the httpd server will fork a child process to execute the PHP statements in the .php file. Since the child process has the PHP module loaded in its image, it can execute the PHP code fast and efficiently. Alternatively, the httpd server can also be configured to execute php as CGI, which is slower since it must use fork-exec to invoke the PHP interpreter. For better efficiency, we assume that .php files are handled by the PHP module. In the following, we shall show basic PHP programming by examples.

(1). PHP statements in HTML files

In a .php file, PHP statements are included inside a pair of PHP tags

<?php

// PHP statements

?>

The following shows a simple PHP file, p1.php.

<html>

<body>

<?php

echo “hello world<br>”;      // hello world<br>

print “see you later<br>”; // see you later<br>

?>

</body>

</html>

Similar to C programs, each PHP statement must end with a semicolon. It may include comment blocks in matched pairs of /* and */, or use //, # for single comment lines. For outputs, PHP may use either echo or print. In an echo or print statement, multiple items must be separated by the dot (string concatenation) operator, not by white spaces, as in

echo “hello world<br> . “see you later<br>”;

When a Web client requests the p1.php file, the httpd server’s PHP preprocessor will execute the PHP statements first to generate HTML lines (shown at the right hand side of PHP lines). It then sends the resulting HTML file to the client.

(2). PHP Variables:

In PHP, variables begins with the $ sign, followed by variable name. PHP variable values may be strings, integers or float point numbers. Unlike C, PHP is a loosely typed language. Users do not need to define variables with types. Like C, PHP allows typecast to change variable types. For most parts, PHP can also convert variables to different types automatically.

<?php

$PID = getmypid();       // return an integer
echo “pid = $PID <br>”;  // pid = php Process PID
$STR = “hello world!”;   // a string
$A = 123; $B = “456”;    // integer 123, string “456”
$C = $A + $B;            // type conversion by PHP
echo “$STR Sum=$C<br>”;  // hello world! Sum=579<br>
?>

Like variables in C or sh scripts, PHP variables may be local, global or static.

(3). PHP Operators

In PHP, variables and values may be operated by the following operators.

Arithmetic operators

Assignment operators

Comparison operators

Increment/Decrement operators

Logical operators

String operators

Array operators

Most PHP operators are similar to those in C. We only show some special string and array operators in PHP.

(3).1. String operations: Most string operations, e.g. strlen(), strcmp(), etc. are the same as in C. PHP also supports many other string operations, often in slightly different syntax form. For example, instead of strcat(), PHP uses the dot operator for string concatenation, as in “string1” . “string2”

(3).2. PHP Arrays: PHP arrays are defined by the array() keyword. PHP supports both indexed arrays and multi-dimensional arrays. Indexed arrays can be stepped through by an array index, as in

<?php

$name = array(‘nameO”, “namel”, “name2”, “name3”);

$value = array(1,2,3,4);   // array of values

$n = count($name);        // number of array elements

for ($i=0; $i<n; $i++){    // print arrays by index

echo $name[$i]; echo ” = “;

echo $value[$i];

}

?>

In addition, PHP arrays can be operated on as sets by operators, such as union (+) and comparisons, or as lists, which can be sorted in different orders, etc.

Associative Arrays: Associative arrays consist of name-value pairs.

$A = array(‘name”=>1, “name1″=>2, “name2″=>3, “name”=>4);

An associative array allows accessing element value by name, rather than by index, as in

echo “value of namel = ” . $A[‘name1’];

(4). PHP Conditional Statements: PHP supports conditions and test conditions by if, if-else, if-elseif- else and switch-case statements, which are exactly the same as in C, but with slight difference in syntax.

<?php

if (123 < 456){  // test a condition

echo “true<br>”; // in matched pair of { }

} else {

echo “not true<br>”; // in matched pair of { }

}

?>

(5). PHP Loop Statements: PHP supports while, do-while, for loop statements, which are the same as in C. The foreach statement may be used to step through an array without an explicit index variable.

<?php

$A = array(1,2,3,4);

for ($i=0; $i<4; $i++){     // use an index variable

echo “A[$i] = $A[$i]<br>”;

}

foreach ($A as $value){ // step through array elements

echo “$value<br>”;

}

?>

(6). PHP Functions: In PHP, functions are defined using the function keyword. Their formats and usage are similar to functions in C.

<?php

function nameValue($name, $value) {

echo “$name . ” has value ” . $value <br>”;}

nameValue(“abc”, 123); // call function with 2 parameters

nameValue(“xyz”, 456);

?>

(7). PHP Date and Time Functions: PHP has many built-in functions, such as date() and time().

<?php

echo date(“y-m-d”); // time in year-month-day format

echo date(“h:i:sa); // time in hh:mm:ss format

?>

(8). File Operations in PHP: One of the great strengths of PHP is its integrated support for file operations. File operations in PHP includes functionalities of both system calls, e.g. mkdir(), link(), unlink(), stat(), etc. and standard library I/O functions of C, e.g. fopen(), fread(), fwrite() and fclose(), etc. The syntax of these functions may differ from that in the I/O library functions in C. Most functions do not need a specific buffer for data since they either take string parameters or return strings directly. As usual, for write operations, the Apache process must have write permissions to the user directory. The reader may consult PHP file operation manuals for details. The following PHP code segments show how to display the contents of a file and copy files by fopen(), fread() and fwrite().

(9). Forms in PHP: In PHP, forms and form submission are identical to those in HTML. Form processing is by a PHP file containing PHP code, which is executed by the PHP preprocessor. The PHP code can get inputs from the submitted form and handle them in the usual way. We illustrate form processing in PHP by an example.

(9).1. A form.php file: This .php file displays a form, collects user inputs and submits the form with METHOD=”post” and ACTION=”action.php” to the httpd server. Figure 13.12 shows the Web page of the form.php file. When the user clicks on Submit, it sends the form inputs to the HPPTD server for processing.

<!—— form.php file——— >

<html><body>

<Hl>Submit a Form</Hl>

<form METHOD=”post” ACTION=”action.php”>

command:  <input type=”text” name=”command”><br>

filename: <input type=”text” name=”filename”><br>

parameter:<input type=”text” name=”parameter”><br>

<input type=”submit”>

</form>

</body></html>

(9).2. Action.php file: The action.php file contains PHP code to process user inputs. Form inputs are extracted from the global _POST associative array by keywords. For simplicity, we only echo the user submitted input name-value pairs. Figure 13.13 shows the returned Web page of action.php. As the figure shows, it was executed by an Apache process with PID=30256 at the server side.

<!—— action.php file——— >

<html><body>

<?php

echo “process PID = ”  . getmypid() .  “<br>”;

echo “user_name = ”  . get_current_user() .  “<br>”;

$command = $_POST[“command”];

$filename = $_POST[“filename”];

$parameter= $_POST[“parameter”];
echo “you submitted the following name-value pairs<br>”;

echo “command = ” . $command . “<br>”;

echo “filename = ” . $filename . “<br>”;

echo “parameter= ” . $parameter . ” <br>”;

?>
</body></html>

Summary on PHP

PHP is a versatile script language for developing applications on the Internet. From a technical point view, PHP may have nothing new, but it represents the evolution of several decades of efforts in Web programming. PHP is not a single language but an integration of many other languages. It includes features of many earlier script languages, such as sh and Perl. It includes most of the standard features and functions of C, and it also provides file operations in the standard I/O library of C. In practice, PHP is often used as the front-end of a Web site, which interacts with a database engine at the back-end for storing and retrieving data online through dynamic Web pages. Interface PHP with MySQL databases will be covered the Chap. 14.

7. CGI Programming

CGI stands for Common Gateway Interface (RFC 3875 2004). It is a protocol which allows a Web server to execute programs to generate Web pages dynamically in accordance with user inputs. With CGI, a Web server does not have to maintain millions of static Web page files to satisfy client requests. Instead, it creates Web pages to satisfy client requests by generate them dynamically. Figure 13.14 shows the CGI programming model.

In the CGI programming model, a client sends a request, which is typically a HTML form containing both inputs and the name of a CGI program for the server to execute. Upon receiving the request, the httpd server forks a child process to execute the CGI program. The CGI program may use user inputs to query a database system, such as MySQL, to generate a HTML file based on user inputs. When the child process finishes, the httpd server sends the resulting HTML file back to the client. CGI program can be written in any programming language, such as C, sh scripts and Perl.

8. Configure HTTPD for CGI

In HTTPD, the default directory of CGI programs is /srv/httpd/cgi-bin. This allows the network administrator to control and monitor which users are allowed to execute CGI programs. In many institutions, user-level CGI programs are usually disabled for security reasons. In order to allow user-level CGI programming, the httpd server must be configured to enable user-level CGI. Edit the /etc/httpd/httpd.conf file and change the CGI directory settings to

<Directory “/home/*/public_html/cgi-bin”>

Options +ExecCGI

AddHandler cgiscript .cgi .sh .bin .pl

Order allow,deny

Allow from all

</Directory>

The modified CGI Directory block sets the CGI directory to public_html/cgi-bin/ in the user home directory. The cgi-script setting specifies file with suffix .cgi, .sh, .bin and .pl (for Perl scripts) as executable CGI programs.

Source: Wang K.C. (2018), Systems Programming in Unix/Linux, Springer; 1st ed. 2018 edition.

Leave a Reply

Your email address will not be published. Required fields are marked *