Basic Page Setup
Adding Files to Your Web Site
Learning HTML
Converting Documents to HTML
Converting HTML to Documents
Disclaimer
Traffic Logs for Pages
Server Side Includes
CGI Scripts
File Permissions
Mail
UMN Search Appliance
Basic Page Setup
New accounts are made with a basic index page, and a link to a web directory to quickly put pdf, TeX and other files into.
Type "homepage" at any shell prompt to install the lastest stock homepage.
This will create the proper links, and a basic home page (index.shtml).
Below are the directories (pub, docs), symbolic link (html) and file (index.shtml) created by the homepage script.
Web addresses are listed relative to the web server http://www.math.umn.edu/.
The tilde (~) character is shorthand for an account's home directory.
File Location Web Addresses
~/pub N/A, reserved for non http shares
~/pub/html /~johndoe or /~johndoe/
~/pub/html/index.shtml Same as ~pub/html, and also /~johndoe/index.shtml
~/pub/html/docs /~johndoe/docs, /~johndoe/docs/ or "docs" from index.shtml
The docs directory is a place to copy documents you want to share with the internet.
 A basic index page |
 The docs directory with one file added
|
The index.shtml can be edited with a normal text editor (emacs, pico, vi, etc.) though proficiency will take a while.
You can also run make_basic_index to see the basic page without overwriting an existing index page.
make_basic_index > ~/pub/html/basic_index.shtml
Adding Files to Your Web Site
The quickest way to add content to your website is to put PDF versions of papers into a directory with no index page.
Web servers usually default to create an index page for directories that have no index.html or index.shtml file.
The docs directory has no index page so the web server lists any file in the docs directory.
A link from index.shtml to docs, without the link search-engines wouldn't know to index the files inside.
From a unix shell (on Linux or Mac OS X) copy the files with the cp command.
cp preprint.pdf ~/pub/html/docs/preprint.pdf
From a unix desktop (on Linux or Mac OS X) open finder windows for the source and destination, then use the mouse to drag and drop files into ~/pub/html.
From any other machine use software that supports ssh and scp to copy files to one of the lab unix machines in the School of Mathematics.
Graphical file transfer programs like fugu, gftp, and winscp will make a connection, and you'll need to change into the ~pub/html directory, then drag and drop files from your local machine to the remote server.
Terminal based programs can run a command like the following:
scp preprint.pdf johndoe@example.math.umn.edu:pub/html/docs/preprint.pdf
Learning HTML
The first html to understand is how to make links with the <a> tag.
<a href=http://www.example.com>Example</a>
HTML's grammar is a set of nested opening and closing tags.
Links to other web pages are made with anchor tags.
The destination anchor http://www.example.com is a uri (aka Uniform Resource Idenfier)
The source anchor Example is the link title, though any html (like images) could be inside the source anchor.
View the source of a web page, the opening lines should contain an <html> tag and the last few lines should contain a matching </html> tag to close the pair. Some tags, like <br /> and <p />, are self-closing.
Other important tags are:
- a (the anchor tag for linking to other pages)
- p (for paragraph breaks)
- title (for the pages window title)
- h1 (for making top level headers)
- img (for including images)
- table (for tabular data)
There are 93 HTML 4.01 Tags.
Cell phone web browsers support a subset of the full html tags called Basic HTML.
Basic HTML has 52 tags divided into 11 groups.
- Structure Module*
-
body,
head,
html,
title
- Text Module*
-
abbr,
acronym,
address,
blockquote,
br,
cite,
code,
dfn,
div,
em,
h1,
h2,
h3,
h4,
h5,
h6,
kbd,
p,
pre,
q,
samp,
span,
strong,
var
- Hypertext Module*
-
a
- List Module*
-
dd,
dl,
dt,
li,
ol,
ul
- Basic Forms Module
-
form,
input,
label,
option,
select,
textarea
- Basic Tables Module
-
caption,
table,
td,
th,
tr
- Image Module
-
img
- Object Module
-
object,
param
- Metainformation Module
-
meta
- Link Module
-
link
- Base Module
-
base
Tutorials on HTML from the W3C Consortium.
HTML Tutorial from the textbook Software Engineering for Internet Applications
The University of Minnesota has a Web Depot and a Creative Standards Guide for University Web Templates.
The University of Minnesota includes Dreamweaver MX in the Faculty Toolkit software bundle for faculty and staff of the U of M.
Dreamweaver is available natively for Macintosh and Windows, and the windows version is reported to run with Wine on unix.
Converting Documents to HTML
Some document formats can be converted to HTML using software like latex2html, texi2html, makeinfo, docbook, openoffice, MS-Office, and gThumb.
If HTML conversion is not possible, try converting to pdf. Try using pdflatex from the teTeX software package.
Converting HTML to Documents
The htmldoc program is a simple desktop program for assembling multiple html pages into a single doc with a cover and table of conents.
It is used to make handouts for the August orientations for new grad students, visitors and faculty.
Disclaimer
The University of Minnesota Academic/Administrative Policy 2.9.1 requires the following statement on personal and student organization pages:
"The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Minnesota."
One way to list files without the disclaimer is with the following command...
cd ~johndoe/pub/html
find . -name \*htm\* -exec grep -L "The views and opinions expressed in this " \{} \;
If the disclaimer is in a server-side included footer, then the find command above should search for the footer string instead...
cd ~johndoe/pub/html
find . -name \*htm\* -exec grep -L '<!--#include virtual="/~johndoe/_footer.shtml" -->' \{} \;
To match files with the disclaimer, use the grep -l command instead.
Traffic Logs for Pages
To get a log of your web pages traffic, create a directory /www/home/<group>/<username>/logs.
After midnight, your previous days logs in that directory.
The log files are compressed with gzip, so you can simply cat them using zcat or uncompress them using gunzip.
Log files older than 1 week are removed from web directories. To keep old logs, archive them in your home directory.
Error logs are also put in the directory /www/home/<group>/<username>/logs.
Server Side Includes
Server Side Includes (SSI) are a special form of HTML comment tags that are substituted when served.
The user permission bit must be set to execute (x), or the file must be named with a .shtml extension.
See Apache SSI Documentation for more details.
Including Files
An SSI tag like <!--#include virtual="/~jondoe/_footer.shtml" --> can be put on 10 pages to include the same footer.
Any changes to _footer.shtml would be reflected on the 10 pages including it.
Time
May 17, 2008
<!--#echo var="DATE_LOCAL" -->
SSI exec
SSI exec is not enabled for homepages. The 'virtual' function is preferred.
Counter
You can create a counter on a page using SSI. Each page you put the following counter include will be tracked seperately. The following is an example of how to add a counter to a page.
00001 hits since September 23 2005
<!--#include virtual="/cgi-bin/counter.cgi"-->
Here is an actual counter in use: | 48723 hits since November 07 2006 |
CGI Scripts
CGI is enabled for all accounts in your web home directory's cgi-bin directory.
CGI scripts should be in this directory.
Scripts must have the execute bit set, and they are run as your userid.
File Permissions
All web accessible permissions should be 0644 (owner write) or 0444 (read-only), unless they are executable scripts, which should be set to 0755 (owner-write) or 0555 (read-only).
Web accessible directories should be set to 0755 (owner-write) or 0555 (read-only).
Other write permissions are removed every hour.
Mail
The webserver will allow you to send email from your web applications.
It will only send mail to addresses within the university, so the domain must contain 'umn.edu' as a suffix.
Outgoing connections are denied, so you must connect to the localhost SMTP or use the sendmail provided in /usr/sbin/sendmail.
UMN Search Appliance
The University has a search indexing appliance from Google that traverses web servers at the U to create the search index used by umn.edu.
The indexing accounts for 1 to 10 percent of the hits on the web server, and often is just a check that finds the requested URL hasn't been updated.
Keywords can also be set up in the search appliance so something like "dorms" could bring up a link to Housing & Residential website.
|