Best Practices for the Web

Using Statistics

Every time someone visits your page, your web server can collect data about that visitor which can help you maintain and improve your pages, understand how they navigate your site, or identify errors. Some data is logged no matter what; some data can only be gathered if your visitor's machine (client) makes it available.

What Statistical Analysis can do for you

 
  • Find out how many people visit your site.
  • View your Top Requests to understand what content is most popular.
  • View Errors to find bad links, File Not Found requests, and more.
  • View Top Referrers to understand the key drivers of traffic to your site.
  • View Top Paths Through Site to identify and streamline click-through patterns.
  • Learn information about your audience including browser and operating system and geographic location.
  • Track Sessions to see how long a user spends at your site.

Commonly used terms

 
  • Hit - Each file requested by a viewer. Includes html files, graphics, etc. NOT an accurate representation of how many times a page was viewed.
  • Page View - A hit to any file or group of files classified as a page.
  • Referrer - URL of a web page that refers a visitor to your site.
  • Spider - An automated program that searches the internet.
  • Visit - Begins when a user views the first page from your server and ends when the user leaves your server or is inactive beyond a psecified time limit. Default idle time in WebTrends is 30 minutes, but this can be changed.

Available Tools

 

WebTrends statistical software licenses are administered through the following University Departments. Requirements to participate vary. Please contact the appropriate office for more information.

  • Main Campus - Jointly administered by NetCom and the U Webmaster. WebTrends profiles are available for registered institutional websites not served by other departmental licenses. Contact webmaster@utah.edu
  • Health Sciences - Serving the Health Sciences Community.
    Request form: http://uuhsc.utah.edu/wrc/stats/webtrends_request.cfm
  • Marriott Library - Serving Library departments and units.
  • Student Affairs - Serving sites hosted by Student Affairs.
  • Scientific Computing & Imaging - Internal to SCI.
 

Numerous statistical tracking applications and packages are available commercially. Pricing can range from inexpensive basic desk top versions to high end enterprise solutions. Licenses are usually granted per web server or host machine.

How Statistics Work

 

All this information is collected into web server logs. Every page or image request results in a log entry. It is important to understand the data definitions, below, and in particular the critical distinction between hits and page views. The logs are eye-readable but usually your web server administrator has software tools that analyzes data and prepares reports that summarizes the data in more meaningful ways. The
campus has licenses for WebTrends analysis software; other software tools are available as well.

 

Guaranteed data that can be logged:

  • IP address of client
  • Name of the requested file
  • Date and time file was requested
 

Optional data (if provided by client):

  • Client software
  • Client operating system
  • Referring URL
  • Cookie data (session tracking)
 

Sample log entry:

12.254.127.85 - - [14/Jan/2002:13:19:52 -0700]
" GET / HTTP/1.0" 200 27592
" http://search.yahoo.com/search?p=Utah+University"
" Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)"

 

If you look at all the information contained in this simple entry, you can see that if your site log has thousands of entries, you need some other way to analyze the data than simply by looking at each of these entries in turn.

 
  • 12.254.127.85 - - [14/Jan/2002:13:19:52 -0700]
    The IP address of the requesting client machine and the date and time your server finished processing its request. This data helps tell you where your visitors come from (e.g. 90% from .edu domains, 43% from Latvia), and helps you track usage by time of day, which may help you in planning any needed server downtime.

 
  • " GET / HTTP/1.0" 200 27592
    The client requested the resource using the "get" method via the HTTP protocol. The status code your server sent back to the client ("200") shows a successful response; alternatives you might see include a redirection (codes beginning in 3), an error caused by the client (codes beginning in 4, for example, 404 File Not Found), or an error in the server (codes beginning in 5). The last number (27592) shows the size of the object or file returned to the client.

 
  • " http://search.yahoo.com/search?p=Utah+University"
    This is the referring site; where the client reports it came from. In this case, the visitor arrived from a link on Yahoo after doing a search for Utah and University.

 
  • " Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)"
    This is the information that the client's browser sends about itself. This data can tell you which browsers most of your visitors use so you can be sure your pages are viewable by them, and also can help you see trends towards adoption of new browser versions so you can start to design for them.