Up to Quest Home Page
Support for VMS. Patches provided by Robin Garner.
A bug fix: the domain charts used to be based on all usages,
regardless of settings for sites and documents to be ignored.
Now they are based only on accesses included in the total
access count. (Ignored sites and documents are not included.
Hidden documents are included.)
Support for the PLEXUS server log format. Provided by Glenn Heinle.
I have not tested this code myself; you can contact him
at [email protected] if you ahve questions about
the PLEXUS support.
Bug fix: you can now ignore/hide/etc. accesses of your root document
by blocking accesses to "/".
Compatibility enhancement: since the CERN server uses
"Welcome.html" and "welcome.html" in the way the NCSA server
uses "index.html", all three are now aliased to "/" when used
in the root directory.
Uppercase site names are now converted to lowercase before being
analyzed.
Identity check information (person@) is now stripped off of
site names before they are analyzed.
A bug fix for the double-reports that have appeared recently
in 3.0.2. Hopefully. Hard to tell since I've seen them at my
site but attempts to deliberately produce them fail. Let me
know what happens with 3.1 if you've seen them with 3.0.2.
Compatibility fix: strdup() has been banished, replaced by
a simple mystrdup() function. This will make those whose C
libraries lack strdup() happy.
I have received patches for VMS and am working on built-in
MSDOS support, among other things, for the next release.
My apologies to those who hoped to see them in 3.1.
A small fix: graphs weren't coming out right on the first pass,
but were OK if wusage was run a second time. Fixed -- graphs now
come out right the first time.
Just a small fix to version 3.0: graphs were displaying an extra
week with zero accesses. Fixed up.
Version 3.0 has been extensively rewritten. It is not so much
enhanced as made far more stable and very easy to install:
Support for the new common log file format..
Those of you who have just upgraded to NCSA httpd 1.2 will note that wusage
has stopped working. This is because the log file format has changed.
The first line of your wusage.conf file now most likely reads
either NCSA or CERN; change it to COMMON, if you have the
latest version of either server. Note that data in the old format
can't be read once you do this, and will be ignored. If you've
been running wusage all along, this is not a problem, since
you already have reports for all previous weeks; you may need
to fudge your results a bit for the one week during which
you switched servers.
Please see configuring wusage for your server.
Incredibly dumb mistake on my part while creating 2.3 which led to
problems even worse than those in 2.2 is now fixed. It's amazing how much
trouble one line of code can cause. THIS VERSION WORKS,
at least on the systems available to me for testing.
One-line but extremely important bug fix in wusage.c! I deleted
a critical line between version 2.1 and 2.2. Mea culpa.
The bullet is no longer inside the anchor in the list of weeks,
owing to a problem with at least one client that won't accept
this (although I believe it's valid HTML).
Bug fixes in several places. Sites with less than one week of
data should work fine now (of course the graph is rather dull
with no data points, wait a week or two); meaningful error messages
should appear for missing directories.
A reference to an out-of-date version of pbmplus found in the
only US link in my short collection of sites was removed and
replaced with a reference to a site that has the real thing.
The out-of-date version was inadequate to support wusage.
Compatibility fixes for yet more compilers.
As always, if you have problems, contact me and I'll do my
best to get you (and others with similar setups) up and running.
Upgrade Notes: ppmfig has not changed since
version 2.0, but both wusage.c and usagegraph.c have changed
in this version. So if you already have a working ppmfig
you can forego rebuilding it. If you're having problems,
though, be sure to try rebuilding ppmfig against an
up-to-date pbmplus version.
Version 2.1 now includes support for the CERN httpd as well
as the NCSA httpd. You should insert the line
or
at the beginning of the wusage.conf file, as
appropriate. (For backwards compatibility, wusage will
assume NCSA if this line is absent.)
Version 2.1 corrects a bug in the use of wild cards:
wild cards at the beginning of an entry in one of the
exclusion lists now work properly (so entries such as
"*.gif" are now correctly processed).
Version 2.1 now ignores white space at the end of
entries in the exclusion lists; not strictly a bug
in 2.0, but it saves a lot of grief.
First, and most important- version 2.0 is now compatible with
all, or nearly all, versions of Unix! Version 2.0 relied on certain
time-handling routines that did not exist in non-Sun
versions of Unix. These have been replaced.
Second, Version 2.0 supports the exclusion of
unwanted accesses such as gif files,
personal files, and other materials that distort the
statistical picture of the server, in the opinion
of the operator. This mechanism is entirely under the
control of the operator-- no code changes are needed.
Third, a major bug resulting in incorrect top-ten
lists when wusage attempted to take care of several
unprocessed weeks in one pass was fixed.
wusage 3.2 is copyright 1993, 1994, Quest Protein Database Center,
Cold Spring Harbor Labs. Permission granted to copy and distribute
this work provided that this notice remains intact. Modified
versions should be cleared through Quest first; if this is not
done, any modified version of the program must be clearly labeled
as such.
The Quest Protein Database Center is funded under Grant P41-RR02188 by
the National Institutes of Health.
Written by
Thomas Boutell, 11/93 - 5/94.
The GIF code is based on that found in the pbmplus utilities,
which in turn is based on GIFENCOD by David Rowley. See the notice below:
wusage maintains usage statistics for a WWW server. Specifically,
it updates the following information, week by week:
To use wusage, you will need the following:
That's it! previous versions required the presence
of the pbmplus utilities and of a Unix shell. These requirements
have been lifted. Version 3.2 should be a very easy (even trivial)
port to MSDOS, including the GIF support routines. If you do this,
please contact me so I can combine your code into the
official package and make your binary available!
wusage is intended for use with the NCSA or CERN httpd servers, or
with any server which produces the new "common logfile format".
If you use a different server with a different access log file format,
it will be necessary to patch the wusage.c source code
appropriately, which should not be overly difficult.
I will be glad to assist as best I can. Note that the author
of your server should be using the new common log file format,
so if they are not doing so I suggest you point this out to them.
You can fetch wusage as a compressed tar file
here. Or you can FTP it directly from isis.cshl.org, in the
subdirectory pub/wusage.
In order to build wusage, first untar the wusage.tar file with the
following command:
cd to this directory and examine the Makefile, which you may need
to change slightly. Specifically, if you are using a different
C compiler which is not named or aliased to "cc" (this is
quite uncommon), change
If you are using the SGI C compiler, you will need to add "-cckr" to
the CFLAGS line.
Now, to build the package, just type "make all". If all goes well,
the program "wusage" will be compiled and linked without incident.
You have now built wusage. All that remains is to configure it
for use with your server.
There are several parameters which must be set in order for
wusage to properly interact with your server. These are set
in the file wusage.conf. A sample wusage.conf file is included
in the tarfile, and you can use this file as a starting
point. You will definitely need to edit this file to configure
wusage properly for your server unless it is identical
to ours.
Here is the sample wusage.conf file. Note that lines beginning
with "#" are comments and are ignored. Note also that blank lines
are NOT considered comments and should be avoided.
Important change in version 3.2: the
"home page" line has been removed. Delete it from your
wusage.conf and add the prefix and suffix lines (see below).
The first non-comment line should read:
or
or
Note to those upgrading: once you switch to the COMMON log file
setting, wusage can't read any data in the old format that may
be lying around, but it can skip over it tactfully. The upshot
of this is that if you've been running wusage all along, you'll
simply be able to start using it again and will only need to
adjust the results for the one week during which you made the
changeover to a common-logfile-format server version.
For those using wusage for the first time, this is a thornier
problem, but it can be handled with some ingenuity (by
switching the setting of the first line after running wusage
on the pre-common format part of your log, then deleting the
older content). I encourage server authors (and anyone else for
that matter!) to write a conversion filter to translate old-style
log file formats to the new style. It shouldn't be very
difficult. At worst, you'll have statistics only from the
point at which you switched to a common-logfile-format server.
The second non-comment line should contain the name of your server
as you would like it to be referred to in the usage page.
The third line should contain the full filesystem path (NOT URL)
of a file you would like to have copied in at the beginning of each
page generated by wusage, or the word
The fourth line is just like the third, but specifies a
suffix file to be appended at the end of each page.
Sample prefix and suffix files are provided. Note the
link to the wusage documentation in the suffix file.
You are not required to keep this link, but
we will greatly appreciate it if you do so. (Of course,
if your site is strictly internal and behind a firewall,
you should remove the link, since it won't work
for your users.)
The fifth line contains the directory in your file system in
which html pages generated by wusage should reside. This will
usually be a subdirectory of your server root directory
called "usage". (In our case, DOCUMENT_ROOT is /home/www/web,
so the fifth line is /home/www/web/usage.)
IMPORTANT: this directory should not be shared with other
information! Please give usage a subdirectory to itself,
since it creates and deletes files fairly freely and assumes
its directory is a safe place in which to do so.
The sixth line is the "base URL" for html pages generated by
wusage. This is similar to the second line, but is the location
in web space, not in filesystem space. Thus, if DOCUMENT_ROOT
is /home/www/web and you set the second line to
/home/www/web/usage, the fourth line should be set to
just /usage.
The seventh line is the location of the NCSA server access_log
file, which wusage needs to be able to read in order to compute
statistics. This file is located in .../ncsa/logs; ... is the
location at which you installed the server. In our case
it is installed beneath /home/www.
The eighth line is the default domain, which should be the domain
in which your own server is located. For instance, if your server's
name is siva.cshl.org, this line should read
The lines above are followed by four lists of items,
enclosed by { and } characters. By default, these
lists are empty. The absence of the lists is tolerated for
backwards compatibility.
The first is a list of items which should be "hidden".
This means that they will still register in the total number
of accesses, but they will never be in the top ten for any week.
The second is a list of items which should be "ignored".
These items never appear in the total number of accesses OR
in the top ten-- they are completely ignored.
The third is a list of sites to be ignored. This is
useful if many of the accesses to your server are made
by you personally and you are more interested in
counting accesses made by other sites.
For instance, if you want to keep .gif files (frequently
inline) out of the top ten, completely ignore files
coming from users' personal directories, and ignore
accesses from your own site "here.com", the three
lists would look as follows. (Note that asterisks are
acceptable as wild cards, just as they are in the
file system; question marks are also acceptable to substitute
for any single character.)
This mechanism makes it much easier to arrive at a meaningful top-ten list.
Pie charts showing the usage of your server by domain, telling
you where in the world people are connecting to your server from,
are now available. These pie charts appear on each week's page.
To make them more useful, it is possible to combine countries
into continent domains.
The last section of the wusage.conf file is now made up of
continent aliases. Alternatively, you can turn off domain charts
altogether by uncommenting the "none" line just before the
continent aliases.
The continent aliases provided work well, but if you would like
to alter them (to add new countries or break up continents,
for instance, if your server is located in Europe), here are the rules:
The entire set of aliases is enclosed in a { ... } pair.
Each alias is enclosed in a { ... } pair (see the example set
in wusage.conf). The first domain in each alias is the name that
the rest will be aliased to. This adds them together to make the
result show up better in the pie chart and the list of the
top ten domains. The first domain can itself be a real
domain (such as the little-used "us" domain, to which you could
additionally alias gov, edu, org, mil and com, though this is not
always correct), or it can be a made-up domain such as "Asia".
Domain names are generally kept short so they will fit into
the pie charts well.
See the provided wusage.conf file for examples.
The pie chart only shows domains which take up a sufficient
percentage to be legible in the chart, but the top ten list always
shows the top ten domains (if there have been accesses from ten
or more domains).
The "?" domain is assigned to accesses from sites whose names
are unknown. The default domain (line seven of wusage.conf) is
assigned to sites which have no periods in their names
(ie, they are assumed to be local sites in your own domain).
The "other" category in the pie chart is assigned to all accesses
from domains too small to show up in the chart.
Again, if the pie charts don't work well for you (because all
of your accesses are from one domain, or because your nameserver
is broken and you only have IP addresses in your access logs),
you can shut them off by uncommenting "none" in your wusage.conf file.
There are three common ways to run wusage: as an automatic weekly
job, by hand, and through a cgi script (which allows you to
have a "button" on one of your web pages to update the
information).
An automatic weekly job is the best approach, since this is
the frequency with which wusage generates reports. If you are
using a Unix system, it is easy to do this using the
program "cron".
wusage needs to be run on a weekly basis in order to keep
useful statistics. Specifically, it should be run as soon
after midnight on Sunday as possible. For the purposes of
creating an html report, wusage should always be run
with this ONE option (this is a change from versions
before 3.0):
You can simply run wusage by hand with the -c option
(example:
To run it from a cgi script, create a cgi script
which executes the above command and echoes back a
reasonable page to the user indicating success.
(Since reports are weekly no matter how often the
program is run, it is recommended that
such a button be placed on a private page, since it
has no dramatic effect and need not be run incessantly
by users.)
In order to install wusage as a regularly-scheduled
automatically-run program, you need to add it to your
crontab file and submit it to the program "crontab".
Our crontab file looks like this:
Of course, if you run the www server as root, you no doubt already
have a crontab file for root, to which you will want to add this
line, following this with a reinstall using crontab. (We
created a separate www account to facilitate this sort of thing;
I recommend this strategy to other server administrators.)
Everything else is taken care of; all that remains is to run
wusage for the first time (to make sure the various html
and .gif files actually exist) and linking the usage report
to your home page.
Run wusage by hand using the following command:
Now, if all has gone well, edit your home page to include
a link to the usage report. Here is the relevant excerpt
from our home page:
Note that in addition to a normal text link, a small usage
graph is provided as an icon. This graph is genuine-- it is
updated at the same time as the larger graph on the main
usage page!
Your access_log file will grow tremendously over time, particularly
if your server is heavily used. It is desirable to purge this
file periodically, and this can be done provided you follow
these directions.
Take note of the most recent week for which wusage has generated
a complete report. Determine the date on which this week
ended (the usage report displays the date the week began).
Now edit your access_log file and find the first entry
that falls AFTER the completion of that week.
It is safe to delete all entries BEFORE that line in
the access_log file.
Important note:if you do purge your access_log file,
then be sure to back up the directory in which wusage
keeps its html pages. This directory contains important
summary information for previous weeks which wusage must
have in order to graph information regarding past weeks
no longer in the access_log file. Of course, you should
also compress and back up your old access_log data.
What's New in version 3.2!
Support for prefix and suffix files of your own design. These
let you completely control what you add to your usage pages,
and eliminate the old "Up to Home Page" link. You will need
to edit your wusage.conf (required) and can prepare prefix and suffix
files of your own (optional)!
What's New in version 3.1!
DOMAIN CHARTS! wusage can now create pie charts showing access
by domain, and lets you alias country domains together to make
statistically significant continent domains.
What's New in version 3.0.2!
What's New in version 3.0.1!
What's New in version 3.0!
What's New in version 2.5!
What's New in version 2.4!
What's New in version 2.3!
What's New in version 2.2!
What's New in version 2.1!
NCSA_HTTPD
CERN_HTTPD
What's New in version 2.0!
Credits and license terms
/*
** Based on GIFENCOD by David Rowley
What is wusage?
What else do I need to use wusage?
What if I don't use a standard server?
How do I get wusage?
How do I build wusage?
uncompress wusage3.2.tar.Z
tar -xf wusage3.2.tar
This will create the directory "wusage3.2" beneath the current
directory.
CC=cc
to read
CC=acc
Or to another appropriate compiler.
Configuring wusage for your server
#Type of server log: COMMON (all new servers), NCSA_HTTPD, CERN_HTTPD,
#or PLEXUS_HTTPD.
#The latter three are for older versions of those servers; newer versions
# should use the COMMON log file format (but CHECK YOUR DOCUMENTATION).
COMMON
#Name of your server as it should be presented
Quest
#File to use as a prefix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY.
#NOT A URL.
/home/www/prefix
#File to use as a suffix; MUST BE A COMPLETE FILE SYSTEM PATH. REALLY.
#NOT A URL.
/home/www/suffix
#Directory where html pages generated by usage program should be located
/home/www/web/usage
#URL to which locations of html pages should be appended for usage reports
#(the same as the first line, but in web space, not filesystem space)
/usage
#Path of ncsa httpd log file
/home/www/ncsa/logs/access_log
#Your top-level domain name (org, edu, com... just the topmost level)
org
#Hidden items
{
}
#Ignored items
{
}
#Ignored sites
{
}
#Domain aliases or "none"
#none
{
{
aliasname
domain
domain
domain
}
... More aliases, if any ...
}
COMMON
NCSA_HTTPD
CERN_HTTPD
or
PLEXUS_HTTPD
as appropriate to your server's log file format. Note that the
latest versions of CERN and NCSA servers produce the COMMON log file format,
and setting this line to a different value won't work for those
versions! UPPERCASE REQUIRED.
none
(in lowercase letters). You can use this mechanism to add
a link up to your home page, or an illustration of your choice.
org
.
This is new in version 3.1 and later.
Excluding unwanted accesses
#Hidden items
{
*.gif
}
#Ignored items
{
/~*
}
#Ignored sites
{
www.ourcompany.com
*.ourschool.edu
}
Charting access by domain
Running wusage
-c (location of wusage.conf file)
which specifies the location of the configuration file.
wusage -c wusage.conf
). You will
need to do so once a week.
1 0 * * 0 /home/www/wusage -c /home/www/wusage.conf
... other jobs, if any ...
The crontab file submitted to the Unix system with the following
command, assuming it is called "crontab.txt":
crontab crontab.txt
Hooking up wusage
wusage -c /home/www/wusage.conf
(Substitute the directory where wusage.conf resides
on your system for /home/www in the above.)
<P>Usage of the Quest WWW server is kept track of through
<A HREF="/usage/index.html">
<IMG ALIGN=TOP SRC="/usage/usage.graph.small.gif"></A>
<A HREF="/usage/index.html">usage statistics</A>.</P>
In addition to obvious name changes, you may need to change the
directory linked to if you did not use /usage in your configuration
file.
Purging access_log (how and why)
Please tell us you're running wusage!
When you contact us and let us know you are running wusage,
you help us justify the time spent in maintaining and improving
it. So please let us know. If you can provide a URL for your
usage page, that's great, but if it's an internal server
(not open to the public), a simple note is just as welcome.
If you have problems
If you have any difficulties with wusage, feel free to contact
the author,
Thomas Boutell.