|
The following are some definitions of
the information that
Webalizer statistics package reports.
Logs/WebSite
Statistics: The yearly (index) report shows
statistics for a 12 Month
period, and links to each month. The monthly report has detailed
statistics for that month with additional links to any URL's and
referrers found. The various totals shown are explained below.

Hits: Any request made to the server which is logged,
is considered a 'hit'. The requests can be for anything...
html pages, graphic images, audio files, cgi scripts,
etc... Each valid line in the server log is counted
as a hit. This number represents the total number of
requests that were made to the server during the specified
report period.
Files:
Some requests made to the server, require that the server
then send something back to the requesting client, such
as a html page or graphic image. When this happens,
it is considered a 'file' and the files total is incremented.
Not all hits will send data, such as 404-Not Found requests
and requests for pages that are already in the browsers
cache.
Tip: By looking at the difference between
hits and files, you can get a rough indication of repeat
visitors, as the greater the difference between the
two, the more people are requesting pages they already
have cached (have viewed already).
Sites:
The number of unique IP addresses/hostnames that made
requests to the server. Care should be taken when using
this metric for anything other than that. Many users
can appear to come from a single site, and they can
also appear to come from many ip addresses so it should
be used simply as a rough guage as to the number of
visitors to your server.
Visits:
These occur when some remote site makes a request for
a page on your server for the first time. As long as
the same site keeps making requests within a given timeout
period, they will all be considered part of the same
Visit. If the site makes a request to your server, and
the length of time since the last request is greater
than the specified timeout period (default is 30 minutes),
a new Visit is started and counted, and the sequence
repeats. Since only pages will trigger a visit, remotes
sites that link to graphic and other non- page URLs
will not be counted in the visit totals, reducing the
number of false visits.
Pages:
Pages are, well, pages! Generally, any HTML document,
or anything that generates an HTML document, would be
considered a page. This does not include the other stuff
that goes into a document, such as graphic images, audio
clips, etc... This number represents the number of 'pages'
requested only, and does not include the other 'stuff'
that is in the page. Some people call this metric page
views or page impressions, and defaults to any URL that
has an extension of .htm, .html or .cgi.
KBytes:
The KBytes (kilobytes) value shows the amount of data,
in KB, which was sent out by the server during the specified
reporting period. This value is generated directly from
the log file, so it is up to the web server to produce
accurate numbers in the logs (some web servers do stupid
things when it comes to reporting the number of bytes).
In general, this should be a fairly accurate representation
of the amount of outgoing traffic the server had, regardless
of the web servers reporting quirks.
A
Site
is a remote machine that makes requests to your server,
and is based on the remote machines IP Address/Hostname.
URL: Uniform
Resource Locator. All requests made to a web server
need to request something. A URL is that something,
and represents an object somewhere on your server, that
is accessable to the remote user, or results in an error
(ie: 404 - Not found). URLs can be of any type (HTML,
Audio, Graphics, etc...).
Referrers:
Those URLs that lead a user to your site or caused the
browser to request something from your server. The vast
majority of requests are made from your own URLs, since
most HTML pages contain links to other objects such
as graphics files. If one of your HTML pages contains
links to 10 graphic images, then each request for the
HTML page will produce 10 more hits with the referrer
specified as the URL of your own HTML page.
Search Strings
are obtained from examining the referrer string and
looking for known patterns from various search engines.
The search engines and the patterns to look for can
be specified by the user within a configuration file.
The default will catch most of the major ones, but only
available if that information is contained in the server
logs.
User Agents
are a fancy name for browsers. Internet Explorer, Netscape,
Opera, etc.. are all User Agents, and each reports itself
in a unique way to your server. Keep in mind however,
that many browsers allow the user to change it's reported
name, so you might see some obvious fake names in the
listing, but only available if that information is contained
in the server logs.
Top Entry/Exit
Pages: The Top Entry and Exit Pages give
rough estimates of what URLs are used to enter
your site, and what the last pages viewed are. Because
of limitations in the HTTP protocol, log rotations,
etc... This number should be considered a good "rough
guess" of the actual numbers, however will give
a good indication of the overall trend in where users
come into, and exit, your site.
Countries:
These are determined based on the top level domain of
the requesting site. This is somewhat questionable however,
as there is no longer strong enforcement of domains
as there was in the past. A .COM domain may reside in
the US, or somewhere else. An .IL domain may actually
be in Isreal, however it may also be located in the
US or elsewhere. The most common domains seen are .COM
(US Commercial), .NET (Network), .ORG (Non-profit Organization)
and .EDU (Educational). A large percentage may also
be shown as Unresolved/Unknown, as a fairly large percentage
of dialup and other customer access points do not resolve
to a name and are left as an IP address.
Response Codes:
Defined as part of the HTTP/1.1 protocol (RFC 2068;
See Chapter 10). These codes are generated by the web
server and indicate the completion status of each request
made to it.
Search Keywords is
a list of keywords that were used in search engines to
find your website
|