Written by Fernando Maciá
Índice
Not all statistical reports of the traffic registered on a given website are created in the same way. Server activity analysis allows us to properly measure a number of important data about the performance of our Internet presence, while real-time statistics provide us with more accurate data about the actual number of unique visitors and page views. The first group – traffic statistics reports based on the analysis of server activity – includes reports generated by applications such as Urchin or WebTrends, which can generally be consulted from the control panel of our website. To the second group belong different services that, from the Web itself, detect each of the visits to our website and record valuable information about them, which is then presented graphically. It is important to understand first of all that both types of reports are by no means mutually exclusive, but complementary, and that it is worth learning how to properly interpret your data in order to make the right marketing decisions.
Analysis of server activity
Your server on the Internet generates a record of traffic and data requests in a log file. This file includes information about errors, processing time, bandwidth used, IP address of the visitor, where the visits came from (referrers) as well as information about the operating system and browser used by the visitors. Applications that use server activity log analysis are usually installed behind the servers’ firewall and analyze the log file on a periodic basis (weekly, monthly…). These applications interpret the data and generate a snapshot in the form of a report composed of data tables and/or graphs that make the information contained in the log understandable and usable.
Real-time statistics
Another method of analyzing the activity of our website is to update a database each time a visitor arrives. This method requires including a small snippet of Javascript code on each page we want to control. This code is invisible to visitors to the website. Each time an Internet user visits our website, the code places a cookie on their computer so that they can be tracked as a unique visitor. As soon as this code is inserted into each page, information about your visitors is securely recorded in a database that is immediately searchable. This database is optimized to record how a given marketing campaign (e.g. a banner, or a pay-per-click program) or e-commerce order amount is performing. As this information is stored in real time, it is available immediately, without the need to wait for a periodic report.
Why can server activity analysis and real-time statistics give different data?
Well, each of them uses a different type of data, so it is essential to start by knowing what each one is telling us:
Counting method:
In the server analysis, the hits registered by the web server are counted. This means that the server registers one hit for each element requested by the visitor: one hit for each page, plus one hit for each image contained in that page, plus one hit for each script. In the case of websites built with a frame structure, hits are recorded for each of the pages contained in each frameset.
In real-time statistics, a hit always corresponds to a page view, regardless of the number of different elements that compose it (images, multimedia, etc.) or whether the website has a frame structure.
Identification of unique visitors:
In the server analysis, each visitor with a different IP address is counted as a unique visitor. But since most Internet accesses are made from dynamic IP addresses, there is no way to differentiate whether three visits on the same day were made by different people or by the same person. Similarly, visits made from accesses that have a proxy-cache (such as all Telefónica ADSL accesses) are counted as being made from the same IP (that of the proxy) since the user’s IP address is hidden behind it. So the unique visitor data provided by server analysis is very unreliable.
In real-time statistics, each visitor stores a cookie in his browser that identifies him the next time he visits the website, regardless of the IP address he has in each session. Although it is still impossible to distinguish between two different people using the same computer, the data is much truer to reality.
Method of report generation:
In server analysis, reports are generated on a periodic basis, once a month, etc. So between reports there is no way of knowing what is happening on the website.
In real-time statistics, data is immediately and continuously available from any browser anywhere in the world and presents historical information in a much more useful way when drawing conclusions about trends.
Information about referrals:
In the server analysis, the search terms used to access our website, as well as the search engines used by visitors to the website, are recorded.
In real time statistics, we obtain valuable information about the historical evolution of the use of a certain search term in the generation of traffic to our website, so that we can assess whether it is appropriate to optimize certain pages to obtain better rankings in it. It also gives us direct links to the pages that acted as referrers of the visits, which allows us to control the positions in which a page of our website has appeared for a given search term.
Spiders and robots:
In server analysis, spiders and robots that visit the website for indexing are recorded in the form of hits. In real-time statistics, the activity of spiders and robots is not recorded unless the HTML file has been viewed.
Proxy-cache servers:
In the server analysis, page views that reside in the browser cache of the companies’ proxy server browsers are not counted, as they do not produce requests to the server.
In real-time statistics, all page views are counted.
Non HTML files:
In server analysis, non-HTML files such as graphics, images, Flash or downloads may be counted as page views.
In real-time statistics, unless specifically flagged for control, no double counting occurs.
Page errors:
In the server analysis, the errors generated are detected and logged, allowing us to correct the defective pages.
In real-time statistics, errors are not logged, and the page is only counted as viewed if the Javascript code was able to execute.
Conclusion
The information provided by both types of reports on e-commerce performance is complementary. There are important aspects – errors, average time per session, origin of visits, spider and robot activity – that are more effectively recorded from server analysis. On the other hand, real-time statistical information allows us to know today how many visits the banner placed yesterday in search engine X has generated, presents us in a more easily interpretable and usable graphical form the main trends in the activity of our website and gives us a much truer picture about the number of unique visitors as well as the capacity of our website to generate regular visitors. In future installments, we will discuss the magic numbers, i.e., the ones we should focus on when drawing conclusions that will help us to judge the performance of our website and those aspects that we can change to improve it.