[ Home | What We Do | Our Clients | Press & Events | Library | Contact Us ]

Fastwater Rapids vol. 1.8, 19Oct98
There is a tendency to think of website data as if it were just one thing, as when someone asks, "What are we doing with the stuff we collect from the website?" Thinking of web data in this way can make it harder to use. There are distinctly different kinds of data that you can collect from your website; making effective use of the data starts with understanding the differences.
In part one of this article on using visitor and customer data we suggested that you could usefully divide your web site data into four different categories:
The web greatly reduces the cost of collecting these kinds of data. These advantages apply not only to consumer businesses, but also to businesses selling to other businesses. Whether you are selling research chemicals, office supplies, electronics components, or some other product to a diverse, heterogeneous set of businesses, your web site can potentially give you a great deal of information about your customers' needs and buying patterns. This week we will take a look at a couple of the things you can do with this kind of information.
This magazine system is a very simple instance of the model illustrated in the picture above. The reasons that such a simple system can work are:
Imagine that we knew the library classification numbers for the last three books that you have read. We could create a three dimensional graph, with library classification numbers on each axis, and fix a point in the graph space that identified your recent reading choices. It would be a pretty safe bet that other people whose points were close to yours shared similar reading interests. If we knew the titles of the books that they read, it would probably be worthwhile to recommend those titles to you. We would have a more powerful, flexible way of conducting the classify/connect process in the center of the diagram than is provided by a simple rule such as "recommend Road and Track if they buy Car and Driver."
Net Perceptions and LikeMinds sell software that does this sort of thing in a much more sophisticated way than our simple catalog number system, dealing with different kinds of information all at once, and with variables that don't scale as cleanly as our library catalog numbers. They also work in spaces that have many more than just three dimensions. But the general idea is that same: by collecting information about you, they can identify a group of other buyers that probably share at least some of your interests and buying preferences. If the online vendor knows the sort of things that others in the group bought, there is a good chance that you will be interested in those same things. This is one way of knowing that it might be useful to recommend a wristwatch to a camera buyer.
Recommendation engines are attractive for website use for a number of reasons that go beyond their ability to deal with disparate kinds of input data (e.g., purchases, stated preferences, site visit patterns, search terms, non web data) and different kinds of product recommendations. For one thing, they can operate unobtrusively. The user does not necessarily have to fill out a profile or provide other information about likes and preferences; the engine can just go to work on data collected as the user moves through the site and looks at things. Another attractive feature is that they can begin to make recommendations or target advertising right away, during a prospect's initial visit to a site. Clearly, more history (more precise location on the graph) results in better recommendations and predictions, but it is possible to do useful things with even a little bit of information. Finally, recommendation engines can work in real time, which is why they are capable of performing the "classify/connect" processing in the center of the diagram pictured above.
If you have ever had a statistics course, then you certainly have heard someone explain that correlation doesn't "mean" anything -- and certainly does not imply cause. The classic example, as I remember it, is that if you surveyed hundreds of communities of all sizes you would probably find that the size of the churches in a town correlates with the number of bars And the meaning is ...?
The same problems of meaning apply to the operations of recommendation engines. The way that the customers and visitors cluster does not necessarily connect with anything that you can use to better understand your business, make better decisions, or make better plans. The ability to make good recommendations, as valuable as it is, is just statistics. It doesn't automatically mean anything, or have any use beyond the recommendation to the individual customer. Having gigabytes of information about your customers and not being able to use it to make business decisions is frustrating. As one vendor of personalization tools put it: "Our clients ask us, 'You serve my customers very well, but how do you serve me?'" Serving the people running the business as well as the customer implies getting the broader understanding identified at the bottom of the picture below, and requires that you add something more to the work of the recommendation engine.
There are two basic ways to approach this. In practice, the two ways are often used together. The first approach starts with some a priori notions of what the meaningful divisions might be. You might decide that the gender of the buyer matters, for example, or the age, or the income range. Or you might try to tie the buying to a more sophisticated, psychographic classification scheme. For a good example of how such a system works, you can visit the SRI site and look at the VALS segmentation system, which classifies buyers into categories such as "actualizers," "fulfilleds," "makers," "strivers," and so on. The SRI site even gives you a test you can take to see how you would be classified. These predetermined segments have characteristics developed out of extensive consumer research -- for example, "Fulfilleds" are likely to have a swimming pool in their backyard and to own spreadsheet software The great advantage of working with such pre-defined categories is that you can connect your product with very different things, such as reading habits or entertainment preferences of the group. In short, because the segmentation is general, you can fit your product into a bigger picture. There are a number of companies (e.g., net.Genesis, Andromedia, and NetPerceptions) who are beginning to offer the capability to connect and compare the visitor data from your site with data from larger populations, such as the broad base of data that Engage has assembled.
The second approach to analysis looks just at the visitors to your web site. It can grow from the data collected to support the recommendations and one-to-one personalization. Let's go back to our example of the three dimensional grid built from the library classifications of recently read books. If you collected and graphed these data for a great many customers, you would have a three dimensional space in which there floated "clouds" of clustered points. The clouds would not all be spherical, but would fall into cigar shapes and other patterns representing the different distributions of reading preferences. To turn the clouds into something with meaning, you could begin to move and rotate the axes of the graph through the space so that they did a better job of running lengthwise through some of the bigger clouds. This would allow you to begin to name the axes, assigning meaning to the dimensions. For example, you might find that the position of an axis expressed the propensity for reading fiction rather than non-fiction, or of the tendency to read history books. You could even begin to change the angles between the axes so that they were no longer perpendicular (orthogonal) to each other, in order to better fit the clouds. In doing that, you would be saying that the dimensions were not completely independent of each other -- a situation likely to be true, in fact.
What we are describing here is an intuitive look at what a market segmentation product such as Personify does. NetPerceptions, one of the leading vendors of recommendation engines, is also moving in the direction of such offerings. The important points for business people are that:
First, we noted that your website provides you with access to information about visitors, prospects, and customers in more detail, at less cost, than is typically possible in other business environments. One important use of this information is the ability to target ads or product suggestions much more effectively; it is possible to provide visitors with information and offers that they really want to see. Products such as Andromedia's LikeMinds and NetPerceptions, that provide collaborative filtering, are key components for enabling such one-to-one delivery.
But collaborative filtering provides information to your visitors and customers, not to you. To actually use visitor information to make better decisions about advertising, about offers, and to enable cross selling, you need to identify and understand the different groups of customers that you are serving. In other words, you need to develop an effective segmentation for your market. There are two ways to do this; the methods can be used in combination with one another. One method depends on connecting your customers and prospects to larger demographic and psychographic databases. Site measurement companies like Andromedia, net.Genesis, and NetPerceptions are working with data aggregators such as Engage to begin providing these capabilities.
The second method depends on doing a segmentation analysis that works just on the data collected on your own web site. Because this approach focuses specifically on the dimensions in your market, without necessarily referring to larger consumer populations, it can be useful for business to business applications as well as consumer applications. Personify is currently the principal source of this kind of marketing analysis capability for the web; NetPerceptions is intending to offer such products in the future.
Our focus this week has been on the products in the center of the diagram, the ones that classify visitors and make connections. As the diagram suggests, implementing one-to-one delivery involves more than collaborative filtering -- you also need a way to deliver the advertisement, make the cross-sell offer, or suggest the additional product. In a coming issue we will take a look at the techniques and products used on the delivery end of the process.
[ Home | What We Do | Our Clients | Press & Events | Library | Contact Us ]