Academia.eduAcademia.edu

Price watcher agent for e-commerce

2001, Intelligent agent technology: research …

We report an autonomous agent for retrieving competitors' product prices over the World Wide Web, for the purpose of price comparison at an e-commerce retail shop. This price watcher model is different from the conventional price comparison services currently available on the Internet in a way that it collects competitors' price information without the competitors' participation and attention. It scans the price information over the Internet on a regular basis, builds up a knowledge base at the user's site and provides a price comparison facility for shoppers to use. It is an information retrieval utility that could be used as a part of the business intelligence infrastructure. This paper summaries the application background as well as the technical details in the design of the prototype.

Proceedings of the second Asia-Pacific Conference on Intelligent Agent Technology (IAT-2001), Pages 294--299, Maebashi Terrsa, Maebashi City, Japan, October, 2001 PRICE WATCHER AGENT FOR E-COMMERCE SIMON FONG E-Netique Pte Ltd, Singapore E-mail: [email protected] AIXIN SUN School of Computer Engineering, Nanyang Technological University, Singapore E-mail: [email protected] KIN KEONG WONG School of Computer Engineering, Nanyang Technological University, Singapore E-mail: [email protected] We report an autonomous agent for retrieving competitors’ product prices over the World Wide Web, for the purpose of price comparison at an e-commerce retail shop. This price watcher model is different from the conventional price comparison services currently available on the Internet in a way that it collects competitors’ price information without the competitors’ participation and attention. It scans the price information over the Internet on a regular basis, builds up a knowledge base at the user’s site and provides a price comparison facility for shoppers to use. It is an information retrieval utility that could be used as a part of the business intelligence infrastructure. This paper summaries the application background as well as the technical details in the design of the prototype. 1 Introduction The Watcher Agent proposed in this paper is an autonomous software program that “spies” on the competitors’ prices over the web. The prices collected from the competitors are stored in a local database. They can be used for price comparison at the front-end of an e-commerce online shop as well as for market research at the back-end. This technology will offer itself as a useful new feature for online shops and help increase consumers’ confidence in buying the products by showing them the competitors’ prices, and hence helps improve sales. The agent can be configured such that only the prices higher than (or equal to) ours are displayed. A snapshot of a shopping site with price watcher is shown in Figure 1. One of the barriers for e-commerce retailers to overcome is that most consumers are not convinced that the price of a product offered at their sites is the best; and it is always easy for them to surf away to other shopping sites looking for a better offer 1 . How to encourage the consumer to commit a purchase on the spot at the current site is thus an issue to be addressed. PriceWatcher: submitted to World Scientific on June 21, 2001 1 Figure 1. Snapshot of the application of Price Watcher There are several price comparison services available on the web2,3,4 . The differences between our price watcher agent and most of the web-based price comparison software and portals are follows: 1. Designed for usage by individual online shops. Price watcher is a price-monitoring tool used by individual online shops while the usual web-based price comparison services are made publicly available for web surfers to compare prices. 2. Neither broker nor public database is used. For most of the price comparison services, there exist a mediator which is usually the web server or service provider, and a centralized database is used to maintain the price information available to the users. In our watcher agent strategy, a private and confidential database that holds the competitors’ price information is located at the local site. 3. No participation of retailing shops is required. The way that some price comparison services work is they let the participating stores to submit their latest prices to the mediator. Our approach is different because there is no need to get the competitors involved. 4. Forms part of the Competitor Intelligence strategy. The price watcher is to be implemented as a part of the competitor intelligence strategy that includes information retrieval, filtering, analysis, and presentation. In this paper, Section 2 covers the overall working process of the price watcher. The product name matching and price extraction algorithms are PriceWatcher: submitted to World Scientific on June 21, 2001 2 described in detail in Sections 3.1 and 3.2 respectively. The technical limitations about price watcher is given in Section 3.3 and finally we conclude our work in Section 4. WWW Search Engine HTTP requests query Web pages Information Retrieval Layer URL Retrieval Engine search results URLs Market Explorer Web pages Market Monitor URLs Compilation Layer Price Watcher Market Watcher Storage Layer DB_MS marketing source DB_MI DB_MP prices marketing news Presentation Layer Marketing Information System Price Comparer Figure 2. The architecture of Watcher Agent 2 Price Watcher Working Process The price watcher working process consists of five steps: 1. The set of competitors’ URLs, configuration parameters(e.g. retrieval scheduling) and product names are obtained from database. 2. The HTML pages are downloaded using the web retrieval engine. 3. A dollar sign detector is used as a filter. Only pages containing dollar signs like $ and S$ are to be processed further. 4. The product names are searched within each page. The price for any possible matches is extracted and stored in the local database. 5. The competitors’ price (and our own price) are then queried and shown in a tabular form. PriceWatcher: submitted to World Scientific on June 21, 2001 3 3 Technical Details To monitor a web site, the contents of the web site should be downloaded based on some schedule setting5 . In the price watcher, only the HTML texts are to be downloaded. Finding the level of similarity between our product names and the names provided on the web, as well as extracting the corresponding prices are the two main challenges facing us. The architecture of the Watcher Agent is shown in Figure 2. The agent is composed of two major parts. One part is the price watcher and the other part is the market watcher. The market watcher helps the administrator of the online shop get the latest information about his competitors’ web sites. The market watcher part is not covered in this paper. 3.1 Product Name Matching We know that one product name can usually be divided into three parts: brand, model number and description. For example, brand: Canon, model number: BJC-4200SP and description: Color Bubble Jet Printer. The model number is believed to be unique for a specific product. The brand part may appear to be slightly different on different Web sites. For example, Hewlett Packard and HP (for short). This problem can be solved by inputting more than one brand equivalents from users. The description part may be quite different from each Web site. However, this part is not so critical for product name matching although it is useful in determining where model number or brand can be found. What we do in product name matching is to allow users allocate weight for each part. For example, 50%, 30% and 20% for model number, brand and description respectively. Model number and brand require exact matching regardless of the character case. Exact matching will give a similarity level of 1, otherwise the similarity level is 0. Approximate word matching algorithm6 is applied for similarity level computation of the description part. The final similarity of each part is given by the product of its similarity level and weight. The overall similarity level for the whole product name matching is obtained by summing the final similarity levels of these three parts. This final value is subsequently compared with the threshold value to decide if a match has actually been detected. 3.2 Price Extraction The main operation of the price watcher is to extract the prices from HTML documents. HTML documents are semi-structured in nature7 , hence extracting information from HTML documents is significantly different from extract- PriceWatcher: submitted to World Scientific on June 21, 2001 4 ing information from tables in a database. The price extraction algorithm is developed based on the KPS Mining Algorithm 8 . Once a product name is matched and located in a HTML document, the following rules are applied to extract the price. • For a product name appearing in a title (i.e. <title>, <h1> - <h6>), the price of the product is most likely to be located in the string after the product name. • For a product name appearing in an item list, the price is most likely to be located in the same item, or the next one until the end of the list. • For a product name appearing in a cell of a table, the price is most likely to be located in the same cell, or the same row in the column-wise table, or the same column in the row-wise table. • For a product name appearing in a textual line, the price is most likely to be located in the same paragraph, or the next paragraph, until the end of the page. • The price is assumed to be the first one appearing after the product name if more than one price are found. For each HTML page retrieved by the system, a Semi-Structured Data Tree7 will be constructed. If a model number can be located in the tree, the brand and the description are searched within the data node. If none of them can be located in the current data node, a super data string will be formed from all the data nodes which are children of the parent of the current data node. The similarity level of the obtained product name and the defined product name will then be computed. The price of this product will firstly be searched with the current data node, and up to three levels if no price information can be found. 3.3 Price Watcher Limitations One technical limitation is that the price watcher cannot distinguish Singapore dollar and American dollar. The reason is that the “S$” and “$” are always used interchangeably in Singapore. In the current prototype implementation, price watcher can only deal with textual data. Another problem is that the detected product name may not be the one to be monitored although a high similarity level is calculated. For example, “Cartridge for HP DeskJet 840C Printer” will be easily detected as “HP DeskJet 840C Printer”. A more sophisticated algorithm is needed to resolve this problem. PriceWatcher: submitted to World Scientific on June 21, 2001 5 4 Conclusion and Future Work In this paper, we have reported an autonomous software program called price watcher that collects competitors’ product prices on the web. The collected price information will contribute to managers’ business decision making, and it can be used to enhance shoppers’ confidence via price comparison. The application of price watcher technology is believed to be relatively new and would create an impact on the way that retail shops market their goods online. The first online shop that applies this technology would benefit most, because it helps to place their business in a market position one step ahead of their competitors. It is envisaged that the system can be expanded to include scanning and analysis of competitors’ other information, such as news, new products, promotions, etc. Work can also be extended to study how this agent can be integrated into the full infrastructure of business intelligence5 . References 1. L. Gerald and L. Spiller, Electronic shopping:The effect of customer interfaces on traffic and sales. Communications of the ACM, 41(7), pages 81-87, 1998. 2. B. Krulwich, The BargainFinder agent: Comparison price shopping on the Internet. In Agents, Bots, and other Internet Beasties, SAMS.NET publishing, pages 257-263, 1996. 3. R. B. Doorenbos, O. Etzioni and D. S. Weld, A Scalable ComparisonShopping Agent for the World-Wide Web. In Proceedings of the First International Conference on Autonomous Agents, pages 39–48, 1997. 4. Pricewatch for Computer Products, http://www.pricewatch.com. 5. Q. Chen, P. Chundi, U. Bayal and M. Hsu, Dynamic Software Agents for Business Intelligence Applications. ACM Autonomous Agents’98, pages 453-455, 1998. 6. J. C. French, A. L. Powell and E. Schulman, Applications of Approximate Word Matching in Information Retrieval. In Proceedings of the Sixth International Conference on Knowledge and Information Management, pages 9–15, 1997. 7. S. J. Lim and Y. K. Ng, An automated approach for retrieving hierarchical data from HTML tables. In Proceedings of the Eighth International Conference on Information and Knowledge Management, pages 466-474, 1999. 8. T. Guan and K. F. Wong, KPS: a Web Information Mining Algorithm. Computer Networks 31(11-16): 1495-1507, 1999. PriceWatcher: submitted to World Scientific on June 21, 2001 6