Research and Realization of a Spider Model Facing URL
DOI:
CSTR:
Author:
Affiliation:

Clc Number:

TP393.08

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The key issue of mining data on WEB is how to design an intelligent and effective spider.The paper analyzes the work flow and key technologies of the spider facing URL in details.It also brings forward the mind that adopting several queues to manage the URL list,in order to download HTML files in high speed we sort the URLs by document correlativity.Moreover,we import the idea of iterative threshold into computing document correlativity,which resolve the random modification of threshold.

    Reference
    Related
    Cited by
Get Citation

张国平; 万仲保; 刘高原.基于轻量级J2EE框架信息发布系统的设计与实现[J].华东交通大学学报,2007,24(1):71-75.
. Research and Realization of a Spider Model Facing URL[J]. JOURNAL OF EAST CHINA JIAOTONG UNIVERSTTY,2007,24(1):71-75

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 11,2006
  • Revised:
  • Adopted:
  • Online:
  • Published:
Article QR Code