| Skip to main content | Skip to navigation |

Site Level Noise Removal for Search Engines

  • André Luiz da Costa Carvalho, Federal University of Amazonas, Brazil
  • Paul - Alexandru Chirita, L3S and University of Hannover, Germany
  • Edleno Silva de Moura, Universidade Federal do Amazonas, Brazil
  • Pável Calado, IST/INESC-ID, Portugal
  • Wolfgang Nejdl, L3S and University of Hannover, Germany

Full text:

Track: Search

Slot: 11:00-12:30, Wednesday 24th May

The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their web sites. At the same time, the growth of the web has also inherently generated several navigational hyperlink structures that have a negative impact on the importance measures employed by current search engines. In this paper we propose and evaluate algorithms for identifying all these noisy links on the web graph, may them be spam or simple relationships between real world entities represented by sites, replication of content, etc. Unlike prior work, we target a different type of noisy link structures, residing at the site level, instead of the page level. We thus investigate and annihilate site level mutual reinforcement relationships, abnormal support coming from one site towards another, as well as complex link alliances between web sites. Our experiments with the link database of the TodoBR search engine show a very strong increase in the quality of the output rankings after having applied our techniques.

Other items being presented by these speakers

Organised by

ECS Logo

in association with

BCS Logo ACM Logo

Platinum Sponsors

Sponsor of The CIO Dinner

Valid XHTML 1.0! IFIP logo WWW Conference Committee logo Web Consortium logo Valid CSS!