umen
February 21st, 2008, 08:16 AM
Hello all
How does the data is collected in sites like digg.com or Slashdot.com and so on ?
Does it collected manually or by web spiders or something else?
goatslayer
February 21st, 2008, 08:52 AM
I think you'll find something like Google news uses a spider, a very complicated one.
Digg - I thought the whole point is that users suggest where to point a spider to and then for the summary a spider will take the first few lines of the website or whatever and then the main digg points you to the entire article.
And then there is RSS which basically (very bascially), you set up a number of links (by subscription for example) which are queried and then content updated as the actual content is updated, whether that is in a client app or an app sitting on top of a web server as part of a web page.
Spiders are very important to most news aggregators though.