Fair enough. Some people don’t want to read the whole mind-numbingly long post while their eyes glaze over. For those people, my short summary would be two-fold. First, I believe the crawl/index team certainly has enough machines to do its job, and we definitely aren’t dropping documents because we’re “out of space.” The second point is that we continue to listen to webmaster feedback to improve our search. We’ve addressed the issues that we’ve seen, but we continue to read through the feedback to look for other ways that we could improve.
People have been asking for more details on “pages dropping from the index” so I thought I’d write down a brain dump of everything I knew about, to have it all in one place. Bear in mind that this is my best recollection, so I’m not claiming that it’s perfect.
Bigdaddy: Done by March
- In December, the crawl/index team were ready to debut Bigdaddy, which was a software upgrade of our crawling and parts of our indexing.
- In early January,...