At a recent Twitter conference, we overheard attendees bragging about their datasets. “I have 800 million rows in mine,” one developer said.
Well, did you ever try to search datasets that large, or larger?
At Rackspace, a few developers eat datasets that large for lunch. Stu Hood, tech lead at Rackspace’s Mail and Apps group, is one of them. In this video interview with Robert Scoble, he shares his insights into the tools he uses to search huge datasets, including Hadoop and Lucene. Watch as he explains some of the issues with building search engines that work across huge datasets.