Free Search

Scale-up versus Scale-out

July 30, 2007

I just ran across a paper from IBM comparing scaling-up (using bigger boxes) to scaling-out (using more boxes). They use Nutch search as their workload, and conclude “… that scale-out solutions have an indisputable performance and price/performance advantage over scale-up for search workloads.” Not exactly a big surprise, but it’s good to have objective data. They also conclude that “Scale-out systems are still in a significant disadvantage with respect to scale-up when it comes to systems management.” Hmm. With frameworks like Hadoop, folks shouldn’t be bothered as much by the more frequent host failures that a scale-out system is prone to.

Posted in Uncategorized | 3 Comments »

siren song

December 18, 2006

Nutch developer Sami Siren seems to be diving into Hadoop, with his second post, this time examining the underutilized record facility. I’m hoping that, once we get a particular bug fixed, we’ll start using records for lots of Hadoop’s internals. Some fun cases will be replacing things like the source for IntWritable with something as simple as:

class IntWritable { int value; }

Posted in Uncategorized | 1 Comment »

Hadoop’s made the news!

November 22, 2006

I just spotted a complementary article about Hadoop, Lucene & Nutch.

Posted in Uncategorized | 3 Comments »

objectivity, again

July 3, 2006

Battelle’s blog has elicited a good discussion of search engine objectivity. I discussed this issue a while ago. One comment led to a good article (pdf) on the topic.

Posted in Uncategorized | 1 Comment »

travel plans

April 24, 2006

Next Thursday, I’ll be in San Francisco for the Nutch Meeting.

I’ll be in Helsinki for most of July, hosted by Wray Buntine, attending the International Workshop on Intelligent Information Access there July 6-8, among other things.

I’ll probably also attend the Open Source Information Retrieval workshop at SIGIR in August.

Posted in Uncategorized | 3 Comments »

	серверы под эмулятор… on MapReduce cookbook for machine…
	Marcella Mollicone on Hadoop Sorts a Petabyte
	Karan Peret on web search is a commodity
	Scott Mc on web search is a commodity
	major Energy on Cloud: commodity or propr…

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Scale-up versus Scale-out

siren song

Hadoop’s made the news!

objectivity, again

travel plans

Recent Posts

Recent Comments