11-15
Finding Needles in a 10 TB Haystack, 140M Times/Day

Search is one of the most ubiquitous and important applications used on the internet, but it is also one of the hardest applications to do well. Google is a search engine company that began as a research project at Stanford University, and has evolved into the world's largest and most trafficked search engine in just under three years. Three main characteristics have driven this growth: search quality, index size, and speed. Addressing these issues has required tackling problems in a range of computer science disciplines, including algorithm and data structure design, networking, operating systems, distributed and fault-tolerant computing, information retrieval, and user interface design. In this talk, I'll focus on Google's unique hardware platform of 10,000 commodity PCs running Linux, and some of the challenges and benefits presented by this platform. I'll also describe some of the interesting problems that arise in crawling and indexing more than a billion web pages, and performing 140 million queries per day on this index. Finally, I'll describe some of the challenges facing search engines in the future.
Date and Time
Thursday November 15, 2001 4:00pm - 5:30pm
Location
Computer Science Small Auditorium (Room 105)
Event Type
Speaker
Rob Shillner, from Google, Inc
Host
David Dobkin

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List