Welcome to YLOAN.COM
yloan.com » Jobs search » Real-time Search With Lucene
Environment Relationships Religion and Faith Jobs search Economics Society residential christian

Real-time Search With Lucene

Real-time search is kind of a fuzzy concept, but basically it means dropping the

time a modification to an index takes to be seen by users to a near negligible quantity or a small enough time difference to be acceptable for a given real-time application. Not all applications need real-time search, but the type of application that does need it is very popular these days social networking sites. The average social networking site would like user changes to be search-able almost immediately. When it comes to Lucene, this type of rapid update application has required you to jump through quite a few hoops and accept more than a few compromises.

The future looks a bit more rosy though.

The longer term hope for real-time search in Lucene has been to create an IndexReader that can read the un-flushed state that IndexWriter holds in RAM. Easier said than done though. What is actually materializing at this time is a slightly different approach as soon as Lucene 2.9, you will be able to ask for an IndexReader from a live IndexWriter. One of the guys working on this (Lucene guru Mike McCandless) calls this near real-time search. Briefly, it works like this (note: I am not working on this issue, and do not know it in depth just following along):

When you ask for the IndexReader from the IndexWriter, the IndexWriter will be flushed (docs accumulated in RAM will be written to disk) but not committed (fsync files, write new segments file, etc). The returned IndexReader will search over previously committed segments, as well as the new, flushed but not committed segment. Because flushing will likely be processor rather than IO bound, this should be a process that can be attacked with more processor power if found to be too slow. Also, deletes are carried in RAM, rather than flushed to disk, which may help in eeking a bit more speed.


The result is that you can add and remove documents from a Lucene index in near real time by continuously asking for a new Reader from the IndexWriter every second or couple seconds. I havent seen a non synthetic test yet, but it looks like its been tested at around 50 documents updates per second without heavy slowdown (eg the results are visible every second). The patch takes advantage of LUCENE-1483, which keys FieldCaches and Filters at the individual segment level rather than at the index level this allows you to only reload caches per segment rather then per index essential for real-time search with filter/cache use.

To know more about Apache Lucene and Enterprise search

check out Lucid Imagination website

www.lucidimagination.com

by: Lucid Imagination
Should You Use Professional Tree Planting? How to Get the Perfect Job - 3 Steps You Should Take at Once! Globalization's Impact on the Accounting Profession The Job Vacancies Facts On Applying And Working In Japan As An Intern Developing Iphone Apps In Not Only The Domain Of Professionals Guides On Professional Teeth Whitening In Los Angeles Taking Care Of ADHD - Some Surprising Facts From Recent Research Papers Do Bilingual Jobs Offer More Pay? A professional dental assistant certification course followed by a professional dental assistant resume is your gateway to a great career The Professional's All-inclusive Advice Pro Explorers Yearning To Leave For A Caribbean Journey 4 Reasons You Need Professional Pest Control Services Serial Communication Working Process
print
www.yloan.com guest:  register | login | search IP(216.73.216.142) California / Anaheim Processed in 0.016889 second(s), 7 queries , Gzip enabled , discuz 5.5 through PHP 8.3.9 , debug code: 16 , 2915, 91,
Real-time Search With Lucene Anaheim