Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
Username: Password: Auto login:
Did you know you can browse Geekzone without ads by Subscribing?
You haven't logged in yet. If you don't have an account you can register now.
  Using Google Desktop Search as a network search server

Posted on 14-JAN-2005 23:12 by M Freitas. | Tags Filed under: Blog.




Using Google Desktop Search as a network search server

We have a network drive at work where the whole team stores information relevant to our projects. It's packed with more than 250,000 files, most of these Microsoft Word and Microsoft Excel documents. I've been thinking on how to index this monster. And I think I've found a cheap option.

It should be something easy for the end user - not everyone is a geek around there. But almost everyone knows how to use Google to search for information. So what best than using the Google interface?

At first I thought of the new Google Mini, a small search appliance recently released. It's not that expensive (US$5,000 at launch), but it is limited to 50,000 documents. Next step is the Google Search Appliance. It can index up to 15 million documents, but its cost is not good.

So what can we do? I tried to find if there was a way to change some of the Google Desktop Search settings to allow for indexing network drives. According to the FAQ the tools will not index a network drive. But with some registry setting changes we can have the Google Desktop Search engine scanning mapped network drives. For example, locate the following registry key:

[HKEY_CURRENT_USER\Software\Google\Google Desktop\HistoricalCapture\Crawler]
CRAWL_DIRS=
CRAWL_FILE=DONE

By entering the !C:!M: (where C: and M: are drives to index) into CRAWL_DIRS and removing DONE from CRAWL_FILE we can instruct the engine to actually index remote drives. Note that is the TAB character. Best way to enter it is using Notepad, type !C: and press the TAB key. Then CTRL-A (Select All) and CTRL-C (Copy). Paste into the key.

However, how can I make this Google Desktop Search engine available to a team? What if I install it on a computer that is always on and could be used as a "Search Server"?

Google Desktop Search can be installed on any PC, but the built-in web server will only allow localhost connections. But even this can be changed. I've found that DNKA will act as proxy and allow external connections to the server! The program is very flexible, allowing user control (anonymous or logged use), IP allow/deny, and Logging. And what's more, it allows the user to define a drive list to index, including mapped network drives. This is so much easier than manually changing the registry.

There's a couple of disadvantages though. First, the server will only run in the context of the user who installed it. So this user must be always logged on the server. Or create a scheduled task that runs on startup, without the need of a user to login. Simply create a batch file, google.bat for example, with the following lines:

"C:\Program Files\Google\Google Desktop Search\GoogleDesktop.exe" /startup
"C:\Program Files\DNKA\ServerOptions.exe" /restart

Create a new scheduled task that runs when your computer starts, and run as the user with privileges to run the server programs (this is the username used to install the programs). Now, even if the computer is restarted, there's no need for someone to come around and login!

Second, the Google Desktop Search will not update the network drive index automatically. But DNKA allows for "touch" and also server port number change.

And if the whole team uses the same drive letter as the "Search Server", result links will open the correct document, all the time.

One of the advantages of this approach instead of having each team member with her own Google Desktop Search is the network traffic impact. Instead of multiple users trying to index a huge mapped drive, we have only one doing the job. There you go. A cheap search server...

PS. DNKA is free for personal use, with a cheap licence for commercial use.





Other articles related to Blog



Comments

Kingtone
  send private message user's profile
Comment posted by Kingtone on 15-JAN-2005 17:07
Visit my GDS Tips page http://users.tns.net/~skingery/firefox/GDS_Tips.html
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 15-JAN-2005 17:11
Kingtone, very cool page of tips you have there.
johnmeyer
  send private message user's profile
Comment posted by johnmeyer on 13-DEC-2005 13:25
How can a windows network administrator block this google network indexing? It is causing server busy lockups and heavy network trafic - especially during business hours.

Thanks for any ideas.
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 13-DEC-2005 14:31
Perhaps by deploying a domain policy which changes the registry keys leaving only local drives behind?



nram
  send private message user's profile
Comment posted by nram on 4-FEB-2006 18:59
Using the Google Desktop on a Network Drive is exciting stuff. What we have is as follows: We have approx. 200,000 MS Office & Open Office Docs on our web server related to our projects. We would like to provide search facility to our office employees on these documents. Can Google Desktop be used in this scenario? If so, any ideas on how we can implement it?
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 4-FEB-2006 19:09
No, I doubt it. Remember that this will create an index in each computer on your network - if you have 200,000 documents you will have these index replicated in all computers running Google Desktop.

I recommend you look at Google Mini instead. It's a dedicated search appliance, and the index will be in one single location. The Google Mini is limited to 100,000 documents though.

The next step up is the Google Search appliance, which can index up to 500,000 documents.

The Google Mini is a dedicated server and is limited to 100,000 documents. You probably can't have a desktop software-only version handling 200,000 documents without causing serious problems in the network...

nram
  send private message user's profile
Comment posted by nram on 4-FEB-2006 19:12
We have a huge number of MS Office & Open Office documents [200,000 and still growing] related to our projects.
We would like to give a serach facility to our users. these files are on a web server.
Is it possible to access the Google Desktop thru an application and display the result set in the application itself.
Assumption : Google Desktop is running on the web server PC and is indexing files.

Thanks for any ideas.
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 4-FEB-2006 19:12
Actually, I see there's the Google Mini update - with options for 100,000, 200,000 and 300,000 documents. More information on the Google Mini Product Page.

freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 4-FEB-2006 19:28
"Is it possible to access the Google Desktop thru an application and display the result set in the application itself. Assumption : Google Desktop is running on the web server PC and is indexing files."

Yes, I've done that, but the impact on the server's performace will be huge. You will have to tweak the built-in Google Desktop webserver and allow some ports on any firewall on this server, but it will very probably degrade performance.


nram
  send private message user's profile
Comment posted by nram on 4-FEB-2006 19:46
freitasm, Thanks for your immediate response.
We use the web server as a centralized storgae for our files. Its not used as a standard web server.
Is there any way we can handle performance, which according to you would degrade if we have an application access Google Desktop? Can we have better servers in place to do this. maybe more CPU power or memory?

Thanks for any ideas
nram
  send private message user's profile
Comment posted by nram on 4-FEB-2006 19:51
freitasm, I did come across an article which says we could have a proxy server residing on the same PC and this would allow users to access the Google Desktop results from any other PC.
Do you think this is a good idea to implement.

According to this article teh danger is in allowing open access to this proxy server.
The way it does it is, the proxy server accepts serach parameters thru a browser and passes it to the local Google Desktop web server. When the Google Desktop web server returns the result set, the same is returned to the browser by the proxy server.

Looks like an interesting idea. I am not sure about teh security holes in this. Though teh article says that it could end up exposing all your files to the outside world.
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 4-FEB-2006 19:54
As I said before, looking at the volume of documents, I don't think Google Desktop Search (which is a personal tool) will give you good results in the long term.

Better look at dedicated search appliances - like the links I posted before.

johnmeyer
  send private message user's profile
Comment posted by johnmeyer on 5-FEB-2006 05:02
Can you tell me what ports google uses for the search? Thanks.
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 5-FEB-2006 17:31
I can't remember now, since I don't have that server around anymore. But if you want to ask this to a more general public, I suggest you post the question in the Geekzone Forums.

rayza
  send private message user's profile
Comment posted by rayza on 10-FEB-2006 15:38
yeh i have managed to get gds running on a remote machine indexing files on another and its accesiable over the network , the only two things i need now is a way for gds to read the coments / keywords assoicated with the file and a way to edit the search page so i can limit the searches from that machine so that you dont acciently search the net , also a way to filter file types would be cool to
BastiaanBos
  send private message user's profile
Comment posted by BastiaanBos on 5-FEB-2007 21:38
Hello,

I'm trying to find out whether it's possible to let Google desktop make an index of a server connected throughout the internet.

i want to install google desktop on a client desktop pc an make an index of all the file available on the webserver conneted through internet

  __                                ^^^^^^^^^                               __
 [  ]------------------------{                  }-----------------------|  |
  --                                 ^^^^^^^^^                             /---\

files on server                    internet                                  cliet google desktop

thanks,

If it's possible how and if not Why?


kind regards, Bastiaan Bos
freitasm
 open user's web page send private message user's profile
Comment posted by freitasm on 5-FEB-2007 22:42
Not possible if it's not a mapped drive on your system. If you are the administrator, consider installing Hamachi and mapping the drives.

graff117
  send private message user's profile
Comment posted by graff117 on 26-MAY-2007 01:55
Actually it is possible.

You will need a file server with google desktop, permanent IP address and Apache server on it.

Now you can connect from any computer to your file server and send request to google desktop.
File server will send you back the results. You do not need google desktop on your other computers.

I did such application for one of my client and it works just perfect.
Post a commentPlease login or register to post a comment on this article.