phpMySearch站内搜索引擎

instruction manual

1. Introduction
2. Installation instructions
3. Administration Interface
4. Statistics Tool
5. Searching
6. Layout
7. Important information

1) Introduction

phpMySearch provides a search engine for your own web site. It is not meant to replace a powerful internet-wide search system like Google, Lycos or AltaVista.

phpMySearch provides the same full boolean search queries found in the 'big' search engines, e.g.: download +software or download OR software.

phpMySearch is very easy to install and this guide will take you through all the steps you need to set it up for your web site.

For the installation you do not need programmer or web server knowledge. The complete installation needs about 5 minutes and all you will need is a browser and an FTP program. The configuration can be changed at any time through a browser control panel.

The visual design of the search results are easily customised to match the rest of your site by use of templates. It's a simple process to modify the templates with an HTML editor to make unique layouts.

Documents can be searched in HTML, PDF, DOC or TXT format. Outdated documents are detected automatically and deleted from the data base. Therefore only the most current information is displayed to your visitors.

phpMySearch runs reliably on Linux, UNIX and Windows operating systems, and supports all the usual protocols: HTTP, HTTPS, FTP, FTPS. phpMySearch can be deployed on either internet or intranet-based web sites.

There is an online forum available at our web site where you can exchange ideas and support with other users of phpMySearch.

phpMySearch may be used privately as well as commercially, free of charge and without restrictions as long as the copyright notice on phpMySearch at the end of the search output is preserved. Support licenses are available to help you with larger and more complex projects.

2) Installation instructions

System Requirements:

Installation:

  1. Download phpMySearch from http://phpmysearch.web4.hm to your system. For *nix-systems download the phpMySearch.tar.gz file, for Windows systems download the phpMySearch.exe file.
  2. Windows: Double click on the phpMySearch.exe and follow the instructions to extract the contents.
    Unix: Extract phpMySearch under a Unix shell with: tar -xzf phpMySearch.tar.gz
  3. Copy all files from the newly generated folder phpMySearch to a folder on your web space which is accessible from web (e.g. http://www.example.com/phpMySearch/ ) After that execute with a browser the install.php (e.g. http://www.example.com/phpMySearch/install.php)
  4. Setup phpMySearch with admin.php from a browser window.

Support: you can get commercial support licenses at http://phpMySearch.web4.hm.

3) Administration Interface

To access the administrative interface run admin.php from the browser.

Default login and password are:

  • login: admin
  • password: admin

It is strongly recommended you change these to maintain security.

Table 1-1: Fields and buttons on the Administration page:

Field or button

Description

Search start URL's:

A list of URLs from which crawler will start to gather information. To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the check boxes near URLs you'd like to delete and push REMOVE button. e.g.:
http://www.example.com
http://www.example.com/folder
http://www.example.com/folder/file.html

To add urls with (HTTP)-User-Identification you can add urls in this way: http://Username:该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。/file.html ftp://Username:该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。/pub/file.html
User identifications systems with html forms and with GET or POST submissions are not supported.

Number of attempts to retrieve the page:

If a document or server is not available phpMySearch will try to connect to it x times.

Not indexed URLs (Black list)

List of URLs, which will be ignored by crawler. To add new URL to the list type it in the field below and push ADD button. To remove any of the URLs check the check boxes near URLs you'd like to delete and push REMOVE button.

Document extensions to index

A list of document extensions which spider should try to index. To add new document extension to the list type it in the field below and push ADD button. To remove any of the extensions check the check boxes near extensions you'd like to delete and push REMOVE button.

Search depth

Search depth tells spider how much iteration he should follow links from the pages and proceed with crawling:
0 - don't follow any links
1 - follow links only from the first page
2 - follow links from the first page + 2
3 - follow links from the first page + 3.

Re parse all

If check box is checked spider will clean database and start to ramble, otherwise it will parse only pages that were updated. By default it is unchecked

Automatic spider start

Check this box if you have not access to crontab tool at *nix systems or task scheduler in Windows system. If it is checked each time visitor use search script, it will check whether it is time to start the crawler. If it finds it is time to start or the crawler was not started at specified time script will start the spider. It is recommended that you use system scheduling utilities (CRON) to start the spider.

Start time

Time to start the spider

Start spider each (days)

Period in days to restart the spider (in days)

Force crawling

Click on Start Spider button to start spider immediately

Number of links per page

Specify here number of links, which should be displayed on a single page.

Max pages block

Specify how many pages should be visible in pages menu

Proxy settings active

Check this box if you want the spider shell connect to proxy. Please note that this function is in BETA and for testing. If you have any problems with proxy support, we'll be glad for any response.

Proxy host

Fill in here the proxy host. (e.g. an IP 192.168.0.1 or a host). If you must use a special port you can type host:port

Proxy user and pass

Fill in here if you need a user and password for the proxy. Please fill in the following format: User:Password

Search Engine log file name

Enter path to the file where search engine will log its work.

Spider Engine log file name

Enter path to the file where spider will log its work

Admin Tool log file name

Enter path to the file where all changes will be logged in admin tool

Templates path

Enter path to template files

PHP Full Path: Fill in here the path to PHP. You can fill in PHP, sometime you must type in the full path to PHP e.g. PHP, /usr/bin/local/PHP or c:\PHP\php.exe. On Windows platforms you can use backslash forward slash.
Converter URLs:

Defines the URL to use when which will convert file types PDF, DOC and XLS in to HTML.

To transform PDF files into HTML you can use this url: http://access.adobe.com/perl/convertPDF.pl. The format for phpMySearch to convert a given URL is: http://access.adobe.com/perl/convertPDF.pl?url=http://www.yourdomain.com/yourfile.pdf.

For DOC or XLS files you can try to use the converters supplied with phpMySearch. These converters are in your phpMySearch folder, e.g.: ./converter/XLSConverter.php. These PHP scripts all use external converter tools.

To install these converters you need full ROOT or administrator rights on your server.

For the converters we use:

For any installation or questions for these tools look at their own homepages, forums and mailing lists. The phpMySearch Group can not give you any support for questions of the installations or troubleshooting of these tools.

Admin Login

Login name of administrator

Admin Password

Administrators password

Confirm Password

Confirmation of administrators password

Submit

By clicking on Submit button you save all changes

Logout

By clicking on Logout button you logout from the admin tool

 

4) Statistics Tool

This tool lets you store search terms into a database and lets you put your statistics into the search.php. First you'll have to activate this implement at the Admin-tool beside the Proxy settings. After that, you can decide how many outputs of your statistics you'd like to have shown into the search.php. With the clear-button next to "Clear Stats DB:" you are able to clear the stats table (for example if you want to show the most frequent search terms for this week or this month). When you've finished the settings press the submit-button.

Image admin

If you now start the search.php, the first term you enter will be stored into the database. By entering the second search term, the output of the database with your entered settings will be displayed.

Image search
 

5) Searching

If you want to search for a single word it is simple. Just execute the search.php in your browser. Now type the word you'd like to search for in the text field and press Submit.

You also can use Boolean logic to narrow your search. See table below for operators allowed.

Table 2-1:Search Boolean logic.

Operator

Description

AND +

Finds documents containing all of the specified words or phrases. Peanut AND butter finds documents with both the word peanut and the word butter.

OR &

Finds documents containing at least one of the specified words or phrases. Peanut OR butter finds documents containing either peanut or butter. The found documents could contain both items, but not necessarily.

AND NOT + -

Excludes documents containing the specified word or phrase. Peanut AND NOT butter finds documents with peanut but not containing butter. NOT must be used with another operator, like AND. Search engine does not accept 'peanut NOT butter'; instead, specify peanut AND NOT butter.

OR NOT & -

Finds documents containing one of the specified words or phrases or not containing other word. Peanut OR NOT butter finds documents which contain Peanut or not containing butter

" "

Quotation marks are used to denote exact phrases. For example, a search on "New York Times" will match only documents containing the words as an exact phrase. It will not find pages with the words used in a different order, such as "New times in York!"

{ }

Braces are used to denote folders. For example, a search on "CPAN/objects" will match only documents stored in www.servername.com/currentlocation/CPAN/objects

You also can navigate through the site folder structure. In the drop down box before Submit button you will see list of sub-folders of the current folder. By selecting the folder name you localize search to this folder and its sub-folders. '..' allows you to go one step up. [top]

6) Layout

You can make your own design by using the templates and a WYSIWIG Editor [eg. Dreamweaver]. You find the templates in the folder "templates" in your phpMyAdmin Directory, or when you change it in the admin panel, then in the "TemplatesPath" directory. All templates with the "adm" prefix are for the admin tool. All others [eg. main.tpl,body.tpl,body_docfrom.tpl,body_ok.tpl,refs.tpl.] are the search engine templates. Here you see an overview how the single templates are:

Image description

Here is a short list, which variables you can use. Remember that all searches are case sensitive and that all variables in the templates stand in braces [e.g. {VARIABLE} ].

Template variables

Description

{QUERY}

Search String

{PAGES}

Search Result

{ERROR}

Error Message

{OUT_CURR_PATH}

Search path


 

 

{URL}

Result URL

{pageDate} Result Date
{expiresDate} Expires Date
{title} Website Title
{description} META tag: Description
{keywords} META tag: keywords
{author} META tag: Author
{replyTo} META tag: replyTo
{publisher} META tag: Publisher
{copyright} META tag: copyright
{contentLanguage} META tag: language
{pageTopic} META tag: page Topic
{pageType} META tag: page type
{abstract} META tag: abstract
{classification} META tag: classification
{body_1} META tag: text from the website (first 255 chars)
{body_2} text from the website (all others)

When you change the drop down menus to check boxes, radio buttons or hidden fields, ensure the variable names are the same as in the default templates.

The search.php needs some variables to work [eg. page,currPath]. You can set this with hidden fields, too. The form action is always GET. If you have problems, have a look in the default templates.

If you want to send us your template or your link to your search we will consider putting it in the official templates list in the next version of phpMySearch! [top]

7) Important information

Please note if phpMySearch visits web sites and stores data into your database, you need the agreement from the web site owner to do this.

If the phpMySearch spider visits web sites which are not local and on the same web server, you may incur a lot of data transfer between both servers. Obviously this could result in great cost if you subsequently exceed your hosting limits.

To receive updates regarding phpMySearch you can subscribe to the newsletter service at http://phpMySearch.web4.hm. Also, all updates will be made available on this home page.

 


For more information on the phpMySearch Group and the phpMySearch project, please see http://phpMySearch.web4.hm.

phpMySearch is a project from:

Webagentur web4.hm
Pyrmonter Strasse 42
D-31789
Hameln
Germany.
Tel: +49-5151-609970-0
http://www.web4.hm
该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。
 

phpMySearch logo

公司简介

 

自1996年以来,公司一直专注于域名注册、虚拟主机、服务器托管、网站建设、电子商务等互联网服务,不断践行"提供企业级解决方案,奉献个性化服务支持"的理念。作为戴尔"授权解决方案提供商",同时提供与公司服务相关联的硬件产品解决方案。
备案号: 豫ICP备05004936号-1

联系方式

地址:河南省郑州市经五路2号

电话:0371-63520088

QQ:76257322

网站:800188.com

电邮:该邮件地址已受到反垃圾邮件插件保护。要显示它需要在浏览器中启用 JavaScript。