Home Php C# Sql C C++ Javascript Python Java Go Android Git Linux Asp.net Django .net Node.js Ios Xcode Cocoa Iphone Mysql Tomcat Mongodb Bash Objective-c Scala Visual-studio Apache Elasticsearch Jar Eclipse Jquery Ruby-on-rails Ruby Rubygems Android-studio Spring Lua Sqlite Emacs Ubuntu Perl Docker Swift Amazon-web-services Svn Html Ajax Xml Java-ee Maven Intellij-idea Rvm Macos Unix Css Ipad Postgresql Css3 Json Windows-server Vue.js Typescript Oracle Hibernate Internet-explorer Github Tensorflow Laravel Symfony Redis Html5 Google-app-engine Nginx Firefox Sqlalchemy Lucene Erlang Flask Vim Solr Webview Facebook Zend-framework Virtualenv Nosql Ide Twitter Safari Flutter Bundle Phonegap Centos Sphinx Actionscript Tornado Register | Login | Edit Tags | New Questions | 繁体 | 简体


10 questions online user: 21

0
votes
answers
28 views
+10

Sphinx runs OK from linux console but not from php api

My sphinx runs ok from linux console

This program (CLI search) is for testing and debugging purposes only;
it is NOT intended for production use.
[root@coinsaver sphinx]# search -i product -q iphone
Sphinx 2.1.8-id64-release (rel21-r4675)
Copyright (c) 2001-2014, Andrew Aksyonoff
Copyright (c) 2008-2014, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/etc/sphinx/sphinx.conf'...
index 'product': query 'iphone ': returned 88 matches of 88 total in 0.014 sec

displaying matches:
1. document=205267, weight=2773
2. document=470963, weight=2696
3. document=432191, weight=1696
4. document=125460, weight=1642
5. document=186938, weight=1642
6. document=199461, weight=1642
7. document=222081, weight=1642
8. document=249572, weight=1642
9. document=310231, weight=1642
10. document=395051, weight=1642
11. document=395052, weight=1642
12. document=430649, weight=1642
13. document=438066, weight=1642
14. document=468067, weight=1642
15. document=470947, weight=1642
16. document=470961, weight=1642
17. document=471161, weight=1642
18. document=482581, weight=1642
19. document=484640, weight=1642
20. document=490590, weight=1642

words:
1. 'iphon': 88 documents, 97 hits

BUT AFTER REBOOTING it does not work from PHP API

$sphinx = new SphinxClient();
// options....
$result = $sphinx->Query("$string*", '*');

it returns false. It worked before I rebooted my server, it seems like Nginx and Sphinx don't get along anymore, I don't even know where to start looking....

result of searchd --status

searchd status
--------------
uptime: 361
connections: 1
maxed_out: 0
command_search: 0
command_excerpt: 0
command_update: 0
command_keywords: 0
command_persist: 0
command_status: 1
command_flushattrs: 0
agent_connect: 0
agent_retry: 0
queries: 0
dist_queries: 0
query_wall: 0.000
query_cpu: OFF
dist_wall: 0.000
dist_local: 0.000
dist_wait: 0.000
query_reads: OFF
query_readkb: OFF
query_readtime: OFF
avg_query_wall: 0.000
avg_query_cpu: OFF
avg_dist_wall: 0.000
avg_dist_local: 0.000
avg_dist_wait: 0.000
avg_query_reads: OFF
avg_query_readkb: OFF
avg_query_readtime: OFF

Also, there are no entries in sphinx query log since the reboot.

10
votes
answers
22 views
+10

How to take advantage of Sphinx multiple indexes to improve performance

I am reading a book on Sphinx and it mentions that in order to take advantage of multiple cores and the Sphinx technology itself, I'll inevitably have to split a big index into smaller ones and query them in a multi-index query. However the book doesn't go into any further details.

What are the general strategies for this? Do you simply split it in a UNION-like way, e.g.

index1: SELECT ... FROM table LIMIT 0, 1000
index2: SELECT ... FROM table LIMIT 1000, 1000
...

And then you rebuild these pieces from time to time. When search is made different cores will process these indexes in parallel? Or is it something different like separating existing items in bigger index and newer items that are being added into a smaller index? Or separating text fields into one index and attributes into another?

up vote 10 down vote accepted favorite
沙发
+100
+50

Great question.

Sphinx is utilizing one CPU core per single local index search and one CPU core for building one index while indexing

If you have two indexes you could run two indexers at the same time and utilize two CPU cores. Please beware that indexing is IO intensive task so don't run too many indexers.

Once you have two (or more) indexes you could search them at the same time by mentioning all of them in search query or using distributed index like this:

index index_main
{
        type            = distributed
        local           = index1
        local           = index2
}

where index1 and index2 are separate indexes. In this case you can search against index_main and sphinx will provide you aggregated results from both indexes

Regarding splitting data you could utilize all the techniques you want including splitting records by range, by hash or by attribute value and all the above in any combination.

My favorite one is to use modulo to determine index number like this:

For the first index:

sql_query       = SELECT id, title, description FROM <my_table> WHERE (id % 2) = 0

For second:

sql_query       = SELECT id, title, description FROM <my_table> WHERE (id % 2) = 1

This method has some drawbacks but in general it's a good start if you don't have lots of data.

好答案+1。 - Yavar 12年4月4日5:58

謝謝!希望這可以幫助。 - vfedorkov 12年4月4日8:54

@vfedorkov我使用了與你相同的方法,有多個索引並使用模數運算符(%)..但是,請你告訴我們你對%運算符有什麼缺點的意思?我們有超過5000萬個關鍵字,我在索引時使用了這個運算符,但看起來很好..雖然索引需要大約2~3個小時。我們有大約8個索引,因為我們的服務器使用了8個核心。因此我使用id%8 = 1,id%8 = 2等。 - Ronald Borla 7月25日12:01

0
votes
answers
35 views
+10

Sphinx delta indexing — still necessary to rebuild the main index?

I've been reading up on the Sphinx search engine and the Thinking Sphinx gem. In the TS docs it says...

Sphinx has one major limitation when compared to a lot of other search services: you cannot update the fields [of] a single document in an index, but have to re-process all the data for that index.

If I understand correctly, that means when a user adds or edits something, the change is not reflected in the index. So if they add a record it won't come up in searches until the entire index is rebuilt. Or if they delete a record, it will come up in searches, and then cause some kind of error or frustrating behavior.

Moreover, while rebuilding the index Sphinx is shut down. So, your app's search functionality goes off line regularly (once an hour, once every few hours), and anyone who tries to do a search then will get an error or a "try later" message.

OK, clearly none of that is acceptable in real-world app. So you pretty much have to use delta indexing.

But apparently you still need to regularly shut down your search engine and do a full indexing...

Turning on delta indexing does not remove the need for regularly running a full re-index, as otherwise the delta index itself will grow to become just as large as the core indexes, and this removes the advantage of keeping it separate. It also slows down your requests to your server that make changes to the model records.

I don't really understand what the docs are saying here. Maybe someone can help me out. I thought the whole point of delta indexing was that you don't need to regularly rebuild the index. It's updated instantly whenever the data changes.

Because rebuilding the index every hour or every anything would be totally messed up, right?

6
votes
answers
25 views
+10

Sphinx returning bad search results

I am using Sphinx with the Thinking Sphinx plugin. I have indexed a model called Venue with the following code (and the rake thinking_sphinx:index command)

define_index do
    indexes :name
    indexes city
    indexes zip
end

I obtain the results in my controller with this code:

@venues = Venue.search params[:search]

and I render them as json. The problem I have is that when I hit this URL:

http://localhost:3000/venue/list?search=Baltimo

I get nothing. But when I hit this URL:

http://localhost:3000/venue/list?search=Baltimor

I get all Venues located in the city of Baltimore. For some reason that one character makes a difference. Theoretically, I should be getting all Venues in Baltimore if I just search with one character - 'b'

Does anyone know what is going on here?

Thank you

沙发
+60
+50

Unless you have enable_star set to 1 and min_prefix_len or min_infix_len set to 1 or more, you won't get B to match Baltimore (and even then, I think you need to search for B* to get the match).

What's happening here is that by default, Thinking Sphinx tells Sphinx to use an English stemmer, which allows for similar words (by characters, not by meaning) to be considered matches, so it puts Baltimor and Baltimore in the same basket.

If you want to get any part of any word matched, then you need to put something like the following in config/sphinx.yml:

development:
  enable_star: 1
  min_infix_len: 1
test:
  enable_star: 1
  min_infix_len: 1
production
  enable_star: 1
  min_infix_len: 1

Then stop Sphinx, re-index, and restart Sphinx. Once you've done that, then searches for B* should return Baltimore.

Hope this helps.

甜蜜,只是發現在獅身人面像文件...但找不到它說在yml中的位置。謝謝! - 托尼2009年4月16日0:21

有沒有辦法讓他們不必在最後鍵入*?像搜索木材會拉起木頭和木工而不必打字木材* - Mike於2009年7月8日16:02

很好的答案。對於任何偶然發現的人,可以在freelancing-god.github.com/ts/en/advanced_config.html找到詳細信息。需要注意的一點是將min_infix_len設置為1可能會降低性能。 - domonopoly 12月12日在3:07

0
votes
answers
33 views
+10

Is there a way to query Sphinx for records that have a particular field that is not empty?

I'm using the SPH_MATCH_EXTENDED2 match mode with Sphinx 0.9.9 and I want to write a search query that finds all records that have anything in a particular field. I have tried the following with no success:

@MyField *
@MyField !""

I figure that I can add a field to my index that specifically checks for this and query against that, but I'd prefer to have more flexibility than that--it would be really nice to be able to do this through the query syntax.

Any thoughts?

1
votes
answers
17 views
+10

Sphinx search engine, a couple of quick questions

So, I'm just starting to read up on this, I've never implemented a search in PHP before. I have a couple of questions that I was wondering:

  • By the sounds of things, Sphinx needs a 'daemon', a program running in the background, to operate?
  • Say I built an index of a mySQL table, then a user uploads another record. For the search to show this record, would I have to build the index over and over, every time a user updates / creates a record?

Thanks.

沙发
+10

回答你的第一個問題:是的Sphinx帶有一個後台運行的守護進程,並在指示時執行搜索。

板凳
0

您可以使用增量索引方案,請參見此處

3
votes
answers
45 views
+10

sphinx on centos 7 cant started because of searchd.pid absent

I installed sphinx-2.2.11 on my CentOS 7

yum install -y postgresql-libs unixODBC wget 
http://sphinxsearch.com/files/sphinx-2.2.11-1.rhel7.x86_64.rpm yum 
install sphinx-2.2.11-1.rhel7.x86_64.rpm

Installation went without any errors and then I created sphinx config and installed php extension (also all without error)

I restart apache (httpd) and I tried to START sphinx service

systemctl start searchd

I got this message

Job for searchd.service failed because a configured resource limit was > exceeded. See "systemctl status searchd.service" and "journalctl -xe" for details.

after launch command - systemctl status searchd.service

May 02 20:28:57 kvmde43-10657.fornex.org systemd[1]: Failed to read PID from file /var/run/sphinx/searchd.pid: Invalid argument May 02 20:28:57 kvmde43-10657.fornex.org systemd[1]: Failed to start SphinxSearch Search Engine.

In fact I havn't "searchd.pid" anywhere at system though installation went good) How should I fix it ?

Thanks in advance

沙发
+30

I have just resolved this issue.

I took a look at sphinx log /var/log/sphinx/searchd.log and noted that some data files under folder /var/log/sphinx/data/ are Permission denied;

I set chown sphinx:sphinx on /var/log/sphinx/data/ folder and It started to work as charm )

Thanks

9
votes
answers
115 views
+10

How can Sphinx do its sorting so fast?

Let's say I search for "baby". Sphinx will grab all the documents that have "baby" in it, and then sort it using my own algorithm. (EXTENDED mode).

The question is, how can it sort so fast? How does it grab millions of records and then sort them within milliseconds?

沙发
+90

哦,你問的是魔法。Sphinx(和Lucene以及許多其他搜索引擎)正在使用倒排索引

基本上,每個文件都被切成標記; 搜索索引包括從標記到稱為發布列表的文檔的映射處理查詢包括查看查詢術語的發布列表和查找匹配文檔。為了加快速度,令牌將存儲為整數列表。通過壓縮索引可以使這更有效

Very interesting read (so far...) thanks for the link :) – Matthieu M. Nov 8 '10 at 11:20

7
votes
answers
26 views
+10

Does Sphinx auto update is index when you add data to your SQL?

I am curious as to whether or not Sphinx will auto update its index when you add new SQL data or whether you have to tell it specifically to reindex your db.

If it doesn't, does anyone have an example of how to automate this process when the database data changes?

沙发
+20
+50

正如sphinx文檔部分中有關實時索引的內容

實時索引(或簡稱的RT索引)是一個新的後端,允許您動態插入,更新或刪除文檔(行)。

因此,要動態更新索引,您只需要進行查詢即可

{INSERT | REPLACE} INTO index [(column, ...)]
VALUES (value, ...)
[, (...)]

So where do you run this SQL like statement? I am reading through the documentation but all their examples show it being queried within mysql as it is. – lockdown Sep 29 '11 at 19:02

You could issue that via your favorite MySQL client – tmg_tt Sep 30 '11 at 8:00

+30

答案是否定的,您需要告訴sphinx重新編制數據庫。

您需要了解一些步驟和要求:

  1. 主要和三角洲是要求
  2. 首次運行時,您需要索引主索引。
  3. 在第一次運行之後,您可以通過旋轉它來索引delta(以確保服務正在運行並且當時可以使用Web上的數據)
  4. 在進一步開始之前,您需要創建一個表來標記“最後編入索引的行”。最后索引的行ID可以用於下一個索引增量並將delta合併到main中。
  5. 您需要將delta索引合併到主索引。作為sphinx文檔中的內容http://sphinxsearch.com/docs/current.html#index-merging
  6. 重啟sphinx服務。

    提示:創建自己的程序,可以使用C#或其他語言執行索引。您可以嘗試Windows的任務計劃也可以。

這是我的conf:

source Main
{
type            = mysql

sql_host        = localhost
sql_user        = root
sql_pass        = password
sql_db          = table1
sql_port        = 3306  # optional, default is 3306
sql_query_pre = REPLACE INTO table1.sph_counter SELECT 1, MAX(PageID) FROM table1.pages;
sql_query       = 
    SELECT  pd.`PageID`, pd.Status from table1.pages pd
    WHERE pd.PageID>=$start AND pd.PageID<=$end 
    GROUP BY pd.`PageID`

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT MIN(PageID),MAX(PageID)
              FROM tabl1.`pages`
sql_range_step      = 1000000
}


source Delta : Main
{
sql_query_pre = SET NAMES utf8

sql_query = 
    SELECT  PageID, Status from pages 
    WHERE PageID>=$start AND PageID<=$end 

sql_attr_uint       = Status

sql_query_info      = SELECT * FROM table1.`pages` pd WHERE pd.`PageID`=$id
sql_query_range     = SELECT (SELECT MaxDoc FROM table1.sph_counter WHERE ID = 1) MinDoc,MAX(PageID) FROM table1.`pages`;
sql_range_step      = 1000000
}


index Main
{
source          = Main
path            = C:/sphinx/data/Main
docinfo         = extern
charset_type        = utf-8
}


index Delta : Main
{
    source = Delta
path = C:/sphinx/data/Delta
charset_type = utf-8
}

You do not need to restart searchd if you pass the --rotate param. – Christian Apr 1 '13 at 23:30

+20

擴展Anne的答案 - 如果您使用的是SQL索引,它將不會自動更新。您可以在每次更改後管理重新索引的過程 - 但這可能很昂貴。解決此問題的一種方法是使用包含所有內容的核心索引,然後使用具有相同結構的增量索引來僅對更改進行索引(這可以通過布爾值或時間戳列來完成)。

這樣,您可以在超常規的基礎上重新索引delta索引(更小,從而更快),然後更少地定期處理核心和delta(但最好至少每天都這樣做)。

但是否則,新的RT索引值得關注 - 你仍然需要自己更新內容,而且它與數據庫無關,所以它是一種不同的心態。另外:RT索引沒有SQL索引所具有的所有功能,因此您需要確定哪些更重要。

8
votes
answers
15 views
+10

Complex Query with Sphinx

I am using Sphinx Search. It's working fine for me except one problem: I need to exclude some entries where a specific field doesn't contain a word.

Something that would look like this in MySQL:

SELECT * FROM table
   WHERE yescolumn = 'query' 
   AND othercolumn not like '%keyword%'
沙发
+80
+50

You can use Sphinx's extended query syntax to pick the fields you want to search. Try running a query through Sphinx like this:

@yescolumn query @othercolumn -keyword

So in a PHP page you might have a link to a Sphinx database named $sphinx:

$sphinx->SetMatchMode(SPH_MATCH_EXTENDED2);
$results = $sphinx->Query('@yescolumn query @othercolumn -keyword');

More information here: http://www.sphinxsearch.com/docs/current.html#searching