Sphinx Search Beginner's Guide
上QQ阅读APP看书,第一时间看更新

Why use Sphinx for full-text searching?

If you're looking for a good Database Management System (DBMS), there are plenty of options available with support for full-text indexing and searches, such as MySQL, PostgreSQL, and SQL Server. There are also external full-text search engines, such as Lucene and Solr. Let's see the advantages of using Sphinx over the DBMS's full-text searching capabilities and other external search engines:

  • It has a higher indexing speed. It is 50 to 100 times faster than MySQL FULLTEXT and 4 to 10 times faster than other external search engines.
  • It also has higher searching speed since it depends heavily on the mode, Boolean vs. phrase, and additional processing. It is up to 500 times faster than MySQL FULLTEXT in cases involving a large result set with GROUP BY. It is more than two times faster in searching than other external search engines available.
  • As mentioned earlier, relevancy is among the key features one expects when using a search engine, and Sphinx performs very well in this area. It has phrase-based ranking in addition to classic statistical BM25 ranking.
  • Last but not the least, Sphinx has better scalability. It can be scaled vertically (utilizing many CPUs, many HDDs) or horizontally (utilizing many servers), and this comes out of the box with Sphinx. One of the biggest known Sphinx cluster has over 3 billion records with more than 2 terabytes of size.

In one of his presentations, Andrew Aksyonoff (creator of Sphinx) presented the following benchmarking results. Approximately 3.5 Million records with around 5 GB of text were used for the purpose.

Apart from a basic search, there are many features that make Sphinx a better solution for searching. These features include multivalve attributes, tokenizing settings, wordforms, HTML processing, geosearching, ranking, and many others. We will be taking a more elaborate look at some of these features in later chapters.