From Java to Python: Migrating Search Functionality at billiger.de

Patrick Schemitz

Patrick is a Senior Scientist at solute GmbH. An avid Pythonista since 2003, his main responsibility is the billiger.de search functionality, which he (co-) wrote using first Lucene, later Solr and now SolrCloud. Besides that, he wrote the SVM-based offer categorization at billiger.de and has a keen interest in machine learning. Patrick holds a Ph.D. in particle physics from Karlsruhe university.

Abstract

Tags: solrcloud solr search python

billiger.de is a German price comparison site. Search is handled by a heavily customized Solr setup. When switching to SolrCloud earlier this year, instead of porting our custom SolrComponents to SolrCloud, we ended up re-implementing them in a Python service layer. Here we show how, and why.

Description

The search on our price comparison site billiger.de is implemented using Solr and half a dozen custom SolrComponents. When switching from Solr to SolrCloud earlier this year, we had to go over all our custom components in order to make them cluster-ready. What we ended up doing instead was re-implementing the custom functionality in a Python service layer that in turn uses stock SolrCloud. This talk describes our journey, shows some code and advocates hiding implementation details like Solr v. SolrCloud behind a service layer. Ported functionality includes boosting more successful documents, identifying brands and categories in queries, "minimum match" search and facet ranking and alternatives.