Reversim Summit 2019

Full session (30 minutes)

Engineering

Algorithms

Optimization

This is the easy-to-follow story of the algorithms and optimizations used to write the fuzzysearch Python library.

It starts with the classic linear-programming algorithm for calculating the Levenshtein distance between two strings. From there I'll describe, step by step, the development of the algorithms used to implement efficient fuzzy searching. Optimization is at the heart of this, since the goal is to find fuzzy matches efficiently.

I will describe varied optimization techniques, from low-level bit fiddling in C extensions written manually and with Cython, up to analyzing the problem domain and reducing the search space. There's a surprise twist at the end which had significant impact on the project and is a great take-away, but I won't spoil it here!

Fuzzy Text Search: A Story of Algorithm Optimization

Tal Einat