MENU

Fun & Interesting

How Fuzzy Text Search Works

Big Python 16,468 4 years ago
Video Not Working? Fix It Now

In this tutorial, we explore how to make fuzzy autocomplete in Python. First using libraries like `fuzzywuzzy`, then from scratch using Levenshtein distance and the Wagner-Fischer algorithm. Finally, we look at how we can get C/C++ level of performance with Python using vectorization and NumPy. For production use, definitely prefer one of the libraries mentioned at the beginning :) If you're interested in text processing, dynamic programming, and making Python programs run fast, I hope you'll find the video useful. Let me know in the comments what tutorial you'd like to see next! https://tkarabela.github.io/bigpython https://twitter.com/BigPythonDev 🔴 SOURCE CODE 👇 https://github.com/tkarabela/bigpython/tree/master/003--fuzzy-text-search ◼️ TIMESTAMPS 🕑 00:00 Intro 00:41 Example: Fuzzy autocomplete with "rapidfuzz" 01:27 Libraries for fuzzy matching in Python 03:01 String similarity measures 04:07 Hamming distance explained 05:17 Hamming distance, implementation 05:47 Levenshtein distance explained 07:32 Levenshtein distance, implementation 08:26 Wagner-Fischer algorithm explained 13:40 Wagner-Fischer performance in Python 14:47 How to optimize it? Cython/Numba vs. vectorization 15:13 Vectorization of Wagner-Fischer algorithm 17:14 Example: Fuzzy autocomplete with vectorized Wagner-Fischer 17:53 Comparison with C implementation 18:20 Outro ◼️ REFERENCES 📚 https://en.wikipedia.org/wiki/Edit_distance https://en.wikipedia.org/wiki/Wagner%E2%80%93Fischer_algorithm https://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/ https://maxbachmann.github.io/RapidFuzz/ ◼️ TOPICS 🎓 #ProgrammingTutorial #TextSearch #BigPython #Python ◼️ CREDITS 🙏 Icons made by Freepik from www.flaticon.com

Comment