Sefaria is constantly working to maximize connections between texts. While some connections have been manually placed, we try to automate the process as much as possible, for efficiency’s sake. In doing this, we have had the privilege of working in partnership with computer scientists Moshe Koppel and Avi Shmidman at Dicta, a research institute which is exploring uses of computational methods to analyze Hebrew texts. We have a good thing going – we provide the texts, they share the results of their research with us, and Sefaria users benefit as we incorporate the findings into our web interface and textual connections.
One of the results of this partnership is the Dibur Hamatchil matching script.
Traditionally, a commentary on a text is divided up into sections corresponding to a base text – the text on which it comments. Each section begins with a short quote from the base text, usually no more than a few words, and sometimes even just one word, called the dibur hamatchil.
If both the source text and the commentary are as long as the Tanakh or the Talmud or the Shulchan Arukh, or if people are working from different manuscripts, it becomes harder and harder to accurately match the dibur hamatchil to the base text. That’s where Dicta comes in. Using their machine learning magic, Dicta has come up with an algorithm whose accuracy far surpasses anything we’ve come up with ourselves.
How do you benefit? The script will help us build the connections between commentaries and texts on the website more quickly and more accurately so that when you click on a text, you will see a full complement of commentary and connections.
Right now on Sefaria, you can see fruits of this labor in the Mishnah Berurah, a commentary on the first section of the Shulchan Arukh, Orach Chayim. We’ve used our new algorithm to link the two texts. Going forward, we will be using this new technology to improve the linking for many more base texts and their commentaries.
If you have questions about this work, please feel free to email us at email@example.com.