Technology
Comments 2

Staff Spotlight: Noah Santacruz

Software Engineer Noah Santacruz recently completed his Masters Degree in Natural Language Processing at Cooper Union. His thesis? “Part of Speech Handling for Aramaic in Talmud.” Or, of course, “PSHAT.”

Generally, when someone reads a sentence, they understand the parts of speech for each word based not on lexicon alone, but on contextual cues. For example, when reading the sentence, “I read a book yesterday,” an English speaker knows that the word “read” was in the past-tense, because it happened yesterday, even though it would be spelled the same in the present tense. The reader also understands the word “book” is a noun, because the sentence would make no sense if it was a verb, as in, “to book a flight.”

Computers use context cues to parse texts, too. However, when it comes to Talmud, computers (and often people as well) run into trouble parsing sentences. For one thing, the Talmud is written in a mix of Hebrew and Aramaic. For another, it doesn’t use a standard sentence structure words flow together in the two languages see what we did there?

Noah’s algorithm uses neural networks that divide the Talmud into Hebrew and Aramaic sections. It then analyzes the Aramaic sections and parses the parts-of-speech of each word, using the isolated passages and an available lexicon (his research focused on the often less understood Aramaic). His research provides scholars a better understanding of just how the Talmud was written and edited as it separates out the layers of Aramaic and Hebrew, recognizes different writing and speaking styles, and identifies certain repetitions of keywords. It also makes it easier for educators and students to break up passages of the Talmud by questions asked, answers offered, and rejections of those answers – all automatically.

Noah thanks the Comprehensive Aramaic Lexicon (CAL) for providing the dataset and Dicta for advising him throughout.

2 Comments

  1. P S GREEN says

    I have read this, and previous, posts with great interest. Would it be possible to provide examples of the sort of textual analysis you refer to?

    Like

    • wolfamelia says

      We’re glad you’ve found this interesting! We link to the thesis itself in the post and recommend you check it out.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s