Those who have tried to use the CAL database for dealing with Babylonian Talmudic Aramaic (JBA) will undoubtedly have encountered many problems not generally encountered when dealing with, say, targumic texts. In light of the interest in our work of several new projects dealing with talmudic material, we are currently reviewing the database toward the twin goals of accuracy and consistency but feel that a detailed explanation of problematic nature of the material for our users is warranted.
The JBA material was prepared by Michael Sokoloff as the basis of his magisterial dictionary. How was it done? First, during a year-long stay at the CAL, he prepared an outline lexicon of entries, based largely on Jastrow. Then he had the chosen texts entered and processed via our algorithms. From him we then received the tagged data. Then the outline lexicon, the tagged files, and the printed dictionary were all incorporated into the CAL database.
All good, then? Not at all. Herewith the problems:
a) The CAL still contains here and there lemmata found only in Jastrow but either eliminated or spelled differently in DJBA.
b) DJBA contains entries from the Talmud not included in the textual database! As was the case with the Talmud Yerushalmi, Sokoloff omitted Hebrew material from the textbase. Where Hebrew appears within extended Aramaic contexts it is marked as Hebrew and not otherwise tagged in the text. But where an isolated Aramaic word occurs in a Hebrew context it is generally not found in the database but is included in DJBA. Similarly, many DJBA (and hence CAL) entries come from variant texts that are not those included in the textbase.
c) The headwords of DJBA are quite properly in Babylonian form, first and foremost, of course, the emphatic form of nouns, but also with extensive matres lectionis. The headwords of the CAL are in standard Aramaic, i.e. in the absolute and without extensive matres. Homograph numbers also may often differ between the CAL entry and that in DJBA. This means that we have to provide extensive data tables to provide the proper correspondences. As of this writing roughly 500 lemmata still remain without collated verification of this data.
d) Lastly and most importantly, Sokoloff simply did not do his otherwise valuable work with the needs of the CAL in mind. Simple typographical errors in the tagging were never corrected. Where the tagging of a specific lemma was correct and consistent, it does not necessarily match the original outline lexicon form that served as the basis of the CAL entry, nor does it necessarily match the form chosen as the headword in DJBA. Nor are all the examples of a single lemma tagged consistently across the database.
We hope to have all of these issues (except for (b) of course) corrected within a few months, but as always any assistance or corrections will be most welcome.