The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay insc... more The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay inscriptions, written in Hebrew. These texts originate from the biblical kingdoms of Israel and Judah, and dated to the late First Temple period (8th – early 6th centuries BCE). The ostraca are almost the sole remaining epigraphic evidence from the First Temple period and are therefore important for archaeological, historical, linguistic, and religious studies of this era. This “noisy” material offers a fertile ground for the development of various “robust” image analysis, image processing, computer vision and machine learning methods, dealing with the challenging domain of ancient documents’ analysis. The common procedures of modern epigraphers involve manual and labor-intensive steps, facing the risk of unintentionally mixing documentation with interpretation. Therefore, the main goal of this study is establishing a computerized paleographic framework for handling First Temple period epigraphic material. The major research questions, addressed in this thesis are: quality evaluation of manual facsimiles; quality evaluation of ostraca images; automatic binarization of the documents and its subsequent refinement; quality evaluation of binarizations on global and local levels; identification of different writers between inscriptions (two distinct methods are proposed); image segmentation (with improvements over the classical Chan-Vese algorithm); and letters’ shape prior estimation. The developed methods were tested on real-world archaeological and modern data and their results are found to be favorable.
New Studies in the Archaeology of Jerusalem and its Region: Collected Papers, Vol. XVI, 2023
In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was ... more In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was spotted, subsequently leading to a salvage excavation carried out by the Antiquities Theft Prevention Unit and the Jerusalem Regional Office of the Israel Antiquities Authority. Among the finds from the extensively disturbed tomb were the remains of at least ten individuals, two of which were analyzed by ancient DNA methods.
Contrast is not uniquely defined in the literature. There is a need for a contrast measure that s... more Contrast is not uniquely defined in the literature. There is a need for a contrast measure that scales linearly and monotonically with the optical scattering depth of a translucent scattering layer that covers an object. Here, we address this issue by proposing an image contrast metric, which we call the Haziness contrast metric. In its essence, the Haziness contrast compares normalized histograms of multiple blocks of the image, a pair at a time. Subsequently, we test several prominent contrast metrics in the literature, as well as the new one, by using milk as a scattering medium in front of an object to simulate a decline in image contrast. Compared to other contrast metrics, the Haziness contrast metric is monotonic and close to linear for increasing density of the scattering material, compared with other metrics in the literature. The Haziness contrast has a wider dynamic range, and it correctly predicts the order of scattering depth for all the channels in the RGB image. Utilization of the metric to evaluate the performance assessment of dehazing algorithms is also suggested.
Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and cult... more Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and culture of people of long-gone eras. Paleography, the study of writing, aims at documenting the inscriptions, transliterating the texts, reconstructing their historical context, and studying the evolution of writing itself. The digital revolution gave rise to computational paleography, introducing new tools of data acquisition, image processing, and machine learning. Herein, we will provide an introduction to the emerging field of computational paleography through the lens of ancient Hebrew inscriptions, dating from the Iron Age through the Middle Ages. The years that passed since their composition had a great effect on their preservation level, including blurs, stains, and erosions; moreover, some documents tend to fade in the years after their discovery. Therefore, it is of paramount importance to promptly document ancient inscriptions using the most suitable imaging techniques, such as visible, infra-red, or multispectral imaging. Image analysis and processing techniques, such as binarizations, letter segmentation, and letters' prior estimation are valuable in their own right or may serve as a stage for subsequent tasks. We will also discuss automatic handwriting analysis and writers' identification, which could shed light on the historical background of the inscriptions.
The Materiality of Greek and Roman Curse Tablets: Technological Advances, 2022
Many issues faced by paleographers and philologists in their study of the materiality of the obje... more Many issues faced by paleographers and philologists in their study of the materiality of the objects at hand might provide obstacles that can literally make or break our ability to interact with a given text. The essays in this book show how new technologies are significantly helping in the tasks of deciphering, understanding, and restoring ancient texts written on different materials. Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts. In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.
Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. E... more Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. Excavation of the site yielded over 100 Hebrew ostraca (ink inscriptions on potsherds) dated to ca. 600 BCE, the eve of Nebuchadnezzar's destruction of Jerusalem. Due to the site's isolation, small size and texts that were written in a short time span, the Arad corpus holds important keys to understanding dissemination of literacy in Judah. Here we present the handwriting analysis of 18 Arad inscriptions, including more than 150 pair-wise assessments of writer's identity. The examination was performed by two new algorithmic handwriting analysis methods and independently by a professional forensic document examiner. To the best of our knowledge, no such large-scale pair-wise assessments of ancient documents by a forensic expert has previously been published. Comparison of forensic examination with algorithmic analysis is also unique. Our study demonstrates substantial agreement between the results of these independent methods of investigation. Remarkably, the forensic examination reveals a high probability of at least 12 writers within the analyzed corpus. This is a major increment over the previously published algorithmic estimations, which revealed 4-7 writers for the same assemblage. The high literacy rate detected within the small Arad stronghold, estimated (using broadly-accepted paleo-demographic coefficients) to have accommodated 20-30 soldiers, demonstrates widespread literacy in the late 7 th century BCE Judahite military and administration apparatuses, with the ability to compose biblical texts during this period a possible by-product.
Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of seve... more Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of several years. We are happy to dedicate this article to him and wish to express our gratitude for what has been both a prodigious and enjoyable experience. The purpose of our joint endeavor has been the introduction of modern techniques from computer science and physics to the realm of Iron Age epigraphy. One of the most important issues addressed during our cooperation was the topic of facsimile creation. Facsimile creation is a necessary preliminary step in the process of deciphering and analyzing ancient inscriptions. Several manual facsimile construction techniques are currently in use: drawing upon collation of the artifact; outlining on transparent paper overlaid on a photograph of the inscription; and computer-aided depiction via software such as Adobe Photoshop, Adobe Illustrator, Gimp or Inkscape (see Summary section below for software web links). Despite their importance for the field of epigraphy, little attention has thus far been devoted to the methodology of facsimile creation (though the recent comprehensive treatment by Parker and Rollston 2016). Recent decades have seen rapid development and consolidation of various computerized image processing algorithms. Among the most basic and popular tasks in this field is the creation of a blackand-white version of a given image, denoted as image binarization (see Fig.1a-b). Often, such a binarized image is used as a first step for further image processing missions, such as Optical Character Recognition (OCR), texts digitization and text analysis tasks. An algorithmic creation of binarizations can therefore be seen as another method of facsimile creation. Furthermore, a relatively new sub-domain of image processing, Historical Imaging and Processing (HIP), specializes in handling antique documents of different types, periods and origins. Accordingly, binarization algorithms stemming from HIP are even more suitable for archaeological purposes.
We deal with the general issue of handling statistical data in archaeology for the purpose of ded... more We deal with the general issue of handling statistical data in archaeology for the purpose of deducing sound, justified conclusions. The employment of various quantitative and statistical methods in archaeological practice has existed from its beginning as a systematic discipline in the 19th century (Drower 1995). Since this early period, the focus of archaeological research has developed and shifted several times. The last phase in this process, especially common in recent decades, is the proliferation of collaboration with various branches of the exact and natural sciences. Many new avenues of inquiry have been inaugurated, and a wealth of information has become available to archaeologists. In our view, the plethora of newly obtained data requires a careful reexamination of existing statistical approaches and a restatement of the desired focus of some archaeological investigations. We are delighted to dedicate this article to Israel Finkelstein, our teacher, adviser, colleague, an...
Proceedings of the 4th International Workshop on Historical Document Imaging and Processing, 2017
The problem of finding a prototype for typewritten or handwritten characters belongs to a family ... more The problem of finding a prototype for typewritten or handwritten characters belongs to a family of "shape prior" estimation problems. In epigraphic research, such priors are derived manually, and constitute the building blocks of "paleographic tables". Suggestions for automatic solutions to the estimation problem are rare in both the Computer Vision and the OCR/Handwriting Text Recognition communities. We review some of the existing approaches, and propose a new robust scheme, suitable for the challenges of degraded historical documents. This fast and easy to implement method is employed for ancient Hebrew inscriptions dated to the First Temple period.
Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspec... more Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspected Corpus, along with all the relevant data (e.g., letter quantities). Obs_N_Seps-the observed number of separations in the Inspected Corpus. MC_ITER-number of Monte Carlo (MC) simulations, in this case 100,000.
The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. T... more The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. Their facsimiles (binary depictions) are indispensable for further analysis. Previous attempts at mechanizing the creation of facsimiles have been problematic. Here, we present a proof of concept of objective binary image acquisition, via Raman mapping. Our method is based on a new peak detection transform, handling the challenging fluorescence of the clay, and circumventing preparatory ink composition analysis. A sequence of binary mappings (signifying the peaks) is created for each wavelength; their legibility reflects the prominence of Raman lines. Applied to a biblical-period ostracon, the method exhibits high statistical significance.
Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Nav... more Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Naveh in 2005, were re-imaged using a high-end multispectral imaging technique. The re-imaging yielded dozens of changed or added characters and resulted in renewed, larger and improved readings, hereby published. In addition, we interpret the texts of the ostraca and place them in the context of the economy and administration of Judah in the seventh century BCE.
The authors present a new method of writer identification, employing the full power of multiple e... more The authors present a new method of writer identification, employing the full power of multiple experiments, which yields a statistically significant result. Each individual binarized and segmented character is represented as a histogram of 512 binary pixel patterns-3 × 3 black and white patches. In the process of comparing two given inscriptions under a "single author" assumption, the algorithm performs a Kolmogorov-Smirnov test for each letter and each patch. The resulting p-values are combined using Fisher's method, producing a single p-value. Experiments on both Modern and Ancient Hebrew data sets demonstrate the excellent performance and robustness of this approach.
2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016
This article discusses the quality assessment of binary images. The customary, ground truth based... more This article discusses the quality assessment of binary images. The customary, ground truth based methodology, used in the literature is shown to be problematic due to its subjective nature. Several previously suggested alternatives are surveyed and are also found to be inadequate in certain scenarios. A new approach, quantifying the adherence of a binarization to its document image is proposed and tested using six different measures of accuracy. The measures are evaluated experimentally based on datasets from DIBCO and H-DIBCO competitions, with respect to different kinds of binarization degradations.
This paper suggests a new quality measure of an image, pertaining to its contrast. Several contra... more This paper suggests a new quality measure of an image, pertaining to its contrast. Several contrast measures exist in the current research. However, due to the abundance of Image Processing software solutions, the perceived (or measured) image contrast can be misleading, as the contrast may be significantly enhanced by applying grayscale transformations. Therefore, the real challenge, which was not dealt with in the previous literature, is measuring the contrast of an image taking into account all possible grayscale transformations, leading to the best "potential" contrast. Hence, we suggest an alternative "Potential Contrast" measure, based on sampled populations of foreground and background pixels (e.g. scribbles or saliency-based criteria). An exact and efficient implementation of this measure is found analytically. The new methodology is tested and is shown to be invariant to invertible grayscale transformations.
Chan-Vese is an important and well-established segmentation method. However, it tends to be chall... more Chan-Vese is an important and well-established segmentation method. However, it tends to be challenging to implement, including issues such as initialization problems and establishing the values of several free parameters. The paper presents a detailed analysis of Chan-Vese framework. It establishes a relation between the Otsu binarization method and the fidelity terms of Chan-Vese energy functional, allowing for intelligent initialization of the scheme. An alternative, fast, and parameter-free morphological segmentation technique is also suggested. Our experiments indicate the soundness of the proposed algorithm.
Proceedings of the National Academy of Sciences of the United States of America, Jan 11, 2016
The relationship between the expansion of literacy in Judah and composition of biblical texts has... more The relationship between the expansion of literacy in Judah and composition of biblical texts has attracted scholarly attention for over a century. Information on this issue can be deduced from Hebrew inscriptions from the final phase of the first Temple period. We report our investigation of 16 inscriptions from the Judahite desert fortress of Arad, datedca 600 BCE-the eve of Nebuchadnezzar's destruction of Jerusalem. The inquiry is based on new methods for image processing and document analysis, as well as machine learning algorithms. These techniques enable identification of the minimal number of authors in a given group of inscriptions. Our algorithmic analysis, complemented by the textual information, reveals a minimum of six authors within the examined inscriptions. The results indicate that in this remote fort literacy had spread throughout the military hierarchy, down to the quartermaster and probably even below that rank. This implies that an educational infrastructure ...
Cílem diplomové práce je návrh plánu implementace konceptu společenské odpovědnosti do strategie ... more Cílem diplomové práce je návrh plánu implementace konceptu společenské odpovědnosti do strategie výrobní divize společnosti M&V, s. r. o. Teoretická část popisuje koncept společenské odpovědnosti a jeho vliv na image firmy jako východisko pro praktickou a projektovou část. Cílem praktické části je na základě kvalitativního výzkumu zmapovat přístupy manažerů strojírenských firem ke společenské odpovědnosti. Nalézt oblasti, které tito manažeři považují za důležité a získat inspiraci a poznatky z praxe pro další, projektovou část. V ní je navržena první, plánovací fáze implementace konceptu společenské odpovědnosti ve výrobní divizi s ohledem na uspokojení zájmů klíčových stakeholderů a budování firemního image. Součástí návrhu je komunikační strategie pro nejdůležitější cílové skupiny stakeholderů.
The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay insc... more The thesis concentrates on computational methods pertaining to ancient ostraca - ink on clay inscriptions, written in Hebrew. These texts originate from the biblical kingdoms of Israel and Judah, and dated to the late First Temple period (8th – early 6th centuries BCE). The ostraca are almost the sole remaining epigraphic evidence from the First Temple period and are therefore important for archaeological, historical, linguistic, and religious studies of this era. This “noisy” material offers a fertile ground for the development of various “robust” image analysis, image processing, computer vision and machine learning methods, dealing with the challenging domain of ancient documents’ analysis. The common procedures of modern epigraphers involve manual and labor-intensive steps, facing the risk of unintentionally mixing documentation with interpretation. Therefore, the main goal of this study is establishing a computerized paleographic framework for handling First Temple period epigraphic material. The major research questions, addressed in this thesis are: quality evaluation of manual facsimiles; quality evaluation of ostraca images; automatic binarization of the documents and its subsequent refinement; quality evaluation of binarizations on global and local levels; identification of different writers between inscriptions (two distinct methods are proposed); image segmentation (with improvements over the classical Chan-Vese algorithm); and letters’ shape prior estimation. The developed methods were tested on real-world archaeological and modern data and their results are found to be favorable.
New Studies in the Archaeology of Jerusalem and its Region: Collected Papers, Vol. XVI, 2023
In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was ... more In December 2018, the looting of an Iron II rock-cut burial cave in the village of Abu Ghosh was spotted, subsequently leading to a salvage excavation carried out by the Antiquities Theft Prevention Unit and the Jerusalem Regional Office of the Israel Antiquities Authority. Among the finds from the extensively disturbed tomb were the remains of at least ten individuals, two of which were analyzed by ancient DNA methods.
Contrast is not uniquely defined in the literature. There is a need for a contrast measure that s... more Contrast is not uniquely defined in the literature. There is a need for a contrast measure that scales linearly and monotonically with the optical scattering depth of a translucent scattering layer that covers an object. Here, we address this issue by proposing an image contrast metric, which we call the Haziness contrast metric. In its essence, the Haziness contrast compares normalized histograms of multiple blocks of the image, a pair at a time. Subsequently, we test several prominent contrast metrics in the literature, as well as the new one, by using milk as a scattering medium in front of an object to simulate a decline in image contrast. Compared to other contrast metrics, the Haziness contrast metric is monotonic and close to linear for increasing density of the scattering material, compared with other metrics in the literature. The Haziness contrast has a wider dynamic range, and it correctly predicts the order of scattering depth for all the channels in the RGB image. Utilization of the metric to evaluate the performance assessment of dehazing algorithms is also suggested.
Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and cult... more Ancient texts are unique evidence providing a glimpse into the thoughts, day-today life, and culture of people of long-gone eras. Paleography, the study of writing, aims at documenting the inscriptions, transliterating the texts, reconstructing their historical context, and studying the evolution of writing itself. The digital revolution gave rise to computational paleography, introducing new tools of data acquisition, image processing, and machine learning. Herein, we will provide an introduction to the emerging field of computational paleography through the lens of ancient Hebrew inscriptions, dating from the Iron Age through the Middle Ages. The years that passed since their composition had a great effect on their preservation level, including blurs, stains, and erosions; moreover, some documents tend to fade in the years after their discovery. Therefore, it is of paramount importance to promptly document ancient inscriptions using the most suitable imaging techniques, such as visible, infra-red, or multispectral imaging. Image analysis and processing techniques, such as binarizations, letter segmentation, and letters' prior estimation are valuable in their own right or may serve as a stage for subsequent tasks. We will also discuss automatic handwriting analysis and writers' identification, which could shed light on the historical background of the inscriptions.
The Materiality of Greek and Roman Curse Tablets: Technological Advances, 2022
Many issues faced by paleographers and philologists in their study of the materiality of the obje... more Many issues faced by paleographers and philologists in their study of the materiality of the objects at hand might provide obstacles that can literally make or break our ability to interact with a given text. The essays in this book show how new technologies are significantly helping in the tasks of deciphering, understanding, and restoring ancient texts written on different materials. Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts. In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.
Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. E... more Arad is a well preserved desert fort on the southern frontier of the biblical kingdom of Judah. Excavation of the site yielded over 100 Hebrew ostraca (ink inscriptions on potsherds) dated to ca. 600 BCE, the eve of Nebuchadnezzar's destruction of Jerusalem. Due to the site's isolation, small size and texts that were written in a short time span, the Arad corpus holds important keys to understanding dissemination of literacy in Judah. Here we present the handwriting analysis of 18 Arad inscriptions, including more than 150 pair-wise assessments of writer's identity. The examination was performed by two new algorithmic handwriting analysis methods and independently by a professional forensic document examiner. To the best of our knowledge, no such large-scale pair-wise assessments of ancient documents by a forensic expert has previously been published. Comparison of forensic examination with algorithmic analysis is also unique. Our study demonstrates substantial agreement between the results of these independent methods of investigation. Remarkably, the forensic examination reveals a high probability of at least 12 writers within the analyzed corpus. This is a major increment over the previously published algorithmic estimations, which revealed 4-7 writers for the same assemblage. The high literacy rate detected within the small Arad stronghold, estimated (using broadly-accepted paleo-demographic coefficients) to have accommodated 20-30 soldiers, demonstrates widespread literacy in the late 7 th century BCE Judahite military and administration apparatuses, with the ability to compose biblical texts during this period a possible by-product.
Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of seve... more Our research team enjoyed the privilege of collaborating with Benjamin Sass over a period of several years. We are happy to dedicate this article to him and wish to express our gratitude for what has been both a prodigious and enjoyable experience. The purpose of our joint endeavor has been the introduction of modern techniques from computer science and physics to the realm of Iron Age epigraphy. One of the most important issues addressed during our cooperation was the topic of facsimile creation. Facsimile creation is a necessary preliminary step in the process of deciphering and analyzing ancient inscriptions. Several manual facsimile construction techniques are currently in use: drawing upon collation of the artifact; outlining on transparent paper overlaid on a photograph of the inscription; and computer-aided depiction via software such as Adobe Photoshop, Adobe Illustrator, Gimp or Inkscape (see Summary section below for software web links). Despite their importance for the field of epigraphy, little attention has thus far been devoted to the methodology of facsimile creation (though the recent comprehensive treatment by Parker and Rollston 2016). Recent decades have seen rapid development and consolidation of various computerized image processing algorithms. Among the most basic and popular tasks in this field is the creation of a blackand-white version of a given image, denoted as image binarization (see Fig.1a-b). Often, such a binarized image is used as a first step for further image processing missions, such as Optical Character Recognition (OCR), texts digitization and text analysis tasks. An algorithmic creation of binarizations can therefore be seen as another method of facsimile creation. Furthermore, a relatively new sub-domain of image processing, Historical Imaging and Processing (HIP), specializes in handling antique documents of different types, periods and origins. Accordingly, binarization algorithms stemming from HIP are even more suitable for archaeological purposes.
We deal with the general issue of handling statistical data in archaeology for the purpose of ded... more We deal with the general issue of handling statistical data in archaeology for the purpose of deducing sound, justified conclusions. The employment of various quantitative and statistical methods in archaeological practice has existed from its beginning as a systematic discipline in the 19th century (Drower 1995). Since this early period, the focus of archaeological research has developed and shifted several times. The last phase in this process, especially common in recent decades, is the proliferation of collaboration with various branches of the exact and natural sciences. Many new avenues of inquiry have been inaugurated, and a wealth of information has become available to archaeologists. In our view, the plethora of newly obtained data requires a careful reexamination of existing statistical approaches and a restatement of the desired focus of some archaeological investigations. We are delighted to dedicate this article to Israel Finkelstein, our teacher, adviser, colleague, an...
Proceedings of the 4th International Workshop on Historical Document Imaging and Processing, 2017
The problem of finding a prototype for typewritten or handwritten characters belongs to a family ... more The problem of finding a prototype for typewritten or handwritten characters belongs to a family of "shape prior" estimation problems. In epigraphic research, such priors are derived manually, and constitute the building blocks of "paleographic tables". Suggestions for automatic solutions to the estimation problem are rare in both the Computer Vision and the OCR/Handwriting Text Recognition communities. We review some of the existing approaches, and propose a new robust scheme, suitable for the challenges of degraded historical documents. This fast and easy to implement method is employed for ancient Hebrew inscriptions dated to the First Temple period.
Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspec... more Confusion_Matrices-joined results of steps A and B. Inspected_Docs-the inscriptions of the Inspected Corpus, along with all the relevant data (e.g., letter quantities). Obs_N_Seps-the observed number of separations in the Inspected Corpus. MC_ITER-number of Monte Carlo (MC) simulations, in this case 100,000.
The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. T... more The textual evidence from ancient Judah is mainly limited to ostraca, ink-on-clay inscriptions. Their facsimiles (binary depictions) are indispensable for further analysis. Previous attempts at mechanizing the creation of facsimiles have been problematic. Here, we present a proof of concept of objective binary image acquisition, via Raman mapping. Our method is based on a new peak detection transform, handling the challenging fluorescence of the clay, and circumventing preparatory ink composition analysis. A sequence of binary mappings (signifying the peaks) is created for each wavelength; their legibility reflects the prominence of Raman lines. Applied to a biblical-period ostracon, the method exhibits high statistical significance.
Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Nav... more Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Naveh in 2005, were re-imaged using a high-end multispectral imaging technique. The re-imaging yielded dozens of changed or added characters and resulted in renewed, larger and improved readings, hereby published. In addition, we interpret the texts of the ostraca and place them in the context of the economy and administration of Judah in the seventh century BCE.
The authors present a new method of writer identification, employing the full power of multiple e... more The authors present a new method of writer identification, employing the full power of multiple experiments, which yields a statistically significant result. Each individual binarized and segmented character is represented as a histogram of 512 binary pixel patterns-3 × 3 black and white patches. In the process of comparing two given inscriptions under a "single author" assumption, the algorithm performs a Kolmogorov-Smirnov test for each letter and each patch. The resulting p-values are combined using Fisher's method, producing a single p-value. Experiments on both Modern and Ancient Hebrew data sets demonstrate the excellent performance and robustness of this approach.
2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2016
This article discusses the quality assessment of binary images. The customary, ground truth based... more This article discusses the quality assessment of binary images. The customary, ground truth based methodology, used in the literature is shown to be problematic due to its subjective nature. Several previously suggested alternatives are surveyed and are also found to be inadequate in certain scenarios. A new approach, quantifying the adherence of a binarization to its document image is proposed and tested using six different measures of accuracy. The measures are evaluated experimentally based on datasets from DIBCO and H-DIBCO competitions, with respect to different kinds of binarization degradations.
This paper suggests a new quality measure of an image, pertaining to its contrast. Several contra... more This paper suggests a new quality measure of an image, pertaining to its contrast. Several contrast measures exist in the current research. However, due to the abundance of Image Processing software solutions, the perceived (or measured) image contrast can be misleading, as the contrast may be significantly enhanced by applying grayscale transformations. Therefore, the real challenge, which was not dealt with in the previous literature, is measuring the contrast of an image taking into account all possible grayscale transformations, leading to the best "potential" contrast. Hence, we suggest an alternative "Potential Contrast" measure, based on sampled populations of foreground and background pixels (e.g. scribbles or saliency-based criteria). An exact and efficient implementation of this measure is found analytically. The new methodology is tested and is shown to be invariant to invertible grayscale transformations.
Chan-Vese is an important and well-established segmentation method. However, it tends to be chall... more Chan-Vese is an important and well-established segmentation method. However, it tends to be challenging to implement, including issues such as initialization problems and establishing the values of several free parameters. The paper presents a detailed analysis of Chan-Vese framework. It establishes a relation between the Otsu binarization method and the fidelity terms of Chan-Vese energy functional, allowing for intelligent initialization of the scheme. An alternative, fast, and parameter-free morphological segmentation technique is also suggested. Our experiments indicate the soundness of the proposed algorithm.
Proceedings of the National Academy of Sciences of the United States of America, Jan 11, 2016
The relationship between the expansion of literacy in Judah and composition of biblical texts has... more The relationship between the expansion of literacy in Judah and composition of biblical texts has attracted scholarly attention for over a century. Information on this issue can be deduced from Hebrew inscriptions from the final phase of the first Temple period. We report our investigation of 16 inscriptions from the Judahite desert fortress of Arad, datedca 600 BCE-the eve of Nebuchadnezzar's destruction of Jerusalem. The inquiry is based on new methods for image processing and document analysis, as well as machine learning algorithms. These techniques enable identification of the minimal number of authors in a given group of inscriptions. Our algorithmic analysis, complemented by the textual information, reveals a minimum of six authors within the examined inscriptions. The results indicate that in this remote fort literacy had spread throughout the military hierarchy, down to the quartermaster and probably even below that rank. This implies that an educational infrastructure ...
Cílem diplomové práce je návrh plánu implementace konceptu společenské odpovědnosti do strategie ... more Cílem diplomové práce je návrh plánu implementace konceptu společenské odpovědnosti do strategie výrobní divize společnosti M&V, s. r. o. Teoretická část popisuje koncept společenské odpovědnosti a jeho vliv na image firmy jako východisko pro praktickou a projektovou část. Cílem praktické části je na základě kvalitativního výzkumu zmapovat přístupy manažerů strojírenských firem ke společenské odpovědnosti. Nalézt oblasti, které tito manažeři považují za důležité a získat inspiraci a poznatky z praxe pro další, projektovou část. V ní je navržena první, plánovací fáze implementace konceptu společenské odpovědnosti ve výrobní divizi s ohledem na uspokojení zájmů klíčových stakeholderů a budování firemního image. Součástí návrhu je komunikační strategie pro nejdůležitější cílové skupiny stakeholderů.
This article surveys ongoing research of the Legibility Enhancement of Ostraca (LEO) team of Tel ... more This article surveys ongoing research of the Legibility Enhancement of Ostraca (LEO) team of Tel Aviv University in the field of computerized paleography of Hebrew Iron Age ink-written ostraca. We perform paleographic tasks using tools from the fields of image processing and machine learning. Several new techniques serving this aim, as well as an adaptation of existing ones, are described herein. This includes testing a range of signal-acquisition methodologies, out of which multispectral imaging and Raman spectroscopy have matured into imaging systems. In addition, we deal with semior fully automated facsimile construction and refinement, facsimile, and character evaluation, as well as the reconstruction of broken character strokes. We conclude with future research directions, addressing some of the long-standing epigraphic questions, such as the number of scribes in specific corpora or detection of chronological concurrences and inconsistencies.
Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Nav... more Three Hebrew ostraca, found near Khirbet Zanu’ (Ḥorvat Zanoaḥ) and published by Milevski and Naveh in 2005, were re-imaged using a high-end multispectral imaging technique. The re-imaging yielded dozens of changed or added characters and resulted in renewed, larger and improved readings, hereby published. In addition, we interpret the texts of the ostraca and place them in the context of the economy and administration of Judah in the seventh century BCE.
The relationship between the expansion of literacy in Judah and composition of biblical texts has... more The relationship between the expansion of literacy in Judah and composition of biblical texts has attracted scholarly attention for over a century. Information on this issue can be deduced from Hebrew inscriptions from the final phase of the first Temple period. We report our investigation of 16 inscriptions from the Judahite desert fortress of Arad, datedca 600 BCE-the eve of Nebuchadnezzar's destruction of Jerusalem. The inquiry is based on new methods for image processing and document analysis, as well as machine learning algorithms. These techniques enable identification of the minimal number of authors in a given group of inscriptions. Our algorithmic analysis, complemented by the textual information, reveals a minimum of six authors within the examined inscriptions. The results indicate that in this remote fort literacy had spread throughout the military hierarchy, down to the quartermaster and probably even below that rank. This implies that an educational infrastructure that could support the composition of literary texts in Judah already existed before the destruction of the first Temple. A similar level of literacy in this area is attested again only 400 y later,ca 200 BCE.
This article surveys an ongoing research of the Legibility Enhancement of Ostraca (LEO) team of T... more This article surveys an ongoing research of the Legibility Enhancement of Ostraca (LEO) team of Tel Aviv University in the field of computerized paleography of Hebrew Iron Age ink-written ostraca. We perform paleographic tasks using tools from the fields of image processing and machine learning. Several new techniques serving this aim, as well as an adaptation of existing ones, are described herein. This includes testing a range of signal-acquisition methodologies, out of which multispectral imaging and Raman spectroscopy have matured into imaging systems. In addition, we deal with semi- or fully automated facsimile construction and refinement, facsimile, and character evaluation, as well as the reconstruction of broken character strokes. We conclude with future research directions, addressing some of the long-standing epigraphic questions, such as the number of scribes in specific corpora or detection of chronological concurrences and inconsistencies.
This article surveys an ongoing research of the Legibility Enhancement of Ostraca (LEO) team of T... more This article surveys an ongoing research of the Legibility Enhancement of Ostraca (LEO) team of Tel Aviv University in the field of computerized paleography of Hebrew Iron Age ink-written ostraca. We perform paleographic tasks using tools from the fields of image processing and machine learning. Several new techniques serving this aim, as well as an adaptation of existing ones, are described herein. This includes testing a range of signal-acquisition methodologies, out of which multispectral imaging and Raman spectroscopy have matured into imaging systems. In addition, we deal with semi- or fully automated facsimile construction and refinement, facsimile, and character evaluation, as well as the reconstruction of broken character strokes. We conclude with future research directions, addressing some of the long-standing epigraphic questions, such as the number of scribes in specific corpora or detection of chronological concurrences and inconsistencies.
A recent study shows that multispectral (MS) imaging can improve the legibility of ostraca. Sever... more A recent study shows that multispectral (MS) imaging can improve the legibility of ostraca. Several examples of Iron Age ostraca unearthed over twenty years ago in Israel (Ḥorvat ʿUza and Ḥorvat Radum) are presented, showing how new images taken with an MS system improve the reading of inscriptions that have significantly faded over time. The article provides instructions for constructing a simple and low-cost MS imaging system that yields comparable results to commercial systems
Document binary images, created by different algorithms, are
commonly evaluated based on a pre-ex... more Document binary images, created by different algorithms, are commonly evaluated based on a pre-existing ground truth. Previous research found several pitfalls in this methodology and suggested various approaches addressing the issue. This article proposes an alternative binarization quality evaluation solution for binarized glyphs, circumventing the ground truth. Our method relies on intrinsic properties of binarized glyphs. The features used for quality assessment are stroke width consistency, presence of small connected components (stains), edge noise, and the average edge curvature. Linear and tree-based combinations of these features are also considered. The new methodology is tested and shown to be nearly as sound as human experts’ judgments.
Proceedings of the 2013 ACM symposium on Document engineering - DocEng '13, 2013
Document binary images, created by different algorithms, are commonly evaluated based on a pre-ex... more Document binary images, created by different algorithms, are commonly evaluated based on a pre-existing ground truth. Previous research found several pitfalls in this methodology and suggested various approaches addressing the issue. This article proposes an alternative binarization quality evaluation solution for binarized glyphs, circumventing the ground truth. Our method relies on intrinsic properties of binarized glyphs. The features used for quality assessment are stroke width consistency, presence of small connected components (stains), edge noise, and the average edge curvature. Linear and tree-based combinations of these features are also considered. The new methodology is tested and shown to be nearly as sound as human experts' judgments.
We examine how multispectral imaging can be used to document and improve reading of ancient inscr... more We examine how multispectral imaging can be used to document and improve reading of ancient inscriptions. The research focuses on ostraca, texts written in ink on ceramic potsherds. Three corpora of Hebrew ostraca dating to the Iron Age II were imaged in visible and near infrared light using a state-of-the-art commercial spectral imager. To assess the quality of images, we used a new quality evaluation measure which takes into account various contrast and brightness transformations. We show that there exists a wavelength range where the readability of ostraca is enhanced. Moreover, we show that it is sufficient to use certain bandpass filters to achieve the most favorable image. Our study paves the way towards a low cost multispectral method of imaging ostraca inscriptions.
We examine how multispectral imaging can be used to document and improve reading of ancient inscr... more We examine how multispectral imaging can be used to document and improve reading of ancient inscriptions. The research focuses on ostraca, texts written in ink on ceramic potsherds. Three corpora of Hebrew ostraca dating to the Iron Age II were imaged in visible and near infrared light using a state-ofthe-art commercial spectral imager. To assess the quality of images, we used a new quality evaluation measure which takes into account various contrast and brightness transformations. We show that there exists a wavelength range where the readability of ostraca is enhanced. Moreover, we show that it is sufficient to use certain bandpass filters to achieve the most favorable image. Our study paves the way towards a low cost multispectral method of imaging ostraca inscriptions.
In this paper we claim that during the First Temple period, no organized or fixed system of liqui... more In this paper we claim that during the First Temple period, no organized or fixed system of liquid volume measurements existed in Judah. The biblical bath, which has been understood to be the basic measurement of the system, was not a measurement at all but a well-known vessel – the Judahite storage jar–also known as
the lmlk jar. The nēḇel and the kaḏ were two other vessels that had other uses. The lōḡ, hîn, and 'iśśārôn, which are usually termed “measurements” and considered part of the system of liquid volume measurements, were actually vessels that were part of the official Temple cult during the Second Temple period and were never part of the First Temple economy and administration.
Uploads
Thesis by Arie Shaus
Papers by Arie Shaus
Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts.
In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.
Philological editions of ancient texts, and articles in which ancient artifacts are studied, sometimes require facsimiles of the discussed finds: tablets, gemstones, and papyri. The facsimiles are especially important for certain objects when a normal photograph cannot fully capture or elucidate the writing (e.g., texts written on metal lamellae). In these and other cases, as we explore below, the production of facsimiles provides a great tool in the advancement of interacting with and understanding texts.
In this chapter we examine some possible methods of producing facsimiles of ancient objects, specifically those that have been studied within the projects led by Christopher A. Faraone and Sofía Torallas Tovar at the University of Chicago. These projects focus on Greco-Egyptian magical formularies and curse tablets written in Greek and Latin. Here we make an initial assessment of the material particularities of individual fragments and then describe different methods that can be used to produce black-and-white facsimiles of these artifacts. Finally, we explore the possibility of using automatic binarization algorithms and analyze the results obtained across different materials.
technique. The re-imaging yielded dozens of changed or added characters and resulted in renewed, larger and improved readings, hereby published. In addition, we
interpret the texts of the ostraca and place them in the context of the economy and administration of Judah in the seventh century BCE.
Iron Age ostraca unearthed over twenty years ago in Israel (Ḥorvat ʿUza and Ḥorvat Radum) are presented,
showing how new images taken with an MS system improve the reading of inscriptions that have significantly
faded over time. The article provides instructions for constructing a simple and low-cost MS imaging system
that yields comparable results to commercial systems
commonly evaluated based on a pre-existing ground truth.
Previous research found several pitfalls in this methodology and
suggested various approaches addressing the issue. This article
proposes an alternative binarization quality evaluation solution for binarized glyphs, circumventing the ground truth. Our method relies on intrinsic properties of binarized glyphs. The features used for quality assessment are stroke width consistency, presence of small connected components (stains), edge noise, and the average edge curvature. Linear and tree-based combinations of these features are also considered. The new methodology is tested and shown to be nearly as sound as human experts’ judgments.
the lmlk jar. The nēḇel and the kaḏ were two other vessels that had other uses. The lōḡ, hîn, and 'iśśārôn, which are usually termed “measurements” and considered part of the system of liquid volume measurements, were actually vessels that were part of the official Temple cult during the Second Temple period and were never part of the First Temple economy and administration.