Second International Conference on Natural Sciences · PDF fileSecond International Conference...

Second International Conference on Natural Sciences and Technology in Manuscript Analysis

Centre for the Study of Manuscript Cultures (CSMC), Hamburg29 February - 2 March 2016

Book of AbstractsConference Programme

CENTRE FOR THE STUDY OF MANUSCRIPT CULTURES

Preface

Dear colleagues and friends,

Welcome to the Centre for the Studies of Manuscript Cultures (CSMC) in Hamburg and to the Second

International Conference on Natural Sciences and Technology in Manuscript Analysis!

Since 2011 the Centre is engaged in fundamental research investigating a broad range of Asian,

African, and European manuscript cultures represented by material artefacts both from a historical

and comparative perspective. The variety of research fields and academic disciplines, as well as the

large number of cultures under investigation allowed for overcoming simplistic attitudes, such as

considering historically contingent European developments to be generally legitimate, or naive

dichotomies ("East‐West"), considered self‐evident not only in Europe, but also in Asia and Africa.

Within the CSMC as well as the SFB, natural sciences and informatics play an active role in shaping

interdisciplinary research by utilization of methods and devices rooted in e.g. physico‐chemical

measurement technologies ‐ thus going beyond the basic support of scholars from the Humanities.

Our long‐term goals include the establishment of an interdisciplinary research field dealing with

generalized manuscript studies, and the development of sustainable and functional tools.

The second conference dedicated to natural sciences and technology in manuscript analysis

continues our attempt of bringing together scholars and scientists from the fields of the humanities,

informatics, chemistry, physics, and biology. We hope that this conference will again provide a forum

for discussing various aspects of manuscript analysis and presenting new results, technologies, and

approaches. Contributions reflect original research work stressing the impact of the natural sciences,

technology and informatics in the following areas:

● Material analysis of writing material

● Recovering lost writing

● Image analysis of visual manuscript features

● Cutting edge techniques

In addition to the regular presentations, we invite you to participate in the Round Table discussion

that will address various issues of interdisciplinary research.

We wish all of you a great conference and a pleasant stay in Hamburg, yours

Christian Brockmann

Michael Friedrich

Oliver Hahn

Volker Märgner

Ira Rabin

H. Siegfried Stiehl

Conference Chairs and Program Committee

Invited Speakers

Marina Bicchieri (ICRCPAL, Istituto Centrale per il Restauro e la Conservazione del Patrimonio

Archivistico e Librario, Rome, Italy)

Leif Glaser (DESY ‐ Deutsches Elektronen‐Synchrotron)

Vito Mocella (CNR‐IMM‐Istituto per la Microelettronica e Microsistemi‐Unità di Napoli, Italy)

Peter A. Stokes (King's College London, UK)

Conference Chair

Michael Friedrich (Director of CSMC, University of Hamburg, Germany)

Oliver Hahn (CSMC, University of Hamburg, BAM, Berlin, Germany)

Programme Committee

Christian Brockmann (CSMC, University of Hamburg, Germany)

Oliver Hahn (CSMC, University of Hamburg, BAM, Berlin, Germany)

Volker Märgner (CSMC, University of Hamburg, Germany)

Ira Rabin (CSMC, University of Hamburg, BAM, Berlin, Germany)

H. Siegfried Stiehl (CSMC, University of Hamburg, Germany)

International Advisory Board

Roger Easton (Rochester Institute of Technology, N.Y., USA)

Gregory Heyworth (University of Mississippi, USA)

Judith Schlanger (EPHE‐Sorbonne, Paris, France)

Friederike Seyfried (Egyptian Museum and Papyrus Collection, Berlin, Germany)

Daniel Stoekl Ben Ezra (EPHE‐Sorbonne, Paris, France)

Local Organising Committee (CSMC)

Karsten Helmholz

Christina Kaminski

Daniela Niggemeier

Irina Wandrey

CONFERENCE PROGRAMME

Monday, 29 February 2016

1:00 pm Registration

2:00 pm Welcome

SESSION I: MATERIAL ANALYSIS

Session Chair: H. Siegfried Stiehl

2:15 pm M. Bicchieri (invited)

Hard Science and History

3:00 pm M. M. Khorandi, M. Gulmini, M. Aceto, A. Agostino, and H. Sayyadshahri

The non‐invasive Approaches to identify the Dyes and Pigments of the

Haft Awrang‐i Jāmi (A Persian Manuscript from 1553 AD in MAO)

3:25 pm P. Çakar

Elemental Analysis of Mesnavi from 14th Century

3:50 pm Coffee Break

4:20 pm M. Mayer

ATWISE 5242‐ A recently developed Device for imaging Watermarks in Medieval

Manuscripts

4:45 pm M. Geissbühler

Advanced Codicological Studies of Codex germanicus 6.

5:10 pm M. Delhey

Material Analysis of Buddhist Sanskrit Manuscripts preserved in Nepal

5:35 pm D. Nosnitzin und A. Brita

A Field Experience in Ink Studies: Manuscripts from Northern Ethiopia (East Tigray)

6:30 pm Reception

Tuesday, 1 March 2016

SESSION II: RECOVERING OF LOST WRITING

Session Chair: Ira Rabin

9:00 am V. Mocella (invited), E. Brun, C. Ferrero, and D. Delattre

The Quest of lost Ancient Literature: X‐ray Phase Contrast Tomography reveals the

Secrets of Herculaneum Papyri

9:45 am K. T. Knox

Image Processing Software for the Recovery of Erased or Damaged Text

10:10 am C. T. C. Arsene, P. E. Pormann, W. I. Sellers, and S. Bhayro

Computational Techniques in Multispectral Image Processing:

Application to the Syriac Galen Palimpsest

10:35 am Coffee Break

11:05 am F. Albertin, E. Peccenini, M. Bettuzzi, R. Brancaccio, M. P. Morigi, A. Patera, I. Jerjen,

S. Hartmann, and R. Kaufmann

X‐Ray Reading of Large‐size Unopened Ancient Manuscripts

11:30 am V. Lorusso and B. Pouvkova

Recovering lost Commentaries on Aristotle’s Treatise On the Heavens in Venice

Manuscript Marcianus Gr. 210

11:55 am M. Schreiner, H. Miklas, C. Rapp, R. Sablatnig, W. Vetter, B. Frühmann, F. Hollaus

The Centre of Image and Material Analysis in Cultural Heritage (CIMA) in Vienna,

Austria

12:20 pm T. Łojewski and D. Chlebda

Application of Hyperspectral Imaging for Quantitative Assessment of Conservation

Treatments for Documents

12:45 pm Lunch

SESSION III: IMAGE ANALYSIS

Session Chair: Christian Brockmann

2:30 pm P. A. Stokes (invited)

Computation and Palaeography: Where are we Now?

3:15 pm E. Arabnejad, H. Ziaei Nafchi, E. Treharne, C. Allen, and M. Cheriet

Visual Saliency for Visual Feature Analysis of Historical Manuscripts

3:40 pm Y. Elfakir, G. Khaissidi, M. Mrabti, M. A. El Yaccoubi, Z. Lakhliai, and D. Chenouni

Bag‐of‐descriptors of SIFT for Segmentation‐Free Word Spotting in Handwritten

Arabic Documents

4:05 pm Coffee Break

4:35 pm R. Cohen, K. Kedem, and J. El‐Sana

Transcript Alignment for Historical Manuscripts

5:00 pm S. Sudholt, L. Rothacker, and G. A. Fink

Simple and Effective Segmentation‐Free Word Spotting in Historic Documents

5:25 pm D. Stutzmann, T. Bluche, Y. Leydier, F. Cloppet, V. Eglin, C. Kermorvant, and

N. Vincent

Text‐Image Alignment and Automated Letter‐form Classification:

Reading vs. Looking at

5:50 pm T. Konidaris, A. L. Kesidis, and B. Gatos

A Segmentation‐Free Word Spotting Method

8:00 pm Dinner

Wednesday, 2 March 2016

SESSION IV: CUTTING EDGE TECHNIQUES

Session Chair: Volker Märgner

9:00 am L. Glaser (invited), D. Deckers, and C. Brockmann

10 Years of Iron Gall Ink X‐Ray Fluorescence Element Mapping

9:45 am F. Kergourlay, C. Andraud, A. Michelin, A. Histace, B. Lavédrine, I. Aristide‐Hastir,

and R. Lheureux

REX Project: Extraction and Processing of underlying Texts ‐ Study of a

Marie‐Antoinette secret Correspondence

10:10 am A. Garz, M. Seuret, A. Fischer, and R. Ingold

GraphManuscribble: Interact Intuitively with Digital Facsimiles

10:35 am Coffee Break

11:05 am R. Hedjam, M. Kalacska, S. S. A. Al‐ma’adeed, and M. Cheriet

Visual Literary Topology

11:30 am A. Santoro, A. Marcelli, and F. Carillo

An Interactive System to help Transcription of Historical Handwritten Documents

11:55 am Round Table

Posters

N. Akcebe

Imaging Watermarks of 15th Century Islamic Manuscript Kashf Al‐Bayan’an Sifat Al‐Hayawan

M. Bronzato, A. Zoleo, L. Nodari, C. Federici, and M. Zanetti

The Ignatius of Loyola’s Exercitia Spiritualia Autograph: Analyses before and during Conservation

Treatments

C. Colini, I. Rabin, O. Hahn

Can non‐destructive Techniques and Portable Instruments be used to analyse Ink and Paper

Degradation?

R. Farrahi Moghaddam, M. Cheriet, and S. A. Al‐Ma’adeed

Age and Fiber Structure Study Using 3D, Mesoscale Modeling and Simulation of Ink Seepage in Paper

Porous Media

B. Frühmann, F. Cappa, W. Vetter, and M. Schreiner

A combination of three complementary non‐destructive Methods applied to Historical Manuscripts

R. Hedjam, M. Kalacska, S. A. Al‐Ma’adeed, and M. Cheriet

Old Manuscript Analysis: beyond the Visible

T. Jocham, M. Marx

Scientific Analysis of Early Qur'anic Manuscripts

Y. Keheyan and G. Eliazyan

Spectroscopic Studies of Armenian Manuscripts: Paper, Inks, Pigments

M. W. A. Kesiman, J.‐C. Burie, and J.‐M. Ogier

A First Step to Balinese Script OCR: An Initial Study on Isolated Character Recognition of Balinese

Script on Palm Leaf Manuscripts

A. Kocaman

Fibre Analysis of Pattani Manuscripts

A. Rogulska and B. Łydżba‐Kopczyńska

Application of forensic Multispectral Scanner to non‐invasive Analysis of Iron Gall Inks:

a Comparison with XRF and micro‐Raman Spectroscopic Techniques

S. Snoussi

Perceptual Model with global‐local Vision Primitives for Arabic Script Recognition

D. Stoekl Ben Ezra

Why should Philologists learn Computer Vision?

Marina Toumpouri

The "decorative style" group reconsidered: A contribution to the study of twelfth and thirteenth

century production of Greek illuminated manuscripts in the Eastern Mediterranean

A. Ul‐Hasan, S. S. Bukhari, and A. Dengel

Meaningless Text OCR Model for Medieval Scripts

Abstracts

Hard science and history

Marina Bicchieri

Istituto Centrale Restauro e Conservazione del Patrimonio Archivistico e Librario (Icrcpal), Italy

Books, archival documents and graphic works of art are one amongst the most invaluable patrimony

in human history. Each single document is an open window on our history and its preservation is

paramount. Often the value of books is merely evaluated on the basis of their content, either textual

of graphical, and it is neglected the history brought by the physical support, the paper used, the kind

of ink chosen, their provenience, what are they made of, the fabrication procedures. All of these

information, stored between the pages, and somehow hidden to the eye, tell us of the long travel of

the paper used, of the technological and scientific discoveries made at the time the book was written

or drawn, they tell of the genius of whom invented an ink or a specific paper treatment, they bring

with them the evolution of aesthetics and morals and the costumes of the time.

In short terms, they are carriers of our story, of the human history. This entire incommensurable

heritage is unfortunately destined to a slow death. Supports, media, and binding are subject to

ageing and they lose their mechanical characteristics; inks can fade or induce acidity in the support,

by damaging it till reaching its complete destruction. The natural aging is a spontaneous and

irreversible process; quite slow by itself in absence of other external interferences, such as ‐ for

example ‐ the storage in unsuitable places, when other degradation processes ‐physical, biological or

chemical‐ can take place. The function of scientists in the field of conservation of Cultural Heritage is

manifold. On the one hand by investigating the structure of materials, they can understand the

nature and the causes of the degradation and find solutions to prevent a further decay. On the other

they can solve some problems or questions related to the manufacturing of the object or to its past

life, thus helping scholars in their historical studies. Moreover each discovery permits, as well, to

explore issues of the history of science. In this paper two case studies will be presented to underline

the synergy ‐positive or negative‐ between different expertises.

Leonardo da Vinci self‐portrait

In 2012 the very famous self‐portrait of Leonardo da Vinci was subjected to a completely non‐

destructive diagnostic campaign at Icrcpal (Istituto Centrale Restauro e Conservazione del Patrimonio

Archivistico e Librario). The purpose of the analyses was to assess the conservation status of the

drawing, which presented an apparent fading of the graphic medium and diffuse foxing. To this end,

surveys were accomplished in the chemistry laboratory by using molecular (Raman and Infrared)

and elemental (X‐Ray Fluorescence) spectroscopies. Alongside, Atomic Force Microscopy (AFM) was

applied to obtain a topographic description of the paper in damaged ad less damaged areas. The

topography of the paper is, in fact, related to its preservation state. The other laboratories of the

Institute performed measures in FORS, Multispectral Reflectance and microbiological studies (Misiti

2014).

Only the comparative analysis of experimental results obtained with different techniques and

methods can provide scientific information to correctly characterize the work, and to predict how the

time will alter its chemical‐physical characteristics, and which will be the expected life for the work of

art. All the techniques showed a very dramatic oxidation of the paper, caused both by chemical,

physical and biological attacks. Moreover AFM topographies demonstrated a severe decrease in the

thickness of the paper in the foxed areas, ranging in average, from 20% on the whole foxing spots to

60% in some parts of each spot, where the spectroscopic measures had shown the presence of triple

carbon‐carbon bonds (Fig.1). A restoration project, including chemical treatment for the stabilization

of the paper, was proposed, but the art historians rejected it for purely “philological” reasons,

condemning in this way the drawing to the destruction. In this case the positive synergy between the

different branches of science did not correspond to a positive synergy with the world of the

humanities, thus leding to negative results.

Fig. 1. Leonardo self‐portrait. Left: AFM topography of the paper. Right: AFM topography of the paper in a

foxing spot.

The purple Codex Rossanensis

The Codex Rossanensis is a 6th century illuminated manuscript written on purple parchment,

conserved at the Museo Diocesano in Rossano Calabro (Cosenza, Italy). In 1917‐19 the codex was

subjected to a restoration treatment, carried out by Nestore Leoni, a famous miniaturist, active from

the end of 19th century to mid 20th century. The Leoni’s intervention irreversibly modified the aspect

of the illuminated sheets. Nestore Leoni newer wrote which materials he used for the restoration. In

June 2012 the Codex arrived at Icrcpal for a complete characterization of the pigments, the support

and the materials used by Nestore Leoni, the state of conservation and for the conservative

restoration. A scientific commission was established, including paleographers, biblicists and

historians specialized in the study of illumination. The positive synergy between all the different

approaches allowed for the complete characterization of the manuscript, in its dating (mostly on the

basis of the biblical text), in the attribution of the discovered pictorial palette to a specific

geographical area. Moreover the chemistry laboratory could discover, replicate and characterize a

peculiar lake, the elderberry lake. To the author’s knowledge, this was the first time that

experimental evidence has been shown on the use of that lake in such an ancient document

(Bicchieri 2014).

Conclusions

The few examples presented in this paper allow drawing some conclusions. From the methodological

point of view, it is necessary to face all problems related to cultural heritage artefacts with a

multidisciplinary approach that includes several and complementary techniques and competences. It

should also be emphasised that no single non‐destructive technique can be claimed to be the

resolving one and that the cooperation between all the “souls” involved in conservation, from

scientists to humanists, can bring new pieces to the knowledge of our history.

References

M. Bicchieri, Environ. Sci. Pollut. R., 21(24), 14146, 2014.

M. C. Misiti, I disegni di Leonardo, diagnostica, conservazione, tutela, Sillabe, Livorno, 2014.

The non‐invasive approaches to identify the dyes and pigments of the Haft Awrang‐i Jāmi

(A Persian manuscript from 1553 AD in MAO*)

Mojtaba Mahmoudi Khorandi1, Monica Gulmini1, Maurizio Aceto2, Angelo Agostino1

and Hamed Sayyadshahri3

1Department of Chemistry, University of Turin, Italy 2Department of Science and Technological Innovation, University of Eastern Piemonte, Italy

3Department of Physics and Earth sciences, University of Ferrara, Italy

The influence of Iranian Empires led to the production of manuscripts with common linguistic‐textual

features also over the borders of political influence. Hence, the term “Persian manuscripts” covers

the old books written in Persian language dealing with various subjects (from a wide geographical

area and different epoch). After the Mongol Conquest and from the fourteenth century onwards,

supporting the artisans by rulers and patrons caused a great achievement of all luxury arts. As a

result, a renaissance of the painting art started again that its apex appeared in the seventeenth

century. In spite of partially losing the social function of painting, the great development of

illustration in books generated great advances in the artistic expression. The strong link with Persian

poems caused to create many elaborated manuscripts that can be found in the museums and

collections. One of these magnificent artworks is the Haft Awrang‐i Jāmi belonging to the Museum of

Oriental Art of Turin (MAO). This Persian manuscript, which is the subject of this paper, is a copy of

the Jāmi’s poems (1414 ‐1492 AD) created in 961 AH (1553 AD) at Shiraz, Iran, and its calligrapher

(Diagonal Nasta’liq) is Ali Al‐Khatib. The black and red inks are employed to write the text, and, nine

illuminated headpieces, one fine gilded and painted double page frontispiece nine miniatures are the

other decorative parts of this book. A wide range of colors including blue, red, yellow, black, green,

violet, turquoise, pink, gray, brown, white and orange embellish the painted pages of the book. Some

non‐invasive analytical techniques, such as Fiber Optics Reflectance Spectroscopy (FORS), Fiber

Optics Molecular Fluorimetry (FOMF) and Portable X‐Ray Fluorescence (p‐XRF) were used to identify

the utilized colors of this manuscript.

It is generally possible to recognize the inorganic pigments by combining the reflectance

spectroscopy and X‐ray fluorescence. The organic dyes are barely mentioned in the paper of

Purington and Watters ,although evidence of their large use emerged by considering some volumes

kept in libraries in Europe which were investigated in situ by UV‐visible diffuse reflectance

spectrophotometry , FOMF and p‐XRF. Therefore, the FORS technique was selected as the main

method to analyze both pigments and dyes. Moreover, to confirm the obtained results, the p‐XRF or

FOMF analysis applied to those diagnosed by FORS for pigments and dyes respectively. In order to

build up a spectral data‐base devoted to dyes possibly employed in Persian manuscripts that were

not studied before, a set of mock‐up samples was set up by considering natural dyestuffs indicated in

the comprehensive book of Nadjib Mayil Harawi , which collects articles dealing with penmanship,

ink making, papers, gilding and book binding. The paper employed as a support was obtained from

hemp by miming historical procedures whereas Althea officinalis, Anemone coronaira, Lawsonia

inermis, Berberis Vulgaris, Rheum undulatum, Curcuma longa and Crocus sativus were considered as

sources of dyes. The plants were treated according to ancient recipes to extract the dyes and used to

* Museum of Oriental Art of Turin, Italy

dye or paint the paper substrate. FORS and spectrofluorimetry equipped with fibre optics were then

employed to record the spectral features of the mock up samples. The information obtained on mock

ups were then considered for the interpretation of reflectance and fluorescence spectra use for

analyze the dyes this manuscripts. The obtained results revealed that the dyes were employed to dye

the paper support and to impart delicate hues to particular details in miniatures. According to the

results, the sprayed particles on the papers (observed by Digital Microscope) were gold. Moreover,

the dying agent of the paper are a mix of Crocus Sativus (saffron) and Curcuma longa. Furthermore,

the analysis showed that Cochineal and mix of cochineal and indigo was used for the violet color of

the faces and some dresses of people in the miniatures of this manuscript. Likewise, it was

demonstrated that giving a special properties to paint ornament (e.g. brilliance, antiseptic or glazing),

the saffron and indigo were combined with verdigris to create different hues of green in miniature

paintings and the colorant of green in those of other illustration of book is Malachite. In addition, the

applied analysis revealed the blue color of headpiece are ultramarine while indigo and ultramarine

are the color agent of blue in those of miniature, red and orange color of headpiece are just Red

ochre while for the miniatures are used Red Lead, cinnabar, and red ocher. and, the white lead,

carbon and orpiment are employed to create the white, black and yellow colors of the book

respectively. Additionally, the investigation showed that gray color consists of mix of carbon with

white lead or silver and pink color is a mix of Cinnabar and Red Lead, and, a combination of carbon

and red ochre or Carbon and unknown colorant, was detected for the brown color in this manuscript.

References

M. Aceto, A. Agostino, G. Fenoglio, A. Idone, M. Gulmini, M. Picollo, P. Ricciardi, J.K. Delane, 2014, Characteri‐sation of colourants on illuminated manuscripts by portable fibre optic UV‐visible‐NIR reflectance spectropho‐tometry, Analytical methods, DOI: 10.1039/c3ay41904e

M. Aceto, A. Agostino, G. Fenoglio, M. Gulmini, V. Bianco, E. Pellizzi, 2012, Non invasive analysis of miniature paintings: proposal for an analytical protocol, Spectrochimica Acta Part A, 91, 34‐41

M. Bacci, 2000. UV‐Vis‐NIR, FT‐IR and FORS Spectroscopies. In: E. CILIBERTO, GL Spoto, a cura di, 2000. Modern Analytical Methods in Art and Archaeology. 1° ed. (s.l.): John Wiley & Sons, Inc, 321‐361.

M. Barkeshli, 2009. Historical and scientific analysis of Iranian illuminated manuscripts and miniature painting. Golestan‐e Honar. Quarterly on the History of Iranian Art and Architecture, 5 (2(16)).

D. Cardon, 2007. Natural Dyes. Archetype publications, London.

A. Idone, 2014, Analytical techniques for the investigation of natural dyestuffs, PhD thesis, Università degli Studi del Piemonte Orientale “Amedeo Avogadro” XXVI course

Nadjib Mayil Harawi, 1993, Art of bibliopegy in Islamic civilization, Printing and publishing department of Astan Quds Razavi, Mashhad, Iran.

Qāżi Aḥmad b. Šaraf‐al‐Din Ḥosayn Monši Qomi Ebrāhimi, 18th, GOLESTĀN‐E HONAR (هنر گلستان),Entesarat‐e Bonyad‐e Farhang‐e Iran 1973.

R. Pakbaz, 2006, Persian Painting (naghashi iran az dirbaz ta emrooz, نقاشی ايرانی از ديرباز تا .Zarrin va simin (ISBN 964‐92113‐3‐0) ,( امروز

N. Purinton and M. Watters, JAIC 30 (1991) 125‐144

Elemental Analysis of Mesnavi from 14th Century

Pınar Çakar

Department of Manuscript Conservation and Archive, Manuscripts Institution of Turkey, Istanbul

The Masnavi is the masterpiece of philosopher Jalāl ad‐Dīn Muhammad Rūmī (aka Rumi) (d. 1273)

(Shakibaej and Golaiji 2012). The manuscript analysed was copied by Ahmed bin Muhammed el

Mevlevi in 1386. It is named as ‘Mesnevi‐i Şerif’ and belongs to the collection of Hacı Selim Ağa

Library (collection number: 554) in Istanbul. It was written with naskh style and has 185 sheets (Fig.

1a). It has microorganism damage on paper and fleaking pigments on some of the illuminated areas.

It also contains self‐adhesive tapes on the illuminated pages and depending on the degradation of

the binder, the tape damages the pages (Fig. 1b). Old repairs are present on the bookbinding. The

aim of the work is to determine the pigments and inks and to reveal the chemical content of paper

used at the manuscript via elemental analyses. Through these analyses palette of the pigments used

were determined.

Fig. 1a: Illuminated pages of the manuscript, Fig. 1b: Illuminated page with self‐adhesive tape

Elemental analyses were carried out via ARTAX µXRF non‐destructively. The instrument can detect

elements from Na to U but it is hard to detect light elements without helium atmosphere (Çakar

2011). The analyses were performed with adjusting the voltage at 50 kV and the current at 600 µA.

Rich colors of the illuminations were analysed and interpreted. Further study will be done for

undetected colors. An example of an analysed area and the spectrum of it is presented in Fig. 2 and

Fig. 3, respectively.

Fig. 2: Image of the gold area of the illumination

Fig. 3: µXRF spectrum of the gold area of the illumination

The data obtained were used to understand the art history of the manuscript and to choose the

appropriate conservation methods and materials.

References

P. Çakar, Tezhipli Elyazması Eserlerde Bakır ve Diğer Elementlerin Pigmentler Üzerine Etkisinin İncelenmesi,

Yıldız Technical University Department of Chemical Engineering, Published Master Thesis, İstanbul, (2011).

Z. Shakibaej and Y. Golaiji, The Effect of Mavlana’s Masnavi Manavi Narrative on the Extent of Adolescent’s

Philosophizing Questioning Skills, 4th World Conference on Educational Sciences (WCES‐2012) 2‐5 February

2012 Barcelona,Spain, Procedia‐Social and Behavioral Sciences, 46(2012), 2882‐2885.

ATWISE 5242‐ A recently developed Device for imaging Watermarks in Medieval Manuscripts

Manfred Mayer

University Library Graz, Austria

For the documentation of historical watermarks various methods have been developed. Drawings by

hand over light sources and rubbings are among the earliest and simplest, but are relatively

inaccurate. The method chosen should not affect in any case neither the paper nor the watermark.

Taking this into account, the International Association of Paper Historians (IPH), published a standard

for determining methods of watermarking (version 2.0, 1997). Dylux, beta‐radiography, X‐ray

method, transmitted light photography (VIS and IR) and others will be briefly discussed and

compared to each other. The challenge of watermark documentation begins exactly when the

watermark is superimposed by written or printed text and graphics, and therefore normal

transmitted light methods cannot be applied.

In 2011 the project CHARTA was launched at the University Library of Graz, whose aim is the

complete documentation of all papers used in the medieval manuscripts of the collection.

Understandably, watermarks are very often located in the center of the sheet and are overlaid by

writing on both sides. Very often one fails when trying to make a drawing by hand under transmitted

light. Another "classic case" is the position of the watermark in the book fold, also here it is

particularly tricky to get access to the watermark. Having this in mind and exclusing the Beta‐

radiography and thermographic method due to lack of budget we faced a serious problem. So we

decided to develop a special, new device that meets our requirements: “fast, easy, good results,

limited costs”. The point is, that by use of infrared‐photography one has a chance to largely eliminate

text, which is written in iron gall ink (Fig. 1 and 2)

Luckily iron gall ink was widely used in the Middle Ages, so we estimate that about 70 to 80% of

western medieval manuscripts that are stored in our library can be examined with that equipment

(Fig. 3). For oriental manuscripts a much lower percentage of iron‐gall ink written manuscripts may

be estimated, but their number is nevertheless high enough to play a certain role in the CHARTA‐

project.

Of course the device fulfills all the conditions of conservation, for example the binding is supported

by a book cradle and there is no significant exposure to mechanical stress. We named this machine

“ATWISE 5242”, which means “Austrian Watermark Imaging System”. 5242 stands for the

information that the dimension of the page can be up to 52 x 42 cm.

The paper describes the special characteristics and challenges in the development of this device. At

the end of the presentation a short video about its practical application will be shown.

Fig. 1: MS307 fol. 96, Image of a page under transmitted visible light (detail). The watermark is super‐

imposed by the text on both sides.

Fig. 2: MS307 fol. 96, the same detail as in figure 1, but captured with ATWISE 5242 under infrared

illumination. The watermark is clear to be seen.

Fig. 3: The Equipment: ATWISE 5242

Advanced codicological studies of Codex germanicus 6

Mirjam Geissbühler

University of Bern, Switzerland

15th century manuscript, Codex Germanicus 6, has proved to be an extremely thankful object for a

codicological investigation assisted by ink classification with the help of micro X‐ray fluorescence

analysis. A preliminary codicological study has revealed that the order of the twelve texts that

constitute the codex though written by a single scribe could not correspond to the chronology of

writing. The scribe used iron‐gall inks for the main texts and red inks for the rubrics and certain

passages at the beginning and the end of the texts.

Following the codicological studies we conducted three measurement campaigns to establish the

inks fingerprint in the main texts and the transition passages of the consecutive texts to clarify

whether they were written with the same ink. In addition, we checked the inks of the later marginal

notes, the pagination and corrections to clarify their connection with the main inks.

The study of the red inks appeared to be very fruitful: red inks range from cinnabar to mixtures of

cinnabar and lead read, and in one case of ochre. Such a variety pointed to a rather complicated

history of the manuscript production. Similarly, the fingerprint of the iron‐gall inks helped to sort the

texts according to the order of the writing.

With the help of this study we could reconstruct the history of the production of Codex germanicus 6

that could not be done using conventional codicology alone.

Material Analysis of Buddhist Sanskrit Manuscripts preserved in Nepal

Martin Delhey

Centre for the Study of Manuscripts, Hamburg, Germany

Ongoing research in the Centre for the Study of Manuscripts at the University of Hamburg (CSMC) is

devoted to the library or manuscript collection(s) of Vikramaśīla, which was one of the most

important and famous Buddhist monasteries of medieval India. In accordance with the general

approach emphasized at the CSMC, we try to gain some insight in various aspects of the physical

organization of knowledge at this library including the production and later fate of its manuscripts

rather than being interested in these manuscripts only as carriers of texts in certain states of their

transmission.

Vikramaśīla was founded by the first rulers of the East Indian Pāla dynasty in the early 9th century and

was deserted and destroyed about 1200 CE. It can be considered as fairly certain that ruins

excavated in the East of present‐day Bihar near the South banks of the Ganges are the remains of

this famous monastic establishment. There can be no doubt that most of the manuscripts produced

there are irretrievably lost. Moreover, none of those that have survived are extant in situ. However, a

significant number of manuscripts produced in this or other similar Buddhist monasteries of medieval

East India have been discovered in modern times in Nepal and Tibet. Due to the fact that only a small

minority of these important materials bear an explicit mark or note regarding their exact place of

origin, it is very hard to determine which of these manuscripts come from Vikramaśīla.

In short, one could divide the corpus of palm‐leaf manuscripts that we are examining in our present

project in three groups: Some manuscripts containing colophons that explicitly mention Vikramaśīla

as place of the production (group I); c. 15 manuscripts which have their layout and script in common

with one of the items belonging to group I (group II); a smaller group of manuscripts that differ in

some respects from those of group II in layout and script and are considered to be a Nepalese

imitation of the Vikramaśīla standard (group III).

This contribution presents the findings resulting from the material analysis of those manuscripts of

the aforementioned groups that are preserved in Nepal, viz. in the National Archives, Kathmandu

(NAK), and in the Kaiser Library (KL), which is likewise situated in Kathmandu. The analysis was

undertaken in March 2013. The colleagues of the NAK allocated a room in their precincts to us,

where we could set up our mobile laboratory, and gave us access to the required manuscripts from

their holdings. The officials of the KL, in turn, allowed us to take some of their valuable and ancient

manuscripts to the NAK. In this way, we were enabled to conduct multi‐instrumental studies on

writing materials of great antiquity and interest.

The main findings relate to arsenic in the palm leaves and the mercury enriched carbon ink in the

primary texts of all the proper Vikramaśīla manuscripts (group I) and those associated with the

monastery by codicological investigations (group II). Interestingly, the only manuscript from the

group III that we were able to examine also has been written on arsenic treated palm leaves with the

inks that display slight mercury enrichment.

Our results have an impact on the question of how the historical connection between group II and

group III manuscripts of our corpus and the group I manuscripts has to be conceived of. The

hypothesis that there is an intimate relationship between these groups was originally formed on

evidence of a different nature, especially, but not exclusively, the striking similarities regarding the

dimensions and the standardized layout of the pages. By material analysis we have discovered

further similarities and thus corroborated the hypothesis of a common or very similar origin. We

have also seen that one of these newly discovered similarities (i.e. the use of mercury) sets our

original manuscripts apart from some recognizably later additions made on them.

A Field Experience in Ink Studies: Manuscripts from Northern Ethiopia (East Tigray)

Denis Nosnitsin1 and Antonella Brita2

1Hiob Ludolf Centre for Ethiopian Studies, Hamburg, Germany 2Centre for the Study of Manuscripts, Hamburg, Germany

In the framework of the project Ethio‐Spare (supported by the European Research Council and

carried out in 2009‐2015) the research team from Hamburg had a rare opportunity to access a

number of ecclesiastic traditional libraries and to digitize and study the numerous parchment

manuscripts.

The last stage of the project work included also attempts at material studies. It was decided to focus

the attention on the study of the ink as the most important material component of the manuscript;

the intention was also to try various methods of ink analysis which could be conducted in situ, in the

field conditions of North Ethiopia, looking for those most effective and feasible, which could be

effectively used both for the description and study of a single manuscript and for other conservation

and study tasks. Empirical observation of a significant number of the manuscript lead to the

conclusion that the typology of the inks used by the Ethiopian scribes may be more diversified than it

was commonly assumed before, with methods of ink preparation not completely identical in various

periods and regions.

The research team, consulted by a specialist in manuscript material studies, started to apply the 3

colour usb microscope Dinolite for the quick reflectography of the ink in the field. At one single

occasion, it was possible to organize more extensive field study and include XRF spectroscopy

executed by the invited specialist. The speakers will present some of the results and challenges

encountered in the course of the work.

10 years of Iron Gall Ink X‐Ray Fluorescence Element Mapping

Leif Glaser1, Daniel Deckers2 and Christian Brockmann3

1Deutsches Elektronen‐Synchrotron DESY, Hamburg, Germany 2Universität Hamburg, Institut für Griechische und Lateinische Philologie, Germany

In medieval times it was common practise to reuse old books by erasing the writing and preparing

the parchment to be written upon again. In these cases the previously written text was often erased

chemically by means of bleach or other reagents, thus just removing the organic compound of the

ink, but leaving the metallic part at its place.

The metallic fingerprint of the erased inks can nowadays be visualized and sometimes used to

correlate texts from one author to different times.

In order to re‐access this writing non‐destructively acceptable standard modern techniques besides

several often very successful different methods of photographic imaging, the use of X‐rays, in

particular X‐ray fluorescence (XRF), allows an element specific probing of the writing, even if this is

covered, hidden or chemically erased. Using a small X‐ray spot while scanning the writing (a), one can

measure the elemental distribution on the parchment and after deconvolution of the different

writings (c) (making use of their different metallic fingerprints (b)), allows to access the hidden or

erased text or texts, while leaving the parchment unharmed (Young et al. 2005).

The talk will give an overview on what has been done with the technique of XRF element mapping

since the first successful text recovery at the Archimedes Palimpsest in 2006 (Bergmann 2007). Some

improvements on setup and detector side could be achieved using storage ring based light sources

and post measuring data processing (Bergmann and Knox 2009) to optimize the readability of the

results. Additionally some steps have been made towards transportable alternatives (Glaser and

Deckers 2014) with a few important developments still needed to be done, following the goal to

eventually move the measuring equipment to wherever needed and thus avoiding to transport any

historic material.

References

U. Bergmann, Archimedes brought to light, Physics World Archive, Physics World, November 2007, Institute of

Physics Publishing Bristol and Philadelphia, ISSN: 0953‐8585

U. Bergmann and K. Knox; Pseudo‐color enhanced X‐ray fluorescence imaging of the Archimedes Palimpsest,

Document Recognition and Retrieval XVI, edited by Kathrin Berkner, Laurence Likforman‐Sulem, Proc. of SPIE‐

IS&T Electronic Imaging, SPIE Vol. 7247, 724702‐1‐13 (2009).

L. Glaser and D. Deckers, Basics of fast scanning XRF element mapping for iron gall ink palimpsests, Manuscript

Cultures No.7, (2014), ISSN 1867‐9617, pp. 104‐112

G. Young et al., Effect of High Flux X‐radiation on Parchment, Canadian Conservation Institute Report No. Protus

92195, <http://www.archimedespalimpsest.org/pdf/archimedes_f.pdf>.

Image Processing Software for the Recovery of Erased or Damaged Text

Keith T. Knox

Imaging Consultant, Hawaii, USA

An imaging processing software package will be described that recovers erased or damaged text

from multispectral images of ancient documents written on parchment or paper. The software is

written in the Java programming language to make it portable to many different computer platforms.

The goal of the project is to make this package of image processing routines available for use

anywhere in the world by researchers, students, and even scholars. The architecture of the software

has been designed to make it modular, easily expanded, and easy‐to‐use with an intuitive graphical

user interface. Examples of recovered text from manuscripts from the library of St. Catherine’s

Monastery in the Sinai in Egypt will demonstrate the capabilities of the software. See

http://sinaipalimpsests.org.

This software is an adaptation of a UNIX‐based package of image processing routines, written in C by

the author between 2000 and 2013, to process the multispectral images of the Archimedes

Palimpsest project. See http://www.archimedespalimpsest.org. The UNIX operating system has the

advantage that image scanlines can be passed between the modules over UNIX pipes. As a result, a

new algorithm can be incorporated by writing a new module and including it in the UNIX command

line. This is easy for a software researcher to do, but is beyond the capabilities of a non‐technical

user. In Figure 1, an example is shown of a parchment with erased text. The processing of the

multispectral imagery was done using the UNIX software package. This capability will be available in

the new Java‐language package, but will be easier to use and will be more widely available.

Holy Monastery of St. Catherine at Mount Sinai

Fig. 1: On the left is a natural light image of a parchment page in which the erased text slightly visible. On the

right, the ultraviolet illumination has enhanced the erased writing. In the pseudocolor image, the erased text is

rendered in color, giving it increased contrast.

The move to the Java programming language was made for two reasons. First, Java is a portable

language that is available on almost all computers and operating systems. Secondly, Java comes with

tools that make it easy to create graphical user interfaces. These two features make it possible to

create an image processing package that can be used by a large number of people with varying

degrees of technical expertise.

Although Java does not implement UNIX pipes, a modular structure was created to enable each

module to be run as an independent software “thread” with an interface that enables modules to

retrieve and send processed scanlines. As a result, a new image processing capability can be easily

incorporated into the package.

The Java software package is still under development, but a preliminary user interface is shown in

Figure 2. A list of available routines is automatically created as the package starts up and is displayed

along the top. To use a module, the user simply drags it into the main body of the window. As

multiple modules are added to the processing task, links are automatically connected between

modules. In the example shown below, an image, taken in red light, is flattened. A second image,

taken in ultraviolet light, is combined with the first image in the “pseudocolor” module. The colors

are enhanced and written to a TIFF file. The task, as shown, is run in batch mode and can be applied

to any number of image files.

Fig. 2: The preliminary graphical user interface of the software package is shown. In this example, erased text is

enhanced by combining two spectral separations in pseudocolor.

There are commercial image processing systems available to process multispectral imagery. For

example, see ENVI at https://www.exelisvis.com/docs/linearspectralunmixing.html. While these

commercial packages contain many image processing features, typically, they are expensive and can

be out of the reach of many potential users.

The Java package, described in this talk, will be distributed free of charge. Currently, only the author

is developing this software package. Early in 2016, the package will be sufficiently developed to

allow other developers to join the effort. The author’s goal is to locate a few software developers

that are interested in participating in the continued development of the package. Also, if sufficient

interest exists, the author is would like to work with a few individuals to explore the capabilities of

the package to scholars.

Computational Techniques in Multispectral Image Processing:

Application to the Syriac Galen Palimpsest

Corneliu T.C. Arsene1, Peter E. Pormann1, William I. Sellers1, and Siam Bhayro2

1School of Arts, Languages and Cultures, University of Manchester, United Kingdom 2Department of Theology and Religion, University of Exeter, United Kingdom

Multispectral/hyperspectral image analysis has experienced much development in the last decade

(Kwon et al. 2013; Wang and Chunhui 2015; Shanmugam and Srinivasa Perumal 2014; Chang 2013;

Zhang and Du 2012). The application of these methods to palimpsests (Bhayro 2013; Pormann 2015;

Hollaus et al. 2012) has produced significant results, enabling researchers to recover texts that would

be otherwise lost under the visible overtext, by improving the contrast between the undertext and

the overtext. In this paper we explore an extended number of multispectral/hyperspectral image

analysis methods, consisting of supervised and unsupervised dimensionality reduction techniques

(van der Maaten and Hinton 2008), on a part of the Syriac Galen Palimpsest dataset

(http://www.digitalgalen.net). Of this extended set of methods, eight methods gave good results:

three were supervised methods – Generalized Discriminant Analysis (GDA), Linear Discriminant

Analysis (LDA), and Neighborhood Component Analysis (NCA); and the other five methods were

unsupervised methods – Gaussian Process Latent Variable Model (GPLVM), Isomap, Landmark

Isomap, Principal Component Analysis (PCA), and Probabilistic Principal Component Analysis (PPCA).

The relative success of these methods was determined visually, using color pictures, on the basis of

whether the undertext was distinguishable from the overtext, resulting in the following ranking of

the methods: LDA, NCA, GDA, Isomap, Landmark Isomap, PPCA, PCA, and GPLVM. These results were

compared with those obtained using the Canonical Variates Analysis (CVA) method [6,7] on the same

dataset, which showed remarkably accuracy (LDA is a particular case of CVA where the objects are

classified to two classes). A comparison was also made with a double thresholding and processing

technique, developed as part of this project, which consists of the following: the darker overtext is

carefully identified by the human operator and colored in white (threshold 1), and then the

remaining undertext, which is black but not as black as the overtext was, is made even darker

(threshold 2). This last technique showed some initial encouraging results, but its success depends on

the human operator selecting suitable cutting values. Figure 1 shows the results and a comparison of

the different computational techniques applied to page 102v‐107r_B of the Syriac Galen Palimpsest

data (http://www.digitalgalen.net) and for the page obtained with the ultraviolet (365 nm)

illumination with green color filter (i.e. called CFUG).

Ultimately the choice of technique is based on the preferences of the person trying to read the

manuscript and the precise makeup of the original document but easy access to an appropriate

toolset is clearly highly desirable. Further work will consist of applying other reducing dimensionality

techniques that enable the recovery of the undertext in palimpsests, as well as applying the above

techniques to the rest of the Syriac Galen Palimpsest.

a) Original picture b) Thresholding and processing c) CVA

d) GDA e) PCA f) Probabilistic PCA

Fig. 1: Comparison of different computational techniques applied to the Syriac Galen Palimpsest for multispec‐

tral image processing and enhancement.

Acknowledgment

The authors would like to thank the Arts and Humanities Research Council, United Kingdom, for

supporting this work (Research Grant AH/M005704/1 ‐ The Syriac Galen Palimpsest: Galen’s On

Simple Drugs and the Recovery of Lost Texts through Sophisticated Imaging Techniques).

References

S. Bhayro, P.E. Pormann, W.J. Sellers, Imaging the Syriac Galen Palimpsest: preliminary analysis and future pro‐spects, Semitica et Classica, vol. 6 (2013) 297‐300.

C. Chang, Hyperspectral data processing: algorithm design and analysis, Wiley (2013).

F. Hollaus, M. Gau, and R. Sablatnig, Multispectral Image Acquisition of Ancient Manuscripts, Progress in Cul‐tural Heritage Preservation, Lecture Notes in Computer Science, EuroMed, (2012), 30‐39.

H. Kwon, X.Hu, J. Theiler, A. Zare, P. Gurram, Algorithms for Multispectral and Hyperspectral Image Analysis, Journal of Electrical and Computer Engineering (2013).

S. Shanmugam, P. Srinivasa Perumal, Spectral matching approaches in hyperspectral image processing, Interna‐tional Journal of Remote Sensing, vol.35, 24 (2014).

P.E. Pormann, Interdisciplinary: Inside Manchester’s ‘arts lab’, Nature, 525 (2015).

L.J.P. van der Maaten and G.E. Hinton. Visualizing High‐Dimensional Data Using t‐SNE, Journal of Machine Learning Research, 9, (2008), 2579‐2605.

L. Wang, Z. Chunhui, Hyperspectral Image Processing, Springer (2015).

L. Zhang, B. Du, Recent advances in hyperspectral image processing, Geo‐spatial Information Science, vol. 15‐3 (2012), 143‐156.

X‐Ray Reading of Large‐size Unopened Ancient Manuscripts

F. Albertin1, E. Peccenini2,3,4, M. Bettuzzi2,3,4, R. Brancaccio2,3,4,

M. P. Morigi2,3,4, A. Patera5, I. Jerjen5, S. Hartmann6, and R. Kaufmann6

1Faculté des sciences de base, Ecole Polytechnique Fédérale de Lausanne (EPFL),

CH‐1015 Lausanne, Switzerland 2Centro Fermi, 00184 Roma, Italy

3Dipartimento di Fisica e Astronomia, Università di Bologna, 40127 Bologna, Italy 4INFN Sezione di Bologna, 40127 Bologna, Italy

5Swiss Light Source, Paul‐Scherrer‐Institute, Villigen, Switzerland 6Center for X‐ray Analytics, Swiss Federal Laboratories for Materials Science and Technology, Duben‐

dorf, Switzerland

In recent experiments (Albertin et al. 2015/1; Albertin et al. 2015/2 Albertin et al. 2015/3), we

successfully used X‐ray tomography to read texts inside ancient manuscripts. As an example, Fig. 1

shows a reconstructed portion of a 200‐page handwritten physics book from the 18th century. Our

tests did not use centralized synchrotron facilities: advanced microfocus X‐ray sources provided

sufficient contrast and resolution.

Fig. 1: X‐ray tomography reconstruction of handwriting from inside a 1790 physics book. The reconstructed

portion of the book exhibits readily recognizable characters and words. The tomography was based on raw

projection radiographs obtained with a laboratory‐based microfocus source (Albertin et al. 2015/3).

Also recently, we coupled this technique with photogrammetry that produced accurate 3‐

dimensional renderings of the objects. The combination provides correlated information on the

content and structure of the manuscripts.

The use of tomography to analyze ancient manuscripts (Albertin et al. 2015/1; Albertin et al. 2015/2

Albertin et al. 2015/3) is the response to multiple challenges: (1) “reading” unopened volumes and

scrolls; (2) in general, avoiding as much as possible the manipulation of the specimens, to prevent

possible damage (we observed no radiation effects in our tests); (3) in the long term, the rapid and

non‐invasive digitization of large historical collections like the Archivio di Stato in Venice – the target

of our “Venice Time Machine” project (http://vtm.epfl.ch/). The foundation of the technique is the

widespread use throughout Europe of inks containing X‐ray‐absorbing heavy elements. Indeed, our

chemical analysis detected “iron gall” black inks in all the specimens we used so far, over 6 centuries.

The first stage of our program (Albertin et al. 2015/1) used X‐rays emitted by centralized synchrotron

sources. The beam quality was outstanding, but it forced us to move specimens outside their normal

environment, traveling over long distances. This obviously limited the potential applications.

This limitation was overcome thanks to an important recent success: the use of laboratory‐based X‐

ray instrumentation without an unacceptable loss of quality suitable for analysis of large‐area

manuscripts. Figure 1 is one of several recent results of this kind.

We are now dealing with a challenging obstacle on the path of large‐scale application: automatic

separation of individual pages ‐‐ from manuscripts that are typically warped and sometimes rolled.

We experimented with advanced algorithms, obtaining promising results. However, automatic

segmentation remains a formidable challenge: we will discuss the present problems and the possible

solutions.

Besides text recognition, X‐ray techniques can also deliver a wealth of additional information on: (1)

the substrates microstructure; (2) the writing process (i.e., paleographic facts such as the “ductus”);

(3) chemical data on inks, both black and colored (that typically contain heavy elements); (4)

manuscript structural features such as seals and watermarks; (5) in general, the “hidden” structure of

the specimens. Potentially, it could also contribute to the current studies of the ink‐substrate

interactions, in particular ink‐induced damage.

Our main target, however, remains text recognition. The recent successes in applying the approach

to specimens with a large number of pages (see again Fig. 1) open up exciting possibilities in that

direction ‐‐ corroborating and complementing the important results recently obtained, for example,

by Mocella et al. (2015) on the Herculaneum papyri.

References

F. Albertin, A. Astolfo, M. Stampanoni, E. Peccenini, Y. Hwu, F. Kaplan and G. Margaritondo, J. Synchrotron Rad. 22, 446 (2015)

F. Albertin, A. Patera, I. Jerjen, S. Hartmann, E. Peccenini, F. Kaplan, M. Stampanoni, R. Kaufmann and G. Mar‐garitondo, Microchemical J. (2015), In Press

F. Albertin, E. Peccenini, Y. Hwu, Tsung‐Tse Lee, E. B. L. Ong, J. H. Je, F. Kaplan and G. Margaritondo, Proc. In‐tern. Conf “Digital Heritage” (2015), p. 5

V. Mocella, E. Brun, C. Ferrero, and D. Delattre, Nature Commun. 6, 5895 (2015)

Recovering lost commentaries on Aristotle’s treatise On the Heavens in

Venice manuscript Marcianus Gr. 210

Vito Lorusso and Boriana Pouvkova

Centre for the Study of Manuscripts, Hamburg, Germany

The manuscript Marcianus Gr. 210, written in the late twelfth or early thirteenth century on oriental

paper and kept at the Biblioteca Nazionale Marciana of Venice, consists of 207 leaves and contains

three of Aristotle’s works devoted to natural philosophy, namely On the Heavens on leaves 1r‐80v, On

Generation and Corruption on leaves 80v‐122v, and Meteorology on leaves 123r‐207r.

The text of Aristotle’s works is enriched with several commentaries written by the main scribe in the

margins of almost every page of the manuscript. Marcianus gr. 210 has suffered badly from the

ravages of time. More specifically, the manuscript is faded and damaged by water with the result

that nearly all the commentaries that were written in the margins are not visible anymore to the

naked eye. In the course of a multispectral image campaign in October 2014, the Hamburg SFB‐

project Z01 provided better data from the manuscript. This talk will present some results from the

research carried on these new data.

The Centre of Image and Material Analysis in Cultural Heritage (CIMA) in Vienna, Austria

Manfred Schreiner1, Heinz Miklas2, Claudia Rapp3, Robert Sablatnig4, Wilfried Vetter1,

Bernadette Frühmann1, and Fabian Hollaus4

1Institute of Science and Technology in Art, Academy of Fine Arts Vienna, Austria 2Institute of Slavic Studies, University of Vienna, Austria

3Institute of Byzantine and Modern Greek Studies, University of Vienna, Austria 4Computer Vision Lab, Vienna University of Technology, Austria

The inter‐university Centre of Image and Material Analysis in Cultural Heritage (CIMA) was founded

in early 2014 within the framework of the HRSM‐project (HRSM: Hochschul‐Raum‐Struktur‐Mittel /

Structural fund for the Austrian higher education area), Higher Education Plan 2013 of the Austrian

Federal Ministry of Science and Research.]. The main aim of this centre is the “Analysis and

Conservation of Cultural Heritage – Modern Imaging and Material Analysis Methods for the

Visualization, Documentation and Classification of Historical Written Material (Manuscripts)”.

Specialized in research in the fields of imaging, image enhancement and analysis as well as the non‐

invasive chemical analysis of materials used for the production of historical objects, CIMA represents

a unique facility with an interdisciplinary approach to the investigation of cultural heritage. The

centre brings together the expertise of three disciplines from three universities: Philology (University

of Vienna), Computer Science (Vienna University of Technology) and Chemistry (Vienna Academy of

Fine Arts). The main idea behind the foundation of CIMA was to extend and strengthen co‐operations

by establishing a central laboratory that offers its services to universities, libraries, museums, private

collections etc.

One part of CIMA concerns MultiSpectral Imaging (MSI), which enables in combination with digital

image processing on one side enhancing the readability of palimpsests and damaged manuscripts

and, on the other, certain automated investigations of the codicology and palaeography of

manuscripts such as layout, line structure, or identification of scribes. In the second part so‐called

non‐destructive / non‐invasive analytical techniques such as x‐ray fluorescence (XRF), UV‐Vis,

reflection infrared (FTIR) and Raman spectroscopy are applied for e.g. manuscripts, in order to

determine pigments and/or inks used for the illumination and text of manuscripts. This combination

facilitates the creation of new and improved data in the humanities. Until now, CIMA has applied its

methodology and technical expertise to badly preserved or rewritten manuscripts (palimpsests) from

the 8th to the 14th centuries (mainly in Slavic, Greek and Latin). The material investigations aim at the

identification of the inks and pigments used in contrast to the supporting material (presently the

focus is on parchment).

In the course of the project, a common database will be created which contains the information

gained from the imaging, image enhancement, chemical and philological investigation. The final

objective of CIMA’s is to compare the data generated in the course of its research, to reveal

correlations stemming from multiple modalities (writing material and its preparation, inks and

pigments, reflectivity etc.) in order to advance the research agenda at the intersection of science and

the humanities.

Application of Hyperspectral Imaging for Quantitative Assessment of

Conservation Treatments for Documents

Tomasz Łojewski1 and Damian Chlebda2

1AGH University of Science and Technology, Faculty of Materials Science and Ceramics,

Krakow, Poland 2Jagiellonian University, Faculty of Chemistry, Krakow, Poland

Conservation procedures performed on documents (eg. consolidation, deacidification, cleaning,

disinfection) often lead to various kinds of changes in their appearance.

Evaluation and documentation of the desired changes as well as the unwanted ones is in the first

place based on optical methods ‐ primarily digital photography (or flat bed scanning) and/or

colorimetric measurements. Consumer digital cameras or scanners can produce images with very

high spatial resolution but do not provide sufficient spectral information to determine colorimetric

indices (eg. CIE L*a*b*).

Colorimetric measurements are restricted to relatively large areas of homogeneous color on a

studied document, which practically limits its use to paper (or parchment) substrate and does not

allow to record and monitor possible alterations induced on the writing media. Hyperspectral

imaging allow to overcome these difficulties offering both spatial and spectral resolution needed to

complete such a task.

In the presentation a detailed workflow would be ascribed for monitoring side effects of

conservation treatments on paper based documents with the use of a scanning hyperspectral system

comprised of a VisNIR camera (Headwall Photonics) and a broadband illumination source (xenon

lamp).

A set of colorimetric standards and model samples of modern inks on paper was used to test and

prove the procedure of datacube collection, normalization, registration and recalculation from

reflectance spectra to CIE L*a*b* color values. The method was applied to monitor in a quantitative

way color changes for archival documents subjected to two novel conservation treatments ‐ (1) cold

plasma and (2) essential oils disinfection. A comparison with data obtained with a filter‐based

multishot imaging system (7 spectral lines in VIS) with monochrome camera (Point Grey/CMOSIS) will

also be provided.

Computation and Palaeography: Where are we Now?

Peter A. Stokes

Department Digital Humanities, King’s College London, United Kingdom

The primary purpose of this lecture is to provide a survey of the field, focussing on developments

since the 2012 Dagstuhl Perspectives Workshop on ‘Computation and Palaeography: Potentials and

Limits’ (Hassner et al. 2013), in which a number of issues were discussed and identified as essential

to future development in the use of digital methods in the analysis of handwriting and other related

topics. Although this was by no means the first such conference, it was perhaps one of the more

significant in terms of bringing together expertise in palaeography, digital humanities and informatics

at an important time when many thousands if not millions of digital images were being produced in

large‐scale digitisation projects. However, three years is a long time in this field, and so it is worth

revisiting the discussions that were held there and asking where we are now, and where we might

want to go next.

The ‘manifesto’ from that workshop identified a number of areas for future development. Most of

these were not technological or algorithmic but related much more to aspects of communication and

collaboration. They can be broadly summarised into three overall headings:

1 Access to and sharing of data and images, including standards, metadata, harmonisation of

copyright and intellectual property.

2 Access to and sharing of results and methods, including tools, libraries and resources.

3 Increased communication and understanding particularly between disciplines, including add‐

ressing problems of terminology, developing meaningful ontologies and ‘mid‐level features’,

avoiding ‘black boxes’, addressing questions of context and meaning.

Since the publication of the ‘manifesto’, a Dagstuhl Seminar took place on ‘Digital Palaeography: New

Machines and Old Texts’ during which many of these questions were revisited (Hassner et al. 2014).

The conclusions then were that the problems of the ‘black box’ were at least much more widely

recognised than before, and that concerted effort was being made to address the problem. However,

it was also recognised more explicitly than before that the ‘black box’ applies not only to the

computer but also to the human specialist. This point had been made before (Davis 2007; Schomaker

2007; Stokes 2009), but raising it explicitly here changed the question somewhat from one of

obscurity to one of trust: not ‘how can we know what is happening in the box’ but ‘how can we (and

should we) trust others’ conclusions?’ Research brings with it a responsibility to be as transparent as

possible and also to challenge and question each other’s results, yet an interdisciplinary context

necessarily requires a wider range of expertise than any one individual can reasonably be expected

to understand. Indeed, it was also recognised that ‘digital palaeography’ is perhaps still not

interdisciplinary enough: that must involve more than just palaeography and image analysis,

expanding to include other areas ranging from ‘GLAM’ institutions (galleries, libraries, archives and

museums) through palaeography, codicology, history, art history, linguistics; infrastructure

development and support; image analysis, knowledge representation, UI and UX; and so on. Further

questions were raised about the possibility of creating toolboxes or suites of web services, of

reviewing and acknowledging the very different metrics for success in different disciplines.

Clearly there has been much progress in the last few years, both in methods and techniques in the

image analysis of visual manuscript features, and also in the more ‘social’ aspects of this area of

research. Old questions remain, however, and new ones are opened up by recent developments. The

promise of toolboxes, VREs, suites of web services and so on has been made for some time but

results have not yet become widespread: why this is so requires examination. The question of

measures for success seems fundamental here and easy to overlook: if different disciplines do indeed

have truly different goals, then how can we make these explicit in order to address them properly

and ensure that all parties can genuinely benefit from the work that is being done? This is perhaps

part of the reason why many palaeographers still consider image analysis to have no value for their

research: because the goal for them is not to analyse images or generate data but to understand

aspects of human history and culture, and how to get from the former to the latter is still not

sufficiently clear. If we can soon achieve very good results for key problems as some have suggested,

such as wordspotting, image segmentation, identification of allographs, and writer identification,

then what will the consequences be for our research (and what would ‘very good results’ mean to

different people in different fields)? What questions can be addressed by existing methods but have

not yet so far been considered (for some examples see (Hassner et al. 2014) and especially (Stokes

2015))? These and other related questions will be raised and addressed, particularly through

examination of existing projects and perhaps less common approaches, with a view not towards

establishing the state of the art in terms of algorithms and computational methods, but rather to

broaden the discussion, widening the context in the hope of inspiring some new thoughts and

directions of research from which all parties might benefit.

References

T. Davis, The practice of handwriting identification, The Library 7th series, 8, 251–76 (2007). 10.1093/library/8.3.251

T. Hassner, M. Rehbein, P.A. Stokes, L. Wolf, Computation and palaeography: Potentials and limits, Dagstuhl Manifestos 2, 14–35 (2013). 10.4230/DagMan.2.1.14

T. Hassner, R. Sablatnig, D. Stutzmann, S. Tarte. Digital palaeography: New machines and old texts, Dagstuhl Reports 4, 127–8 (2014). 10.4230/DagRep.4.7.112

L. Schomaker, Advances in writer identification and verification, in: Proc. of 9th Int. Conf. on Document Analysis and Recognition (ICDAR), 2, 1268–73 (2007). 10.1109/ICDAR.2007.4377119

P.A. Stokes, Computer‐aided palaeography, present and future, in: M. Rehbein et al. (ed), Kodikologie und Paläographie im Digitalen Zeitalter — Codicology and Palaeography in the Digital Age, Books on Demand, Nor‐derstedt, 2009, pp. 313–42. urn:nbn:de:hbz:38‐29782.

P.A. Stokes, Digital approaches to palaeography and book history: Some challenges, present and future, Front. Digit. Humanit. 2 (2015). 10.3389/fdigh.2015.00005

Visual Saliency for Visual Feature Analysis of Historical Manuscripts

Ehsan Arabnejad1, Hossein Ziaei Nafchi1, Elaine Treharne2, Celena Allen3, Benjamin L. Albritton4,

and Mohamed Cheriet1

1 Synchromedia Laboratory, École de Technologie Supérieure, Montreal, Canada, H3C 1K3 2 Department of English, Stanford University, CA, USA

3 Center of Spatial and Textual Analysis, (Cesta), Stanford, CA, USA 4 Stanford University Libraries, Stanford, CA, USA

Introduction

Visual feature extraction and analysis is very important step towards categorization and

understanding of historical manuscripts. While human visual system (HVS) can easily recognize and

localize significant features in historical images, automatic detection of salient features is not an easy

task. The salient features could be text or graphics with salient colors, or with salient shapes.

Depending on the documents under study, analysis of such features may reveal important

information about organization and structure of manuscript such as beginning of new significant

section, new text or important section of the text, beginning of new chapter and etc. Also detection

and analysis of these visual features help us to investigate the relation and interaction between

authors and writers. Current visual saliency detection algorithms are not designated to deal with

documents and degraded documents in particular. Degradations in historical documents often have

irregular patterns that might be considered as salient features by mistake. While color saliency based

methods cannot deal with the gray‐scale images, shape‐based methods that are not using color

information are not able to detect salient color regions. Meanwhile, there is no dataset of historical

document images with associated ground truths that can be used to evaluate different saliency

detection algorithms. Our project is in collaboration with Stanford University, part of Digging into

Data project. In this project 198 manuscripts from 11th, 12th, and 13th centuries are selected as target

for feature extraction. Some of them belong to one century while others have spanning of two or

three centuries. The authors used different colors, shapes and decorations for organization of

manuscripts which help readers for better understanding. The aim was to extract the salient color

regions (characters) in these document images. After extraction of features, the goal was to classify

them into four categories: i) Litterae Notabiliores, ii) Enlarged Capitals, iii) Rubric, and IV) Intertextual

space. The extraction of colored characters is in fact a segmentation problem. Gaussian mixture

model with expectation maximization and also K‐Means are classical approaches for color image

segmentation. There are three main problems that make the use of these approaches at least

interesting. The images in this study are degraded, the number of classes varies from one image to

another, and that the background color might be very similar to the colored text. In the following, we

explain the proposed method that we developed to overcome the aforementioned problems.

Proposed Method for Feature Analysis

The proposed method uses a new color saliency technique, as well as the color saliency method of [1]

to segment document images. In [1], instead of computing the gradient of the images from a

luminance channel, the color images are boosted and a gradient map is computed from that boosted

image. The advantage of using this gradient map is that edge strengths at colored contours often

have higher magnitudes. This is in contrast to the traditional edge detection methods that work on a

luminance channel. Since background often has less gradient information, it can be distinguished

from the texts with the same color as background. This approach, however, cannot always

distinguish between non‐salient text and the salient text (colored). Therefore, we propose a simple

and efficient color saliency method to classify image pixels into non‐salient and salient color pixels.

Our assumption is that variation of the three color channels (RGB) for colored pixels is high. For each

pixel, the standard deviation of three channels is computed. To take into account different colors,

this process is repeated three times on the original image, variance normalized image, and variance

normalized image with red channel being divided by 2. The three saliency maps are found to cover

wide range of colors which are enough to classify salient and non‐salient regions. From each of these

three maps, pixels with higher variation are selected and a binary map is generated by thresholding.

The binary maps are then combined. The binary image obtained from color gradient [1] and the

binary image generated with the proposed method is combined to form the final segmented image.

Next step is to analyze the extracted features. The goal is to automatically assign a label to each

feature. Litterae Notabiliores features are made of components with different colors, while only one

color is used to write capitals. Therefore, we simply use the second order image statistics, e.g.

standard deviation, of the three channels to classify features into Litterae Notabiliores and capitals.

Also, entropy was used to measure the amount of “business” for each feature. Both standard

deviation and entropy for capitals should be small in comparison with the Litterae Notabiliores. For

final decision, a support vector machine was trained and used. Rubrics are other interesting features

in this dataset. Similar to the capitals, rubrics are made of just one color. To distinguish between

capitals and rubrics, layout analysis was employed as if the detected color saliency matched with

specific constrains of text‐lines, columns or generally layouts, it will be labeled as Rubric. The last

features that should be detected are Intertextual space. For this purpose, structure of document and

layout are analyzed to find the spaces that are expected to be text according to layout contrariness

but kept void.

The performance of the proposed method for classification of four features in terms of recall and

precision is listed in Table I. The time complexity for processing such a big datasets is very important.

The proposed color saliency method is very simple and fast, while color gradient algorithm of [1] has

a moderate complexity. Therefore, we could process all of the images in the dataset in a relatively

short time.

Litterae Notabiliores Enlarged Capitals Rubrics InterTextual Space

Recall Precision Recall Precision Recall Precision Recall Precision

0.61 0.95 0.50 0.83 0.70 0.81 0.75 0.60 Table I. The classification performance of the proposed method for four features.

Acknowledgement

The authors would like to thank DiDC (NSERC RGP DD‐13) project, and also NSERC of Canada for their

financial support.

Reference

J. Weijer, T. Gevers, A. D. Bagdanov, Boosting color saliency in image feature detection, in: IEEE Transactions on Pattern Analysis and Machine Intelligence. 28, 1 (2006).

Bag‐of‐descriptors of SIFT for Segmentation‐Free Word Spotting in Handwritten Arabic Documents

Y. Elfakir1, G. Khaissidi1, M. Mrabti1, M. A. El Yaccoubi2, Z. Lakhliai1, and D. Chenouni1

1LIPI / ENS, Fes, Morocco 2SAMOVAR, Télécom SudParis, CNRS, Université Paris‐Saclay, France

Old manuscripts are a part of the richest cultural heritage and legacy of civilizations. Repetitive

manual manipulation of fragile documents should be avoided as it could destroy them. Digitalization,

therefore, is a convenient solution for the preservation of these manuscripts. Many digitization

projects, which treat Latin scripts, have been developed such as manuscripts d’Oc and d’Oïl in the

Vatican Library (IRHT 2011), Better Access to Manuscripts and Browsing of Images (Calabretto et al.

1999), etc. The conception of recognition systems for degraded handwritten Arabic document

images knows today a great expansion and appears as a necessity in order to exploit the wealth of

information contained in ancient manuscripts.

This paper deals with the problem of query‐by‐example word spotting in handwritten Arabic

documents. This operation needs a lot of time and effort to do by manual inspection. Many existing

architectures on word spotting based on text, word or line segmentation steps (Rath and Manmatha

2003; Elfakir 2015) are used in the recognition systems to facilitate the search. However, any

segmentation errors of the document affect the subsequent word representations and matching

steps. This explains why research on word spotting and retrieval is oriented towards segmentation‐

free methods. Gatos and Pratikakis (2009) present an approach applied to historical printed

documents. The proposed method is based on document image block descriptors that are used in a

template matching process. Rothacker et al. (2013) propose to combine the Bag‐of‐visual‐word

representation with Hidden Markov Models in a patch‐based segmentation‐free framework in

handwritten documents. Almazán et al. (2014) represent document images by a grid of HOG

descriptors and a sliding‐window approach is used to locate in the document the regions that are

most similar to the query.

We address the search problem by using a Bag of Visual Words (BoVW) powered by Scale‐invariant

feature transform (SIFT) descriptors. The BoVW method, based on a histogram of occurrence counts

of words, is a popular technique for image classification inspired by models used in natural language

processing. This representation does not take into account the spatial distribution of the visual words.

To solve this problem, we use the Spatial Pyramid Matching method proposed by Lazebnik et al.

(2006). Then, the Latent Semantic Analysis method introduced by Landauer et al. (1998) is applied to

represent the local region descriptors in order to solve the ambiguity and redundancy of individual

visual words in the document. Finally, to reduce the memory of local regions descriptors and the

computational cost of searching the nearest neighbors, we encode the SIFT descriptors using the

Product Quantization (PQ) method (Jégou et al. 2011). The latter consists of decomposing the space

into a Cartesian product of low dimensional subspaces and of quantizing each subspace separately.

The proposed method was applied to handwritten Arabic document images from the Ibn Sina

dataset (Moghaddam et al. 2010) and other Arabic documents. The obtained results are satisfactory

in terms of recognition rate and execution time.

References

J. Almazán, A. Gordo, A. Fornés, and E. Valveny, Segmentation‐free word spotting with exemplar SVMs, Pattern Recognition 47 (12), (2014), pp. 3967–3978.

S. Calabretto, A. Bozzi, J.‐M. Pinon, Numérisation des manuscrits médiévaux : le projet européen BAMBI, in: Actes du colloque Vers une nouvelle érudition: numérisation et recherche en histoire du livre, Rencontres Jacques Cartier, Lyon. décembre 1999.

Y. Elfakir, G. Khaissidi, M. Mrabti, Z. Lakhliai, D. Chenouni, and M. Elyacoubi, Contribution à l’indexation des documents manuscrits arabes scannés, Mediterranean Telecommunication Journal Vol. 5, N° 2 (2015).

B. Gatos and I. Pratikakis, Segmentation‐free word spotting in historical printed documents, in: International Conference on Document Analysis and Recognition, Proceedings, (2009), pp. 271–275.

IRHT, coord. Maria Careri (Université de Chiet ‐ membre associé à l’IRHT), Anne‐Françoise Leurquin et Marie‐Laure Savoye (tt://jonas.irht.cnrs.fr/2011 – 2021).

H. Jégou, M. Douze, and C. Schmid, Product quantization for nearest neighbor search, IEEE Trans. Pattern Anal. Mach. Intell. 33 (1) (2011), pp. 117–128.

S. Lazebnik, C. Schmid, and J. Ponce, Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, in: international Conference on Computer Vision and Pattern Recognition, Proceedings of the IEEE Computer Society, (2006), pp. 2169–2178.

T. Landauer, P. Foltz, and D. Laham, “Introduction to Latent Semantic Analysis,” Discourse Processes, (1998), 25, pp. 259‐284.

R. F. Moghaddam, M. Cheriet, M. M. Adankon, K. Filonenko, and R. Wisnovsky, “IBN SINA: A database for re‐search on processing and understanding of Arabic manuscripts images”, Proceedings of DAS’10, June 9‐11, 2010, Boston, MA, USA

T. M. Rath and R. Manmatha. Word image matching using dynamic time warping, in: international Conference on Computer Vision and Pattern Recognition, Proceedings, (2003) volume 2, pp. 521–527.

L. Rothacker, M. Rusiñol, and G. Fink, Bag‐of‐features HMMs for segmentation‐free word spotting in handwrit‐ten documents, in: 12th International Conference on Document Analysis and Recognition, Proceedings, (2013), pp. 1305–1309.

Transcript Alignment for Historical Manuscripts

Rafi Cohen, Klara Kedem, and Jihad El‐Sana

Department of Computer Science, Ben‐Gurion University of the Negev, Israel

The recent efforts which have been invested in digitizing libraries, have exposed historical datasets to

scholars and the general public. However, the documents in these datasets are stored as images and

not as text, which makes searching, indexing and retrieval challenging tasks. Sometimes an ASCII

transcript is supplied together with the document’s image. A mapping (aligning) of each word in the

transcript to the corresponding word image in the document will simplify and accelerate accessing

and processing the manuscripts. In addition to allowing searching and indexing of the document

images, alignment provides an automatic way for ground truth generation, which, in turn, can be

used to evaluate various document retrieval and recognition algorithms.

Transcript alignment methods suggested in the literature can be roughly divided into two categories

depending on whether or not character/word recognition models are used. Recognition based

methods usually perform better, but require more preprocessing for training the character/word

recognizer (Yin 2013). Methods that do not use recognition models, usually reduce the problem into

an matching problem between features extracted from the line image, and features generated for

the ASCII transcript, using a matching technique such as Dynamic Time Warping (DTW) (Rabaev et al.

2015) or some other heuristics based matching methods, e.g., Hassner et al. (2013), Stamatopoulos

et al. (2014).

The work is inspired by the work done in the speech recognition community for speech‐to‐phoneme

alignment (Keshet2007). It uses the Structured Support Vector Machine (S‐SVM) framework for

learning a weight vector that separates the correct alignment sequences from incorrect ones.

In the alignment problem, we are provided with a line image which is accompanied with a sequence

of events (characters) and the goal is to align each of the events in the sequence with its

corresponding position in the line. The goal is to find the start time of each event in the input line,

where we also consider the space between words as an event. More formally, we represent a line

image as a sequence of feature vectors x x , x , . . , x and the sequence of events is denoted by

e e , e , . . , e . In our problem, each input is a pair x, e and the output is an alignment of x with e. That is, a sequence of start‐times y y , y , . . , yK , where y ∈ 1, . . , T is the start‐time of the

event ek in the line image. Our goal is to learn an alignment function, denoted f, which takes as input

the pair x, e and returns an event timing sequence y.

We use the Structured Support Vector Machine (S‐SVM) framework for predicting the correct

alignment. The S‐SVM is a machine learning algorithm that generalizes the SVM classifier. Whereas

the SVM classifier supports simple output, such as, binary classification, regression, etc. the S‐SVM

allows training of a classifier for predicting complex labels.

The first step towards a solution is to define a quantitative assessment of alignments. Let x, e, y be

a training example and let f be an alignment function. We denote by γ y, f x, e the cost of

predicting the timing sequence f x, e where the true timing sequence isy. In this work we use the cost function defined in Eq. (1). In words, the above cost is the average number of times the absolute

difference between the predicted timing sequence and the true timing sequence is greater thanε.

γ y, y1|y|

i: y y ε 1

We describe a large margin approach for learning f. Recall that a learning algorithm for alignment

receives as input a training set S x , e , y , . . , x , e , y , and returns an alignment function f.

To facilitate an efficient algorithm, we confine ourselves to a restricted class of alignment functions.

Specifically, we assume a predefined set of base alignment feature functions, ϕ , where

ϕ x, e, y returns the confidence of ϕ in the suggested timing sequence.

We denote by ϕ x, e, y the vector in , whose jth element isϕ x, e, y . The alignment functions

we use are of the form given in Eq. (2), where w ∈ is a vector of importance weights that we

need to learn. In words, f returns a suggestion for a timing sequence by maximizing a weighted sum

of the confidence scores returned by each base alignment function ϕ . The actual computation of

the arg max operator is done using dynamic programming.

f x, e argmax w ∙ϕ x, e, y 2

We now describe a large margin approach for learning w from the training set S. We try to rank the

sequences according to their quality. Ideally, for each instance x , e and for each possible

suggested timing sequencey′, we would like constraint (3) to hold. That is, w should rank the correct timing sequence y above any other possible timing y′ by at least γ y, y .

w ∙ ϕ x , e , y w ∙ ϕ x , e , y′ γ y, y 3

We follow the SVM approach and define the optimization problem (4), where eachξ 0, is a slack variable that indicates the loss of the ith example.

minizew,

12||w|| C ξ subject to {∀i, y |w ∙ ϕ x , e , y w ∙ ϕ x , e , y′

γ y, y ξ 4

Equation (4) cannot be solved using a standard solvers, since the number of constraints is

exponential. Therefore we solve it using an iterative algorithm that is similar to the Perceptron

algorithm (Rosenblatt 1957).

We define four base functions, our first base function is a character recognizer based on the HOG

descriptor combined with linear SVM. In particular, to train a classifier for a character, e, we extract

positive and negative examples for e from the training set. Our second and third base functions are

binary indicator functions, which aim at capturing transitions between events. The second function is

based on projection profile, where we compute the strict local minima of the projection profile for

the line image and for each column return a binary indicator whether it’s within the vicinity of such a

local minima. The third base function is based on connected components in the binarized image. We

scan the line image from left to right, and whenever we encounter within a column a new connected

component, the column and its neighboring columns are marked as 1. Our last base function scores

timing sequences is based on character length. It merely examines the length of each character, as

suggested byy, compared to the typical length of that character in the training set.

Our Structured Support Vector Machine(S‐SVM) method was tested on several datasets and provided

encouraging results. On the Saint Gall dataset, and it outperformed the results in (Fischer et al. 2011).

In particular we obtained the following results: Accuracy = 97:61%, Precision = 97:61%, Recall =

99:71%. Fig. 1 illustrates four examples of the alignment. The left 4 lines (a) are taken from (Fischer

et al. 2011), whereas the right 4, are the result of our algorithm. We mark the boundaries of words

by dark bars, and the beginning of characters by light bars.

(a) (b)

References

V. Fischer, A. Frinken, Forn´es, and H. Bunke. Transcription alignment of Latin manuscripts using Hidden Mar‐kov Models. In the Workshop on Historical Document Imaging and Processing (HIP’11), pages 29–36. ACM, 2011.

T. Hassner, L. Wolf, and N. Dershowitz. OCR‐free transcript alignment. In the 12th International Conference on Document Analysis and Recognition (ICDAR’13), pages 1310–1314, 2013.

J. Keshet, S. Shalev‐Shwartz, Y. Singer, and D. Chazan. A large margin algorithm for speech‐to‐phoneme and music‐to‐score alignment. IEEE Transactions on Audio, Speech, and Language Processing, 15(8):2373–2382, 2007.

I. Rabaev, R. Cohen, J. El‐Sana, and K. Kedem. Aligning transcript of historical documents using dynamic pro‐gramming. In Document Recognition and Retrieval XXII (DRR’15), IS&T/SPIE, pages 94020I1–94020I9.

F. Rosenblatt, The Perceptron‐‐a perceiving and recognizing automaton. Report 85‐460‐1, Cornell Aeronautical Laboratory, 1957.

N. Stamatopoulos, B. Gatos, and G. Louloudis. A novel transcript mapping technique for handwritten document images. In the 14th International Conference on Frontiers of Handwriting Recognition (ICFHR’14), pages 41–46, 2014.

F. Yin, Q. Wang, and C. Liu. Transcript mapping for handwritten Chinese documents by integrating character recognition model and geometric context. Pattern Recognition, 46(10):2807–2818, 2013.

Simple and Effective Segmentation‐Free Word Spotting in Historic Documents

Sebastian Sudholt, Leonard Rothacker, and Gernot A. Fink

Department of Computer Science, TU Dortmund University, Dortmund, Germany

Word spotting is the task of searching words in document images without explicitly transcribing the

documents first. Instead, possible matches are ranked according to their relevance with respect to

the query (Bluche et al. 2016). Although, a complete transcription would be preferable, as it allows

for manual and automatic processing of the document far beyond searching, transcriptions of

historic document images are hard to obtain in practice. Automatic recognizers usually fail unless the

variability in the script’s visual appearance is low or huge amounts of annotated training material are

available. Especially for historic documents these prerequisites are hardly met (Rothacker et al. 2014).

Word spotting methods, on the other hand, are much more robust in this regard. The search is

directly modeled as a retrieval problem instead of implementing the search on top of a classification

result. Users, therefore, benefit even if there are errors in the recognition, as long as the relevant

results are in the top ranks of the retrieval list (Frinken et al. 2012).

An important characteristic of word spotting systems is the input modality of query words, usually

given as exemplary image (query‐by‐example) or textually (query‐by‐string) (Lladós et al. 2012). In

query‐by‐example scenarios the query is an exemplary occurrence of the search term that has to be

selected in the document image by the user. While otherwise no annotated training material is

required and the complexity of such systems is relatively low, the drawbacks are limitations with

respect to the feasible variability in the script’s visual appearance and the user’s effort of locating the

query first. Query‐by‐string word spotting systems do not suffer from the aforementioned

disadvantages but require annotated training material and are a lot more complex in comparison (cf.

Frinken et al. 2012). For practical applications of word spotting, the retrieval database consists of

entire document images. In order to perform retrieval on word or line level, one approach is to

heuristically segment document images. These methods often require preprocessing, like

binarization, and assume a priori knowledge about the visual appearance of text. Due to

degenerations in historic documents originating from writing materials, storage or age, such

assumptions lead to errors as they will not be valid in general (cf. Lladós et al. 2012). Subsequent

steps in the recognition pipeline are doomed to fail if they are relying on perfect segmentations and

are sensitive to segmentation errors. One way of approaching this problem is the development of

fully segmentation‐free methods.

One of the first word spotting methods for historic documents that has been evaluated without any

dependency on given line or word segmentations was presented in (Leydier et al. 2007). Query words

are retrieved by detecting and matching zones‐of‐interest in the document image. The authors also

emphasize that their method does not require any binarization. Future approaches to segmentation‐

free word spotting were mainly inspired by successful methods from Computer Vision. While in

content‐based image retrieval hardly any assumptions with respect to the visual appearance of

scenes and objects are possible, the same approaches can be applied to retrieving word images. In

Rusiñol et al. (2015) and Almazán et al. (2014) methods are presented that are built on Bag‐of‐

Features Spatial Pyramids (Rusiñol et al. 2015) and Histogram‐of‐oriented‐Gradients representations

(Almazán et al. 2014). Retrieval is performed in patch‐based frameworks where densely sampled

patches are encoded in lower dimensional vector spaces that allow for very fast performance. The

high accuracy of both methods shows the features’ robustness with respect to patches that do not

exactly match with occurrences of the query in the document image. In Rothacker et al. (2014) we

presented a hierarchical method using inverted file structures for rapidly detecting regions of interest

in a first stage. Afterwards, these regions are examined with Bag‐of‐Features HMMs in a patch‐based

framework for highly accurate retrieval. Patch‐based frameworks approach the segmentation

problem by simply considering all possible word positions. Unfortunately, this leads to huge search

spaces, the rapid exploration of which requires indexing strategies as in (Almazán et al. 2014;

Rothacker et al. 2014; Rusiñol et al. 2015). Furthermore, the patch size, as well as orientation, is

crucial for the retrieval performance. Setting the size of the patches to the same size as the query

leads to good results as long as the writing style is homogeneous (Almazán et al. 2014; Rothacker et

al. 2014). For increased flexibility patches at a few sizes have been extracted in Rusiñol et al. (2015).

However, all of these attempts are far from evaluating all possible patch sizes for a given query. In

our experiments we will show that this is insufficient for handling larger writing style variability as it

can be found in the Bentham word spotting data set (cf. Puigcerver et al. 2015). In order to address

these problems of spotting words on historic document images, we present a simple and effective

approach achieving state‐of‐the‐art results. Using basic off‐the‐shelf methodology, we won the

learning‐free (query‐by‐example) track of the ICDAR 2015 Keyword Spotting Competition (Puigcerver

et al. 2015). Feature representations for patch‐based retrieval frameworks have already proven to be

robust against segmentation errors as patches overlapping only partly with relevant words receive

reasonably high similarity scores. For that reason we propose to use simple techniques for word

segmentation (based on binarization and connected component analysis) and apply simple word

representations from patch‐based segmentation‐free word spotting, similar to Rusiñol et al. (2015).

In the future we see two interesting lines of research. Given that it is sufficient to have the relevant

word segments within a list of possible word segments that will be ranked during retrieval, extracting

more variants in the segmentation process can increase accuracy. An alternative lies in the

possibilities of sequence models. Bag‐of‐Features HMMs have successfully been applied in fully

segmentation‐free word spotting and can be extended to decoding exact word positions. This is

different to the method presented in Frinken et al. (2012) because no perfect line segmentation is

required.

References

Jon Almazán, Albert Gordo, Alicia Fornés, Ernest Valveny.Segmentation‐free word spotting with exemplar SVMs. Pattern Recognition 47(12): 3967‐3978 (2014).

Volkmar Frinken, Andreas Fischer, R. Manmatha, Horst Bunke:A Novel Word Spotting Method Based on Re‐current Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 34(2): 211‐224 (2012).

Yann Leydier, Frank Lebourgeois, Hubert Emptoz.Text search for medieval manuscript images. Pattern Recog‐nition 40(12): 3552‐3567 (2007).

Josep Lladós, Marçal Rusiñol, Alicia Fornés, David Fernández Mota, Anjan Dutta.On the Influence of Word Representations for Handwritten Word Spotting in Historical Documents. IJPRAI 26(5) (2012).

Leonard Rothacker, Marcal Rusinol, Josep Llados, Gernot A. Fink. A Two‐Stage Approach to Segmentation‐Free Query‐by‐Example Word Spotting. manuscript cultures, 1(7): 47‐57 (2014).

Marçal Rusiñol, David Aldavert, Ricardo Toledo, Josep Lladós.Efficient segmentation‐free keyword spotting in historical document collections. Pattern Recognition 48(2): 545‐555 (2015).

Joan Puigcerver, Alejandro H. Toselli, Enrique Vidal: ICDAR 2015 Competition on Keyword Spotting for Hand‐written Documents. In Proc. Int. Conf. on Document Analysis and Recognition, Nancy, France, 2015.

Text‐Image Alignment and Automated Letter‐form Classification: Reading vs. Looking at

Dominique Stutzmann1, Théodore Bluche2, Yann Leydier3,4, Florence Cloppet4, Véronique Eglin3,

Christopher Kermorvant5, and Nicole Vincent4

1Institut de Recherche et d’Histoire des Textes (CNRS – UPR 841), France 2A2iA, Paris, France

3LIRIS Laboratoire d’Informatique en Image et Systèmes d'information, Lyon, France 4LIPADE Laboratoire d’informatique Paris Descartes Université Paris Descartes, France

5 Teklia, Paris, France

This paper presents and compares two automated letter‐form classification methods in order to

enhance the production and analysis of text‐image alignment. The methods were applied on a large

corpus ‘GRAAL’ (130 pages, 10’700 lines, 114’268 words, and more than 400’300 characters),

available online (http://catalog.bfm‐corpus.org/qgraal_cm). The first method is based on Deep

Neural Networks and Hidden Markov Models, which achieve state‐of‐the‐art text recognition

accuracy (Bluche et al. 2014; Bluche 2015), and is primarily used and is primarily used to align the

text of a digital scholarly edition with the image of a medieval manuscript at page, line, word, and

character level. The alignment results have already been published and outmatches any other

attempt by other teams so far (Stutzmann et al. 2015). In this method, the existence of diverse letter‐

forms and other graphical phenomena in the sequence of letters are modelled, so that the computer

may apply a letter‐form classification during the process of aligning text and image and classify the

characters or the sequences of characters according to the classification without any information in

the ground‐truth. Four phenomena were modelled: ‘allographs’ (variant letter‐forms, e.g. d/D/ꝺ,

r/ꝛ/R, and s/ſ/S), ‘ligature’ (specific forms combining two subsequent letters, e.g. ff, ss, and st),

‘conjunction’ (connected characters or overlap between two letters, e.g. de, bo…) and ‘elision’

(suppression of the initial stroke of a letter after some specific letters, Bluche et al 2016). Results are

published (http://oriflamms.a2ialab.com/Charsegm/Graal/collage.html?chars=LIG_st). As for the

alignment, the accuracy is extremely high (e.g. 100% for 5224 occurrences of ligature ‘st’). Lesser

results are a consequence of unequal repartitions (no ‘vertical d’ in the corpus, so that the ‘uncial d’

were modelled in two classes; very few ‘round s’ so that the second class of ‘s’ gathers all ‘round s’

but also occurrences of ‘ſ’). Ligatures are prominent graphical features and are very adequately

identified. The modelled ligature ‘ez’ is a proof, because the Computer led palaeographers to revise

the notion of ligature for this sequence of letters. All in all, the results obtained while aligning, that is

by knowing which letter has to be modelled within the sliding window, are very good, and allow for

in‐depth and exhaustive palaeographic analysis. The second applied method is a learning‐free

classification of the crops of aligned characters as obtained from the first method. This method has

been developed to enhance and further analyse the letter‐forms for which we had not modelled

graphic differences, esp. in order to obtain a neat cluster of precisely aligned letters and thus foster

palaeographical analysis. This method has been developed in two steps. In the first we considered

that neatly aligned and well extracted characters would build a homogeneous group and therefore

developed a process to separate correctly aligned characters (assuming they correspond to a dense

population in the representation space) from outlying badly aligned characters (assuming they

correspond to sparse and scattered elements). Rather than just comparing colour pixels, we compare

gradients (i.e.: mathematical derivatives) that make use of each pixel's neighbourhood to describe

the local orientation of the curves and the contrast. Besides we did not limit the comparison to one

point of view but used multiple views in order to highlight the differences between characters. The

results did not correspond to the expectations and proved that artefacts on the cropped image

(overlapping ascenders or descenders, ink deficiencies, and ligatures and connected characters) yield

far more consequences than expected. This first result is of high importance for palaeographers,

because it confirms that reading is not only about recognizing a known letter‐form, it is also about

discarding all the not meaningful parts in the image of the script. Basing on this result and in order to

allow for a closer analysis, we decided to increase the number of classes. For testing this new path,

we used a cascading classification process on a ground‐truth of 627 letter samples (letter ‘r’). At each

step, the most heterogeneous class is divided into two subclasses. The resulting hierarchical

clustering brings more information than a flat collection of unrelated classes. Having up to thirty

classes for each letter allows for an analysis that goes beyond the usual allographs used in Latin

palaeography and identification of classes. Though such fine classification was not expected, the

information has to be integrated in our future conclusions. In parallel to this work, the alignment and

validation software developed since 2013 (Leydier at al. 2014) has been further developed to allow

the user to use the automated classifications and to tag and annotate whole classes at once. This has

now to be used as a historical tool for research in manuscript studies.

Fig. 1: Oriflamms software, with automated classification of letter 'm'

Acknowledgement

The research is funded by ORIFLAMMS research project (Ontology Research, Image Feature,

Letterform Analysis on Multilingual Medieval Scripts). References: ANR‐12‐CORP‐0010 (Agence

Nationale de la Recherche / Cap Digital), 2013‐2016, http://www.agence‐nationale‐

recherche.fr/projet‐anr/?tx_lwmsuivibilan_pi2[CODE]=ANR‐12‐CORP‐0010

References

T. Bluche, H. Ney, and C. Kermorvant, A Comparison of Sequence‐Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling for Handwriting Recognition, in: L. Besacier, A.‐H. Dediu, and C. Martín‐Vide (ed.), Statistical Language and Speech Processing, Springer International Publishing, 2014, 199‐210.

T. Bluche, Deep Neural Networks for Large Vocabulary Handwritten Text Recognition, PhD thesis, Université Paris‐Sud, 2015.

D. Stutzmann, T. Bluche, A. Lavrentiev, Y. Leydier, and C. Kermorvant, From Text and Image to Historical Re‐source : Text‐Image Alignment for Digital Humanists, in: Digital humanities 2015, Sydney, 2015.

T. Bluche, D. Stutzmann, and C. Kermorvant, Automatic Handwritten Character Segmentation for Paleograph‐ical Character Shape Analysis, [submitted to] DAS2016 – Document Analysis Systems.

Y. Leydier, V. Eglin, S. Bres, and D. Stutzmann, Learning‐Free Text‐Image Alignment for Medieval Manuscripts, in: 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2014, pp. 363–368.

A Segmentation‐Free Word Spotting Method

Thomas Konidaris1, Anastasios L. Kesidis2, and Basilis Gatos3

1Centre for the Study of Manuscript Cultures, Hamburg, Germany 2Department of Surveying Engineering, Technological Educational Institution of Athens, Greece

3Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, Athens, Greece

We present, a two‐step segmentation‐free word spotting method for historical printed documents

(Konidaris et al. 2015). The first step involves a minimum distance matching between a query

keyword image and a document page image using SIFT keypoints. In the second step of the method,

the matched keypoints on the document image serve as indicators for creating candidate image

areas. The query keyword image is matched against the candidate image areas in order to properly

estimate the bounding boxes of the detected word instances. The method is evaluated using two

datasets of different languages and is compared against segmentation‐free state of the art methods.

The experimental results show that the proposed method outperforms significantly the competitive

approaches.

Introduction

Image analysis for historical manuscripts can be a challenging task. Complex layouts, degradations

and unknown fonts are some of the factors that play a crucial role to the exploitation of their

invaluable content. One of the active research areas is the localization of textual information, namely,

word spotting, directly on the manuscript images without the need of any OCR procedures. Queries

are images that can either be interactively selected by the user, or selected from a list of predefined

images.

In the literature there are two main categories of word spotting methods. The first includes

segmentation based methods (Kim et al. 2005; Rath and Manmatha 2007) while the second concerns

segmentation‐free methods (Leydier et al. 2007; Gatos and Pratikakis 2009). The later assumes that

the processed documents have not undergone any kind of segmentation.

The proposed method falls into the segmentation‐free category. It consists of two distinct steps that

involve SIFT keypoint matching, the creation of candidate image areas and the final bounding box

localization. The bounding boxes are constructed using the RANSAC (Fishler and Bolles 1981)

algorithm and homographies.

Proposed Method

In the proposed method, we follow a segmentation‐free word spotting approach due to the poor

results that usually characterize the segmentation process of historical documents. We are based on

SIFT features (Lowe 2004) that have been proved to provide robustness concerning low image quality

and image degradation. Furthermore, SIFT features are scale and rotation invariant.

The first step of the method involves the matching of the query keyword keypoints with the

keypoints of an entire document page. Instead of following the traditional SIFT matching, we

calculate the K most similar keypoints to each keypoint of the query keyword.The reason is that

traditional SIFT scatters the points and when there are multiple instances of the desired word on a

document page, the algorithm fails to correctly localize the keypoints. From the produced point

correspondences we create candidate image areas. That way, we narrow the search space for the

rest of our method. The candidate image areas are created based on the relative position of the

query keyword keypoints and the scale information of the matched keypoints. An example is shown

in Figure 1.

Fig. 1: Candidate image areas that were created from the corresponding keypoints of the document page

The next step of the method is to perform an additional matching between the query keyword image

and the candidate image area. Again, the SIFT keypoints are used. This process will serve as the final

bounding box localization. The bounding boxes are constructed using the RANSAC algorithm and

homographies. We use the keypoint correspondences in order for RANSAC to create a model that

will produce a 3 x 3 homography matrix H. This matrix is used in order to create the final bounding

boxes on the document image. An example is shown in Figure 2. Overlapping bounding boxes as well

as irregularly shaped ones are pruned.

Fig. 2: The application of the homography matrix H that is calculated based on the point correspondences be‐

tween the query keyword image and the candidate image area

Experiments

The method has been tested into two different datasets. The one is based on a German book and the

other is based on a Greek. The main load of experiments concerned the German dataset which

consists of 100 pages and 100 keywords. The method showed better performance than other

competitive methods. Results are presented in Figure 3 and on Table 1 and are based on two

different evaluation parameters. TREC‐Eval was also used.

Table 1: Results (bold indicated best performance)

Proposed Leydier et.

al.[4]

Gatos et.

al.[5]

MAP 0.795 0.776 0.689 0.584

Geometric

0.751 0.747 0.640 0.503

R Prec 0.771 0.774 0.675 0.604

Bpref 0.927 0.921 0.871 0.625

Reciprocal

1.000 1.000 0.985 1.000

Proposed Leydier et.

al.[4]

Gatos et.

al.[5]

MAP 0.836 0.808 0.730 0.615

Geometric

0.809 0.787 0.697 0.545

R Prec 0.796 0.792 0.700 0.626

Bpref 0.967 0.990 0.907 0.651

Reciprocal

1.000 1.000 0.995 1.000

Fig. 3: P‐R Curves for the two evaluation parameters

References

M. A. Fishler and R. C. Bolles (1981) A Paradigm for Model Fitting with Applications to Image Analysis and Au‐tomated Cartography. Commun Assoc Comput Mach, 24(6):381–395.

B. Gatos and I. Pratikakis (2009) Segmentation‐free Word Spotting in Historical Printed Documents. In: 10th international conference on document analysis and recognition (ICDAR’09), Barcelona, Spain, pp 271–275.

S. Kim, S. Park, C. Jeong J. Kim, H. Park, and G. Lee (2005) Keyword Spotting on Korean Document Images by Matching the Keyword Image. In: Digital libraries: implementing strategies and sharing experiences, vol 3815, pp 158–166.

T. Konidaris, A. L. Kesidis, B. Gatos (2015) A Segmentation‐free Word Spotting Method for Historical Printed Documents. Pattern Analysis and Applications (PAA).

Y. Leydier, F. LeBourgeois, and H. Emptoz (2007) Text Search for Medieval Manuscript Images. Pattern Recogni‐tion, 40:3552–3567.

D. G. Lowe (2004) Distinctive Image Features from Scale‐Invariant Keypoints. International Journal Computer Vision 60(2):91–110.

T. M. Rath and M. Manmatha (2007) Word Spotting for Historical Documents. International Journal of Docu‐ment Analysis and Recognition (IJDAR), 9(24):139–152.

The Quest of lost Ancient Literature: X‐ray Phase Contrast Tomography reveals

the Secrets of Herculaneum Papyri

Vito Mocella1, Emmanuel Brun2,3, Claudio Ferrero2, Daniel Delattre4

1CNR‐IMM‐Istituto per la Microelettronica e Microsistemi‐Unita` di Napoli, Italy 2ESRF—The European Synchrotron, Grenoble, France

3 Inserm, U836, Grenoble, France 4CNRS‐IRHT‐Institut de Recherche et d’Histoire des Textes, France

We present the first experimental demonstration of a non‐destructive technique that reveals the text

of a carbonized and thus extremely fragile Herculaneum papyrus.

Buried by the famous eruption of Vesuvius in 79 AD, the Herculaneum papyri represent a unique

treasure for humanity. Overcoming the difficulties of the other techniques we prove that X‐ray phase

contrast tomography technique can detect the text within scrolls, thanks to the coherence and high‐

energy properties of a synchrotron source.

This new imaging technique represents a turning point for the study of literature and ancient

philosophy, disclosing texts that were believed to be completely lost.

REX Project: Extraction and processing of underlying texts

Study of a Marie‐Antoinette secret correspondence

Florian Kergourlay1, Christine Andraud1, Anne Michelin1, Aymeric Histace2, Bertrand Lavédrine1,

Isabelle Aristide‐Hastir3, and Rosine Lheureux3

1MNHN‐CRCC, Paris, France 2ETIS, UMR CNRS, Cergy‐Pontois, France

3Archives Nationales, Pierrefitte‐sur‐Seine, France

Marie‐Antoinette and Earl Axel de Fersen have nourished a secret correspondence between June

1791 and August 1792 while the royal family was confined in The Tuileries in Paris. Coded and

partially crossed‐out this correspondence hasn’t yet revealed all of its secrets. Indeed some words,

lines or complete paragraphs have been cleverly censored thanks to very tight curls and false

features to mislead the reader.

This project is part of a field explored by several previous works (Easton and Noel 2010; Bergmann

and Knox 2009; Colombo et al. 2005; Larsen 2011), the specific issue here residing in the voluntary

overlay of two contemporaneously inks. The main goal is thus twofold: (i) develop an experimental

methodology through the combination of non‐invasive and non‐destructive analytical tools and

image processing allowing to distinguish two very similar inks from the 18‐19th century and in fine (ii)

reveal the underlying text.

In this context, an experimental corpus composed of ten letters has been studied by the combination

of (i) micro X‐Ray Fluorescence (µXRF) equipped with a Mo‐Kα excitation source, (ii) Hyper‐Spectral

Imaging in Visible, Near and Short Wavelength InfraRed spectral ranges (HSI‐VNIR and HSI‐SWIR)

from 400 to 2500 nm, (iii) InfraRed flash Thermography (IRT) and stereomicroscopy in transmittance

mode, micro‐topography, 3D numerical microscopy and Reflectance Transformation Imaging (RTI). In

parallel image processing was used to enhance the readability of the obtained image. The

experimental methodology is detailed in Figure 1.

Fig. 1: Experimental methodology

µXRF results first confirm that the elementary composition of the under and overlying inks are typical

of metal gall ones and most specifically iron‐gall inks, per se principally composed of Fe, K, Mn, Ni, Zn.

However a crucial dissimilarity has been highlighted on one letter through the presence of Cu in the

underlying ink allowing to propose elemental distribution maps partly revealing the hidden text.

HSI‐VNIR/SWIR analysis combined with multivariate statistics methods has proved to enhance the

inks superposition and to present some similarities with µXRF results. Several methods of image

processing were applied in parallel on a set of pictures improving the differentiation between the

paper, the under and overlying inks (Figure 2). Flash thermography and all the other methods used in

order to underline the inks topography have failed but have shown some great potentiality.

This study, through the combination of the results, offers a new lecture of the crossed‐out

paragraphs with the revealing of an entire underlying text and provides a new way to apprehend

multiple data collected thanks to complementary analytical tools associated to image processing in

order to reveal hidden information on ancient manuscripts.

Fig. 2: Image processing, µXRF and HSI analysis

References

U. Bergmann and K.T. Knox, Pseudo‐color enhanced X‐ray fluorescence imaging of the Archimedes Palimpsest, in: SPIE Proceedings 7247 (2009).

G. Colombo, F. Mercuri, F. Scudieri, U. Zammit, M. Marinelli, and R. Volterri, Restaurator 26, 92‐104 (2005).

R.L. Easton, W. Noel, Infinite possibilities: Ten years of study of the Archimedes Palimpsest, in: Proceedings of the American Philosophical Society, 50‐76 (2010).

C.A. Larsen, Document Flash Thermography. All Graduate Theses and Dissertations, Paper 1018, Utah State University (2011).

GraphManuscribble: Interact Intuitively with Digital Facsimiles

Angelika Garz, Mathias Seuret, Andreas Fischer, and Rolf Ingold

DIVA research group, Department of Informatics, University of Fribourg, Fribourg, Switzerland

This abstract presents a new user‐centred and intuitive tool GraphManuscribble developed for

directly interacting with a digital facsimile of a manuscript; particularly to segment, extract, or mark

its contents. It exploits the new human‐computer interaction patterns evolving around touch‐screen

devices that are operated with a stylus, such as Microsoft Surface, and builds upon document graphs

that capture the structure of a manuscript similar to human perception. Specifically, users of the tool

scribble directly onto a digital facsimile of a manuscript in order to select or annotate manuscript

parts. They are assisted by a semi‐automatic system that facilitates imprecise interactions since

accurate operations such as the region segmentation are done automatically. A user study including

participants who did not have any experience with pen‐based interaction has shown a promising

potential of the proposed tool.

Give a person a page of a manuscript, and they will instantly and intuitively be able to recognise its

fundamental structures (such as text lines), regions, and objects (e.g. embellishments) regardless of

the language or script it is written in. Whether that is an ancient European script, Arabian, Cyrillic,

illegible cursive handwriting, or exotic layouts such as in Babylonian Aramaic Magic Bowls,*

complicated curving lines (Asi et al. 2014), or more elusive layouts such as writing embedded in

works of art such as paintings.† To us, a manuscript is a meaningful arrangement of regions; we can

agree on the regions and their semantic meaning on a high level, establish connections and relations

between parts, and understand boundaries and belonging, without needing to read or understand

the content.

Fig. 2: St. Gallen, Stiftsbibliothek, Cod.Sang.18‡, page 95.

Left: original image, centre: document graph, right: close‐up

* http://www.metmuseum.org/collection/the‐collection‐online/search/321885 † http://www.metmuseum.org/collection/the‐collection‐online/search/454611 ‡ Composite manuscript, astronomical clock of Pacificus of Verona, DOI:10.5076/e‐codices‐csg‐0018

Gestalt psychology, which emerged in the early 20s of the last century, targeted at understanding

human visual grouping, i.e. the way humans recognise a pattern as an object with boundaries and

extent, and how to differentiate it from background. As such perceptual grouping tries to “solve the

problem of ‘what goes with what’ and the differentiation of figure from ground” (Han 1996). Our

perception is terribly powerful. For us humans, “form emerges as result of the relationship [and

complex interactions] between the parts” (Brock et al. 2006). To fix ideas, let us consider the example

of a zebra herd: we are able to identify each zebra in a herd of similar looking zebras as individual

and agree on its boundaries despite its complex texture and the similar texture of its herd members.

This means, we are able to identify what defines a collection of patterns, and what distinguishes it

from others, and hence, group those with respect to some similarity criterion (Breidbach et al. 2006).

A computational method exposed to the same task, on the other hand, works on an array of pixels

the document image is digitally represented in, without the inherent understanding of the structure.

Methods have been developed to classify, group, and align pixels to model our intuitive

understanding of a document and to provide a means for automatic processing (Nagy 2000). Graphs

in the context of computer science (Conte et al. 2004) are a powerful means to model structural

relationships. They are capable of representing data in a fashion similar to our perception given an

appropriate definition of their topology and criterions for defining relations: graphs are a set of

points connected by edges (see Figure 1 where the graph is a minimum spanning tree based on

triangulated contour points). To fix ideas let us explore the analogy of maps, more particularly road

networks, and graphs. Graphs capture entities and their relationships rather similar to a road

network connecting neuralgic points in a city. In the network, there are roads that are more crucial

than others, are wider, have more lanes, or a higher speed limit, and thus, higher traffic throughput.

A graph can similarly encode different relationship qualities in its edges.

We propose a user‐centred system that is inspired by the humans’ perceptual capabilities and aims

at representing the document image in a way that resembles our understanding using graphs (Garz

et al. 2015). The goal of our first prototype tool is to assist a user in selecting and segmenting entities

in a manuscript in an intuitive manner guided by the humans’ connotation of structure and belonging

in a document. The user edits this representation with a stylus on a touch‐sensitive screen. The range

of application of such a tool extends from extracting parts of a manuscript e.g. for selecting samples

for word spotting (Riba et al. 2015) or illustration retrieval, in order to find other appearances of the

same or word or illuminations, decorations, or drop caps (Nguyen et al. 2011); to a tool for scholars

that empowers them to work on the digital facsimile of a manuscript in an intuitive and natural

fashion. A scholar is empowered to directly mark and annotate a facsimile with simple and intuitive

tools.

We conducted a user study with the aim of assessing the efficiency of the tool for annotating

complex historical manuscript pages, the user fatigue, and the quality of the proposed scribbling

interaction pattern. It demonstrated that our tool and the interaction with a pen on a touch screen

that displays the facsimile does neither require prior training, algorithmic knowledge, nor extensive

computer skills.

Fig 3: User Interface of the GraphManuscribble tool with the selected areas (polygons colour‐coded according

to their class), and the user scribbles (interactions) on a second view with a binary version of the image, where

the foreground is white, and the graph over‐imposed as green lines.

References

A. Asi, R. Cohen, K. Kedem, J. El‐Sana, and I. Dinstein (2014). A Coarse‐to‐Fine Approach for Layout Analysis of Ancient Manuscripts. In Proc. Int. Conf. on Frontiers in Handwriting Recognition (pp. 140‐145). IEEE.

O. Breidbach, and J. Jost (2006). On the Gestalt Concept. In Theory in Biosciences (Vol. 125, No. 1, pp. 19‐36). Springer International Publishing.

D. Conte, P. Foggia, C. Sansone , and M. Vento (2004). Thirty Years of Graph Matching in Pattern Recognition. In Int. Journal of Pattern Recognition and Artificial Intelligence (Vol. 18, No. 3, pp. 265‐298). World Scientific.

A. Garz, M. Seuret, F. Simistira, A. Fischer, and R. Ingold (2015). Creating Ground Truth for Historical Manu‐scripts with Document Graphs and Scribbling Interaction. Submitted for review to Int. Workshop on Document on Document Analysis Systems.

S. Han, F. Xiao, and L. Chen (1996). Uniform connectedness and the classical gestalt principles of grouping. In Investigative Ophthalmology & Visual Science (Vol. 37, No. 3, pp. 1350‐1350).

G. Nagy (2000). Twenty Years of Document Image Analysis in PAMI. IEEE Trans. on Pattern Analysis and Ma‐chine Intelligence (Vol. 22, No. 1, pp. 38–62). IEEE.

T. T. H. Nguyen, M. Coustaty, and J. Ogier (2011). Bags of strokes based approach for classification and indexing of drop caps. In International Conference on Document Analysis and Recognition (ICDAR), (pp. 349‐353). IEEE.

P. Riba, L. Lladós, A. Fornés, and A. Dutta (2015). Large‐Scale Graph Indexing Using Binary Embeddings of Node Contexts. In Graph‐Based Representations in Pattern Recognition (pp. 208‐217). Springer International Publish‐ing.

Visual Literary Topology

Rachid Hedjam1, Margaret Kalacska1, Sumaya S. Ali Al‐ma’adeed2, and Mohamed Cheriet3

1Department of Geography, McGill University, Montreal, Qc H3A 2K6; Canada 2Department of Computer Science and Engineering, Qatar University, Doha, Qatar

3Department of Automated Manufacturing Engineering, ETS, University of Quebec Montreal, Canada

Introduction

This paper highlights some research directions of one of the ongoing project (DiDC: Digging into Data

Challenge) we are conducting in collaboration with the literary department of McGill University

(Canada). The aim of this project is to combine visual image processing (VIP) pattern recognition (PR),

machine learning, network science and text analysis to study cultures of literary communication

across a broad spectrum of space and time: post‐classical Islamic philosophy, Chinese Women’s

Writing from the Ming‐Qing Dynasties, the Anglo‐Saxon Middle‐Ages, and the European

Enlightenment. How are these different periods and places characterized by networks of shared

ideas? How did such literary networks contribute to the distinct intellectual contributions of each

epoch?

Fig. 4: Footnote mark and footnoted word

More recently modeling by networks has become of interest to literary scholars who have explored

the nature of literary networks in a variety of contexts, including visualizations of eighteenth‐century

epistolary networks; literary geographies; and quantitative studies of social networks within different

genres. Traditionally, these networks are generated based on the similarities between available

textual information of their printed or handwritten manuscripts. Natural language processing (NLP) is

the basic tool used to find out the frequent items shared between the different manuscripts based

on which the similarity is computed (Piper and Algee‐Hewitt 2014). Then, various network analysis (2]

approaches can be used to analyze these networks in order to answer existing questions about

human sciences. Unfortunately, millions of historical manuscripts are still in the form of document

images and the text they contain is not yet in its live form. Our new vision toward meeting this

challenge addresses manuscripts in their image form using VIP and PR. The latter provide to digital

human expert the possibility to entail the generation of network representations of the text corpuses

based on the drawn visual data in the same way as he would using the semantic information of word

frequencies. The nodes of the network will represent individual works or pages and the edges will

represent the similarity or dissimilarity between them based on their visual features.

Methodology

The core of VIP is based on extracting and transforming graphical marks on page images with a large

set of transforms (known as ‘’patches”). A patch can be seen as a sample taken of a character, for

example, and contains a rich amount of information that can be used to build the graphical relations

between pages. Although the approach is universal to all collections, separate descriptors and data

handling will be developed for each collection based on the nature of script and style of its

documents. In the context of DiD German dataset, the footnote is considered as the most important

visual feature that an image patch can cover. Each footnote is indicated by a footnote mark (FNM),

numeric, alphabetic or special character. Each FNM (see Figure 1) appears twice in a manuscript page,

one appears at the top, (noted as TFNM) and one appears at on the bottom (noted as BFNM). TFNM

is placed just after the footnoted word, and the BFNM is placed just before the footnote itself.

Therefore, virtually a manuscript page contains always an even number of footnote markers (i.e.,

TFNM and BFNM). The footnoted word is used for the content analysis looking, for instance, for the

most often footnoted words that will be served to build a common lexicon dictionary. From the

literary point of view two manuscripts can be connected or similar if they share one or more than

one similar footnoted word.

Fig. 5: Similarity network between manuscripts

The footnoted word detection depends on the accuracy of the FNM detection. What making the

automatic processing difficult, if not impossible in many cases, is to deal with degraded manuscript

images scanned with very low resolution, in only black and white colors, add to that all the damages

caused by the geometrical deformation due to poor handling during scanning. Consequently the

footnote detection problem itself is not at all accurate. Some pre‐processing such as noise and

shadow removal, text stroke smoothing, skew correction, must be established first. We have

proposed two methods to detect the FNM. The first method is based on MACH filter (Kumar et al

2015), and the second method is based on Adaboost classification rule (Freund and Schapire 1997).

MACH filter is based on correlation pattern notion. Given a class of instance of a class of FNM patch,

a MACH filter combines the training patches into a single composite template by optimizing four

performance metrics: the average correlation height, the average correlation energy, the average

similarity measure, and the output noise variance. The template is then correlated with testing the

target FNM in the frequency domain via a fast Fourier transform, resulting in a surface in which the

highest peak corresponds to the most likely location of the target FNM in the manuscript image. As

for Adaboost classifiers, we used HAAR features and adaboost algorithm. HAAR features are very

simple and can be calculated efficiently using the integral image. Thanks to its simplicity and

efficiency, it could help in reducing the computational time when processing such kind of huge

collection of document images. Usually; template matching is used for object detection in the binary

images which is computationally expensive. Here, we used a boosting based algorithm, which is

rarely used for binary images, to detect FNMs. The detector can run in real time and provides high

performance. Once the FNM are detected, the footnoted words are located and matched each other

through the manuscript collection in order to build the similarity graph as shown in Figure 2. Finally,

various network analysis (Fortunato 2010) approaches can be used to analyze these networks.

References

A. Piper and M. Algee‐Hewitt. The Werther Effect I: Goethe Topologically. Distant Readings: Topologies of Ger‐man Culture in the Long Nineteenth Century. Ed. Matt Erlin and Lynn Tatlock (Rochester: Camden House, 2014) 155‐184.

S. Fortunato. Community detection in graphs. Physics Reports 486.3 (2010): 75‐174.

B. V. K. Kumar, A. M. Vijaya, and R. D. Juday. Correlation Pattern Recognition. 1st ed. Cambridge: Cambridge University Press, 2005. Cambridge Books Online. Web. 18 November 2015. http://dx.doi.org/10.1017/CBO9780511541087

Y. Freund and R. E. Schapire (1997). A decision‐theoretic generalization of on‐line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119‐139

An Interactive System to help Transcription of Historical Handwritten Documents

Adolfo Santoro, Angelo Marcelli, and Francesco Carillo

Department of Electrical Engineering and Applied Mathematics@University of Salerno, Fisciano, Italy

The ongoing process of knowledge digitization identifies digital libraries as one of the most important

channels for creating and sharing knowledge all over the world: the concept of digital library evolved

from a simple digital collection to a space where services and people support the whole life‐cycle of

the creation, use and preservation of data, information and knowledge. The advancements in

imaging and storing technologies have favored the digitization of historical books and manuscripts

for better preservation and easy access but, in case of cursive handwritten documents, making the

text available for searching and browsing is a very challenging task. Currently an huge amount of

historical handwritten document images have been scanned and stored into an image format, but

their content is not yet easily and quickly available, as well shown in Bentham project, a

crowdsourced handwritten document transcription, or tranScriptorium project, whose aim is to

develop innovative and efficient solutions for indexing and transcribing the content of handwritten

documents belonging to an huge digital archive. On the other hand, easy access is ensured when the

content of the document can be dealt with information retrieval technology, so that the system can

process user’s queries asking for documents with specific content. For that to happen, document

images must be processed and their content extracted in a way that can be easily manipulated; in

case of cursive handwritten historical documents, Optical Character Recognition is not a viable

solution and therefore different approaches, such as word spotting or holistic handwritten text

recognition, have been proposed in literature but the problem is far to be solved at all. The main

reason is the large variability exhibited by those documents, usually produced by different writers in

different ages, and thus involving very different writing styles, which requires very powerful

recognition engines and human efforts of handwriting experts, also known as “scriptores”, to

interpret the content and to eventually correct errors at recognition level.

In this short abstract we propose a novel approach for helping “scriptores” to transcribe the content

of handwritten documents in huge digital historical archives based on keyword retrieval method

following query‐by‐string paradigm. Our approach aims to address two important issues: the small

size of the training set, which means small efforts requested to the “scriptores” needed to have the

system ready for use, and the possibility of improving the performance of the system along the time

by taking advantage of the interaction with the scriptor. The proposed method is composed of four

different steps, that are described below:

1. Build a Reference set, the input of the first step is a set of handwritten digital documents for

which the transcription of the words is available. Those documents are automatically processed

in order to create a Reference Set, which is composed of word images to which it is associated a

transcript.

2. Build a knowledge base: the input of this step is the whole collection of scanned handwritten

documents in which the retrieval step will be performed. All documents are processed in order to

perform the ink matching algorithm, that associates a possible interpretation obtained from the

word images belonging to the Reference Set to the unknown ink by creating a Knowledge base,

which is composed of the set of all possible interpretations for each ink of the data set.

3. Use and interact with the system: the retrieval step is performed by searching the keywords

typed in by the human user into the Knowledge Base. The output is a set of images candidates

for representing the keyword that the human user (i.e. scriptor) can confirm or not as instances

of that keyword.

4. Add knowledge: the last step takes advantage of the interaction with the user by adding

evidence and knowledge to the system: the right candidates are moved from the untranscribed

dataset collection to the Reference set, which will be updated and used in the further searches.

We will also report the results of experiments obtained on an historical handwritten dataset, which

show that with a Reference set composed of only 5 pages the system is able to respond to a number

of 50 keywords with a Recall index up to 80% and a Precision index of 15%; the low value of the

Precision index arises from the choice of engage more the human user at the beginning by enabling

the system to learn from the interaction and to use that evidence in the further searches. Infact in

the last iterations the value of Recall remains stable but the value of the Precision doubles for each

iteration. Moreover, in order to consider a real scenario for using our system, it has been evaluated

time gained by using our system for aided trascription of a number of 1000 words extracted from a

digital archive of handwritten documents: this evaluation has been performed by choosing a

"dictionary" of words more frequent in the pages manually trascribed and without considering stop

words, that are pretty useless for the understanding of the content of the page. In case that the

Reference Set gets a 80% of coverage of all the characters and 45% of all bigrams present in the

collection, the average gain obtained in terms of time needed to transcribe a collection is about 54%.

References

M. T. Rath and R. Manmatha, Word Spotting for historical documents, International Journal of Document Anal‐ysis and Recognition, pp.139‐152 (2007)

C. De Stefano, G. Guadagno, and A. Marcelli, A saliency based segmentation method for on‐line cursive hand‐writing, IJPRAI, 18(6), 1139‐1156 (2004)

Y. Liang, R.M. Guest, and M.C. Fairhurst, Implementing Word Retrieval in Handwritten Documents using a Small Dataset”In: Proc. 3rd International Conference on Frontiers in Handwriting Recognition (2012)

D. Aldavert, M. Rusinol, R. Toledo, and J. Llados, Integrating Visual and Textual Cues for Query‐by‐String Word Spotting Proceedings of the 12th International Conference on Document Analysis and Recognition, ICDAR (2013)

A. Fischer, A. Keller, V. Frinken, and H. Bunke, Lexicon‐free handwritten word spotting using character HMMs, Pattern Recognition Letters, pp.934‐942 (2012)

L. Rothacker, M. Rusinol, and G.A. Fink, Bag‐of‐Features HMMs for Segmentation‐Free Word Spotting in Hand‐written Documents, in. Proc. 12th International Conference on Document Analysis and Recognition (2013)

A. Fischer, V. Frinken, H. Bunke, and Y.C. Suen, Improving HMM‐based Keyword Spotting with Character Lan‐guage Models, in. Proc.12th International Conference on Document Analysis and Recognition (2013)

V. Papavassiliou, T. Stafylakis V. Katsouros, and G. Carayannis, Handwritten Document Image Segmentation into text lines and words” Pattern Recognition 43 pp. 369‐377 (2010)

L.P. Cordella, C. De Stefano, A. Marcelli, and A. Santoro, “Writing order recovery from off‐line Handwriting by Graph Traversal” Proceedings of International Conference of Pattern Recognition ICPR pp. 1896‐1899 (2010)

C. De Stefano, A. Marcelli and A. Santoro, On‐line cursive recognition by ink matching” Proceedings of Interna‐tional Graphonomics Society, IGS pp.23‐37 (2007)

V. Frinken, A. Fischer, R.Manmatha and H. Bunke, “A Novel Word Spotting Method Based on Recurrent Neural Networks” IEEE Transactions on PAMI, Vol.34, No 2,February (2012)

Bentham project http://blogs.ucl.ac.uk/transcribe‐bentham

Trascriptorium project http://transcriptorium.eu

Poster

Imaging Watermarks of 15th Century Islamic Manuscript Kashf Al‐Bayan’an Sifat Al‐Hayawan

Nurgül Akcebe

Department of Manuscript Conservation and Archive (Kitap Şifahanesi) /

Manuscripts Institution of Turkey, Istanbul, Turkey

Kashf al‐Bayan’an Sifat al‐Hayawan , it is in a form of encyclopedia which consists of 62 volumes with

nearly 16 thousand leaves, was written by Abu al Fath Muhammad b. Shaykh Bedreddin also known

as Ibn Atiyah. According to his own words, the author began to write the manuscript in 1487. It

includes many topics not only about zoology but also about botanic, medicine, literature and

philosophy. Also, he indicated that he had benefited from many references of about 3000 to bring

forth this valuable work (Aslan 2015). The manuscript is one of the pieces of a collection that being

hold in Millet Manuscript Library in Istanbul and there is no other copy of it.

The volumes of the manuscripts has been documented and restored by The Department of

Manuscript Conservation and Archive (Kitap Şifahanesi) that is one of the departments of the

Manuscripts Institution of Turkey. It is an ongoing restoration and preservation project. It has been

investigated the watermarks in these volumes when proceeding the documentation and restoration

of them. It has been detected 3 different types of watermarks inside pages of the covers up till now.

Two of them are similar to watermarks which have seen in the 15th century European papers*†. It can

be predictable that these papers had stuck inner covers when the manuscript had been written in

late 1400s. The third watermark that determined is an Eastern type watermark and it could be dated

between from 17th to 19th centuries (Ünver 1962; Ersoy 1963). It can be concluded that these volumes

which contained the Eastern style watermark were restored, maybe after hundreds years later from

its writing date. The text block of these volumes had become hand‐made Eastern paper although the

inner cover pages are Western papers.

Fig. 1: Watermark examples of encyclopedia and their matches with European watermark archives (a),(b).

Eastern style watermark from the same encyclopedia (c)

* Watermarks have been retrieved from: http://www.wasserzeichen‐

online.de/wzis/struktur.php?klassi=001002001001002001001&anzeigeIDMotif=2968

† Watermarks have been retrieved from: http://www.ksbm.oeaw.ac.at/_scripts/php/BR.php

(b) (c)

For taking image from the inner cover page, it is used the NİKON D600 camera with different angle

and then enhanced images with Photoshop CS5 program. It couldn’t be used slim light because pages

are stuck to cover and cannot be separated. Then these watermarks are matched with similar

watermarks that have been detected in 15th century European manuscripts and letters (Figure 1).

References

A. Aslan, Manuscripts Institution of Turkey (2015).

O. Ersoy, XVIII ve XIX. Yüzyıllarda Türkiye’de Kağıt, Ankara Üniversitesi Dil ve Tarih‐ Coğrafya Fakültesi Yayınları, Ankara, 1963, pp 85‐87

S. Ünver, Belleten, 26,739. (1962).

The Ignatius of Loyola’s Exercitia Spiritualia autograph:

analyses before and during conservation treatments

Maddalena Bronzato1,, Alfonso Zoleo2, Luca Nodari3, Carlo Federici4, and Melania Zanetti4

1Federchimica, Milano, Italy 2Department of Chemical Sciences, University of Padova, Padova, Italy

3 IENI‐CNR and INSTM, UdR of Padova, Padova, Italy 4 Department of Humanistic Studies, University of Venice Ca’ Foscari, Venezia, Italy

A large number of paper ancient manuscripts is endangered by the corrosive effect of iron gall inks

(Hey 1979). It is well known that the Fe(III) and Fe(II) species occurring in these inks are powerful

catalyzers of paper degradation reactions (Kolar and Strlic 2006). As a consequence, iron gall inks are

a main concern for paper conservators, and iron mobility and migration from the written text to the

surroundings, or iron penetration into the leaves, is an unwanted situation which often occurs,

because iron migration from the inked areas has been related to degradation of the paper leaves

(Rouchon et al. 2009). Water solubility of Fe(II) ions claims for a careful approach of water

treatments, or even humidification, which could induce halo formation. Mixture of water and alcohol

are often suggested to limit the risk of ion migration. However, both water and idroalcoholic

treatments have been recently questioned, proving to be unreliable to limit iron migration (Rouchon

et al. 2009). At present, there is no consensus on the treatments to apply and more data on iron

migration in iron gall ink on paper in different conservation treatments are required, particularly with

respect to discoloration and degradation.

This work concerns examination and conservation treatments of the oldest evidence of Ignatius of

Loyola’s Exercitia Spiritualia. The paper manuscript includes many autograph annotations by Ignatius

de Loyola, the founder of the Catholic Society of Jesus. In it, the severe degradation induced by iron

gall inks had resulted in discoloration and burn‐through. In the first half of 20th century, each leaf was

lined with silk recto/verso in order to prevent paper fragmentation of the ink areas, inducing paper

yellowing, adhesive stains and other undesirable effects.

A new intervention was therefore required to reduce the risks related to the previous intervention

and the impact due to the degradation processes.

The manuscript has been investigated by means of non‐destructive and non‐invasive spectroscopic

techniques, in order to get information toward the choice of a punctual and suitable intervention

procedure.

Infrared spectroscopies ATR‐IR and DRIFT (Derrick et al. 1999), both completely non‐invasive, were

applied in order to get indications about the current state of leaf, about the sizing materials

commonly used to improve the resistance of a sheet of paper to the water sorption, and about paper

fillers, substances added to the cellulose matrix in order to improve optical and superficial features of

paper. FORS (Fiber Optics Reflectance Spectroscopy) was used to investigate discoloration on the leaf

in order to understand their chemical nature, to determine the colorimetric coordinates of the

irradiated spot and to detect the extension of Fe(II) migration, which was demonstrated in literature

to be normally combined with brown halos (Picollo et al. 2002).

From the MicroRaman Spectroscopy (Bicchieri et al. 2006) it was possible to collect useful

information about the molecule structure, in particular of the inks used in the written areas. XRF (X‐

Ray Fluorescence) Spectroscopy analysis (Hahn et al. 2005) was widely run in order to detect

chemical elements, such as Ca and K, and in particular to investigate the presence of iron and copper

species, known to be efficient catalysts of paper degradation reactions.

The analysis revealed the use of different iron‐gall inks in the various manuscript leaves and the use

of gelatin as sizing material. The leaves showing the worst conservation conditions were brittle and

presented brown halos around the written areas: the XRF analysis confirmed the migration of iron

ions from the text to the surrounding areas. The highly degraded leaves of the book were

characterized by a general higher amount of iron with respect to the leaves in good conservation

state.

References

M. Bicchieri, A. Sodo, G. Piantanida, C. Coluzza, J. Raman Spectrosc., 37, 1186 (2006).

M. Derrick, D. Stulik, J. M. Landry, Scientific Tools for Conservation. Infrared Spectroscopy in Conservation Sci‐ence (The Getty Conservation Institute, 1999).

O. Hahn, B. Kanngießer, W. Malzer, Studies in Conservation, 50, 23 (2005)

M. Hey, The Paper Conservator 4, 66 (1979).

J. Kolar, M. Strlic, Iron Gall Inks: On Manufacture, Characterisation, Degradation and Stabilization (National University Library, Ljublana, Slovenia, 2006).

M. Picollo, M. Bacci, A. Casini, F. Lotti, S. Porcinai, B. Radicati and L. Stefani, Fiber Optics Reflectance Spectros‐copy: A Non‐destructive Technique for the Analysis of Works of Art, in: S. Martellucci, A.N.Chester, A.G. Mi‐gnani (Ed.), Optical Sensors and Microsystems. New Concepts, Materials, Technologies, Kluwer Academic Pub‐lisher, NY, 2002, pp. 259‐265.

V. Rouchon, B. Durocher, E. Pellizzi, J. Stordiau‐Pallot, Studies in conservation 54, 236 (2009).

Can non‐destructive techniques and portable instruments be used

to analyse ink and paper degradation?

Claudia Colini1, Ira Rabin1,2, and Oliver Hahn1,2

1Centre for the Study of Manuscript Cultures, Hamburg, Germany 2BAM Federal Institute for Materials Analysis and Testing, Berlin, Germany

The object of this poster is part of a bigger project aimed at the creation of a database of known raw

materials in black inks and paper coatings in Arabic manuscripts; all of which are identified by means

of non‐destructive techniques and portable instruments.

In order to verify the validity of our database when confronted with ancient manuscripts, we decided

to artificially age our samples and observe the effects of their degradation.

The specific experiment shown in this panel is the pre‐test we decided to run before the start of the

actual ageing process, with a smaller selection of samples (a single iron gall ink recipe and one

coating material), that will allow us to observe several relevant details.

First of all we will verify whether, for these recipes, a difference in the spectra would be observable,

and in such a case whether it would be possible to deduce the degradation mechanism using non‐

destructive and portable technologies alone.

In particular, the following techniques will be applied:

● FTIR‐ATR

● Colorimetry

● XRF

● pH measurement

Moreover we will evaluate the impact of ink on the degradation of cellulose, comparing samples with

and without writing.

With FTIR‐ATR we can identify several organic groups present in papers and inks, characteristic of

cellulose, gums, proteins, alcohol, perfumes, but also the SO42‐ ions peculiar of vitriol (Senvaitiene et

al. 2005). Moreover is possible to observe the behaviour of various carbonyl groups, such as

carboxylic, ketonic, aldehydic groups, which are products of cellulose partial oxidation and hydrolysis

(Calvini and Gorassini 2002; Margutti at al. 2001; Lojewska et al. 2006; Lojewska et al. 2007; Ali 2001;

M. Urescu et al. 2009). The picture will be, however, complicated by the presence of the hydrogen

bonds network, that will shift the peaks to higher frequencies, and free water that can cover some of

our areas of interest.

Concerning Colorimetry we are particularly interested in the L* and b* values for the paper: in fact

lightness can be correlated to the degree of polymerisation (Vives et al. 2001) and the increasing of

yellowness can be correlated to oxidative degradation forming conjugated ketonic groups in the 2

and 3 position of the glycopyranose ring in cellulose (Lojewska et al. 2007). Colorimetry will be

applied to inks as well, in order to evaluate the discolouration of the media (Csefalvayová et al. 2007).

XRF will be used to follow the migration of iron ions from the ink to the paper as well as to observe

the formation of crystals of calcium sulphate in the halo surrounding the writing. pH measurements

will be taken to record the acidity of both paper and ink. Concerning non inked paper, pH can be

related to the formation of carboxylic groups, already observed with FTIR‐ATR; however it seems that

some other group is involved in the increasing of acidity during ageing (Lojewska et al., 2007).

Regarding inks, the increasing of acidity can be linked to iron oxidation (Rouchon et al. 2011).

Given the extreme variety of ageing modalities found in bibliography, we decided to age our samples

in a humid chamber at 80°C and 65% RH for 49 days, using two different configurations: a set of

samples was hanged, while the other set was placed inside stacks of paper sheets, covered top and

bottom with boards of polyacrylate in order to simulate the structure of a book.

Samples will be collected after 0 (the reference), 7, 14, 28 and finally 49 days, meaning we will have 5

samples per configuration.

This design will give us the opportunity to verify whether different experimental conditions could

modify the direction of degradation.

References

M. Ali, “Spectroscopic studies of the ageing of cellulosic paper”, Polymer 42 (2001), pp. 2893‐2900;

P. Calvini, A. Gorassini, “FTIR – deconvolution Spectra of Paper Documents”, Restaurator 23 (2002), pp. 48‐66;

L. Csefalvayová et al., “The influence of Iron gall ink in Paper Ageing”, Restaurator 28 (2007), pp. 129‐139.

J.M. Gibert Vives et al., “A Method for the non‐destructive analysis of Paper based on Reflectance and Viscosi‐ty”, Restaurator 22 (2001), pp. 187‐207.

J. Lojewska et al., “FTIR in situ transmission studies on the kinetics of paper degradation via hydrolytic and oxidative reaction paths”, Applied Physics A 83 (2006), pp. 597‐603

J. Lojewska et al., “Carbonyl groups development on degraded cellulose. Correlation between spectroscopic and chemical results” Applied Physics A 89 (2007), pp. 883‐887

S. Margutti at al., “Hydrolytic and Oxidative Degradation of Paper”, Restaurator 22 (2001), pp. 67‐83

V. Rouchon et al., “Room‐Temperature Study of Iron Gall Ink Impregnated Paper Degradation under Various Oxygen and Humidity Conditions: Time‐Dependent Monitoring by Viscosity and X‐ray Absorption Near‐Edge Spectrometry Measurements”, Analytical Chemistry 83, 7 (2011), pp 2589–2597.

J. Senvaitiene et al., “Spectroscopic evaluation and characterisation of different historical writing inks”, Vibra‐tional Spectroscopy 37 (2005), pp. 61‐67.

M. Urescu et al., “Iron gall inks influence on papers' thermal degradation; FTIR spectroscopy applications”, European Journal of Science and Theology vol.5, n.3 (2009), pp. 71‐84.

Age and Fiber Structure Study Using 3D, Mesoscale Modeling and Simulation of

Ink Seepage in Paper Porous Media

Reza Farrahi Moghaddam1, Mohamed Cheriet1, and Sumaya Ali Al‐Ma’adeed2

1Synchromedia Lab, ETS, UduQ, Montreal, QC, Canada H3C 1K3 2Department of Computer Science and Engineering, Qatar University, Doha, Qatar

Background

There is a high level of interactions involved between the ink molecules, their carriers, and the media

of the paper in the body of paper. These interactions could basically determine the behavior of the

final ink‐paper product along its life cycle, including its reaction to physical stimulus, such as heat,

humidity and light exposure, in short term or long term. In particular, if the thickness of paper is

small, which would be the case in the future considering the global move toward sustainability and

less consumption of resources, it is possible that the ink could propagate and reach the other side

(the verso side) of the paper. These phenomena, which are usually referred to as the bleed‐through

effect, are very common for ancient manuscripts, which suffer from long term exposure to heat and

humidity. The simplest form of the models used for studying ink seepage is that of the ‘diffusion’

models (Farrahi 2009). These models work at a macro scale but try to produce an output that is

approximately correct to simulate the associated two‐phase fluid dynamic problem. These models

usually work on a discretized spatial space and the evolution or dynamic of the state of each point of

that discrete space is usually determined based on some ‘governing’ equations that operate within a

highly‐localized region around the point of interest (PoI), called its neighborhood. We introduced a

generalized form of the diffusions models in which the diffusion terms are not limited only to the

neighboring points in the ‘same’ space (Farrahi 2009). In other words, various ‘sources’ could

contribute to the same target space in the form source spaces from which the state of their points,

which are ‘neighbors’ of a PoI, could be linked to the future state of that PoI on the target space. By

proper modification of the diffusion coefficient, we were able to introduce the first nonlinear, patch‐

based restoration method based on what we called the ‘reverse’ diffusion (Farrahi 2009).

Fig 1: An X‐ray image of paper’s fiber structure (left), a digital fiber structure, which resembles a real structure

in simulations (right)

Previous Work on 3D Modeling and Simulation of Ink Seepage

To have a low‐level simulation of the ink seepage in paper at the mesoscales, actual displacement of

the ink material within the body of paper should be modeled (see Figure 1). Although there are

various models and techniques developed for 3D simulation and analysis of the liquid propagation

within porous media, almost all these models assume an ‘infinite’ volume of liquid (ink) entering the

porous media. This assumption simplifies the calculation to a great extent. However, it is not

applicable to our case of ink seepage because by definition the volume of ink used to write or paint

on a paper in much less than the volume of paper. In order to address this ‘challenge of finite

volume’, we introduced a new mesoscale model at the level of fiber structure discretization that

considers the actual amount of ink used as a constraint in its formulation (Farrahi 2013). To address

the computational challenge associated to direct, iterative numerical solving of the new model, we

also proposed a modified form of the genetic algorithm specialized for our case in order to speed up

process of converging to the final solution. Examples of the simulation results could be seen in

Figure 2.

Fig 2: The digital field that represents the fiber material (left), the same field with addition of the ink

distribution (ink is in red, right)

Proposed Paper Aging Study based on 3D Fiber Structure

Although the model introduced in [Farrahi2013] is capable to simulate 3D mesoscale seepage of ink

within the porous media of fibers, it is recommended to perform various 3D simulations for a

comprehensive set of parameters’ value for a small‐size (microscale) surface of paper, and then

integrate and model the results of the 3D simulations in the form of nonlinear diffusion coefficients

that can be used in the macroscale diffusion simulations. Furthermore, extending and generalizing

the 3D mesoscale model of [Farrahi2013] could be sought toward introducing and considering the

‘sub‐fiber’ potential terms, which could be the actual carriers of the ink molecules along or ‘across’

the fiber pieces or strands, in the calculations. In addition, complementary optical models could start

from the 3D profile of an ink‐seepage instance within paper along with the paper’s fiber structure or

its characteristics, and then calculate the estimated ‘reflective’ or ‘transmissive’ optical observation.

These methods could be then used to perform reverse engineering in estimating the 3D ink profiles

from the observations that can be used to generate the restored or ‘original’ version of a degraded

manuscript. In this work, we focus on a time‐dependent, invasive approach to build a relation

between the ink propagation on the surface or within paper to the fiber structure of the paper. It is

well known that fiber structure of paper could be used as a reliable source to estimate the age and

era of the paper. Our approach is based on a model that relates time‐dependent radial and axial

behavior of ink propagation on the surface of paper to the fiber structure. Therefore, it requires high‐

definition, high‐dynamic‐range, and high‐frame‐rate video observation of ink‐paper interaction. An

extended version of the model is considered that work also with temporal distribution of ink within

the paper in addition of the ‘surficial’, time‐dependent observation. The relations between temporal

behaviors and the fiber structure are then confirmed using 3D simulation of ink seepage and also X‐

ray observation of the structure.

Acknowledgements

This publication was made possible by Grants RGPDD/451272‐13 and RGPIN/138344‐14 from the

NSERC of Canada, Grant 412‐2010‐1007 from the SSHRC of Canada, and NPRP Grant #NPRP 7‐442‐1‐

082 from the Qatar National Research Fund (a member of Qatar Foundation). The statements are

solely the responsibility of the authors.

References

R. Farrahi Moghaddam and M. Cheriet, Low quality document image modeling and enhancement, IJDAR, 11 (4), 183‐201, (2009). DOI: http://dx.doi.org/10.1007/s10032‐008‐0076‐2

R. Farrahi Moghaddam, F. Farrahi Moghaddam, and M. Cheriet, Computer Simulation of 3‐D Finite‐Volume Liquid Transport in Fibrous Materials: a Physical Model for Ink Seepage into Paper, arXiv preprint arXiv:1307.2789, pp 26, (2013). Arxiv: http://arxiv.org/abs/1307.2789

A combination of three complementary non‐destructive Methods applied to Historical Manuscripts

Bernadette Frühmann, Federica Cappa, Wilfried Vetter, and Manfred Schreiner

Institute of Science and Technology in Art, Academy of Fine Arts, Vienna, Austria

Within the framework of the HRSM‐project* with the aim of an investigation of cultural heritage, the

Centre of Image and Material Analysis in Cultural Heritage (CIMA)† was established at the beginning

of 2014. Within this project several historical written manuscripts of the Austrian National Library

(ÖNB) were examined. The selection comprises badly preserved or rewritten manuscripts

(palimpsests) on the one hand, and manuscripts with a remarkable make up on the other, deriving

from the 8th to the 14th century. The material investigations aim at the determination of the inks and

pigments used for writing and illuminating. Besides multispectral imaging‡, different non–destructive

and non–invasive material investigations are required.

As the mentioned manuscripts are very sensitive due to their age as well as due to the fact that they

were in intense use, it was highly aimed that collecting data of used writing inks and parchments

must involve non‐destructive techniques. This means that methods with the ability to measure in‐

situ are needed. XRF Analysis, Raman‐ and FTIR‐spectroscopy covers the demands mentioned and

can be applied under specific conditions also as air‐path systems.

For the elemental identification of the used inks and pigments a portable XRF‐device of the XGLab§,

type ELIO, could be applied. Designed especially for the use in the field of arts it is equipped with a 4

W Rh X‐ray tube with a maximum voltage of 50 kV and mounted with a small x‐y‐stage on a tripod. In

addition, two pointing lasers for alignment purpose and an integrated camera for positioning are

assembled. With an ultra‐fast silicon drift detector – active area 25 mm2 and an energy resolution <

140 eV – practical advantages in spectra quality and the detection of light elements with Z even

below 20, such as Mg, Al, Si and P are given.

For the compound specific identification of pigments e.g. of the same color Raman spectroscopy was

applied**. The measurements could be carried out in‐situ with the Pro‐Raman‐L‐Dual‐G of Enwave

Optronics, USA, a fully integrated and portable instrument. The excitation source applied for this

investigation was a Diode Laser at 785nm (ap. 350W) with narrow line‐width of 2.0 cm‐1. The

instrument is based on a two dimensional CCD array detector which is temperature regulated (‐60 °C).

The integrated microscope is equipped with a 1.3 Mpixel CMOS camera with In‐Line LED illumination.

* HochSchulraum‐StrukturMittel (Structural Fund for Austrian Higher Education) of the Austrian Federal

Ministry of Science and Research, 2013.

† CIMA is an interuniversity research institution with an interdisciplinary approach to the investigation of

cultural heritage.

‡ Imaging in spectral ranges from UV to IR, CVL – Computer Vision Lab, Vienna University of Technology

§ XGLab S.R.L. X and Gamma Ray Electronics – Spinoff del Politecnico di Milano, www.xglab.it

** A.S. Lee, V. Otieno‐Alego, D.C. Creagh, J. Of Raman Spectroscopy 39, 1079‐1084 (2008)

In order to get some more information about the compounds or even some organic mixtures FTIR‐

spectroscopy was a useful tool. For these measurements a portable Bruker ALPHA †† FTIR

spectrometer was employed with a measuring point diameter of approximately 5 mm. Total

reflection spectra (specular and diffuse reflection) were collected in‐situ in the range of 4000‐450

cm‐1 at a resolution of 4 cm‐1 over 32 scans. The background was acquired using a gold mirror as

reference sample. The total reflection spectra were transformed to absorption index spectra applying

the Kramers‐Kronig algorithm, which is included in the software package OPUS, version 6.5, used for

controlling the ALPHA instrument and data acquisition and evaluation. After the transformation a

baseline correction was applied to the absorption index spectra.

In this presentation preliminary results of the measurements of historical written manuscripts will be

shown, especially of the Greek and Slavic region in the 12th to 14th century. On the one hand the

black ink should be classified as well as compared within the manuscripts. On the other hand the

identification of the used pigments for the illuminations and initials was obtained. As some of the

manuscripts contain folia with palimpsests underneath the text, it was an additional challenge to

identify the used writing media for these old writings.

With the results of these three complementary methods it was possible to identify a lot of the used

materials and to compare different pigments applied in similar initials or miniatures.

†† 6Bruker Optics, Ettlingen, Germany, http://www.brukeroptics.com/alphaaccessories.html?&L=0&print=1%

25252525253F (accessed 27/10/2009)

Old Manuscript Analysis: beyond the Visible

Rachid Hedjam1, Margaret Kalacska1, Sumaya S. Ali Al‐Ma’adeed2, and Mohamed Cheriet3

1Department of Geography, McGill University, Montreal, Canada 2Department of Computer Science and Engineering, Qatar University, Doha, Qatar

3Department of Automated Manufacturing Engineering, ETS, University of Quebec, Montreal, Canada

Introduction

Multispectral (MS) imaging is used mostly to record spectral images in both the visible and the

invisible light range, i.e. from ultraviolet (UV) to infrared (IR). Thanks to the use of UV and IR light, MS

imaging can extract information that the human eye cannot capture with its receptors for red, green

and blue. Light that is visible (to the human eye) has wavelengths in the range of about 380 nm to

740 nm (Hedjam 2013). A spectral image is reproduced as a grey‐scale image or an RGB color image.

Visible light is situated between UV light, which has short wavelengths in the 10 nm to 400 nm range,

and near‐IR light, which has long wavelengths in the 700 nm to 1 mm range. IR spectral IR spectral

images can be combined into a grey‐scale image, and three of them can be used to create false color

RGB images. The principle underlying MS imaging systems is the concept of the spectral signature.

The main idea is that all materials emit, transmit, or absorb electromagnetic radiation based on the

inherent physical structure and chemical composition of the material, and the wavelength of the

radiation. Every material transmits, absorbs, or emits an amount of electromagnetic radiation

commensurate with the wavelength of the radiation impinging on the material. The ratio of reflected

to emitted radiation from the surface of an object varies with the frequency of the wavelength and

the angle of incidence of the radiation. The combination of emitted, reflected, and absorbed

electromagnetic radiation across a range of wavelengths produces what we call a spectral signature,

which is unique to that material. It is therefore possible to differentiate between objects based on

differences in their spectral signatures (Klein et al. 2008). There are a number of applications for MS

imaging, e.g., IR reflectography, UV reflectography, UV fluorescence, etc.

Fig. 1: Using IR imaging for paper substrate examination. (left) RGB image; (right) IR band at 1100 nm.

IR imaging technique is a technique that records portions of absorbed and reflected IR light, which

passes through the document layers to interact with the underwritten portions of the document. It

can provide a document historian with very important information about the types of ink used and

the document constituents, all of which help him assess the condition of the document under study.

It can also be used to examine the sheet (leaf) substrate and discover its physical details as shown in

Figure 1 (right). The latter shows narrow lines, very close each other, run through the leaf from up to

bottom are also visible. We believe that, those lines may be due to the traces that rollers of the

production machine can leave when it presses the fiber or the substrate at the time of papermaking.

Those important characteristics of the paper constitute an important source of information provided

to librarians and scholars to help them in detecting the origin of the manuscripts and also know their

date of fabrication. IR imaging can be also a useful technique for ink examination. This problem is

well known in the area of questioned document examination. Forensic scientists still face many false

(forged) documents made with intent to deceive the human eye to, for example, earn profits by

changing the amount of money reported on checks. To discriminate between different suspect inks,

the scientist examines the different spectral signatures recorded from ink samples using a

spectrometer or multispectral sensors. The spectral signature recorded from each pixel represents

the percentage of light absorbed or reflected by those pixels. Pixels from same ink will mostly

generate similar spectral signatures, while pixels from different inks will generate different spectral

signature. For instance, if the ink is made from iron‐gall, it transmits the IR light and thus it will not be

shown in the bands recorded at the IR wavelengths. An example is the numerical text ‘’485” which is

visible in the visible bands and invisible in the IR bans as illustrated in Figure 2. But if the ink is made

from carbon, it absorbs the IR light and thus it appears even in bands recorded at the IR wavelengths.

An example is the Arabic text shown in Figure 2. The reflection, absorption and transmittance of the

light are related mainly to the chemical composition of the materiel which composes the ink. This is

key information to know the period in which such an ink is used.

Fig. 2: Ink discrimination. (a) color image; (b) IR band at 1100 nm.

Another MS technique is Ultraviolet imaging technique, which records portions of absorbed and

reflected UV light. UV light is an effective tool that can be used to detect newly touched up areas and

later restorations that are not visible to the human eye. The experimental setup for UV involves

illuminating the document under study using UV lamps (usually referred to as black light) and

installing a UV pass filter in the front of the acquisition camera to exclude the reflected visible light

and allow only reflected UV light to pass through. The result is a gray‐scale (monochromatic) image

of the UV light reflected from that document. UV is also a very useful tool for investigating ancient

manuscripts such as revealing some traces of text that may be added by an archivist and then erased

after as shown in Figure 3. In the color image shown in Figure 3 (left), it is not possible at all to see it.

By examining the band recorded at UV wavelength (400nm), we can see that there is a trace of a text

written at the upper left side of the leaf (see Figure 3 (middle)). After some improvement in the

image shown in Figure 3 (middle), the text becomes more visible (Figure 3 (right)), even its

deciphering remains difficult. More advanced methods based on MS analysis have been also

proposed (Hedjam and Mohamed Cheriet 2011), mainly in document image segmentation and text

extraction.

Fig. 3: Using UV imaging to detect traces of writing

The idea is to use the spectral signature of pixels as feature vectors and apply existing classification

techniques to separate text from background as shown in figure 4. In practice, each representative

spectral signature (e.g., mean spectral signature) is defined from a homogeneous region of a pattern,

such as an area of particular color. The hypothesis of homogeneous regions assumes that patterns

belonging to the same class share the same spectral characteristics. This hypothesis is then used to

distinguish between objects belonging to different classes. A trivial way to classify the patterns is to

map their associated spectral signatures, or to compare each spectral signature to a reference

spectral signature, according to a specific criterion.

Fig. 4: Text extraction

Acknowledgments

This publication was made possible by NPRP grant #NPRP 7‐442‐1‐082 from the Qatar National

Research Fund (a member of Qatar Foundation). The statements are solely the responsibility of the

authors.

References

R. Hedjam, Visual image processing in various representation spaces for documentary heritage preservation,

Ph.D. dissertation, ETS, University of Quebec, Montreal,Quebec, Canada, April, 30 2013.

M. E. Klein, B. J. Aalderink, R. Padoan, G. de Bruin, and T. A. Steemers, Quantitative hyperspectral reflectance

imaging, Sensors, vol. 9, no. 8, 2008.

R. Hedjam and M. Cheriet, Combining statistical and geometrical classifiers for text extraction in multispectral

document images, in Proc. of the 2011 HIP, ser. HIP ’11, 2011, pp. 98–105.

Scientific Analysis of Early Qur'anic Manuscripts

Tobias J. Jocham und Michael Josef Marx

Corpus Coranicum, Berlin‐Brandenburgische Akademie der Wissenschaften, Potsdam, Germany

In the context of the French‐German cooperation project Coranica an interdisciplinary group took up

the task of including modern scientific analysis methods into the respective manuscript studies. Thus,

the Coranica project includes a module named computatio radiocarbonica where palaeographical

analysis and dating of the oldest manuscripts of the Qurʾān is supplemented by scientific methods

such as radiocarbon dating or ink analysis.

The results for manuscripts dated by colophon and their 14C‐age will be set in relation to the

measured values of undated pieces. Therefore, the first group of samples will be taken from Arabic

papyri from the period 642‐750 AD, which will also test the accuracy of the 14C method for this

particular writing material in this period (Bronk Ramsey & Shortland 2013; Dee et al. 2012). Some

dated early Arabic papyri form an important basis for the palaeographical typology of the early

qurʾānic manuscripts (Grohmann 1958). Since all the early manuscripts of the Qurʾān were written on

parchment, a comparison of dated parchments and palimpsest with non‐qurʾānic texts from the

same region and period, i.e. Syriac and Georgian manuscripts, have been consulted. Such a selection

of documents will be made over an extended period of time (450 to 950 AD), with the benefit of

discerning any systematic errors as well.

With this research, the actual precision and significance of 14C datings can be determined for early

manuscripts of the Qurʾān. The selection forms a representative sample from the known manuscripts

in ḥiǧāzī ductus, which are considered the oldest written textual witnesses of the Qur'an ‐ their

temporal proximity to the proclamation of Muḥammad is still discussed today. Until now, scientific

analysis was performed on these materials in only a few cases and thus had a limited scope of impact.

The structure of this new experimental setup, and the amount of sampled materials, will result in a

database which will provide an accurate basis for further investigations.

This method is not without concerns regarding conservation, as it destroys a small part of the object

for analysis. However, due to improved techniques for extraction of carbon, only a small amount

(about 20 mg) of the original material needs to be removed. For parchment, this corresponds to an

area of about 1 cm² ‐ depending on the respective material's thickness and with some even smaller

bits a species determination was possible.

Through a systematic and comparative analysis of early qurʾānic manuscripts by means of scientific

methods like radiocarbon dating ‐ regardless of palaeographic classification ‐ new findings may be

discovered about the history of the oldest witnesses of the qurʾānic text.

As some first results have already been published (Marx and Jocham 2015), the reaction not only in

the media* but in the scientific circles as well has demonstrated the interest and need for such

interdisciplinary research efforts*.

* Further press coverage in the BBC Persian, Deutsche Welle and the Tagesschau (main news of German public

broadcast).

References

Bronk Ramsey & Shortland, Radiocarbon and the Chronologies of Ancient Egypt, 2013

M.W. Dee, J.M. Rowland, T.F.G. Higham, et.al., Synchronising radiocarbon dating and the Egyptian historical chronology by improved sample selection, in Antiquity 86:333, 2012.

A. Grohmann, The Problem of Dating Early Qurʾans, in Der Islam 33:8, 1958.

M. J. Marx and T. J. Jocham, Zu den Datierungen von Koranhandschriften durch die 14C‐Methode, in Frankfur‐ter Zeitschrift für Islamisch‐theologische Studien (2015).

* Cf. the results of the Leidener University Library can be found here. Or the press release of the University

Library

Spectroscopic Studies of Armenian Manuscripts: Paper, Inks, Pigments

Yeghis Keheyan1 and Gayane Eliazyan2

1ISMN, CNR, c/o Dept. of Chemistry, University of Rome “La Sapienza”, Rome, Italy 2Restoration Dept. of Matenadaran Museum of Yerevan, Armenia

Since 1998, the Italian group (CNR) and Matenadaran (Armenia) department of restoration have

collaborated in order to identify the chemical composition and the degradation of ancient

manuscripts from the tenth to the seventeenth century.

Different papers have been published and presentations have been given throughout the world

(Eliazyan et al. 1998; Keheyan et al. 2001; Keheyan et al. 2012; Baraldi et al. 2013; Baraldi et al. 2014;

Keheyan et al. 2014; Keheyan and Baraldi 2015; Keheyan et al. 2015).

In this presentation the results obtained with different spectroscopic techniques, such as: SEM‐EDX,

Raman, XRF, FTIR will be given. There are studies on different pieces of papers, inks and pigments

and also the study of the whole manuscript from all its different parts, including cover, binding etc.

This contribution presents results from the technical study of a rare XIV century Armenian

illuminated manuscript in the collection of the Matenadaran. Characterization of the manuscript

components has been undertaken to create a preservation plan for the manuscript.

Proceeding in analysis of the painting materials and techniques of Armenian illuminated manuscripts

we refer about a XIV century manuscript n. 4915 (Gospel) from Aghtamar Island with colorful images

that were under restoration. The Aghtamar Gospel is a single bound manuscript (26.5‐27x18.5‐19 cm)

of 288 leaves written in bolorgir, a medieval Armenian cursive script. While no binding is extant,

sewing holes and trimmed edges show that it was bound at least twice previously. Notable are the

full‐page miniatures of the evangelists Matthew, Mark, Luke and John, each of whom faces the

opening text of his Gospel. Yellow, blue, green, magenta and red are lavishly employed in the

miniatures in a range of shades. White, black, grey and brown are used in discrete areas. The facing

pages of the miniatures contain brightly decorated headpieces that signal the opening of the Gospel

texts, plus stylized initials, zoomorphic writing and linear arabesque marginalia. Most of the text is

written with opaque black ink, with occasional rubrication and headings in orange‐red or dark

magenta. A faint grey color and strong indentations in the paper substrate show the ruling lines. The

paper of the text block is surface sized, probably with starch, and has been burnished to give it a

smooth, glazed appearance similar to parchment. Cover: wood and brown leather. The microsamples

were analyzed with different techniques showing that traditional pigments were used, but other

products and mixtures are typical of Armenia illumination, such as vergaut, a mixture of indigo and

orpiment suitable for foliage. The wide gamut of materials employed is of high significance. Gilding

was applied on Armenian bole together with a proper binder. Among the most frequently found

pigments there are: carbon, white lead, gypsum, calcite, orpiment, lazurite, indigo, cinnabar, goethite,

litharge, massicot, azurite, minium. A green is antlerite, a basic copper sulfate characteristic of

regions around Armenia.

References

P. Baraldi, Y.Keheyan, P.Zannini, Un altro paese, un’altra tavolozza,Pigmenti e coloranti nella miniatura di codici armeni da Matenadaran , XIV Congresso nazionale di chimica dell’ambiente e dei beni culturali “La chimica nella società sostenibile”, Rimini, 3‐5 giugno 2013

P. Baraldi, Y. Keheyan, G. Eliazian, A. Mkrtichian, S. Nunziante, C. Baraldi, Armenian illuminated manuscripts, a colourful testimony of religious art examined by molecular spectroscopy techniques, VIth European Symposi‐um on Religious Art, Restoration & Conservation, ESRARC2014, Florence, Italy, 9‐11 June 2014

G. Eliazyan, G. Alaverdyan, Y. Keheyan, Use of polymers for strengthening dilapidated museum materials Work‐shop; Metodi Chimici, Fisici e Biologici per la salvaguardia dei Beni Culturali, 18 dicembre, San Michele, Roma, 1998

Y. Keheyan, G. Eliazyan, G. Alaverdyan, The characterization of medieval colours and papers by laser desorption FT‐ICR mass spectrometry European materials research society spring meeting, E‐MRS 2001, Strasbourg (France, 5‐8) 2001

Y. Keheyan, P.Baraldi, G. Eliazian, A study on the polychromy and technique on some Armenian illuminated manuscripts by Raman microscopy, 2nd Int. Scientific seminar “The faces of memory: The newest Technologies of preservation and restoration of manuscript and printed heritage” October 7‐11, 2012, Yerevan, Armenia.

Y. Keheyan , P. Baraldi, P. Zannini, G Eliazian, C. Baraldi, M.C. Gamberini , S. Nunziante Cesaro A study of some illuminated Armenian manuscripts, ICOM‐CC Triennal Congress, Melbourne (Australia), September 15‐19, 2014

Y. Keheyan , P. Baraldi The use of non invasive micro‐Raman, FT‐IR and SEM‐EDX analyses to study Armenian illuminated manuscripts. Technart15, Catania, April 27‐30, 2015

Y. Keheyan, P.Baraldi , A.Agostino , G.Fenoglio , M.Aceto Spectroscopic study of an Armenian manuscript from the Biblioteca Universitaria di Bologna ESRARC15, VIIth European Symposium on Religious Art, Restora‐tion & Conservation, Trnava, Slovakia, 4‐6 June 2015

A First Step to Balinese Script OCR: An Initial Study on Isolated Character Recognition of Balinese

Script on Palm Leaf Manuscripts

Made Windu Antara Kesiman, Jean‐Christophe Burie, and Jean‐Marc Ogier

Laboratoire Informatique Image Interaction (L3i), University of La Rochelle, La Rochelle, France

Bali has a rich tradition of literature that dates back several hundred years. One very valuable cultural

relics that are found in Bali is a collection of palm leaf manuscripts (Fig. 1). The island’s literary works

were mostly recorded on dried and treated palm leaves, called lontar. Lontars store various forms of

knowledge and historical record of the social life of balinese cultures long ago. Lontars were written

in the ancient literary texts composed in the old Javanese language of Kawi and Sanskrit. The epic of

lontar varies from ordinary texts to Bali’s most sacred writings. Many of those epics based on the

famous Indian epics of Ramayana and Mahabharata. They include texts on religion, holy formulae,

rituals, family genealogies, law codes, treaties on medicine (usadha), arts and architecture, calendars,

prose, poems and even magics. Many lontars contain information on important issues such as

medicines and village regulations that are used as daily guidance.

Fig. 1: Balinese palm leaf manuscripts

Lontar is written on a dried palm leaf by using some sort small knife, which is then scrubbed and

colored with natural dyes. Lontars are inscribed with a special tool called pengerupak. It is made of

iron, with its tip sharpened in a triangular shape so it can make both thick and thin inscriptions. The

writings were incised in one (and or both) sides of the leaf and the script is then blackened with soot.

The leaves are held and linked together by a string that passes through the central holes and knotted

at the outer ends. But unfortunately, the physical condition of natural materials from palm leaves

certainly cannot last long. Many lontars discovered is a collection of the museum and private family

that has been in a state of disrepair due to age and due to inadequate storage conditions. Natural

materials from palm leaves certainly can not fight time. Usually, palm leaf manuscripts are of poor

quality since the documents have degraded over time due to storage conditions. Equipment that can

be used to protect the palm leaf to prevent rapid deterioration still relatively few in number, and

therefore the process of digitizing and indexing lontars are important (Kesiman et al. 2015/1;

Kesiman et al. 2015/2). In the last five years, ancient palm leaf manuscripts have received great

attention from researchers in the field of document image analysis. The collection of palm leaf

manuscripts in Southeast Asia attracted the attention of researchers in document image analysis, for

example some works on document analysis of palm leaf manuscript from Thailand (Chamchong et al.

2014; Fung and Chamchong 2010). The majority of Balinese have never read any lontar because of

language obstacles as well as tradition which perceived them as sacrilege. The main objective of this

project is to bring added value to digitized palm leaf manuscripts by developing tools to analyze,

index and access quickly and efficiently to the content of lontar. This research tries to make lontars

more accessible, readable and understandable to a wider audience and to scholars and students in

Bali, Indonesia and also all over the world. Lontars offer a new challenge in OCR development due to

the physical characteristics and conditions of the manuscripts. Balinese palm leaf manuscript images

provide a real challenge in the OCR development. The palm leaf manuscripts contain discoloured

parts and artefacts due to aging and low intensity variations or poor contrast, random noises, and

fading. Severals deformations in the character shapes are visibles due to the merges and fractures of

the use of nonstandard fonts, varying space between letters, and varying space between lines (Fig. 2).

Fig. 2: Severals deformations in lontar

(Kesiman et al. 2015/2)

With the aim of developing an OCR system for balinese script on palm leaf manuscript images, in this

paper we present our first initial study on isolated character recognition of Balinese script on palm

leaf manuscripts. We investigated the performance of two image features, gradient feature (Khayyat

et al 2013), and a more complex feature, Bag of Features with Dense SIFT (BoF DenseSIFT) (Rusinol

2011), by using two models as classifier, Support Vector Machine (SVM) and Hidden Markov Model

(HMM). We performed six schemes of experiment on our first isolated balinese character dataset

which is collected from our first balinese palm leaf manuscript collection. It consists of 37 characters

of balinese script alphabet, with a total number of 10.159 isolated character images as data train and

740 isolated character images as data test. For the experiment of SVM Two Classes classifier and

HMM classifier, we built 37 SVM models and 37 HMM models for each character. SVM Multi Classes

classifier is then performed by applying One‐VS‐All classification scheme based on all two classes

classifiers. We also present and analyse the quantitative and visual correlation between characters in

balinese script based on the recognition performance of our six schemes of experiment. This result of

analysis will serve in our futur works as the first step to OCR development of balinese script on palm

leaf manuscripts by using a more appropriate image feature or by proposing a new multi classes

classifier scheme in order to achieve a better recognition rate.

References

R. Chamchong, C.C. Fung, K.W. Wong, Comparing Binarisation Techniques for the Processing of Ancient Manu‐scripts, in: R. Nakatsu, N. Tosa, F. Naghdy, K.W. Wong, P. Codognet (Eds.), Cult. Comput., Springer Berlin Hei‐delberg, Berlin, Heidelberg, 2010: pp. 55–64. http://link.springer.com/10.1007/978‐3‐642‐15214‐6_6 (2014).

C. C. Fung, R. Chamchong, A Review of Evaluation of Optimal Binarization Technique for Character Segmenta‐tion in Historical Manuscripts, in: IEEE, 2010: pp. 236–240. doi:10.1109/WKDD.2010.110.

M.W.A. Kesiman, S. Prum, I.M.G. Sunarya, J.‐C. Burie, J.‐M. Ogier, An Analysis of Ground Truth Binarized Image Variability of Palm Leaf Manuscripts, in: 5th Int. Conf. Image Process. Theory Tools Appl. IPTA 2015, Orleans, France, 2015/1: pp. 229–233.

M.W.A. Kesiman, S. Prum, J.‐C. Burie, J.‐M. Ogier, An Initial Study On The Construction Of Ground Truth Bina‐rized Images Of Ancient Palm Leaf Manuscripts, 13th Int. Conf. Doc. Anal. Recognit. ICDAR, Nancy, France, 2015/2.

M. Khayyat, L. Lam, C.Y. Suen, Verification of Hierarchical Classifier Results for Handwritten Arabic Word Spot‐ting, in: IEEE, 2013: pp. 572–576. doi:10.1109/ICDAR.2013.119.

M. Rusinol, D. Aldavert, R. Toledo, J. Llados, Browsing Heterogeneous Document Collections by a Segmenta‐tion‐Free Word Spotting Method, in: IEEE, 2011: pp. 63–67. doi:10.1109/ICDAR.2011.22.

Fibre Analysis of Pattani Manuscripts

Ayşegül Kocaman

Manuscript Institution of Turkey, The Department of Manuscript Conservation and Archive, Istanbul, Turkey

Fiber analysis of historical manuscripts is an important step for determining the nature of paper

material and also deciding the conservation treatment. Owing to the fact that three Patani‐Thailand

manuscripts were analyzed. Test samples were collected from tree manuscript in the name of N1, N2

and N5 which the numbers indicate the manuscript’s collection number given at Pattani. Microscop

slides were prepared with in two groups. In first group fibers were painted with metilen blue and this

group called NM1; NM2 and NM5. Unpainted fibers were in the second group. The identification of

fibers was carried out by Olympus BH2 Polarized Microscopy with 4X and 40X magnification. Stereo‐

microscopy were also used for visual inspection.

Microscopic examination of fibers and optical appearance of fibres were compared with the fiber

atlas. We found mainly two kinds of fibers which were approximately 5‐10 mm in length, and the

others were 3,5‐7,5 mm in length. The first fibers are nucelluar, tough and flexible and the nodes

were usually not obvious. Cells appeared oval in shape. The second type of fibers were more shorten

than the first one and it is probably the fibers of rice. Our investigation revealed that the paper used

in manuscripts were Daulang the special paper of that region.

Pattani1 Pattani2 Pattani5

Fig. 1: Stereomicroscopic image of Patani1 Fig. 2: Microscop slides of samples

Fig. 3: Polarized Microscopic images of NM1, NM2 and NM5

References

M. Barkeshli, Historical and Scientific Analysis of Iranian Illuminated Manuscripts And Miniature Painting, The Book and Paper Group Annual, Vol. 22 (2003)

C.T. Beng, Marilyn Zurmuehlen Working Papers in Art Education Vol.3, Issue 1, Article 11

Application of forensic multispectral scanner to non‐invasive analysis of iron‐gall inks:

a comparison with XRF and micro‐Raman spectroscopic techniques

Anna Rogulska1 and Barbara Łydżba‐Kopczyńska2

1 Faculty of Chemistry, Jagiellonian University, Kraków, Poland 2 Faculty of Chemistry, University of Wroclaw, Wroclaw, Poland

The aim of this study was to establish the efficiency of a forensic multispectral scanner, constructed

for modern inks forensic analysis, in distinguishing between diverse iron‐gall inks in historic

documents. For that purpose, the results from multispectral scanning were compared with those

obtained with XRF and micro‐Raman spectroscopic examination, traditionally applied in historic inks

analysis. By multispectral scanning we expected to obtain data which allow for differentiation of iron‐

gall inks in cases where neither visual analysis nor traditional spectroscopic techniques are sufficient.

The object investigated in the presented study was a collection of XVII‐ and XVIII‐century Polish

administrative documents from the Ossolinski National Library in Wroclaw, Poland. Each of the

manuscript, coming from different part of the country and containing documentation of various lists,

transactions and signatures, provided us a good study material with diversified iron‐gall inks origin.

The text areas under consideration were first examined with portable X‐ray fluorescence

spectrometer (Tracer, Bruker) and micro‐Raman spectrometer equipped with 514 nm laser line

(Horiba Jobin Yvon T6400), to confirm the presence of iron‐gall ink and analyze the differences

between particular writing fragments. All of the measurements were performed in situ, without

taking the sample from the object. To multispectral scanning, a 2D spectral scanner, constructed for

forensic investigations of documents (Łydżba‐Kopczyńska et al. 2012), was implemented. The CADE

(Computer Aided Document Examination) system integrates different optical acquisitions methods:

microscopy, topography, spectroscopy and scattering. The main element of the system is special

spectral head with camera allowing for parallel acquisition of spectral (VIS/NIR) data. The second,

equally essential component of the system, is the relevant software. It contains several dedicated

algorithms for spectral analysis of questioned documents, as well as special visualization tools. In our

study, we focused mainly on reflectance spectra and false color imaging technique.

As a result of our investigation, we confirmed, that a multispectral scanner devoted to detecting

subtle spectral changes in modern inks, e.g. in the dye used, is also applicable to historic iron‐gall ink

identification. This technique can be an important alternative to traditionally used non‐invasive

techniques, such as micro‐Raman and XRF spectroscopy, which in some cases are not able to

demonstrate the difference between used writing materials of the same type, but diverse origin.

Acknowledgments

The authors acknowledge Ossolinski National Library in Wrocław for the access to the manuscripts

and conservators from the library for their assistance to this investigation.

References

B. I. Łydżba‐Kopczyńska, M. Mrzygłód, J. Reiner, G. Rusek, Application of Polymorphic Scanner 2D in Non‐invasive Investigations of Writing Materials, in: 10th Biennal International Conference of the Infrared and Raman Users Group, 2012 Book of Abstracts, 52‐53

Perceptual Model with global‐local Vision Primitives for Arabic Script Recognition

Samia Snoussi

Faculty of Computing and Information Technology, Jedda University, Saoudi Arabia

Our research deals with Arabic Handwritten script. We had proposed a recognition system, based on

interactive‐activation and verification perceptual models.

The proposed system is based on a Transparent Neural Network (TNN). The TNN proceeds by a global

vision of structural descriptors during propagation step and local vision by normalized Fourier

Descriptors during the retro‐propagation step (FD). The idea of TNN is coming from reading human

behavior. The ultimate goal of the proposed handwritten Arabic word recognition system is to

imitate the human ability to read at a much faster rate. Psychological studies have proven that

generally the human reads the word globally. He does not need to recognize all letters. The structural

shape of key letters on its own can be sufficient. If he cannot make it out, he will try to see the word

locally. Our recognition system is based on this idea of combining local and global features of the

Arabic words. Indeed, Some experiences have been carried out by psychologists, in order to study in

the first case how humans recognize shapes and in the second case primitives that stimulate human

vision.

These psycho‐cognitive experiences aimed at observing the behavior of the human while he is

reading and they gave us these observations:

● Importance of lexical context: global vision can help us to deduce local information

presented in some case distortions. From the principal primitives of the word we can

recognize some characters distorted or without primitives.

● Flagrant primitives: global vision can be sufficient to recognize a shape.

● Fine analysis: In order to recognize distorted shapes, it is not necessary to learn all possible

distortions. Training a typical prototype can be sufficient

Based on these principles, psychologists proposed perceptual models of human reading.

Some of these models are based on specific neural network and are implemented by automatic

writing researchers.

The proposed system called TNN‐FD is based on a transparent Neural Network (TNN) processessing

the handwriting word globally by the use of flagrant structural features. Local vision is assured by

normalized Fourier Descriptors (FD). The behavior of the TNN‐FD can be explained step by step. Thus,

it falls into the category of “transparent” systems.

The first particularity of TNN is that it needs a simple step of feature extraction and a normalization

post‐processing step used to reduce variability of handwriting. Indeed, feature extraction is one of

the most difficult and important problems of Arabic script recognition due to the variability of the

cursive script. So, the selected set of features should be easily extracted, efficiently discriminating

between patterns of different classes, but invariant for pattern within the same class. The second

particularity of the proposed recognition system is its ability to simulate human reading.

The proposed perceptual model for recognition of handwritten Arabic script proceeds with global

vision of structural characteristics that are easy to detect throw a TNN recognition system. In the

case of difficulties, a local vision by finer descriptors (FD) improves the decision rate of the TNN‐FD.

Normalized FDs have been chosen because of their invariability to rotation, position and size. The

particularity of TNN‐FD is that each step of recognition can be explained as features extraction,

letters detection, words estimation and script analysis. This transparency gives us possibility to

recognize script without a hard training step. Recognition rate can attempt the 90%. The proposed

TNN is neither time nor memory consuming. For this reason we choose to extend it to a large

vocabulary.

As first perspectives, we are working on the improvement of structural feature extraction. The

second perspective is the extension of the system to sentences by a combination of the TNN with a

syntactic and semantic analyser as a post‐processing step.

Why should philologists learn computer vision?

Daniel Stoekl Ben Ezra

Ecole Pratique des Hautes Etudes, Paris, France

In recent years, algorithms developed by computer vision specialists have become powerful enough

to attack some problems in manuscript studies. Fruitful collaborations with experts from HTR and

NLP, palaeography and forensics are driving the field foreward with promising achievements for all

participants. However, manuscript studies is a combination of highly complex fields with very

complex and greatly varying data in quality and quantity. It is hard to imagine that the “one does it all”

infrastructure will be invented that will be able to cater to assyriologists, egyptologists or other

epigraphists, specialists for Bhuddhist texts on palm leaves and bamboos, Coptic, Hieratic or Aramaic

Ostraca and Latin, Greek, Chinese, Arabic and Hebrew manuscripts with secondary interventions not

to mention palimpsests. Furthermore, once a specific problem in computer vision will be considered

solved from a algorithmic perspective, the field will move on to other theoretical questions that will

permit them to publish innovative articles in their special peer reviewed journals and conferences.

Philologists should therefore prepare to apply computer vision algorithms to their specific problems.

In fact, this situation is not so different from computer vision in the hard sciences. biologists,

astronomers, physicists all learned to code. In philology, this necessity may be even greater, since

images of manuscripts differ from medical or sattelite imagery in having less health, industrial, or

strategic interest and therefore, it is to be expected that less money will be available from outside

the academic and cultural institutions. While perhaps not being cutting edge in informatics, coding as

a manuscript specialist may have the advantage of arriving – sometimes – at better practical results

due to a much shorter evaluation distance.

This contribution will present the results of code written by a philologist for automatic layout analysis

(column and line segmentation) and handwriting analysis (consonant/vocalization segmentation,

transcription alignment and glyph clustering) of Hebrew manuscripts from the Dead Sea Scrolls and

the Middle Ages. The algorithm is based on binarization, vertical and horizontal projection profiles,

morphological transformations, expectations deducted from previous knowledge of the manuscript

and handwriting, synthetic manuscript creation and letter spotting via HOG feature analysis.

The “decorative style” group reconsidered: A contribution to the study of twelfth and thirteenth

century production of Greek illuminated manuscripts in the Eastern Mediterranean

Marina Toumpouri

Science and Technology in Archaeology Research Center (STARC), The Cyprus Institute, Nicosia, Cyprus

The publication of Annemarie Weyl Carr’s monograph in 1987 resulted in the attribution of the

largest homogeneous group of Byzantine illuminated manuscripts known to have survived, to the

Cypro‐Palestinian region (Weyl Carr 1987). The group which comprises 110 members ‐and continues

to expand‐ was given the name “decorative style” based on the stylistic features of their illumination

(decoration and miniatures). The group dated from ca. 1150 to 1250 (Weyl Carr 1987) is in fact

almost all that is known of Byzantine illuminated manuscript production in the twelfth century and

the only group of deluxe Greek manuscripts from the first half of the thirteenth century. Despite the

fact that the stylistic, iconographic and codicological study of the manuscripts made possible their

classification into three groups and within them, eight subgroups, Carr stressed that the manuscripts

must have not been produced in “stable workshops” but resulted from shifting patterns of

relationships between scribes, illuminators and models. She furthermore made clear that several

issues, such as the position of such an important group within the manuscript and artistic production

in the Eastern Mediterranean during the Crusader period, require further examination (Weyl Carr

1987).

More recent studies challenged the locality of production proposed by Carr, suggesting as more

plausible Constantinople (Maxwell 2014). The time span of the production has also been challenged

since the discovery of new dated manuscripts from the end of the thirteenth century, assigned

membership in the group (Džurova 2002). Studies which focused on the textual evidence (New

Testament) aiming to shed light on the place of production of the “decorative style” or attempting to

verify Carr’s manuscript groupings showed that despite codicological and stylistic ties between the

manuscripts, these connections could not be confirmed, but instead, revealed that subgroup

boundaries are in several cases blurred (Maxwell 2014; Langford 2009).

The above bespeak in fact not only the complexity of the interrelationships of the manuscripts of the

group but also the limitations of traditional methodologies. Information regarding types of materials,

procurement and manufacture processes at a microscopic (inks, pigments, parchment etc.) and

macroscopic (bindings, page layout etc.) level, but also, automated document analyses have a major

contribution to make, since they could generate new objective and unbiased knowledge which was

previously remaining ignored or inaccessible. The “decorative style” group and the questions it

addresses, advocate in fact for the application of a holistic approach including technological, visual,

physicochemical, biomolecular and historical evidence and which would built upon and at the same

time reassess Carr’s art historical analysis and classification.

The poster will present the project launched in 2014 as part of the research agenda of the mobile

laboratory STAR Lab (ΝΕΑ ΥΠΟΔΟΜΥ/ΣΤΡΑΤΗ/0308/30 co‐financed by the European Regional

Development Fund and the Republic of Cyprus through the Research Promotion Foundation). The

project is striving to create a corpus of analytical information with the aim to contribute to the study

of the “decorative style” group, by uncovering critical information as to their place and context of

production, as well as regarding their interrelationships. The great potential of such a holistic

approach in the field of manuscript studies given the contribution to our knowledge of distribution

and use of manuscripts, but also, of artistic and cultural practices within the Eastern Mediterranean,

will be highlighted by presenting the results obtained during the first phase of the project. It

concentrated mainly on analytical characterization of manuscripts belonging to the “decorative style”

group found in Cyprus, through non‐invasive techniques (digital microscopy, FORS, XRF, FTIR and

multispectral imaging).

References

A. Weyl Carr, Byzantine Illumination 1150‐1250. The Study of a Provincial Tradition. University of Chicago Press, Chicago, 1987.

K. Maxwell, The Afterlife of Texts: Decorative Style Manuscripts and New Testament Textual Criticism, in: L. Jones (Ed.), Byzantine Images and their Afterlives. Essays in Honor of Annemarie Weyl Carr, Ashgate, Surrey, 2014, pp. 11‐39.

A. Džurova, Byzantinische Miniaturen. Schätze der Buchmalerei vom 4. bis zum 19. Jahrhundert, Schnell und Steiner, Regensburg, 2002.

W. Langford, From Text to Art and Back Again: Verifying A. Weyl Carr's Manuscript Groupings through Textual Analysis, Unpublished Ph.D. Dissertation, Faculty of the New Orleans Baptist Theological Seminary, 2009

Meaningless Text OCR Model for Medieval Scripts

Adnan Ul‐Hasan1, Syed Saqib Bukhari2, and Andreas Dengel1,2

1University of Kaiserslautern, Germany 2German Research Center for Artificial Intelligence, Kaiserslautern, Germany

Abstract: Availability of large amount of ground‐truth data for training an Optical Character

Recognition (OCR) engine is extremely critical. Training data is usually produced by manually

transcribing thousands of document images. In order to augment the limited training data, synthetic

training data is also used, where training data is produced by rendering text into images in suitable

fonts and styles. The most important part in synthetic training data is the corresponding real world

text. If real world text data is unavailable, which could be a case in historical manuscripts, generating

synthetic training data is not possible. In this paper, this problem has been addressed for the case of

historical manuscripts whose vocabulary and sentence structure is neither available in text form not

it is similar to any existing (contemporary) scripts. For such a case, we have introduced a novel

meaningless text OCR model, where meaningless words of variable sizes are generated by permuting

characters. Meaningless text lines are subsequently produced by randomly choosing these

meaningless words. Testing of the meaningless text‐line recognizer on real text‐lines shows good

performance.

The rest of the paper answers the following questions in sequence: which types of historical

documents are we dealing here?, why a textline‐based recognizer is preferable over character‐based

recognizer?, what is the traditional way of training textline‐based recognizers?, what novel technique

we are presenting to overcome the limitations of traditional training procedure?, and what initial

results we have achieved?

Which types of historical documents are we dealing here?

The “Narrenschiff” is a medieval 15th century German novel. Its first edition was printed in German

language in 1494 in Basel and gained a lot of popularity at that time. Afterwards a lot of copies were

spread with numerous translations and variations all over the Europe. Almost every edition can be

characterized by the use of historical fonts and vocabulary. We are digitizing these documents under

the German government funded project, Kallimachos. Some sample document images from this

novel are shown in Figure 1.

Why a textline‐based recognizer is preferable over character‐based recognizer?

Textline based recognizers, in contrast to character based recognizers, produce better recognition

accuracies without using any language modeling or any other post‐processing techniques. Because,

such line based recognizers trained characters with their context. Albeit simple to use, LSTM‐based

recognizers have shown excellent OCR results for many scripts (Breuel et al. 2013; Ul‐Hasan 2013.;

Ul‐Hasan and Breuel 2013.; Karayil et al. 2015; Simistira et al. 2015).

What is the traditional way of training textline‐based recognizers?

For the development of a text‐line based OCR model ground‐truth data, that is a set of text‐line im‐

ages and the corresponding text lines, plays an important role. Traditionally, the following paradigms

are used for training an LSTM‐based OCR model for a particular script: (i) From the scanned docu‐

ment images, extract the text lines along with their ground‐truth information and use them for the

training of an LSTM network., (ii) If text‐lines obtained from above process are not sufficient, then

generate synthetic text lines using the available text in that language. However, in case of historical

script, both of these traditional paradigms are not applicable, first because a lot of time is required

for manual transcription and second because of the non‐availability of digitally created text.

What novel technique we are presenting to overcome the limitations of traditional training way?

For the training of LSTM networks for non‐existing scripts, we have generated meaningless text lines.

The process of generating such meaningless data for training line‐based recognizer is described in the

following. Firstly, we randomly generate a word corpus consisting of all possible permutations of all

characters for several word lengths. We referred the set of these words as a bag of meaningless

words. For Latin “Narrenshiff” novel, there are around 84 characters (small, capital, punctuation

marks, digits). The permutations of all characters with different word lengths (let say 1 to 8) produce

a huge amount of meaningless words, and for limited memory and processing time resources, it is

difficult to include all of these words. For the proof of concept, in this paper we have limited our‐

selves to small length words (i.e. permutations of only small alphabets of 3 and 4 word lengths). Then,

we generated the text‐line training data by using these bags of meaningless words. After that, we

rendered text lines to develop a training database. Some sample rendered text lines images from our

meaningless training database are shown in Figure 2. Finally we trained a LSTM recognizer using the

meaningless training data.

What initial results we have achieved?

To evaluate the trained LSTM model over meaningless training data, we generated 300 textline im‐

ages from real transcriptions consisting of 3 and 4 length meaningful/real words. For this purpose,

we first prepare a word‐list from all 3 and 4 length words and then combine them into text lines of

length up to 10 words per text‐line. The results are shown in Table 1 and Figure 3.

Conclusion:

In this paper, firstly we have introduced a novel process of automatically generating synthetic train‐

ing data for non‐existing historical scripts, i.e. meaningless text. Then, we have showed the perfor‐

mance of meaningless trained model on real‐world test samples. Even though we have trained a

meaningless model on only a limited length of words. The initial results on meaningful/real text data

are promising.

References

T. M. Breuel, A. Ul‐Hasan, M. Al Azawi, F. Shafait. High Performance OCR for Printed English and Fraktur using LSTM Networks. In ICDAR, Washington D.C. USA, aug 2013.

T. Karayil, A. Ul‐Hasan, and T. M. Breuel. A Segmentation‐Free Approach for Printed Devanagari Script Recogni‐tion. In ICDAR, Tunisia, Tunisia, 2015.

F. Simistira, A. Ul‐Hasan, V. Papavassiliou, B. Gatos, V. Katsouros, and M. Liwicki. Recognition of Historical Greek Polytonic Scripts Using LSTM Networks. In ICDAR, Tunisia, Tunisia, 2015.

A. Ul‐Hasan, S. B. Ahmed, S. F. Rashid, F. Shafait, and T. M. Breuel. Offline Printed Urdu Nastaleeq Script Recog‐nition with Bidirectional LSTM Networks. In ICDAR, pages 1061–1065, Washington D.C. USA, 2013.

A. Ul‐Hasan, T. M. Breuel. Can we Build Language Independent OCR using LSTM Networks? In International Workshop on Multilingual OCR, 2013.

Contacts

Maurizio Aceto Dipartimento di Scienze e Innovazione Tecnologica (DISIT) Università degli Studi del Piemonte Orientale, Viale Teresa Michel, 11 15121 Alessandria, Italy eMail: maurizio.aceto@mfn.unipmn.it. Angelo Agostino Department of Chemistry University of Turin N: 7, ST: P. Giuria, CAP 10125, Turin, Italy eMail: angelo.agostino@unito.it Nurgül Akcebe The Department of Manuscript Conservation and Archive (Kitap Şifahanesi) Manuscripts Institution of Turkey Kanuni Medresesi Sok. No: 1 Fatih 34080 İstanbul, Turkey eMail: nurgulakcebe@gmail.com Fauzia Albertin Faculté des sciences de base Ecole Polytechnique Fédérale de Lausanne (EPFL) CH‐1015 Lausanne, Switzerland eMail: fauzia.albertin@epfl.ch Celena Allen Synchromedia Laboratory École de Technologie Supérieure Montreal, Canada, H3C 1K3 eMail: celena.allen@gmail.com Sumaya S. Ali Al‐ma’adeed Department of Computer Science and Engineering Qatar University, P.O. Box 2713, Doha, Doha, Qatar eMail: s_alali@qu.edu.qa Christine Andraud MNHN‐CRCC 36 rue Geoffroy St Hilaire, 75005 Paris, France eMail: christine.andraud@mnhn.fr Ehsan Arabnejad Synchromedia Laboratory École de Technologie Supérieure Montreal, Canada, H3C 1K3 eMail: earabnejad@synchromedia.ca

Isabelle Aristide‐Hastir Archives Nationales 59, rue Guynemer, 93383 Pierrefitte‐sur‐Seine, France Corneliu T.C. Arsene School of Arts, Languages and Cultures University of Manchester United Kingdom eMail: corneliu.arsene@manchester.ac.uk Matteo Bettuzzi Centro Fermi, 00184 Roma, Italy Dipartimento di Fisica e Astronomia, Università di Bologna, 40127 Bologna, Italy INFN Sezione di Bologna, 40127 Bologna, Italy eMail: matteo.bettuzzi@unibo.it Siam Bhayro Department of Theology and Religion University of Exeter United Kingdom eMail: s.bhayro@exeter.ac.uk Marina Bicchieri Head of Chemistry Dpt. Istituto Centrale Restauro e Conservazione del Patrimonio Archivistico e Librario (Icrcpal) Via Milano 76, 00184 Roma, Italy eMail: marina.bicchieri@beniculturali.it Théodore Bluche A2iA 39 rue de la Bienfaisance, 75008 Paris, France eMail: tb@a2ia.com Rosa Brancaccio Centro Fermi, 00184 Roma, Italy Dipartimento di Fisica e Astronomia, Università di Bologna, 40127 Bologna, Italy INFN Sezione di Bologna, 40127 Bologna, Italy eMail: rosa.brancaccio@unibo.it Antonella Brita Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, 20354 Hamburg, Germany eMail: antonella.brita@uni‐hamburg.de Christian Brockmann Universität Hamburg Institut für Griechische und Lateinische Philologie Von‐Melle‐Park 6, D‐20146 Hamburg, Germany Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: christian.brockmann@uni‐hamburg.de

Maddalena Bronzato Federchimica via Giovanni da Procida 11, 20149, Milano, Italy eMail: maddalena.bronzato@gmail.com Emmanuel Brun ESRF—The European Synchrotron Radiation Facility 71 Avenue des Martyrs, 38000 Grenoble, France Inserm, U836, Grenoble, F‐38043, France eMail: emmanuel.brun@esrf.fr Syed Saqib Bukhari German Research Center for Artificial Intelligence Kaiserslautern eMail: saqib.bukhari@dfki.de Jean‐Christophe Burie Laboratoire Informatique Image Interaction (L3i) University of La Rochelle, Avenue Michel Crépeau, 17042 La Rochelle Cedex 1, France eMail: jcburie@univ‐lr.fr Pınar Çakar Manuscripts Institution of Turkey Department of Manuscript Conservation and Archive Kanuni Medresesi Sok. No: 1 Suleymaniye Fatih, 34116 Istanbul, Turkey eMail: pinarcakar00@yahoo.com Frederica Cappa

Institute of Science and Technology in Art

Academy of Fine Arts, 1010 Vienna, Austria

eMail: f.cappa@akbild.ac.at

Francesco Carillo Department of Electrical Engineering and Applied Mathematics@University of Salerno 132 Fisciano (Salerno), Italy eMail: f.carillo1@hotmail.it D. Chenouni LIPI/ ENS, Fes, Morocco eMail: d_chenouni@yahoo.fr Mohamed Cheriet Department of Automated Manufacturing Engineering, ETS University of Quebec 1100, Notre‐Dame Street‐ west, Montreal, Quebec H3C 1K3; Canada eMail: mohamed.cheriet@etsmtl.ca

Damian Chlebda Jagiellonian University Faculty of Chemistry Ingardena 3, 30‐060 Krakow, Poland eMail: damian.chlebda@uj.edu.pl Florence Cloppet LIPADE Laboratoire d’informatique Paris Descartes (Université Paris Descartes) France eMail: florence.cloppet@mi.parisdescartes.fr Rafi Cohen Department of Computer Science Ben‐Gurion University of the Negev, Israel eMail: rafico@cs.bgu.ac.il Claudia Colini Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, 20354 Hamburg, Germany eMail: claudia.sirim@gmail.com Daniel Deckers Universität Hamburg Institut für Griechische und Lateinische Philologie Von‐Melle‐Park 6, D‐20146 Hamburg, Germany eMail: daniel.deckers@uni‐hamburg.de Daniel Delattre CNRS‐IRHT‐Institut de Recherche et d’Histoire des Textes 40 Avenue d' Iéna, 75116 Paris, France eMail: daniel.delattre@irht.cnrs.fr Martin Delhey Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, 20354 Hamburg, Germany eMail: martin.delhey@uni‐hamburg.de Andreas Dengel University of Kaiserslautern German Research Center for Artificial Intelligence eMail: andreas.dengel@dfki.de Véronique Eglin LIRIS Laboratoire d’Informatique en Image et Systèmes d'information (INSA de Lyon – UMR 5205) 20 av. Albert Einstein, 69621 Lyon, France eMail: veronique.eglin@insa‐lyon.fr Y. Elfakir LIPI/ ENS Fes, Morocco eMail: elfakir.youssef11@gmail.com

Gayane Eliazyan Restoration Dept. of Matenadaran Museum of Yerevan, Armenia, 0009 Yerevan, Mashtotsi Ave., 53, Armenia eMail: elgayane@yahoo.com Jihad El‐Sana Department of Computer Science Ben‐Gurion University of the Negev, Israel eMail: el‐sana@cs.bgu.ac.il Reza Farrahi Moghaddam Synchromedia Lab, ETS UduQ, Montreal, QC, Canada H3C 1K3 eMail: imriss@yahoo.com Carlo Federici Department of Humanistic Studies University of Venice Ca’ Foscari Dorsoduro 3484/D, 30123 Venezia, Italy eMail: cfederici@unive.it Claudio Ferrero ESRF—The European Synchrotron Radiation Facility 71 Avenue des Martyrs, 38000 Grenoble, France eMail: ferrero@esrf.eu Andreas Fischer DIVA research group, Department of Informatics University of Fribourg 1700 Fribourg, Switzerland. eMail: andreas.fischer@unifr.ch Gernot A. Fink Department of Computer Science TU Dortmund University Otto‐Hahn‐Str. 8, D‐44221 Dortmund, Germany eMail: gernot.fink@udo.edu Michael Friedrich Universität Hamburg Asien‐Afrika‐Institut Edmund‐Siemers‐Allee 1, Flügel Ost, D‐20146 Hamburg, Germany Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: michael.friedrich@uni‐hamburg.de Bernadette Frühmann Institute of Science and Technology in Art Academy of Fine Arts Vienna Schillerplatz 3, 1010 Vienna, Austria eMail: b.fruehmann@akbild.ac.at

Basilis Gatos Computational Intelligence Laboratory Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, Athens, Greece eMail: bgat@iit.demokritos.gr Angelika Garz University of Fribourg DIVA Group (Document, Image and Voice Analysis) Department of Informatics Boulevard de Pérolles 90, CH‐1700 Fribourg, Switzerland eMail: angelika.garz@unifr.ch Mirjam Geissbühler University of Bern Institut für Germanistik Länggasstr. 49, CH‐3000 Bern, Switzerland eMail: mirjam.geissbuehler@germ.unibe.ch Leif Glaser Deutsches Elektronen‐Synchrotron Notkestr. 85, D‐22607 Hamburg, Germany eMail: Leif.Glaser@desy.de Monica Gulmini Department of Chemistry University of Turin N: 7, ST: P. Giuria, CAP: 10125 Turin, Italy eMail: monica.gulmini@unito.it Oliver Hahn BAM Federal Institute for Materials Research and Testing, Berlin, Germany Division 4.5 Unter den Eichen 44‐46, D‐12203 Berlin, Germany Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: oliver.hahn@bam.de Stefan Hartmann Center for X‐ray Analytics Swiss Federal Laboratories for Materials Science and Technology Dubendorf, Switzerland Rachid Hedjam Department of Geography, McGill University 805 Sherbrooke Street West, Montreal, Qc H3A 2K6; Canada eMail: rachid.hedjam@mcgill.ca

Aymeric Histace ETIS, UMR CNRS 8051 6, avenue du Ponceau, 95014 Cergy‐Pontoise, France eMail: aymeric.histace@u‐cergy.fr Fabian Hollaus Computer Vision Lab Vienna University of Technology Favoritenstrasse 9‐11, 1040 Vienna, Austria eMail: holl@caa.tuwien.ac.at Rolf Ingold DIVA research group, Department of Informatics University of Fribourg 1700 Fribourg, Switzerland eMail: rolf.ingold@unifr.ch Iwan Jerjen Swiss Light Source Paul‐Scherrer‐Institute Villigen, Switzerland eMail: iwan.jerjen@psi.ch Tobias J. Jocham Corpus Coranicum Berlin‐Brandenburgische Akademie der Wissenschaften Potsdam, Germany College de France eMail: jocham@bbaw.de Margaret Kalacska Department of Geography McGill University 805 Sherbrooke Street West, Montreal, Qc H3A 2K6; Canada eMail: margaret.kalacska@mcgill.ca Rolf Kaufmann Center for X‐ray Analytics Swiss Federal Laboratories for Materials Science and Technology Dubendorf, Switzerland eMail: rolf.kaufmann@empa.ch Klara Kedem Department of Computer Science Ben‐Gurion University of the Negev Israel eMail: klara@cs.bgu.ac.il

Yeghis Keheyan ISMN, CNR Dept. of Chemistry University of Rome “La Sapienza” p.le Aldo Moro 5, Rome 00185, Italy eMail: yeghis.keheyan@uniroma1.it Florian Kergourlay MNHN‐CRCC 36 rue Geoffroy St Hilaire, 75005 Paris, France eMail: florian.kergourlay@gmail.com Christopher Kermorvant TEKLIA 164 avenue de Suffren, 75015 Paris, France eMail: kermorvant@teklia.com Ghizlane Khaissidi LIPI/ ENS Fes, Morocco eMail: ghizlane.derkaoui1@hotmail.com Anastasios L. Kesidis Department of Surveying Engineering, Technological Educational Institution of Athens Greece eMail: akesidis@iit.demokritos.gr Made Windu Antara Kesiman Laboratoire Informatique Image Interaction (L3i) University of La Rochelle Avenue Michel Crépeau, 17042, La Rochelle Cedex 1, France eMail: made_windu_antara.kesiman@univ‐lr.fr Mojtaba Mahmoudi Khorandi Department of Chemistry University of Turin (Italy) N: 7, ST: P. Giuria, CAP: 10125, Turin, Italy eMail: mojt.mahmoudi@gmail.com Ayşegül Kocaman The Department of Manuscript Conservation and Archive (Kitap Şifahanesi) Manuscripts Institution of Turkey Kanuni Medresesi Sok. No: 1 Fatih 34080 İstanbul, Turkey eMail: aysegul_80@yahoo.com Thomas Konidaris Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: thomas.konidaris@uni‐hamburg.de

Keith T. Knox Imaging Consultant 2739 Puu Hoolai Street, Kihei, Hawaii 96753, USA eMail: knox@cis.rit.edu Z. Lakhliai LIPI/ ENS Fes, Morocco Bertrand Lavédrine MNHN‐CRCC 36 rue Geoffroy St Hilaire 75005 Paris, France eMail: lavedrin@mnhn.fr Yann Leydier LIRIS Laboratoire d’Informatique en Image et Systèmes d'information (INSA de Lyon – UMR 5205) 20 av. Albert Einstein, 69621 Lyon, France LIPADE Laboratoire d’informatique Paris Descartes (Université Paris Descartes) France eMail: yann@leydier.info Rosine Lheureux Archives Nationales 59, rue Guynemer 93383 Pierrefitte‐sur‐Seine, France eMail: rosine.lheureux@culture.gouv.fr Tomasz Łojewski AGH University of Science and Technology Faculty of Materials Science and Ceramics Mickiewicza 30, 30‐059 Krakow, Poland eMail: lojewski@agh.edu.pl Vito Lorusso Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, 20354 Hamburg, Germany eMail: vito.lorusso@uni‐hamburg.de Barbara Łydżba‐Kopczyńska Faculty of Chemistry University of Wroclaw F. Joliot‐Curie 14, 50‐383 Wroclaw, Poland eMail: barbara.lydzba@chem.uni.wroc.pl Angelo Marcelli Department of Electrical Engineering and Applied Mathematics@University of Salerno 132 Fisciano (Salerno), Italy eMail: amarcelli@unisa.it

Michael Josef Marx Corpus Coranicum Berlin‐Brandenburgische Akademie der Wissenschaften Potsdam, Germany eMail: marx@bbaw.de Manfred Mayer University Library Graz, Special Collections Universitätsplatz 3a, 8010 Graz, Österreich eMail: manfred.mayer@uni‐graz.at Anne Michelin Muséum National d’Histoire Naturelle‐CRCC 36 rue Geoffroy St Hilaire, 75005 Paris, France eMail: anne.michelin@gmail.com Heinz Miklas Institute of Slavic Studie University of Vienna Spitalgasse 2 Hof 3, 1090 Vienna, Austria eMail: heinz.miklas@univie.ac.at Vito Mocella The Institute for Microelectronics and Microsystems (IMM) of CNR via Pietro Castellino, 111 , 80131 Napoli, Italy eMail: vito.mocella@na.imm.cnr.it Maria Pia Morigi Centro Fermi, 00184 Roma, Italy Dipartimento di Fisica e Astronomia Università di Bologna, 40127 Bologna, Italy INFN Sezione di Bologna, 40127 Bologna, Italy eMail: mariapia.morigi@unibo.it Mostafa Mrabti LIPI/ ENS Fes, Morocco eMail: mostafa.mrabti@yahoo.fr Hossein Ziaei Nafchi Synchromedia Laboratory École de Technologie Supérieure Montreal, Canada, H3C 1K3 eMail : hossein.zi@synchromedia.ca Luca Nodari IENI‐CNR and INSTM UdR of Padova Corso Stati Uniti 4, 35127, Padova, Italy eMail: nodari@ieni.cnr.it.

Denis Nosnitzin Universität Hamburg Hiob Ludolf Center for Ethiopian Studies Alsterterrasse 1, D‐20354 Hamburg, Germany eMail: denis.nosnitsin@uni‐hamburg.de Jean‐Marc Ogier Laboratoire Informatique Image Interaction (L3i) University of La Rochelle, Avenue Michel Crépeau, 17042 La Rochelle Cedex 1, France eMail: jean‐marc.ogier@univ‐lr.fr Alessandra Patera Swiss Light Source Paul‐Scherrer‐Institute Villigen, Switzerland eMail: alessandra.patera@psi.ch Eva Peccenini Centro Fermi, 00184 Roma, Italy Dipartimento di Fisica e Astronomia Università di Bologna, 40127 Bologna, Italy INFN Sezione di Bologna, 40127 Bologna, Italy eMail: eva.peccenini@unibo.it Peter E. Pormann School of Arts, Languages and Cultures University of Manchester United Kingdom eMail: peter.pormann@manchester.ac.uk Boryana Pouvkova Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: boryana.pouvkova@uni‐hamburg.de Ira Rabin BAM Federal Institute for Materials Research and Testing, Berlin, Germany Division 4.5 Unter den Eichen 44‐46, D‐12203 Berlin, Germany Universität Hamburg Centre for the Study of Manuscript Cultures Warburgstraße 26, D‐20354 Hamburg, Germany eMail: ira.rabin@bam.de Claudia Rapp Institute of Byzantine and Modern Greek Studie University of Vienna Postgasse 7, 1010 Vienna, Austria eMail: c.rapp@univie.ac.at

Anna Rogulska Faculty of Chemistry Jagiellonian University Ingardena 3, 30‐060 Kraków, Poland eMail: rogulska@chemia.uj.edu.pl Leonard Rothacker Department of Computer Science TU Dortmund University Otto‐Hahn‐Str. 8, D‐44221 Dortmund, Germany eMail: leonard.rothacker@udo.edu Robert Sablatnig Computer Vision Lab Vienna University of Technology Favoritenstrasse 9‐11 1040 Vienna, Austria eMail: sab@caa.tuwien.ac.at Adolfo Santoro Department of Electrical Engineering and Applied Mathematics@University of Salerno Via Giovanni Paolo II 132 Fisciano (Salerno), Italy eMail: adsantoro@unisa.it Hamed Sayyadshahri Department of Physics and Earth sciences University of Ferrara N: 1, St: Saragat, CAP: 44122 eMail: hamedsayyad.sh@gmail.com Manfred Schreiner Institute of Science and Technology in Art Academy of Fine Arts Vienna Schillerplatz 3, 1010 Vienna, Austria eMail: m.schreiner@akbild.ac.at William I. Sellers School of Arts, Languages and Cultures University of Manchester United Kingdom eMail: william.sellers@manchester.ac.uk Mathias Seuret DIVA research group, Department of Informatics University of Fribourg Boulevard de Pérolles 90, 1700 Fribourg, Switzerland eMail: mathias.seuret@unifr.ch Samia Snoussi Faculty of Computing and Information Technology Jedda University, Saoudi Arabia eMail: samia.maddouri@enit.rnu.tn

Daniel Stökl Ben Ezra Directeur d'Études EPHE‐Sorbonne Sciences historiques et philologiques Chargé de mission en humanités numériques à l'EPHE eMail: stoekl@mmsh.univ‐aix.fr Peter A. Stokes Department Digital Humanities King’s College London 26‐29 Drury Lane, London, United Kingdom eMail: peter.stokes@kcl.ac.uk Dominique Stutzmann CNRS‐IRHT‐Institut de Recherche et d’Histoire des Textes 40 Avenue d' Iéna, 75116 Paris, France eMail: dominique.stutzmann@irht.cnrs.fr Sebastian Sudholt Department of Computer Science TU Dortmund University Otto‐Hahn‐Str. 16, 44221 Dortmund, Germany eMail: sebastian.sudholt@udo.edu Marina Toumpouri Science and Technology in Archaeology Research Center (STARC) The Cyprus Institute 20, Konstantinou Kavafi Street, 2121 Aglantzia, Nicosia, Cyprus eMail: m.toumpouri@cyi.ac.cy Elaine Treharne Synchromedia Laboratory École de Technologie Supérieure Montreal, Canada, H3C 1K3 eMail: treharne@stanford.edu Adnan Ul‐Hasan University of Kaiserslautern eMail: adnan@cs.uni‐kl.de Wilfried Vetter Institute of Science and Technology in Art Academy of Fine Arts Vienna Schillerplatz 3, 1010 Vienna, Austria eMail: w.vetter@akbild.ac.at Nicole Vincent LIPADE Laboratoire d’informatique Paris Descartes (Université Paris Descartes), France eMail: nicole.vincent@mi.parisdescartes.fr

M. A. El Yacoubi SAMOVAR Télécom SudParis, CNRS Université Paris‐Saclay, France eMail: mounim.el_yacoubi@telecom‐sudparis.eu Melania Zanetti Department of Humanistic Studies University of Venice Ca’ Foscari Dorsoduro 3484/D, 30123 Venezia, Italy eMail: melania.zanetti@unive.it Alfonso Zoleo Department of Chemical Sciences University of Padova Via Marzolo 1, 35131, Padova, Italy eMail: alfonso.zoleo@unipd.it

Index of Authors

Aceto, M. ∙ 13

Agostino, A. ∙ 13

Akcebe, N. ∙ 73

Albertin, F. ∙ 31

Albritton, B. L. ∙ 41

Allen, C. ∙ 41

Al‐Ma’adeed, S. ∙ 65, 79, 85

Andraud, C. ∙ 59

Arabnejad, E. ∙ 41

Aristide‐Hastir, I. ∙ 59

Arsene, C. ∙ 29

Bettuzzi, M. ∙ 31

Bhayro, S. ∙ 29

Bicchieri, M ∙ 11

Bluche, T. ∙ 51

Brancaccio, R. ∙ 31

Brita, A. ∙ 23

Brockmann, C. ∙ 25

Bronzato, M. ∙ 75

Brun, E. ∙ 57

Bukhari, S. S. ∙ 105

Burie, J. C. ∙ 93

Çakar, P. ∙ 15

Cappa, F. ∙ 83

Carillo, F. ∙ 69

Chenouni, D. ∙ 43

Cheriet, M. ∙ 41, 65, 79, 85

Chlebda, D. ∙ 37

Cloppet, F. ∙ 51

Cohen, R. ∙ 45

Colini, C. ∙ 77

Deckers, D. ∙ 25

Delattre, D. ∙ 57

Delhey, M. ∙ 21

Dengel, A. ∙ 105

Eglin, V. ∙ 51

El Yaccoubi, M. A. ∙ 43

Elfakir, Y. ∙ 43

Eliazyan, G. ∙ 91

El‐Sana, Jihad ∙ 45

Federici, C. ∙ 75

Ferrero, C. ∙ 57

Fink, G. A. ∙ 49

Fischer, A. ∙ 61

Frühmann, B. ∙ 35, 83

Garz, A. ∙ 61

Gatos, B. ∙ 53

Geissbühler, M. ∙ 19

Glaser, L. ∙ 25

Gulmini, M. ∙ 13

Hahn, O. ∙ 77

Hartmann, S. ∙ 31

Hedjam, R. ∙ 65, 85

Histace, A. ∙ 59

Hollaus, F. ∙ 35

Ingold, R. ∙ 61

Jerjen, I. ∙ 31

Jocham, T. J. ∙ 89

Kalacska, M. ∙ 65, 85

Kaufmann, R. ∙ 31

Keheyan, Y. ∙ 91

Kergourlay, F. ∙ 59

Kermorvant, C. ∙ 51

Kesidis, A. L. ∙ 53

Kesiman, M. ∙ 93

Khaissidi, G. ∙ 43

Khorandi, M. M. ∙ 13

Knox, K. T. ∙ 27

Kocaman, A. ∙ 95

Konidaris, T. ∙ 53

Lakhliai, Z. ∙ 43

Lavédrine, B. ∙ 59

Leydier, Y. ∙ 51

Lheureux, R. ∙ 59

Łojewski, T. ∙ 37

Lorusso, V. ∙ 33

Łydżba‐Kopczyńska, B. ∙ 97

Marcelli, A. ∙ 69

Marx, M. J. ∙ 89

Mayer, M. ∙ 17

Michelin, A. ∙ 59

Miklas, H. ∙ 35

Mocella, V. ∙ 57

Moghaddam, R. F. ∙ 79

Morigi, A. P. ∙ 31

Mrabti, M. ∙ 43

Nafchi, H. Z. ∙ 41

Nodari, L. ∙ 75

Nosnitsin, D. ∙ 23

Ogier, J.‐M. ∙ 93

Patera, A. ∙ 31

Peccenini, E. ∙ 31

Pormann, P. E. ∙ 29

Pouvkova, B. ∙ 33

Rabin, I. ∙ 77

Rapp, C. ∙ 35

Rogulska, A. ∙ 97

Rothacker. L. ∙ 49

Sablatnig, R. ∙ 35

Santoro, A. ∙ 69

Sayyadshahri, H. ∙ 13

Schreiner, M. ∙ 35, 83

Sellers, W. I. ∙ 29

Seuret, M. ∙ 61

Snoussi, S. ∙ 99

Stoekl Ben Ezra, D. ∙ 101

Stokes, P. A. ∙ 39

Stutzmann, D. ∙ 51

Sudholt, S. ∙ 49

Toumpouri, M. ∙ 103

Treharne, E. ∙ 41

Ul‐Hasan, A. ∙ 105

Vetter, W. ∙ 35, 83

Vincent, N. ∙ 51

Zanetti, M. ∙ 75

Zoleo, A. ∙ 75

Centre for the Study of Manuscript Cultures (CSMC)Warburgstraße 2620354 Hamburg, GermanyTel.:+49-(0)40-42838-7127Fax: +49-(0)40-42838-4899

manuscript-cultures@uni-hamburg.dewww.manuscript-cultures.uni-hamburg.de

Second International Conference on Natural Sciences · PDF fileSecond International Conference...

Documents

Transcript of Second International Conference on Natural Sciences · PDF fileSecond International Conference...

InternatIonal ConferenCe When Jazz Meets CIneMa

TWENTY-SEVENTH INTERNATIONAL NEUROTOXICOLOGY CONFERENCE ...media.journals.elsevier.com › ...program-21025500.pdf · TWENTY-SEVENTH INTERNATIONAL NEUROTOXICOLOGY CONFERENCE 1 Environmentally

2nd RTNA International Conference on Aeronautics

XVIII International AIDS Conference

11th International Conference on Information Security ...trust.csu.edu.cn/conference/ispec2015/Program.pdf · 11th International Conference on Information Security Practice and Experience

The 2nd International Conference on Bacterial Blight of Rice2nd International Conference on Bacterial Blight of Rice 1 PREFACE The 2nd International Conference on Bacterial Blight

THE 10 AAM INTERNATIONAL CONFERENCE - IREP

Rencana Penyelenggaraan International Conference

IENE International Conference

International E-Conference on Green Economics

11th International Conference “PROBLEMS OF GEOCOSMOS” …geo.phys.spbu.ru/materials_of_a_conference_2016/... · St. Petersburg State University . 11th International Conference

ภาพรายงานEbe&international conference

XV Deafblind International World Conference

International Data Science Conference (iDSC 2017 ... · 1st International Data Science Conference (iDSC 2017) Conference Program ... Elmar Kiesling - Vienna University ... Markus

INTERNATIONAL CONFERENCE - UNY

Prosiding International Conference of Transportation ...

Kishurim IESF International conference, March 2012

V International Conference

INTERNATIONAL CONFERENCE ON ISLAM, DEVELOPMENT AND …

Workability international conference 2014.