Digging deeper into data processing with emphasis on computational and microstructure data_f

23
The Socio-Technological Integrator And Innovator Digging Deeper into Data Processing with Emphasis on Compositional and Microstructure Data: Machine Learning in Support of Archaeological Analysis Liza Charalambous [email protected] [email protected]

Transcript of Digging deeper into data processing with emphasis on computational and microstructure data_f

The Socio-Technological Integrator And Innovator

Digging Deeper into Data Processing with Emphasis on Compositional and Microstructure

Data: Machine Learning in Support of Archaeological Analysis

Liza Charalambous [email protected]

[email protected]

The Socio-Technological Integrator And Innovator

Overview

1. Introduction Archaeological Process Data in many forms and types

2. Part I: Compositional Data Pre-Processing Practices Case Study & Experimental Results

3. Part II: Microstructure Data Microstructure Analysis Pattern Recognition for the Characterization of

Microstructure Data

4. Data Analysis Remarks Data Idiosyncrasies

The Socio-Technological Integrator And Innovator

Profile and Background

Real-time Monitoring

Communication systems

Security and Error Protection systems

Research interests and Background

Digital Signal Processing

Artificial Intelligence

Machine Learning

Audio Coding

PhD student in Computer Engineering at University of Cyprus

In cooperation with the KIOS Research Center for Intelligent Systems and Networks

NARNIA ITN ESR08 (starting date 01/11/2011)

Educational Background

BSc in IT and Multimedia Communications (2007-2010), Lancaster University, UK

MSc in Digital Signal Processing and Intelligent Systems (2010-2011), Lancaster University,

UK

The Socio-Technological Integrator And Innovator

Archaeological Data

Then Now

SKETCHES

STRATIGRAPHY LOGS

PETROGRAPHIC ANALYSIS

RELATIONAL DATABASES

DIGITAL REPRESENTATIONS 3D RECONSTRUCTIONS

ELEMENTAL CONCENTRATIONS SPECTRA

The Socio-Technological Integrator And Innovator

Gather Samples/ Artifacts

Technologies

Available Methods

Data Analysis

Form Archaeological Question

Interpretation of Results

Analyze Objectives: What needs to be proved?

Determine and gather the artifacts of interest (based on the previously formed question)

List available technologies for deployment and analyze effectiveness

List available analysis methods compatible to the selected technology

Application of Clustering/ Classification algorithms so as to increase data manageability

ARCHAEOLOGICAL PROCESS:

Steps

The Socio-Technological Integrator And Innovator

“Too much and overly complicated data”

Data analysis in archaeology, is sometimes believed to take the form of: Simple projection of data (a feature against another)

Employment of very simple clustering or other dimensionality reduction methods

Much attention is given when:

Sampling

Data preprocessing

ARCHAEOLOGICAL PROCESS:

Available Methods

belief that good data will speak for themselves

The Socio-Technological Integrator And Innovator

ARCHAEOLOGICAL PROCESS: Technologies

Analysis comes in different forms and shapes The result is usually in the form of:

Peak elemental measurements → as a result of spectrum analysis Pictures or other schematic representations → commonly based on

the sample’s microstructure

Each technology is dictated by its own characteristics, integration of multiple technologies may not always be beneficiary

The Socio-Technological Integrator And Innovator

Overview

1. Introduction Archaeological Process Data in many forms and types

2. Part I: Compositional Data Pre-Processing Practices Case Study & Experimental Results

3. Part II: Microstructure Data Microstructure Analysis Pattern Recognition for the Characterization of

Microstructure Data

4. Data Analysis Remarks Data Idiosyncrasies

The Socio-Technological Integrator And Innovator

Part I: Compositional Data

Cu Mn Mg Ca Ti

K Fe S Cr Al

Compositional data are defined as vectors of proportions strictly positive components

constant sum; a restriction not always maintained

Chemical analysis is not really involved in measuring, but in enumerating, or counting, the number of each type of atoms in a sample

The results are usually given in relative numbers (usually in % or ppm). a) elemental concentrations are frequencies

of nominal or categorical classes (atoms) of a classificatory concept (matter)

b) chemistry is usually interested not in frequencies, but in relative frequencies.

The Socio-Technological Integrator And Innovator

Part I: Pre-Processing Practices

General Belief:

The more precise and accurate the bulk chemical determinations, the better the chance of making more plausible

and refined estimations.

Reproducibility and comparability of results, is commonly assured by adopting one of the following practices: a) Transformation of the relative concentrations into base 10 values

b) Sub-compositional data: the dataset of interest only contains proportions of the components constituting a sample

c) Calculation of averages

d) Elimination of chemical elements dominated by noisy readings or incomplete measuring

The Socio-Technological Integrator And Innovator

Part I: Ceramics Case Study & Experimental Results

Study the impact of pre-processing on datasets obtained from ceramics with the use of NAA

Investigations on the effect of the following parameters: Raw Vs. Log: the transformation of raw data into the equivalent 10-

base logarithm increased data separation (especially for the heterogeneous ceramics)

Sub-compositional data (with the addition of an extra column): has not influenced in any significant way the product of analysis; practice currently deployed in the archaeology domain

Calculation of averages: reduced the variance of clusters between successive runs; particularly useful for the analysis of homogeneous material.

Standardized and Normalized Data: no significant impact on the commonly used analysis methods

The Socio-Technological Integrator And Innovator

Overview

1. Introduction Archaeological Process Data in many forms and types

2. Part I: Compositional Data Pre-Processing Practices Case Study & Experimental Results

3. Part II: Microstructure Data Microstructure Analysis Pattern Recognition for the Characterization of

Microstructure Data 4. Data Analysis Remarks

Data Idiosyncrasies

The Socio-Technological Integrator And Innovator

Part II:

Microstructure Data

Involves the study of silicate and carbonate-based artifacts which may be relatively unmodified from their original

geological parent raw materials

Microstructure analysis is critical in extracting manufacturing

knowledge

Can achieve resolution better than 1nm

Can provide high quality imaging facilities together with quantitative elemental analysis; using an energy dispersive spectrometer

The Socio-Technological Integrator And Innovator

Part II:

Microstructure Data Analysis

Classification by taking into consideration how ceramics are processed Related to the impact on material durability

The nature of the ceramic microstructure, as a function of temperature, can be related to the composition of the clay source exploited

Issues that an archaeological scientist may require to address through SEM: Characterization of origin material Reconstruction of the technology involved in manufacture Influence of the place of manufacture or source of raw materials Changes that have occurred in the object during burial or storage

The Socio-Technological Integrator And Innovator

Part II:

PR for the Characterization of Microstructure Data

Estimation of Annealing

Temperature

Degree of Vitrification

Porosity/ outer-

connection of particles

Microstructure Data

Evaluation of the sophistication of firing process

Knowing the various nuances of materials and processing systems can be overwhelming and confusing

Properties of crystals Average size Orientation/Alignment Coarseness and depth of

primitive elements

Vitrification Stage Identification of crystals

and degree of fusion

Porosity Spread of pores Shape/size

The Socio-Technological Integrator And Innovator

Part II: PR for the Characterization of Microstructural Data

PIXEL POINT & GROUP PROCESSING

Edge related operations Enhancement

Segmentation

Detection

Texture Analysis Co-occurrence matrix: captures numerical

features which can be used to represent, compare, and classify textures.

Auto and cross correlation: can be used to detect repetitive patterns of textures

Estimation of patch similarity: gives the ability to compare image regions

Promoting of unit invariant measures Perforation

Shape Factors

SHAPE FACTORS Aspect Ratio: function of the

largest and the smallest diameters perpendicular to each other

Circularity: a function of the perimeter and the area

Elongation: ratio of minor axis width to major axis length ratio

Compactness: measure of object roundness area to perimeter ratio

Waviness shape factor of the perimeter: often related to fracture toughness of metals and ceramics

The Socio-Technological Integrator And Innovator

Overview

1. Introduction Archaeological Process Data in many forms and types

2. Part I: Compositional Data Pre-Processing Practices Case Study & Experimental Results

3. Part II: Microstructure Data Microstructure Analysis Pattern Recognition for the Characterization of

Microstructure Data

4. Data Analysis Remarks Data Idiosyncrasies

The Socio-Technological Integrator And Innovator

Not all features should be treated equally Artifacts are characterized by primary, secondary and

supplementary elements

All artifacts regardless physical characteristics are treated the same Size, shape, texture, contamination, aperture upon exposure

Preprocessing steps and methodology Preprocessing of the data usually complies to the disciplines of

certain fixed procedures Effectiveness of an analysis method may be greatly influenced

by data preparation routines Important to maintain consistency

Data Analysis Remarks

The Socio-Technological Integrator And Innovator

Problems of Archaeological Data

“THE VALUE OF DATA IS GIVEN BY THE ABILITY TO EXTRACT INFORMATION.”

Scarce and incomplete data

High amounts of uncertainty and subjectivity

Characterised by high degrees of redundancy

Complex interactions between variables

Analysis of findings with the use of different technologies and analysts may result to be inaccurate and imprecise

The Socio-Technological Integrator And Innovator

Barely affected by alteration or deterioration during burial, they generally present the original trading goods, as far as their material properties and composition are concerned. Very helpful in providing classification among ceramic assemblages

Often giving information about their provenance or origin of production

Ceramics of the same production series may reveal a characteristic elemental composition, usually distinct from ceramics from other production places or series. Due to the: Geochemical diversity of raw material sources

The variation in the pottery manufacturing process

DATA IDIOSYNCRASIES:

Ceramics

The Socio-Technological Integrator And Innovator

Prone to corrosion Different raw materials are

corroded at different rates and

degrees

Corrosion is not uniform

Assuming that the sample is

representative is not always trivial

Sampling requires cleaning the outer surface Usually involves removing the outer coat

Issues with licensing

Due to the material’s flexibility, most of metal objects are not flat

Alloys are challenging

DATA IDIOSYNCRASIES:

Metals

The Socio-Technological Integrator And Innovator

Notoriously homogeneous

Very rarely found in large quantities

Highly fragile Their usually thin structure makes artifact analysis a challenge

Artifact in whole form are very rare to find

Contamination over time Their analysis usually requires the use of acidic substances, for

cleaning the extra coating

Sometimes alters some of their characteristics

Against legislation restrictions

DATA IDIOSYNCRASIES:

Glass

The Socio-Technological Integrator And Innovator

“I am enough of an artist to draw freely upon my imagination.

Imagination is more important than knowledge.

Knowledge is limited. Imagination encircles the world.”

Albert Einstein

Thank you for the attention!

Comments and Questions are Welcome!