Slándáil Project Magazine 2015

26
Slándáil Magazine 2015 1 2015 Magazine Security System for language and image analysis

Transcript of Slándáil Project Magazine 2015

Slándáil Magazine 2015

1

2015Magazine

Security System for language and image analysis

Slándáil Magazine 2015

2

www.slandail.eu

[email protected]

/slandail

@slandailfp7

Slándáil is a three-year project, funded by the European Union Seventh Framework Programme under grant agreement No. 607691 (“Slandail”)

Slándáil is a European project that is investigating the use of digital media in times of natural disaster and is equipping disaster

management personnel with software services for harnessing social media for better disaster response.

We are in four countries, namely: Ireland, Northern Ireland (UK), Germany, and Italy. The aim is for the project to help increase the

security of citizens and groups living in areas affected by natural disasters through increasing the effectiveness and response time

of disaster management teams.

Empowering Emergency Services through Social Media

Slándáil Magazine 2015

2015Slándáil Magazine

Contents

01

02

05

06

22

Section One

A Message from the CoordinatorProf. Khurshid Ahmad introduces Slándáil.

Section Two

Slándáil at a Glance Our key aims and objectives.

Section Three

Meet the Team The researchers behind the project.

Section Four

Research and Innovation Understanding disaster zones ....07 Protecting citizens..........................11Technological development.........17

Section Five

Read all about it An update on our communication activities and related events.

Slándáil Magazine 2015

1

Key topics have been identified as the project has progressed. Analysis of disaster images is our next challenge. Trust between emergency managers and the public has proven implications on evacuation warnings. Non-verbal communication can be assessed and analysed digitally. The legal and ethical implications of using social media are still being debated, and our computer science teams are developing software that responds to this delicate landscape. While our project aims to build a working emergency management system, these topics are leading to research that will extend beyond the end of the project.

With less than two years remaining there is still a lot of ground to cover. Current work involves testing our social media management software with the eventual end users. These tests are being run and will be recorded in late September.

I hope that this magazine is useful as an introduction to the work that we have done, and the work that we are going to do.

Message from the Coordinator

“The goal of Project Slándáil is to ethically use social media information in times of natural crisis and natural disasters to better inform emergency services of the worst affected areas”. Prof. Khurshid Ahmad

The word ‘disruptive’ can describe both the fields of natural disaster management and social media. Natural disasters are disruptive, bringing with them damage, evacuations and unpredictable circumstances. Social media has changed even in the months that have passed since this project began. The nature of social networks and how people use the Internet is still evolving.

The project months have passed quickly, and the findings and developments of the project are broad. We have produced four reports from disaster management partners showing their use of social media, data and communications. We have constructed an ethical framework for social media in disaster management. We have assessed the tense relationship between saving lives and preserving privacy. Advanced information extraction has helped us to develop the use of text for disasters in a way that was not previously possible in our partners’ emergency management software.

Section one | A Message from the Coordinator

Slándáil Magazine 2015

2

Section two | Slándáil at a Glance

The Slándáil approach to harnessing social media in emergency management (#smem) is to ethically use social media information about natural crisis and natural disasters to better inform emergency management. Natural disasters do not respect borders or distinguish between citizens: in our first year we have collaborated across countries, languages and cultures. We work with a range of professional bodies and emergency response professionals in order to develop a technological solution (from advanced data analytics to multi-lingual and multi-cultural services) and new strategies (communication protocols to ethical and legal guidelines) to aid in disaster management. This means that emergency responders can better plan for, respond to, and recover from natural disasters. Our mission is to improve citizens’ lives by developing software solutions for better disaster response.

Slándáil focuses on the management of disaster emergencies and specifically on how

this management can be facilitated and expedited through the use of social media.

As a project, we are focused on the problem of flooding. We have looked at case studies of historical flooding disasters in Belfast, Dublin, London, Venice, and Saxony. Nevertheless, despite this specific focus, our technological solutions and research approach means the Slándáil system can be subsequently modified to deal with a range of other natural disasters (e.g. fires, earthquake, storms, etc.).

Tackling real world problems from the ground upThe Slándáil team comprises ‘end user’ partners, meaning that our team includes emergency managers and response professionals. Our end users include – An Garda Síochána (IRE), Police Service of Northern Ireland (UK) and the Regional Command Saxony (DE). We are also collaborating with an external team at Protezione Civile de Veneto (IT). The active involvement of end users ensures that end users are shaping the outputs of the project. With examples drawn from real operating environments, end users are giving the researchers and developers in the project a clear picture of how emergency managers truly operate; giving us the ability to develop software solutions that are both applied and ratified for the real world.

Project Profile | Emergency response systems using social media

Slándáil Magazine 2015

3

Section two | Slándáil at a Glance

Emergencies as a EU wide problem

Natural disasters do not respect borders or distinguish between citizens. The Slándáil system extracts information from the digital world – a global space where each of us engages in declarative living and often share our experiences - during times of natural disasters. This means that data sources can range from individual citizens sharing pictures and text on twitter to news agencies posting bulletin updates as the disaster unfolds. The Slándáil system will ultimately harness texts, images, and videos that are shared over the web and will deal with issues of language, cultural context, and their interrelationships. Every moment counts, since intelligence can reduce the possibility of harm, increase response to the crisis, and establish what resources will be needed for the response. Thus, the support provided by the system will be on-the-fly as the disaster is happening as well as offering insight after the disaster has happened (for reflection and subsequent regulation of plans for future floods).

Harnessing disaster zone information that is shared over social media

The Slándáil project will deliver a new form of Emergency Management System (EMS) that can aggregate information collected from digital media and deliver relevant messages/support to disaster management personnel. To date, EMS have been used in isolation from systems used in receiving and sending information through social media. Current use of social media in disaster management places the burden of search, interpretation, and communication (to and from the public and the disaster managers) on the officials. This means that timely and effective integration of (possibly lifesaving) information supplied over digital media is too difficult. There is a huge amount of extra work involved in making the best use of information for early warning, rescue and rehabilitation. Slándáil’s approach not only reduces the burden of interpretation, but will also engender better collaboration across borders and civil departments, and provide helpful guidance on how to better communicate with citizens.

Natural disasters do not respect borders or distinguish between citizens.

The Key Problems to be Solved

Harnessing disaster zone information that is shared over social media As yet, Emergency Management Systems (EMS) cannot yet properly harness disaster zone

information that is shared over social media.

Emergencies as a EU wide problem A key concern is that information needs vary for different audiences - from first responders to defence authorities to citizens, across multiple languages, cultures, and modalities (text, images, video).

Societal issues, ethics and the law It is essential that the rights of civilians and disaster operatives are protected.

Getting the Technology RightData processing, analytical analysis, and reporting are tough challenges to solve when dealing with the overwhelmingly large social media milieu.

Our Key Aims and Objectives

Slándáil Magazine 2015

4

Reflection on disaster events will better enable emergency managers and first responders to recover from and plan for the next flood.

Societal issues, ethics and the law

During a natural disaster there is a large volume of information shared on social media sites like Facebook and Twitter. Some of this information contains private data that could be used to identify individuals, although it is difficult to process all of this of data. It is essential that the rights of civilians and disaster operatives are protected - both (i) the right to life and a safe living space before, during and after a natural disaster, and (ii) the right to privacy and protection of information in the publicly open forum of digital media.

A key concern is that information needs vary for different audiences, from first responders to defence authorities to citizens, across multiple languages, cultures, and modalities (text, images, video). The Slándáil system will work across three core languages (English, German and Italian) – this has meant that we have needed to harness and develop methods for aggregation of information from different modalities and languages.

The support provided by the system also happens at multiple stages. On-demand data will be provided in close to real-time during a disaster. However, communication through social media also continues after the emergency. Social media is used by individuals and communities to inform friends and family of their status, for storytelling, and often results in wide scale republishing of data.

The Slándáil system will work with three languages - English, German and Italian.

Getting the technology right

A major contribution of the Slándáil project will ultimately be our new software system - an advanced Emergency Management System (EMS). Data processing, analytical analysis, and reporting are tough challenges to solve when dealing with the overwhelmingly large social media milieu. Navigation of the collections of data is difficult due to a lack of clear indicators with which to monitor and constructively respond to information, not to mention issues of subterfuge and false data due to rumour spreading of vast proportions.

It is clear that the development of our Slándáil system is complex, requiring the collaboration between technological experts from across Europe – this includes linguistics and text analytics, speech and communications, image and video analysis, media annotation, and software development organizations capable of developing intelligent systems.

It is essential that the rights of civilians and disaster operatives are protected.

Section two | Slándáil at a GlanceOur Key Aims and Objectives

Slándáil Magazine 2015

5

Section three | Meet the Team

European Presence

Trinity College Dublin Prof Khurshid Ahmadhttps://www.tcd.ie/

Institut für Angewandte Informatik (INFAI) e.V. Prof Gerhard Heyerhttp://infai.org/

University of UlsterProf. Bryan W. Scotney http://www.compeng.ulster.ac.uk/csri.php

Università di Padova (UNIPD) Prof. Maria Teresa Musacchiohttp://www.unipd.it/

CID GmbH Alexander Lörchhttp://cid.com/

Stillwater Communications Ltd. Cilian Fennellhttp://stillwater.ie/

Police Service of Northern Ireland (PSNI)Una Williamson

http://www.psni.police.uk/

DataPiano S.r.l.Francesco Russohttp://www.datapiano.it/

Military disaster prevention in Saxony Stephan Seeger http://www.bundeswehr.de/

Pintail Ltd. Ciaran Clissmannhttp://www.pintail.eu/

An Garda Síochána Eamon O’Loughlinhttp://garda.ie/

11 partners* in 4 EU countries tackling 3 languages.

Interactive Map

* Previously partnered with CIES http://cies.ie/

Slándáil Magazine 2015

6

Section four | Research and InnovationYear 1 Accomplishments

Year 1 AccomplishmentsSlándáil is developing a new type of Emergency Management System (EMS). It will not only aggregate information collected from social media, but will also deliver the information in a form that is accessible through language analysis, annotation, and visualization. It will provide disaster managers with support on how to deal with the information by providing guidance on how to communicate with citizens. All this will work across three languages (English, German and Italian).

The result will ultimately be an increase in the efficiency and effectiveness of disaster management personnel and their ability to harness digital media. Ultimately, saving people and property in times of natural disaster and planning better for future events.

Our research and development has three main objectives that are the core progress markers of the project -

Protocols to protect citizensWe are developing new rules and guidelines to ensure that rights of the citizens are sufficiently protected and to better manage the confidentiality of the collected data and processed information relating to individual citizens. We have - • Designed an ethical framework • Combined legal and ethical

understanding of data collection• Created an intrusion index

Disaster zone information To date, we have collected and reviewed case studies and historical data about how disaster zone information is used and shared by experts tasked to improve security of citizens and property.

We have - • Collected and shared this information• Captured the technology

requirements of end users • Demonstrated EMS to end users

Technological developmentThe team are also building and testing a prototype Emergency Management System (EMS) for collecting, processing, aggregating and disseminating information for disaster emergency management.

We have - • Developed an information extraction

framework• Published a disaster terminology wiki• Developed Prototype 0

1 2 3

Slándáil Magazine 2015

7

Section four | Research and InnovationDisaster Zone Information

Disaster Zone Information – Collected and shared

As the traditional collaborative model for emergency management is ‘face-to-face’, it cannot yet fully utilize the social aspect of disaster information exchange and the mechanism with which social media enables citizens and emergency responders to share and collaborate. We have looked at how emergency responders currently use social media during times of natural disaster, the current technologies they have available to them, emergency management procedures, and level of social media usage that the end users have. With examples drawn from a real operating environment, end users have given us a clear picture of how they operate currently.

What is emergency management?

Emergency management is the process of preparing for, mitigating, responding to, and recovering from an emergency. The traditional collaborative model for emergency management is ‘face-to-face’. In contrast, social media is already transforming how people share and collaborate during times of natural disasters. Social media is more efficient and flexible than a wide range of traditional communication approaches.

Effective emergency action can avoid the escalation of an event into a disaster. This involves plans and institutional arrangements to engage and guide the efforts of government, non-government, voluntary and private agencies in comprehensive and coordinated ways to respond to the entire spectrum of emergency needs.

Emergency management is the process of preparing for, mitigating, responding to, and recovering from an emergency.

Slándáil Magazine 2015

8

A PSNI report on the threat of major coastal flooding - January 2014

The PSNI case study covers the serious threat of coastal flooding during the winter of 2013/14 which had the potential to seriously impact on a built up area of Belfast city. Overall, the case studies revealed that there was no consistent use of social media in disaster management by the end user agencies, but it was recognised that social media could play a part in disaster management – in resource deployment, communication, and decision making.

An Garda Síochána’s case study on flooding on the M11 - March 2013

The Garda Síochána case study details the incident of widespread flooding on the M11, which is a major arterial route in to and from Dublin city to the East Coast of Ireland. There were two serious and separate incidents of flooding at the same location, which seriously impacted on both morning rush hour traffic and again in the evening. The evening event effectively resulted in a complete blockage of all routes exiting Dublin City to the East and consequently to the South.

An Garda Síochána

PSNI

Disaster Zone Case Studies

Through collaboration with our end users, we have created an archive of emergency management case studies and have studied disaster events using social media data. Each case focuses on a disaster or emergency from the perspective of the end user.

Section four | Research and InnovationDisaster Zone Information

Slándáil Magazine 2015

9

The District Liaison Command Leipzig’s account of the Saxony Floods - June 2013

The Liaison Command Leipzig example covers the serious flooding in Saxony in June 2013. This was due to a series of thunderstorms; a string of rivers and streams in the area of Saxony had broken their banks.

Military disaster prevention in Saxony

Historical Data AnalysisSocial media corpus analysis

A number of social media corpus have been processed using the TCD CiCui system, including Hurricane Sandy and the 2008 earthquake in England. The impact of Hurricane Sandy was examined by assessing the surrounding social media messages from Facebook and Twitter during its impending landfall and dissipation. Terms were tagged according to their part-of-speech and a set of 27,000 nouns were extracted where some 1,200 terms were hand labelled and used to create an Intrusion index. Terms were classed as either referencing an institution, an event, a person, or place among others. Another corpus of testimonials, crowdsourced from the public by the European-Mediterranean Seismological Center (EMSC) about their experiences of a 4.8 magnitude earthquake occurring in England in 2008, was collected. The eyewitness reports of the testimonials described the effect, intensity, and their experience of the earthquake. The varying impact correlated with their exposure and aligned with the intuitive understanding of how people are affected during disaster and crises events. INFAI also examined how social media were used during the flood 2013 in Central Europe and what differences in use appeared among different kinds of social media. We found that Twitter played its most important part in exchange of current and factual information on the state of the event while Facebook prevalently was used for emotional support and organization of volunteers help. In a corpus-based comparative study we showed how the different communicative modes prevalent in the registers German Facebook, Twitter and digital News are clearly reflected by the characteristic content, conceptualization, and language of the respective register. The methods used include differential analysis, sentiment analysis, topic modeling, latent semantic analysis, and distance matrix comparison.

Section four | Research and InnovationDisaster Zone Information

Disaster Zone Case Studies

Slándáil Magazine 2015

10

It is vital that we engender increased trust in disaster communications. The need for trustworthy and concise information in a timely manner is one of the key requirements that emerged from working with the end users - particularly as the “noise” in social media can prevent important messages from filtering through. As quick decision-making is required during a natural disaster, this noise needs to be filtered as much as possible.

However, end users (emergency mangers, including first responders and strategic disaster planners) do not use social media in a consistent manner. We have uncovered that better communications policies are needed. Also, end users are currently using a wide range of technology and software to manage emergencies within their jurisdiction. Social media is also used in varying degrees to aid in their planning and coordination of actions.

Section four | Research and InnovationDisaster Zone Information

It is vital that we engender increased trust in disaster communications.

The Needs of End UsersIntegrity, reliability, and provenance of information provided via social media will impact positively on decision-making and resource deployment to save lives and protect property. A reliable IT system is therefore required to ensure that it does not fail at crucial moments. These requirements include a threshold of trustworthiness for information received by Slándáil end users, and an integrated system that encourages Slándáil end users to engage in social media management through central controllers.

Our work with our end users has drawn attention to issues with managing and maintaining multilingual, large groups of volunteers. Communicating concise and timely messages by emergency management personnel in times of natural disaster is of utmost importance, particularly in order to effectively manage large groups of people in procedures such as evacuations.

Slándáil Magazine 2015

11

Section four | Research and InnovationProtecting Citizens

societal impact assessment and ethical framework

The considerable level of communication and interaction now available thanks to social media means that first responders and rescuers could better focus their valuable (and much needed) resources on the most vulnerable first. It essential to collect as much information as is possible to minimise the loss of life and property. This involves targeting (identifiable) individuals and places.

The subsequent, post-disaster usage of such information may not be allowed under a range of

When considering issues of societal impact and ethics, there are four areas of concern - Desirability, Acceptability, Ethics, and Data Management. We share a set of questions that can be used by service developers when considering these issues. We also share

comprehensive guidance on ethical issues.

Protecting CitizensSocietal impact, EU policy, legal and

regulatory frameworks

international, federal, and national legislations. There is also potential for misuse of a complex data aggregation system such as Slándáil - to this end the team have been researching the potential legal implications, and ethical and societal issues surrounding data collection and aggregation.

When developing software and services for emergency management consideration must be given to issues of societal impact and ethics. These issues can be described under four main categories - desirability, acceptability, ethics, and data management.

Slándáil Magazine 2015

12

Protecting Citizens

Desirability concerns thinking about the need for the system and whether the proposed solution represents an efficient means of meeting that need.

• What is the problem to which this particular policy, project, or technology respond?

• What is the impact of the use of social media on your staffing levels elsewhere and on your budget?

• What assessment has been made of the digital divide, such as differences in digital literacy, and how do the results shape your planning?

• Are all sources of information aimed at the public being utilized, to maximize reach across demographics?

• What measures have been taken to train personnel on social media use as well as on societal impact?

• How are you maintaining situational awareness away from social media, whilst using social media for disaster response?

• What accountability mechanisms are in place both within your organization and for the general public?

• What type of data analytics are carried out as part of your assessment procedures, and is the amount of data used proportionate to the task?

Section four | Research and Innovation

Acceptability builds on desirability to incorporate concerns of trust, accountability and public support and

consent. Trust between the end users and the public at large is dependent on transparency in the treatment of data within the Slándáil platform.

• Through what channels has the public been consulted about your social media plan?

• What methods are you using to identify and keep track of key stakeholders?

• To what extent have civil society organizations been included throughout your policy or design process and how have their inputs been integrated?

• What is the surveillance potential—both positive and negative—of your social media strategy?

• What methods have you taken to ensure the consent of the public, and how have you notified them of potential risks to their data?

• What steps are being taken to ensure the quality of information provided to citizens?

• How involved is the public in your response, and how have you accounted for the drawbacks of crowdsourcing disaster response?

• How have you communicated with the public, and ensured that your voice is recognizable and authoritative in order to minimize confusion?

• What measures do you have in place to prevent defamation or to halt the spread of false information?

Questions to Ask when developing your Technology

Slándáil Magazine 2015

13

Section four | Research and InnovationProtecting Citizens

Protecting Citizens by taking an Ethical ApproachProtecting Citizens by taking an Ethical ApproachEthics refers to the application of values and moral standards - these will inform the use of the Slándáil software services. An ethical approach takes into account trust and other factors, and considers issues such as privacy, human rights, power imbalances, and proportionality.

An appropriate ethical framework is a key part of any project that intends to use a large volume of public data. In Slándáil this will help to ensure that we are appropriately managing and regulating data. The ethical and factual provenance – ensuring that information comes from accurate data and has

Privacy and Informed ConsentThe implications for the Slándáil system with regards to (i) rights and expectations of privacy and (ii) informed consent will be ethically identified and analysed.

Legal Fragmentation and Technological Progress The ethical consequences of the fact that relevant legal contexts are fragmented geographically and evolving at a pace slower than that of technological innovation will be addressed.

De-humanization of subjectsThere is a risk that researchers and end users dealing with datasets feel a “conceptual distance” from the human subjects who provide the data.

Discrimination and Social Exclusion (Digital Divide)Consideration of the lack of access to technology for certain members of communities will be made, particularly those already economically disadvantaged or vulnerable to natural disasters.

Abuses of power and human rights violationsThe risk of misuse of potentially identifying social media data will be addressed in the development of the ethical framework and protocols with the aim of minimising the possibility of subsequent abuse.

Anonymity vs. IdentifiabilityEthical measures required to minimise risks of individual’s identification will be considered in the development of the ethical framework.

been ethically sourced - of data is of utmost importance for the vulnerable and for the rescuers. This data, if collected and mined surreptitiously, can undermine trust and negatively impact future communications with stakeholders, not to mention that such activities are often prohibited under a range of national, federal, and international laws.

Our ethical framework provides guidance at each stage of the emergency. While the framework is currently a work in progress, we share with you the core tenets of our framework.

Slándáil Magazine 2015

14

Section four | Research and InnovationProtecting Citizens

Questions to Ask when developing your Technology

Ethics concerns thinking about values and moral standards.

• How does using openly accessible social media content during emergencies impact on freedom of expression for different social groups?

• What steps are being taken to anticipate and prevent stigmatization of social groups, either through your social media plan or through the actions of social media users?

• What steps have been taken to stop the technology from succumbing to dual use ?

• What measures are in place to protect key freedoms such as freedom of speech, of assembly and of movement?

• What steps have been taken to ensure the inclusion of people with disabilities as part of your social media use?

• How does your use of social media reinforce, and not restrict, political rights?

• How are you identifying victims and survivors and protecting the dignity and reputation of each, both online and offline?

• Do you have measures in place to prevent backdoor access and to ensure the on-going legality of your project?

Do you have measures in place to prevent backdoor access and to ensure the on-going legality of your project?

Protecting Citizens

Slándáil Magazine 2015

15

• Does this project, policy or technology conform to European and national regulations on privacy and data protection?

• Have you consulted and/or informed your national data protection authority (DPA) of your plans?

• What procedures are in place to verify the on-going legality of your project, policy or technology?

• Does the information you have retrieved include personal data? Is the information you publish considered “public”?

• How is geolocation being used for disaster response, and is user identification being associated with geographical information?

• What measures (e.g. Privacy by Design (PBD) or the encouragement of Privacy Enhancing Technologies (PET)) are in place to ensure that user data is anonymised?

Data Management concerns the processing, transportation, and storage of data. During a natural disaster there is a large volume of information shared on social media sites like Facebook and Twitter. Some of this information contains private data that could be used to identify individuals. Privacy and personal data protection is both a legal and societal question. When considering data management, engineers and other data managers are compelled to follow existing law. Data management should also encompass principles of minimization and anonymization of data collected, as well as techniques such as privacy by design and tools such as privacy-enhancing technologies.

The Slándáil platform system will facilitate end user compliance with legal data protection and data retention requirements at each stage of the emergency management lifecycle. Concerns of data-minimisation and anonymization are also being considered and measures are being taken to pursue them as far as possible (e.g. Intrusion Index) both in the system design and in user-practice. We also apply the principle of privacy by design - standard data security and accessibility controls are being adopted and implemented. It is important that the data management strategies taken are transparent, as this will engender both trust and accountability.

• Are you favouring privacy protection as part of your default settings?

• What steps have been taken to minimize the amount of personal data required?

• What access do third parties have to user data, and what protocol is in place to approve or deny these requests and notify users?

• Do you have a plan to prevent hacking or unauthorized access, including from within?

• How are you accounting for the potential for mis-identification?

• Do you have a procedure in place for redress or deletion of data?

• How is respect for data input and output being integrated into the professional routines of your personnel?

Section four | Research and InnovationProtecting Citizens

Questions to Ask when developing your Technology

What measures (e.g. Privacy by Design or the encouragement of Privacy Enhancing Technologies) are in

place to ensure that user data is anonymised?

Slándáil Magazine 2015

16

Protecting Citizens Section four | Research and Innovation

the Slándáil intrusion index

The Intrusion Index is being designed for the Slándáil system in order to better protect the privacy of individuals that may be named on public social media sites. When implemented, it will highlight named entities that appear in text so that the system can later privatise this data, for example by automatically deleting all place and person names. Other data, such as the content of tweets, can still be useful to train the system for future natural disasters, so by deleting sensitive data the rest of the text can still be useful.

Early tests on the Intrusion Index have shown that less than one in five words in public media sources contain a person’s name. Many more of these names have been found to belong to public officials, such as heads of state or emergency managers.

However, named entities can include Twitter handles and place-names, and any data that can identify an individual person should be removed from the Slándáil system unless it can be used to help protect them from danger.

Test on sample sets of social media messages collected from sites such as Twitter and Facebook to determine potential entity occurrences have been conducted. Processing the messages according to the intrusion index method, names of institutions, events, people, places, and Facebook and Twitter names were identified. In one particular sample of 27,000 tagged words, some 8.4% were institutions, 7.9% events, 7.5% person names, 4% places, and 57.1% were Twitter and Facebook handles.

Research on the Intrusion Index is ongoing, and as the Slándáil prototypes are developed over the coming months the index will be tested more frequently.

Slándáil Magazine 2015

17

Section four | Research and InnovationTechnological Developments

The Technology Behind The Slándáil Solution

Effective and timely social media usage requires an ability to capture and aggregate mass communications, to analyse this collected data, and to use it in an emergency management system. Once social media messages can be captured, aggregated and analysed, the rescuers can use this information to make decisions and communicate these decisions to the public. They can also reflect upon the disaster and better plan for the future.

The Slándáil technology VisionThe major contribution of the Slándáil project will be in the organisation of texts and images so that we can do complex on-the-fly analytics (when responding to emergencies) as well as reporting on historic events (for reflection on the disaster and to improve our plans for the future). The thrust of technological development in Slándáil is in creating a learning system that can associate the visual information with textual information and provide intelligent support.

• Text analytics in Slándáil is based on a disaster lexicon – a dictionary that contains disaster related terms – and a conceptual ontology. An ontology is a technology structure (similar to a database and software used to access the database) that defines vocabulary through keywords and key relationships between concepts and terms.

• Image analytics in Slándáil are based on examining key visual primitives - the advanced features beyond shapes, textures, colours - and their interrelationships. Algorithms based on  characteristics of  the human visual system are being developed for fast image processing and  automated scene and event recognition.

Slándáil Magazine 2015

18

Analytics for text and imagesPrevious text analytic and image analytic systems that have already been developed by the academic partners, funded either through EU or national funding, are currently being used to analyse documents and images related to disasters. Text analytic information extraction systems include CiCui and Rocksteady at TCD and the Leipzig Corpus Miner (LCM) at INFAI, which will be incorporated into the Topic Analyst system developed by CID.

Text Clustering An important aspect of a continuous and broad coverage of text and image capture systems for disaster management is that such a system allows for identification of named entities – people, places, and objects. This identification is essential during an emergency, however, in normal circumstances this could facilitate surreptitious collection of information for the named entities, thereby being potentially in conflict with the norms of a civil society (in particular raising privacy concerns). The Slándáil team is working on creating a system that monitors the collection of such information while ensuring that the system log is considered both safe and acceptable.

CiCui and Rocksteady

CiCui and Rocksteady were developed under Enterprise Ireland and Trinity College Dublin grants running between 2009 and 2014. News articles are monitored and captured by the CiCui system and the numerical data are presented by the Rocksteady system. These systems were originally developed to demonstrate the relevance of automatic sentient analysis and automatic ontology production.

CiCui and Rocksteady have been modified for analysis of social media within Slándáil. Part of our Slándáil disaster lexicon and ontology has been created by using the CiCui system. This has been collected by scraping recent disaster news feeds (texts written in English) automatically. The data that is collected is then visualised using Rocksteady.

Information Extraction

Section four | Research and InnovationTechnological Developments

2,500 RSS feeds are used for automatically collecting German news from public online sources as well as a

number of documents from German newspapers.

We have used a German corpus of news feeds that has been automatically collected during the last 20 years by the University of Leipzig. Currently, a list of about 2,500 RSS feeds is used for automatically collecting German news from public online sources as well as a number of documents from German newspapers. Focusing on the time of the flood event in 2013 we have been able to automatically extract characteristic flood terms. These terms will further be used for filtering and collecting relevant documents from other sources and can be used for the creation or further elaboration of a German disaster (flood) lexicon. First experiments have been carried out on a flood sub-corpus for the disasters in 2002 and 2006. As a form of unsupervised clustering we applied topic model analysis. In a first step, we extracted the flood from the general corpus of the relevant time period. In a second step we then inferred a number of topics which occur inside this flood topic. Additionally, we analysed and visualized development of all topics over time and thereby got a view on internal temporal structure and development of the event over time. The sub-topic extraction also was applied to German social media flood data from Facebook and Twitter. From the perspective of an end user this allows us to automatically order and filter flood-related documents with respect to their content.

Slándáil Magazine 2015

19

The Slándáil Text Analytic Module (STAM)

STAM combines the functionality of each of the analytics systems. The use of more integrated text analysis that will target flood terms and words in social media text is the main benefit of this. The STAM module will be used in the development of prototypes for testing over the next eighteen months.

The text analytic systems are being used to graph frequency of terms and words that appear in text in traditional news media and social media. The chart above is taken from the Slándáil disaster newsletter, which is updated weekly with results of natural disaster terms from around the world.

Leipzig Corpus Miner

German disaster-texts are also under examination thanks to the “Leipzig Corpus Miner (LCM)“ - a text mining software system (developed as part of the ePol project) for a qualitative data analysis which facilitates statistical research for academics from the social science and humanities. The software has been further developed and modified as to allow for automatic analysis of social media data. It now offers methods for a wide range of tasks in Information Retrieval, Lexicometrics, Topic Model Analysis, and Classification.

Information Extraction

Section four | Research and InnovationTechnological Developments

Slándáil Magazine 2015

20

Section four | Research and InnovationTechnological Developments

languages and terminology The building of an information extraction system relies heavily on the existence of a set of keywords related to the domain in which the system is expected to be used (i.e. disaster management and flooding). These keywords must be conceptually organised to facilitate better access and interpretation. The conceptual organisation, in turn, requires the creation of an ontology (representation of this entails using information science by defining their names, definitions, metadata/properties and any interrelationships). The ontology is largely language independent and relates key concepts in the domain.

We have created a dedicated Terminology Wiki (http://slandailterminology.pbworks.com/, not

yet publicly available) displaying all the terms relating to the concept fields of emergency management, natural hazards and people in emergencies, which were extracted from the corpora of texts collected in the previous stage of the project.

This Terminology Wiki comprises elaborated terms in English, German,

The terms used in disaster management come from engineering, administration, the various sciences, and from the knowledge of policing, fire, and rescue services.

These will form the backbone of the Slándáil prototypes as they will serve as a dictionary to comprehend and organise digital data.

and Italian. The lexicon will serve as the multilingual knowledge base for the project ontology. The lexicon and the resulting ontology, in turn, will be used for analysing formal and social media content, thus allowing detection of the presence of named entities, beginning/end of events, and the well-being status of people or places.

Another important goal of our lexicon was to try to harmonise currently existing terminologies by looking at various institutions dealing with emergency management, including the European Union. Therefore, terms were also sometimes compared and collected from the International Red Cross, UNISDR, or EIONET to ensure a more comprehensive approach to terminology extraction and management. You can read more about the Terminology Wiki here http://slandail.eu/disaster-lexicon-now-available-on-the-project-terminology-wiki/

Slándáil Magazine 2015

21

Section four | Research and Innovation

Two systems form Prototype 0 - Topic Analyst, a text analysis system that performs language processing and provides semantic search from online sources created by German technology company CID, and SIGE, an alarm and emergency system that helps emergency managers send out messages and target key areas during a natural disaster created by Italian company DataPiano. Both systems were loosely coupled and demonstrated to end users using dummy data of an emergency to show how the systems may operate together.

Topic Analyst provides an analysis tool for filtering, search, investigating and datamining on huge amounts of documents collected and pre-processed by CORPUS backend as well as academic partners. Features of the Topic Analyst system are language processing, entity recognition for e.g. companies, persons, organisations and locations, as well as lemmatized keyword extraction. During a time of natural disaster, this system can be used to pull in news and public social media posts. When an occurrence of key terms such as location or a particular type of emergency is highlighted within the text, Topic Analyst automatically sends a warning to a monitoring station where an emergency manager can view its readings. If the emergency manager decides that the data warrants an alarm, he/she can activate SIGE, which allows fast contact with various emergency management resources, and shows maps and information on the region that may be affected. The next step in the project will be to take the academic partner research on flood terms and use these as a filtration system to improve the outputs from Topic Analyst.

Prototype 0

We are building a number of prototype systems to tackle the issues of using social media for better disaster response. These prototypes include – combining online

material retrieval with emergency communications, text analytics, image analytics, the aggregation of text and

image, and the capability to deal with multiple languages.

Technological Developments

Slándáil Magazine 2015

22

Why not check out http://www.slandail.eu to keep up to date with project news. Recent interesting developments include the following - • Feature articles are now shared alongside project news on our Slándáil

website http://slandail.eu/features/ - Features comprise human-interest stories and record interesting research and other events of interest to a wide audience.

• The Slándáil Disaster Newsletter -contains aggregated data, graphs, and links to archived news as well as a Slándáil Newsletter Analytics (SNA) tool which enables the user to dynamicly generate graphs for investigating disaster news. The SNA combines textual analysis, statistical analysis, data management, and web graphing features. Check it out here http://anglo.scss.tcd.ie/slandail/newsletter/

• And of course our Digital Magazine - Make sure that you are on our mailing list – you can join the mailing list by using the signup form on the project website http://slandail.eu/contact-us/mailing-list/

Keep up to date with Slándáil news

Upcoming EventsThere are several related events coming up over the next couple of months on social media in emergency management. In particular, watch out for:• IDEAL 2015 Conference. 14-16 October 2015 (Wroclaw, Poland),

http://ideal2015.pwr.edu.pl• KommunikationsFluten. International Conference on the

communication during the flooding 2002 ans 2013 in Saxony, 06 November 2015 (Leipzig), http://konferenz.eijc.eu

Section five | Read all about it

Slándáil out and about