Theory and practical case study Modelling in Knowledge Management - Issues of Metadata Theory and...

50
Knowledge Management - Issues of Metadata Theory and practical case Theory and practical case study study Don Schauder - Professor of Information Management, SIMS, Monash Cherryl Schauder - Metadata Project Officer, RMIT October 2003
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Theory and practical case study Modelling in Knowledge Management - Issues of Metadata Theory and...

Modelling in Knowledge Management - Issues of Metadata Theory and practical case studyTheory and practical case study

Don Schauder - Professor of Information Management, SIMS, Monash

Cherryl Schauder - Metadata Project Officer, RMITOctober 2003

References Kennedy and Schauder 1998, Records

Management Business classification scheme, pp 65-68 Chapters 6 and 7

Controlled vocabularies. Middleton, Mike. QUT site - http://www2.fit.qut.edu.au/InfoSys/middle/cont_voc.html

Relevant websites Yahoo Magellan www.dmoz.oz EdNA browse map http://

www.edna.edu.au/EdNA ASCED

bass.adm.monash.edu.au/policy/asced/asced2000.htm

ANZIC www.cqu.edu.au/research/research_services/arc/anzic_codes.htm

‘View source’ - metatags in HTML Headers of many Websites eg Museum of Victoria

Classification as Knowledge Modeling

The way we organise our kitchens, our bills, our wallets or handbags … our kinds of work, our political and justice systems …. etc

Part of our everyday lives and thought processes Part of our culture, our ways of interpreting what

happens in our lives and around us Mostly explicit knowledge (ie ‘documents’), but

also tacit knowledge. Key to classification: Similarity/dissimilarity

Classification as Knowledge Modelling

Grouping things together into

– classes or categories (facets)

– and sub-classes or sub-categories (sub-facets)

Hierarchies or family trees, systems and

sub-systems

In document organisation Classificatory models (classification schemes) are

used for

DISPLAY - IN HIERARCHIES BROAD TO NARROW - INDICATES RELATIONSHIPS

RETRIEVAL IF CODED, THE CODES ENABLE A FILING

ORDER

A class (facet) is a group of things which share a common feature

Disciplines - engineering, science, history, art ... Types of political system - capitalism,

communism, socialism ... Age groups - infant, child, teenager, adult ... Towns - Melbourne, Sydney, Perth ... Cricket grounds - MCG, SCG, GABBA ... Grades - High distinction, distinction, credit … Degrees of approval - acceptable, unacceptable

Ordering concepts in a classified arrangement - logical/systematic The order of facets/classes in a hierarchy may go

from least to most significant facet - or the opposite; or follow some logical order recognisable by the target audience

The order of concepts within each class/sub-class should be some logical order that has meaning in the context - most to least common, young to old, established scientific order, small to big

Alphabetical is a last resort

Mutual exclusivity When we are organising information we need -

as far as possible - to be wary of overlapping categories which can cause confusion (unless we are purposefully using overlapping categories, eg for publicity reasons in a sales campaign)

JEANS– BLACK– BLUE– GREY– STRETCH?

Be careful to avoid overlap AN ORGANISATION’S PERMANENT WEB

NAVIGATION BUTTONS– COURSES– ABOUT US– NEWS– EVENTS– ACADEMIC PROGRAMS– CAREERS– EMPLOYMENT AND JOBS

Hierarchies - Groupings, sub-groupings and sub- sub groupings

B U IL D IN G S

C H U R C H E S M O S Q U E S

R E L IG IO UST yp e tit le h e re

P R E -S C H O O L P R IM A R Y S E C O N D A R Y

T A F E U N IV E R S ITY

T E R T IA R Y C O N T IN U IN G

L E V E LS

E D U C A T IO N A LT yp e tit le h e re

H O U S E S U N IT S A P A R T M E N T S

R E S ID E N T IA LT yp e tit le h e re

B U IL D IN G S

Hierarchies Convenient way to think about organisations,

products, people, systems of belief

BUT SOME HIERARCHICAL GROUPINGS

ARE MORE PERMANENT THAN OTHERS THIS IS IMPORTANT WHEN

– ARRANGING INFORMATION ON A WEB PAGE– FILING THINGS ON A SHELF

Hierarchies

Longer life, eg

Natural materials– Stone

» Marble

» Slate

– Timber» Hardwoods

» Softwoods Pine

Shorter life, eg

Organisation X– Dept of A

» Section Division

– Dept of B» Section

Division

– Dept of C » Section

Division

Hierarchical relationships ‘True’ generic

relationships Timber is always a

natural material Rabbits are always

rhodents Carrots are always a

Vegetable

Contextual generic relationships

Psychology is sometimes a Science

Rabbits are sometimes Pets

Goodwill is sometimes a tradeable asset

Classification schemes or models Can cover all of knowledge or only a specialised field –

the more fields and sub-fields, the greater the complexity of the scheme - CUSTOMISED OR WIDE AUDIENCE

Classification schemes reflect the viewpoint of their compilers and their end-users – they are overlaid with the values and ways of thinking of the people who created them – they cannot be ‘objective’

No classification approach will suit all the users

Classification displays broad to narrow Classification schemes are often linear, one

dimensional arrangements, whereas the broad knowledge/content in documents is often multi-dimensional

Can display a classification scheme as a map/diagram with arrows showing multiple relationships, or as a tree structure

Classified groupings Displays how documents have been grouped e.g subject directories on the Web - Yahoo and

Magellan

BUT still need an index/search engine for searching across the categories

Filing systems in electronic directories, libraries and record keeping contexts

A document’s content is often multi-dimensional– Agricultural surveys of land use– Outdoor cookery for children– Valuation of land in South Yarra in 1999

– A classification allows one position for simple and complex topics

– An index/catalogue/search engine is needed to search across the grouped categories

Creating a Classification scheme for organising documents A classification scheme is a hierarchy or a set of

hierarchies of concepts, usually with some kind of code attached to each concept. The code often establishes an order for filing the documents – usually numbers, letters or a combination of both, eg A2, A46, B13, B14, B335. The numbers or letters can file decimally or as integers/whole numbers.

Dewey Decimal Classification 000 General

(Knowledge, the Book, Bibliography, Computer Science, Librarianship, Journalism, etc)

100 Philosophy 200 Religion 300 Social sciences 400 Language and

linguistics

500 Natural sciences and mathematics

600 Technology 700 The Arts 800 Literature and

rhetoric 900 Geography, History,

Biography, Genealogy

Dewey’s Relative Index Alphabetical display of topics showing the

number for each within different contexts

LAND USE 346.068– AGRICULTURAL SURVEYS 333.73– COMMUNITY SOCIOLOGY 307.33– ECONOMICS 333.73– LAW 346.045– PUBLIC ADMINISTRATION 350.82326

Dewey – Hierarchical numbering code 627 Hydraulic engineering 627.1 Inland waterways 627.12 Rivers and streams 627.122 Sediment and silt 627.123 Water diversion 627.124 Estuaries and river mouths

Classification schemes/models cont. Classification schemes need to be dynamic – they

need to be able to grow and change with changing needs and times

The coding system used needs to be able to expand in the logical position where new topics have arisen - be sure that your numbering system is infinitely expansible

Classification schemes for organising documents are often organised by Subject content of the documents The functions of an organisation The structure of an organisation

An organisation’s website can allow a mix of all three approaches to give clients/customers/users the best retrieval possibilities

Main basis for arrangements Subject - eg library filing arrangements for

publications

Functions and structure - eg record keeping, websites

Often best bet is a bit of all three approaches - but remember to consider likely lifespan of the approach

Curtin University’s Keyword Thesaurus– First two elements are function/activity based terms

from a list; elements which follow are subject based - from list or free text - note: BROAD TO NARROW

– PERSONNEL - RECRUITMENT - APPLICATIONS - RECORDS MANAGER

– i.e. (FUNCTION - ACTIVITY - SUBJECT FROM LIST - FREE TEXT)

Role of RMIT metadata project officer ‘Website Refurbishment Prototype project’ Uses SIM Structured Information Manager

created by RMIT MDS First sections to move into the new system:

– TCE– part of HRS– current top end of the website (homepage, programs,

courses, admissions, About RMIT, News etc)

Role of metadata project officer (cont.)

Role: To assist with the selection and refinement of standards for cataloguing web pages as part of the Prototype project.

Tasks involve advising on aspects of the metadata, compiling vocabularies or drop-down menus, testing functionality, participating in meetings and training, involvement in data quality monitoring

Metadata templates for a range of doc types - MAMS, Generic, News Services etc, Learning objects

The Concept behind the New RMIT Web Strategy

‘Benchmarking study - current website ranked in bottom 25% of Australian university websites with respect to veracity, findability and usability of core corporate content’

‘The project should place us in top 25% in approx. 2 years time’

- Rhys Williams

RMIT Project aims to be a whole range of things: a unified but flexible publishing system with standardised

branding and design elements system for delivering key corporate content dynamically a document production and control system a storage and archiving service a facilitator of learning and knowledge sharing a communications facilitator a career planning and employment service a repository of multimedia objects a repository of reusable learning objects

Concept (cont.)

Rhys Williams:

‘RMIT seeks to provide a robust Web system set up and ready for all departments, faculties, units. Individuals put content into the system needing only Word 2000 skills. Rather than the current system of ‘if you want a website you provide your own boxes, wires, disaster recovery, webmaster with htmls skills, etc.’

Implementing the RMIT Web Project Information audits Focus groups Content map of information categories in each

department and where each should be positioned on the new website

An information architecture – document types – metadata sets– searching pathways

Implementing the Web Project (cont.)

Rhys Williams:

‘The new information architecture has been a difficult thing to achieve. It needs to provide for multiple ways to get to authorised content for very different user groups (or marketing cohorts); as opposed to the current method of ‘hanging our content on to the organisational chart’ … Over 50% of RMIT students do not know which faculty or department they are in.’

Implementing the Web Project (cont.)

MAMS objects (‘learning objects’)

Generic News Services and info Course info Program info

Staff profiles Gallery for student

work FAQs Area home pages

Examples of document types include:

Implementing the Web Project (cont.)

METADATA - NEWSIdentifier ……………………………………..Visibility …………………………………...Editorial group ……………………………..Creator, Name ……………………………..Date, Creation ……………………………..Keywords ………………………………….

Title - Insert text herexxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Implementing the Web Project (cont.)

1. Copy starter doc. (template)

2. Prepare the text of the doc. within the template using Word 2000 and special styles

3. Enter the metadata

4. Save doc. as a Web page

5. Upload doc. to Web system via IE or Netscape

6. Make final changes to metadata

Uploading content to the Web involves:

Risky decisions

1. Uploaders of content enter the metadata themselves‘In resource poor, but information rich enterprises such as universities, the online publishing system is moving toward the role of content creator, author, cataloguer, designer, publisher, and marketer being aggregated.’

- Rhys Williams

2. Controlled vocabulary will not be used in the subject/keywords field

Critical role of doc types and metadata Automatic assignment of

some metadata elements Placement of docs in

logical positions within the website

Automatic table of contents display

Global searching across the site by doc type or area

Searching by doc type within a faculty/dept.

Filtered searching by audience type

Searching by topic words in titles, keywords, abstracts and full text

Control of docs via owning group over who may create, edit, authorise

Critical role of doc types and metadata - functions to be developed later Forms processing Visibility security tied to

SAP/AMS/DLS derived groups

Further alignment with AMS, DLS, SAP and other portals

Integration with single sign-on authentication

Support for site personalisation and user profiling

Integration with Customer Relations Mgt.

Customisable home page/pushed content

Mailing lists and mailouts Site statistics and tracking

International metadata standardsGrand vision Create common element sets that will

– achieve excellent precision and recall– facilitate document management functions

Aim to be – simple to apply– interoperable across geographic boundaries and

computer systems– flexible in applying to different formats and functions– easy and quick to assign

International metadata standards

Can one set of elements suit all local needs?

Can more than one standard be applied to a document?

Some elements have a slightly different meaning in a particular context

Underdeveloped guidelines

Which controlled vocabulary?

Who is going to apply the metadata?

Possible sabotage or manipulation of metadata for commercial reasons

Challenging issues

International metadata standards

Such standards are working well within controlled networks and consortia - eg Picture Australia, AVEL, EdNA

Technical solutions are being found to counter sabotaged metadata

Clever search engines Speedy ways to enter

metadata Ingenious ways of

creating subsets of documents

International standards bodies are talking to each other

On the positive side

Developing an RMIT Standard for the Web Project

Three relevant standards Dublin Core Metadata Initiative Edna Education Network Australia standard IMS Instructional Management System

International metadata standards (cont.)

Dublin Core Metadata Initiative (DCMI) (1995-) http://purl.oclc.org/dc

In 1995 1st Workshop in Dublin, Ohio In 1999 DC Version 1.1 Elements In 2000 a set of DC qualifiers was recommended In 1999/2000 XML, HTML and RDF mark-up

guidelines

International metadata standards (cont.)

DUBLIN CORE (15 ELEMENTS) Content (Title, Subject, Description, Language,

Source, Relation, Coverage) Intellectual property (Creator, Contributor,

Publisher, Rights) Instance (Date, Type, Format, Identifier)

International metadata standards (cont.)

EdNA Education Network Australia (1997-)

http://www.edna.edu.au/metadata/

Based on Dublin Core but with approx. 9 extra elements, eg Audience, Review, Version

For the Australian education sector - schools, VET, ACE, higher ed.

International metadata standards (cont.)

IMS (Instructional Management System) 1997-

http://www.imsproject.org

IMS Global Learning Consortium, Inc. Educom, NIST, vendors, ARIADNE , IEEE IMS Meta-Data Best Practice and Implementation

Guide Version 1.1 2000 IEEE Learning Object Metadata base document IMS Learning Resource XML Binding

Specification Version 1.1 2000

International metadata standards (cont.)

Lifecycle (version, status, contribute)

Technical (format, size, location, requirements, type, minimum version, max.version, installation, platform …)

General (identifier, title, catalog entry, language, keywords)

Rights (cost, copyright, description)

Educational (Interactivity type, learning resource type, semantic density, end user role, learning context, typical age range, difficulty ….

Relation (kind, resource) Annotation (person, date,

description) Classification (purpose, taxon

path, description, keywords)

IMS elements grouped into 8 categories

To conclude: metadata offers challenges and opportunities Daring and innovative RMIT Web Project overarching vision to provide a complex multi-

functional information and communication environment

use of metadata to achieve vital functionality reliance on author-provided metadata pragmatic approach to controlled vocabulary

To conclude

Rhys Williams:‘A key challenge for the new Web system is not the technology itself, rather the strength of the business processes associated with the creation of core corporate content, including metadata compliance. The Web system will dynamically display a wealth of content easily and quickly. It, however, cannot control for out of date, incomplete, incorrect, unauthorised content. And we have an awful lot of this sort of content now.’