Theory and practical case study Modelling in Knowledge Management - Issues of Metadata Theory and...
-
date post
19-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Theory and practical case study Modelling in Knowledge Management - Issues of Metadata Theory and...
Modelling in Knowledge Management - Issues of Metadata Theory and practical case studyTheory and practical case study
Don Schauder - Professor of Information Management, SIMS, Monash
Cherryl Schauder - Metadata Project Officer, RMITOctober 2003
References Kennedy and Schauder 1998, Records
Management Business classification scheme, pp 65-68 Chapters 6 and 7
Controlled vocabularies. Middleton, Mike. QUT site - http://www2.fit.qut.edu.au/InfoSys/middle/cont_voc.html
Relevant websites Yahoo Magellan www.dmoz.oz EdNA browse map http://
www.edna.edu.au/EdNA ASCED
bass.adm.monash.edu.au/policy/asced/asced2000.htm
ANZIC www.cqu.edu.au/research/research_services/arc/anzic_codes.htm
‘View source’ - metatags in HTML Headers of many Websites eg Museum of Victoria
Classification as Knowledge Modeling
The way we organise our kitchens, our bills, our wallets or handbags … our kinds of work, our political and justice systems …. etc
Part of our everyday lives and thought processes Part of our culture, our ways of interpreting what
happens in our lives and around us Mostly explicit knowledge (ie ‘documents’), but
also tacit knowledge. Key to classification: Similarity/dissimilarity
Classification as Knowledge Modelling
Grouping things together into
– classes or categories (facets)
– and sub-classes or sub-categories (sub-facets)
Hierarchies or family trees, systems and
sub-systems
In document organisation Classificatory models (classification schemes) are
used for
DISPLAY - IN HIERARCHIES BROAD TO NARROW - INDICATES RELATIONSHIPS
RETRIEVAL IF CODED, THE CODES ENABLE A FILING
ORDER
A class (facet) is a group of things which share a common feature
Disciplines - engineering, science, history, art ... Types of political system - capitalism,
communism, socialism ... Age groups - infant, child, teenager, adult ... Towns - Melbourne, Sydney, Perth ... Cricket grounds - MCG, SCG, GABBA ... Grades - High distinction, distinction, credit … Degrees of approval - acceptable, unacceptable
Ordering concepts in a classified arrangement - logical/systematic The order of facets/classes in a hierarchy may go
from least to most significant facet - or the opposite; or follow some logical order recognisable by the target audience
The order of concepts within each class/sub-class should be some logical order that has meaning in the context - most to least common, young to old, established scientific order, small to big
Alphabetical is a last resort
Mutual exclusivity When we are organising information we need -
as far as possible - to be wary of overlapping categories which can cause confusion (unless we are purposefully using overlapping categories, eg for publicity reasons in a sales campaign)
JEANS– BLACK– BLUE– GREY– STRETCH?
Be careful to avoid overlap AN ORGANISATION’S PERMANENT WEB
NAVIGATION BUTTONS– COURSES– ABOUT US– NEWS– EVENTS– ACADEMIC PROGRAMS– CAREERS– EMPLOYMENT AND JOBS
Hierarchies - Groupings, sub-groupings and sub- sub groupings
B U IL D IN G S
C H U R C H E S M O S Q U E S
R E L IG IO UST yp e tit le h e re
P R E -S C H O O L P R IM A R Y S E C O N D A R Y
T A F E U N IV E R S ITY
T E R T IA R Y C O N T IN U IN G
L E V E LS
E D U C A T IO N A LT yp e tit le h e re
H O U S E S U N IT S A P A R T M E N T S
R E S ID E N T IA LT yp e tit le h e re
B U IL D IN G S
Hierarchies Convenient way to think about organisations,
products, people, systems of belief
BUT SOME HIERARCHICAL GROUPINGS
ARE MORE PERMANENT THAN OTHERS THIS IS IMPORTANT WHEN
– ARRANGING INFORMATION ON A WEB PAGE– FILING THINGS ON A SHELF
Hierarchies
Longer life, eg
Natural materials– Stone
» Marble
» Slate
– Timber» Hardwoods
» Softwoods Pine
Shorter life, eg
Organisation X– Dept of A
» Section Division
– Dept of B» Section
Division
– Dept of C » Section
Division
Hierarchical relationships ‘True’ generic
relationships Timber is always a
natural material Rabbits are always
rhodents Carrots are always a
Vegetable
Contextual generic relationships
Psychology is sometimes a Science
Rabbits are sometimes Pets
Goodwill is sometimes a tradeable asset
Classification schemes or models Can cover all of knowledge or only a specialised field –
the more fields and sub-fields, the greater the complexity of the scheme - CUSTOMISED OR WIDE AUDIENCE
Classification schemes reflect the viewpoint of their compilers and their end-users – they are overlaid with the values and ways of thinking of the people who created them – they cannot be ‘objective’
No classification approach will suit all the users
Classification displays broad to narrow Classification schemes are often linear, one
dimensional arrangements, whereas the broad knowledge/content in documents is often multi-dimensional
Can display a classification scheme as a map/diagram with arrows showing multiple relationships, or as a tree structure
Classified groupings Displays how documents have been grouped e.g subject directories on the Web - Yahoo and
Magellan
BUT still need an index/search engine for searching across the categories
Filing systems in electronic directories, libraries and record keeping contexts
A document’s content is often multi-dimensional– Agricultural surveys of land use– Outdoor cookery for children– Valuation of land in South Yarra in 1999
– A classification allows one position for simple and complex topics
– An index/catalogue/search engine is needed to search across the grouped categories
Creating a Classification scheme for organising documents A classification scheme is a hierarchy or a set of
hierarchies of concepts, usually with some kind of code attached to each concept. The code often establishes an order for filing the documents – usually numbers, letters or a combination of both, eg A2, A46, B13, B14, B335. The numbers or letters can file decimally or as integers/whole numbers.
Dewey Decimal Classification 000 General
(Knowledge, the Book, Bibliography, Computer Science, Librarianship, Journalism, etc)
100 Philosophy 200 Religion 300 Social sciences 400 Language and
linguistics
500 Natural sciences and mathematics
600 Technology 700 The Arts 800 Literature and
rhetoric 900 Geography, History,
Biography, Genealogy
Dewey’s Relative Index Alphabetical display of topics showing the
number for each within different contexts
LAND USE 346.068– AGRICULTURAL SURVEYS 333.73– COMMUNITY SOCIOLOGY 307.33– ECONOMICS 333.73– LAW 346.045– PUBLIC ADMINISTRATION 350.82326
Dewey – Hierarchical numbering code 627 Hydraulic engineering 627.1 Inland waterways 627.12 Rivers and streams 627.122 Sediment and silt 627.123 Water diversion 627.124 Estuaries and river mouths
Classification schemes/models cont. Classification schemes need to be dynamic – they
need to be able to grow and change with changing needs and times
The coding system used needs to be able to expand in the logical position where new topics have arisen - be sure that your numbering system is infinitely expansible
Classification schemes for organising documents are often organised by Subject content of the documents The functions of an organisation The structure of an organisation
An organisation’s website can allow a mix of all three approaches to give clients/customers/users the best retrieval possibilities
Main basis for arrangements Subject - eg library filing arrangements for
publications
Functions and structure - eg record keeping, websites
Often best bet is a bit of all three approaches - but remember to consider likely lifespan of the approach
Curtin University’s Keyword Thesaurus– First two elements are function/activity based terms
from a list; elements which follow are subject based - from list or free text - note: BROAD TO NARROW
– PERSONNEL - RECRUITMENT - APPLICATIONS - RECORDS MANAGER
– i.e. (FUNCTION - ACTIVITY - SUBJECT FROM LIST - FREE TEXT)
Role of RMIT metadata project officer ‘Website Refurbishment Prototype project’ Uses SIM Structured Information Manager
created by RMIT MDS First sections to move into the new system:
– TCE– part of HRS– current top end of the website (homepage, programs,
courses, admissions, About RMIT, News etc)
Role of metadata project officer (cont.)
Role: To assist with the selection and refinement of standards for cataloguing web pages as part of the Prototype project.
Tasks involve advising on aspects of the metadata, compiling vocabularies or drop-down menus, testing functionality, participating in meetings and training, involvement in data quality monitoring
Metadata templates for a range of doc types - MAMS, Generic, News Services etc, Learning objects
The Concept behind the New RMIT Web Strategy
‘Benchmarking study - current website ranked in bottom 25% of Australian university websites with respect to veracity, findability and usability of core corporate content’
‘The project should place us in top 25% in approx. 2 years time’
- Rhys Williams
RMIT Project aims to be a whole range of things: a unified but flexible publishing system with standardised
branding and design elements system for delivering key corporate content dynamically a document production and control system a storage and archiving service a facilitator of learning and knowledge sharing a communications facilitator a career planning and employment service a repository of multimedia objects a repository of reusable learning objects
Concept (cont.)
Rhys Williams:
‘RMIT seeks to provide a robust Web system set up and ready for all departments, faculties, units. Individuals put content into the system needing only Word 2000 skills. Rather than the current system of ‘if you want a website you provide your own boxes, wires, disaster recovery, webmaster with htmls skills, etc.’
Implementing the RMIT Web Project Information audits Focus groups Content map of information categories in each
department and where each should be positioned on the new website
An information architecture – document types – metadata sets– searching pathways
Implementing the Web Project (cont.)
Rhys Williams:
‘The new information architecture has been a difficult thing to achieve. It needs to provide for multiple ways to get to authorised content for very different user groups (or marketing cohorts); as opposed to the current method of ‘hanging our content on to the organisational chart’ … Over 50% of RMIT students do not know which faculty or department they are in.’
Implementing the Web Project (cont.)
MAMS objects (‘learning objects’)
Generic News Services and info Course info Program info
Staff profiles Gallery for student
work FAQs Area home pages
Examples of document types include:
Implementing the Web Project (cont.)
METADATA - NEWSIdentifier ……………………………………..Visibility …………………………………...Editorial group ……………………………..Creator, Name ……………………………..Date, Creation ……………………………..Keywords ………………………………….
Title - Insert text herexxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Implementing the Web Project (cont.)
1. Copy starter doc. (template)
2. Prepare the text of the doc. within the template using Word 2000 and special styles
3. Enter the metadata
4. Save doc. as a Web page
5. Upload doc. to Web system via IE or Netscape
6. Make final changes to metadata
Uploading content to the Web involves:
Risky decisions
1. Uploaders of content enter the metadata themselves‘In resource poor, but information rich enterprises such as universities, the online publishing system is moving toward the role of content creator, author, cataloguer, designer, publisher, and marketer being aggregated.’
- Rhys Williams
2. Controlled vocabulary will not be used in the subject/keywords field
Critical role of doc types and metadata Automatic assignment of
some metadata elements Placement of docs in
logical positions within the website
Automatic table of contents display
Global searching across the site by doc type or area
Searching by doc type within a faculty/dept.
Filtered searching by audience type
Searching by topic words in titles, keywords, abstracts and full text
Control of docs via owning group over who may create, edit, authorise
Critical role of doc types and metadata - functions to be developed later Forms processing Visibility security tied to
SAP/AMS/DLS derived groups
Further alignment with AMS, DLS, SAP and other portals
Integration with single sign-on authentication
Support for site personalisation and user profiling
Integration with Customer Relations Mgt.
Customisable home page/pushed content
Mailing lists and mailouts Site statistics and tracking
International metadata standardsGrand vision Create common element sets that will
– achieve excellent precision and recall– facilitate document management functions
Aim to be – simple to apply– interoperable across geographic boundaries and
computer systems– flexible in applying to different formats and functions– easy and quick to assign
International metadata standards
Can one set of elements suit all local needs?
Can more than one standard be applied to a document?
Some elements have a slightly different meaning in a particular context
Underdeveloped guidelines
Which controlled vocabulary?
Who is going to apply the metadata?
Possible sabotage or manipulation of metadata for commercial reasons
Challenging issues
International metadata standards
Such standards are working well within controlled networks and consortia - eg Picture Australia, AVEL, EdNA
Technical solutions are being found to counter sabotaged metadata
Clever search engines Speedy ways to enter
metadata Ingenious ways of
creating subsets of documents
International standards bodies are talking to each other
On the positive side
Developing an RMIT Standard for the Web Project
Three relevant standards Dublin Core Metadata Initiative Edna Education Network Australia standard IMS Instructional Management System
International metadata standards (cont.)
Dublin Core Metadata Initiative (DCMI) (1995-) http://purl.oclc.org/dc
In 1995 1st Workshop in Dublin, Ohio In 1999 DC Version 1.1 Elements In 2000 a set of DC qualifiers was recommended In 1999/2000 XML, HTML and RDF mark-up
guidelines
International metadata standards (cont.)
DUBLIN CORE (15 ELEMENTS) Content (Title, Subject, Description, Language,
Source, Relation, Coverage) Intellectual property (Creator, Contributor,
Publisher, Rights) Instance (Date, Type, Format, Identifier)
International metadata standards (cont.)
EdNA Education Network Australia (1997-)
http://www.edna.edu.au/metadata/
Based on Dublin Core but with approx. 9 extra elements, eg Audience, Review, Version
For the Australian education sector - schools, VET, ACE, higher ed.
International metadata standards (cont.)
IMS (Instructional Management System) 1997-
http://www.imsproject.org
IMS Global Learning Consortium, Inc. Educom, NIST, vendors, ARIADNE , IEEE IMS Meta-Data Best Practice and Implementation
Guide Version 1.1 2000 IEEE Learning Object Metadata base document IMS Learning Resource XML Binding
Specification Version 1.1 2000
International metadata standards (cont.)
Lifecycle (version, status, contribute)
Technical (format, size, location, requirements, type, minimum version, max.version, installation, platform …)
General (identifier, title, catalog entry, language, keywords)
Rights (cost, copyright, description)
Educational (Interactivity type, learning resource type, semantic density, end user role, learning context, typical age range, difficulty ….
Relation (kind, resource) Annotation (person, date,
description) Classification (purpose, taxon
path, description, keywords)
IMS elements grouped into 8 categories
To conclude: metadata offers challenges and opportunities Daring and innovative RMIT Web Project overarching vision to provide a complex multi-
functional information and communication environment
use of metadata to achieve vital functionality reliance on author-provided metadata pragmatic approach to controlled vocabulary
To conclude
Rhys Williams:‘A key challenge for the new Web system is not the technology itself, rather the strength of the business processes associated with the creation of core corporate content, including metadata compliance. The Web system will dynamically display a wealth of content easily and quickly. It, however, cannot control for out of date, incomplete, incorrect, unauthorised content. And we have an awful lot of this sort of content now.’