Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

38
Searching & Utilizing of Information 陈陈陈 Chen Gui-wu September, 2006

Transcript of Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Page 1: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Searching & Utilizing of Information

陈贵梧 Chen Gui-wu

September, 2006

Page 2: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Outline

Ⅰ About the Course

ⅡRules of Information

Ⅲ Basic Concepts

Ⅳ Information Sources

ⅤQuery Languages

Ⅵ Questions & Answers

Questions/comments are always welcome!

Page 3: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅰ About the Course

Goals: 1. introduce students to the basic academic informa

tion sources; 2. acquaint students with the strategies and techniq

ues of information searching; 3. develop students’ information competency, inclu

ding information identification, acquisition, evaluation, and utilization.

Page 4: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

About the Course

Competencies:Identify, understand, and effectively use a variety of information sources in print and electronic formats.

Become familiar with and grasp most often used searching strategies

Acquire, organize, and evaluate academic information at a basic level.

Applied information searching techniques into practices.

Page 5: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

About the Course

Activities:Lectures & Presentations PracticesWorkgroup Homework Examination

Page 6: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅱ Rules of Information Rule1: Go Where It Is

Rule2: The Answer You Get Depends on the Questions You Ask

Rule3: The Answer Should Match the Information Need

Rule4: Question Your Answers — Information May Be True But Still Wrong

Rule5: Research Is a Multi-Stage Process

Page 7: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅲ Basic Concepts

1. Information Chain

2. Database

3. Electronic Journal

4. Fields & Records

5. Search engine Vs. Web Directory

6. Impact Factor

Page 8: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

1. Information Chain

A rough way of measuring the usual materials is to classify them as primary, secondary, or tertiary.

Primary sources They are original materials which have not been filtered through interpretation, consideration, or, often, even evaluation by a second party.

e.g. a journal article, monograph, report, patent, dissertation, or reprint of an article.

Page 9: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Information Chain Secondary sources

A secondary source is information about primary, or original, information which usually has been modified, selected, or rearranged for a specific purpose or audience.

e.g. index, abstracts

The neat distinction between primary and secondary sources is not always apparent.

Page 10: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Information Chain Tertiary Sources

These consist of information which is a distillation and collection of primary and secondary sources.

Twice removed from the original, they include almost all the source types of reference.

e.g. encyclopedias, reviews, bibliographical sources, fact books, and almanacs.

Page 11: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Information Chain The definitions of primary, secondary, and tertiary sources

are useful only in that they indicate :

relative currency (Primary sources tend to be more current than secondary sources.) ,

relative accuracy of materials (primary sources will generally be more accurate than secondary sources, only because they represent unfiltered, original ideas; but conversely, a secondary source may correct errors in the primary source).

Page 12: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

2. Database A set of related files that is created and managed by a datab

ase management system (DBMS).

Today, DBMSs can manage any form of data including text, images, sound and video etc.

Databases can be categorized into index databases, abstracts databases, and full-text databases; or classified as general databases and subject-specific databases.

Page 13: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Database Types

Index Database

An index database lists the author, title, date, volume, and source for an article.

Abstracts Database

An abstract database gives all of that information, as well as an abstract, which is a short summary of the article.

Full Text Database

Increasingly, databases are including the full-text of articles.

The whole article can be printed, e-mailed, or saved to a disk for later usage.

However, not everything is available in full-text.

Page 14: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Database Types

Some are general, such as EBSCO, SDOS, Lexis-Nexis, or UMI Proquest,

while others are subject-specific, such as BA, CA, Ovid, Medline or Pubmed.

Page 15: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

3. Electronic Journal

Any journal available over the Internet can be called an "electronic journal" or "e-journal ".

In many cases e-journals are counterparts to familiar print publications, although an increasing number of titles exist only in electronic format.

Frequently e-journals appear on the screen exactly as they do in print with similar page design and typeface.

Page 16: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Electronic Journal

Often e-journal issues may be available before the print counterpart is on the library shelf.

Some e-journals even provide advance copy of articles accepted for publication but not yet scheduled for a print issue.

In most cases, the electronic equivalent of a print journal only exists for the most recent volumes; older issues still need to be read in print.

Page 17: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

e-journals Vs. databases

Full-text e-journals may be viewed as individual issues that correspond to their print counterparts.

Typically, you will use these on the publishers' site where you may browse the table of contents of an issue, scan the abstracts of articles, and view the full-text of articles.

Full-text article databases are collections of articles, not complete issues of journals.

Often these collections bring together articles on a particular subject, such as medicine or biology studies; they are usually searched by subject.

Page 18: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

5. Fields & Records

A field is a unit of data . Examples of fields are title (TI), keyword, author (AU), source (SO), address (AD) and abstract (AB), etc..

A collection of fields make up a record.

In databases, searchable fields are sometimes called search options.

Page 19: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Sample Record

Fields

Page 20: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

5. Search Engine Vs. Web Directory

A search engine is a program designed to help find files stored on a computer. Most outstanding search engine is Google.

The search engine allows one to ask for media content meeting specific criteria (typically those containing a given word or phrase) and retrieving a list of files that match those criteria.

A search engine often uses a previously made, and regularly updated index to look for files after the user has entered search criteria.

Page 21: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Search Engine Vs. Web Directory

A web directory is a directory on the World Wide Web that specializes in linking to other web sites and categorizing those links.

A web directory:- has a pre-defined list of websites- is compiled by human editors- is categorized according to subject/topic- is selective

Page 22: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Search Engine Vs. Web Directory

Web directories don't use software programs. They often allow site owners to submit their web sites for inclusion. Editors review and organize qualified web sites by subject into categories.

The most popular directories are Yahoo and Open Directory Project (http://dmoz.org/).

Page 23: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Search Engine Vs. Web Directory

Consider using the Directory instead of search engines whenever you want to: Familiarize yourself with a topic. Get suggestions for ways to narrow your search. Find ideas for query terms. Figure out the scope of a given category, e.g., the

number of newspapers in California. View only pages that have been evaluated by a human editor.

Page 24: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

6. Impact Factor

Journal Impact Factor is from Journal Citation Report (JCR), a product of Thomson ISI (Institute for Scientific Information).

It is a measure of the frequency with which the "average article" in a journal has been cited in a particular year.

It is calculated by dividing the number of current citations to articles published in the two previous years by the total number of articles published in the two previous years.

It will help you evaluate a journal’s relative importance, especially when you compare it to others in the same field.

Page 25: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅳ Information SourcesGeneral Databases:EBSCO, SDOS, John Willey online Journal, ProQuest-ARL

Specific Databases:Pubmed, Medline, OVID, MD Consult, Cell Press

Index Databases:BA, CA, SciFinder Scholar, EI, Web of Science (SCI, BP, ISTP)

Internet Resources:BioMed Central, HighWire Press

Search Engines:Google

Web Directories:Yahoo, Open Directory Project

Page 26: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅴ Query Languages 1. Keyword-based Querying

2. Boolean Operators

3. Proximity Operators

4. Other Operators

5. Truncation

6. Parentheses

Page 27: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

1. Keyword-based Querying

Single-word :What a word is?

Letters, separatorsNon-splitting characters: on-line. Database decides.

Text documents are assumed to be essentially long sequence of words.

the result of word queries is the set of documents containing at least one of the words of the query

Intuitive, easy to express, fast ranking.Words can be highlighted in the output. the exact positions where a word appears in the text my be required

Page 28: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Keyword-based Querying

Context Queries: Ensure that the words are related

Phrase :a sequence of words; normally, the exact phrase must be matched.

“enhance retrieval”

Allows separators and stopwords: “enhance the retrieval”

Proximity: a sequence of single words or phrase is given, together with a maximum allowed distance between them.

“enhance the quality of information retrieval”

Distance: words, letters. Order: same or not

Page 29: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

2. Boolean Operators : AND

AND is used to find documents which contain both of the search terms linked by the operator and to eliminate documents which contain only one or neither of the search terms.

e1 AND e2

-- select all documents which satisfy both e1 and e2

e.g., to find documents on transgenic mice:

transgenic AND mice

Page 30: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Boolean Operators : OR

OR is used to find documents which contain either one or both of the search terms:

e1 OR e2

-- select all documents which satisfy e1 or e2

e.g., to find all documents referring to the kidney, the liver or both organs:

kidney OR liver

Page 31: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Boolean Operators : NOT

NOT is used to exclude documents from a retrieved set.

e1 NOT e2

-- select all documents which satisfy e1 but not e2

e.g., to find documents on rodents which do not deal with rats:

rodents NOT rats

Page 32: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

3. Proximity Operators

Proximity operators allow you to locate one word within a certain distance of another. The symbols generally used in this type of search are w and n.

The w represents the word with(in) and the n represents the

word "near." This type of search is not available in all databases.

This can be useful to narrow down a search when searching for a sequence of words, if the exact sequence is not known, or if no other means is available to indicate that the key words should be treated as a phrase . 

Page 33: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Proximity Operators

Near Operator (Nx) -- finds words within x number of words from each other, regardless of the order in which they occur.

e.g.: television n2 violence would find "television violence" or "violence on television," but not "television may be the culprit in recent high school violence."

Within Operator (Wx) — finds words within x number of words from each other, in the order they are entered in the search.

e.g.: Franklin w2 Roosevelt would find " Franklin Roosevelt " or " Franklin Delano Roosevelt " or " Franklin D. Roosevelt ", but would not find " Roosevelt Franklin ".

Page 34: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

4. Other operators

A number of other operators are commonly permitted by retrieval system and can be used to refine searching carried out in the simple search mode.

The most useful of these are:

+ - ~ “”

(to be discussed in Google Searching)

Page 35: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

5. Truncation

Certain symbols, often * or ?, may be used in some search systems as wildcards to signify one or more characters.

Their use is most frequently permitted only for truncation at the end of a search term.

e.g. sul*ur will retrieve both sulfur and sulphur , while sulph* will retrieve sulphuric, sulphurous, sulphate, sulphite, etc, but not sulfuric, sulfurous, sulfate, sulfite, etc.

Truncation can result in too many irrelevant retrievals.

e.g., the truncated term diet* will retrieve documents containing the words diet, dietary, dietetic, dietician, but also any references to diethyl compounds.

Page 36: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

6. Parentheses

The operators within a pair of parentheses are treated as a single unit which is processed first.

e.g. to find documents which mention cell culture or tissue culture:

cultur* AND (cell* OR tissue)

Note the use of truncation to cover variants such as cultured, culturing, cultures and cells.

Page 37: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

Ⅵ Questions?

Email: [email protected]

(MSN) [email protected]

Tel: 85220285

Page 38: Searching & Utilizing of Information 陈贵梧 Chen Gui-wu September, 2006.

谢谢 !Thank you!