Enhancing Biomedical Text Rankers by Term Proximity Information 劉瑞瓏...

Enhancing Biomedical Text Rankers by

Term Proximity Information

劉瑞瓏慈濟大學醫學資訊學系

2012/06/13

Outline

• Background– Text ranking– Biomedical information needs

• An approach to enhancing text rankers in the biomedical domain

• Evaluation

• Conclusion

Research Background

Text Ranking• Goal

– Given a query q and a set T of texts retrieved for q, ranking those texts (in T) according to their degrees of relevance to q

• Motivation– Reducing information overload, since T is often

quite huge, even a smart search engine is used– Text ranking is a key issue in information

retrieval, and often a “secret” component for search engines

An Example Ranker

Biomedical Information Need

• Biomedical research requires relevant evidences in the huge and ever-growing biomedical literature

• Retrieval of the evidences requires a system that – Accepts a natural language query for a biomedical

information need, and – Ranks relevant texts higher for access or processing

An Example

• Query: urinary tract infection, criteria for treatment and admission (from OHSUMED) – A disease as the target concept (i.e., urinary tract infection)

– Two concepts about the scenario of the information need (i.e., treatment and admission)

• Neither special nor related to any disease

Contextual Completeness

• Biomedical queries need to be well-formed, and so call for a retrieval system that considers contextual completeness of each query concept t in the text d– Contextual completeness of t in d is the extent

to which the query concepts other than t appear in nearby areas in d

An Example

• In children with an acute febrile illness, what is the efficacy of single medication therapy with acetaminophen or ibuprofen in reducing fever?

[From Lin & Demner-Fushman, 2006]

Answer

Strength

An Approach to Improving Rankers for Biomedical Info Needs

• An approach PRE (Proximity-based Ranker Enhancer) that – Measures contextual completeness of query

concepts appearing in a nearby area in the text– Serves as a supplement to improve existing

rankers

Contrast with Related Work• Biomedical text ranking

– Using synonyms and considering diversity of passages, without considering term proximity

• Text ranking– Individual text scoring techniques (e.g., BM25)

and learning to rank techniques (e.g., Ranking SVM), without considering term proximity

• Improving ranking by term proximity– Term proximity is employed, but contextual

completeness was not considered

System Overview

Text Ranker Development

TrainingTesting

Underlying RankerPRE

Text Ranking TF in d

Query (q)

Text (d)

TF (Term Frequency) Assessment

Training Data

Ranked Texts

TF Assessment

• Three types of term proximity– Overall proximity (QTermTF)– Individual proximity (IndiP)– Collective proximity (CollP)

• A term t may get a large TF increment in d, if – Many query terms appear frequently in d– Query terms are individually near to t at some

places, and– Query terms collectively appear at a place near to t

•RTF(t,d,q) = TF(t,d)+TFincrement(t,d,q)•TFincrement(t,d,q) = QtermTF(d,q)IndiP(t,d,q)×CollP(t,d,q)•QtermTF(d,q) = Total TF of query terms in d•IndiP(t,d,q) =ΣmM －

{t}SigmoidWeight(Mindist(t,m))/ MaxIndiP•Mindist(x,y) = shortest distance between x and y in d•SigmoidWeight(dt) = 1/(1+e-((|q|-1)-dt))•CollP(t,d,q) = MaxkK{mM － {t}

SigmoidWeight(dist(t,k,m))}/MaxCollP, where K is the set positions at which t appears in d•dist(t,k,m) = Distance between t (at position k) and m

Empirical Evaluation

Experimental Data• OHSUMED

– A popular database of biomedical queries and references

– 106 queries– 348,566 references– 16,140 query-reference pairs

• Definitively relevant• Possibly relevant• Not relevant

• TREC Genomics 2006– 28 queries (topics) and 27,999 query-passage

pairs• Definitively relevant, possibly relevant, and not

relevant

– 13,993 query-reference pairs

• TREC Genomics 2007– 36 queries and 35,996 query-passage pairs

• Relevant and not relevant

– 22,913 query-reference pairs

Underlying Rankers

Baseline Ranker Enhancer• Three state-of-the-art techniques that enhanced

text rankers by term proximity– The t-function: t() [Tao & Zhai, 2007]

– The p-function: p() [Cummins & O’Riordan, 2009] – The proximity language model: PLM [Zhao & Yun,

Evaluation Criteria• Evaluating how relevant references are ranked

higher for users to access– Mean average precision (MAP)

– Normalized discount cumulative gain at x (NDCG@X)

Results

Conclusion

• Contextual completeness of query concepts in the texts is essential in ranking biomedical texts

• To measure contextual completeness, it is helpful to integrate three types of term proximity– Overall proximity– Individual proximity– Collective proximity

• Existing rankers may be comprehensively enhanced

Thank You!

Enhancing Biomedical Text Rankers by Term Proximity Information 劉瑞瓏...

Documents

Transcript of Enhancing Biomedical Text Rankers by Term Proximity Information 劉瑞瓏...

•選修部份 - cmass.edu.hk · 部份理科生可選擇之大學學系： 醫學、牙科、藥劑學、化學、化學工程、 統計學、測量學、生物科學、物理 某些大學要求生物組的學系：

適性化多代理人網際網路環境資訊偵搜 Collaborative Multiagent Adaptation for Business Environmental Scanning through the Internet 劉瑞瓏 Rey-Long Liu 中華大學資訊管理系

數學、科學與哲學ocw.nctu.edu.tw/course/msp011/msp011-unit3.pdf · 國立交通大學應用數學系林琦焜老師 數學、科學與哲學 應用數學系林琦焜老師

THIEN LONG MENUHIEN 天瓏軒 - in Danang...2020/02/17 · THIEN LONG MENUHIEN 天瓏軒 All prices are in 000 VND and subject to 10% VAT and 5% service charge Tất cả các giá

Tema 3 - TrampantOJO | Ester Arconada · Web viewTipos de Suelo SUELOS ZONALES dependen del clima Suelos de clima OCEÁNICO Sustrato SILÍCEO Tierras PARDAS RANKERS PODZOLES Suelos

Rankers · almas tabassum b nagarajo dharmaraju s m ushasri mallikarjun t mohd sami p saikumar ... mourya m p rajkumar r bhulaxmi page - 1 . since ace srinivas d tspsc a a (civil)

東吳大學 社會學系學士班【東吳大學社會學系學士班學生手冊】 3 黃朗文 副教授 最高學歷：美國康乃爾大學鄉村社會學博士 學術專長：人口學、家庭社會學、調查研究方法

106 學年度南華大學 管理學院國際企業學士學位 學程說明手冊academic2.nhu.edu.tw/files/archive/913_38222482.pdf · 表3 南華大學管理學院國際企業學士學位學程三級能力指標

Familiendatensammlung Rankers und Thoer 18.02.2018 1...2018/02/01 · Augustina ex Oedt 22.11.1862 AbelenBreyell Peter Johann Zoers Anna Elisabeth 02.11.1818 Amern St. Georg Abels

Interactive Identification of Information Needs and Its Application to Medical Informatics Rey-Long Liu 劉瑞瓏 Dept. of Information Management Chung Hua University.

標題：理學院科學學士學位學程班97學年度校內招生公告 · Web view國立交通大學跨領域雙學位「理學院科學學士學位學程」 10. 6. 學年度校內.

國立中央大學 工學院學士班 - ipe.ec.ncu.edu.twipe.ec.ncu.edu.tw/files/shares/2018OpenHouseDay_NCUIPE.pdf · 逢甲精密系統設 計學士學位學程 ... 入學新生

○○○ 博士 現職 ○○○○ 大學 ○○○○ 大學 ○○○○ 學系

108學年度學科能⼒測驗 數學考科解析 108學年度學科能力測驗 ...¸測數學科... · 2019. 1. 26. · 108學年度學科能⼒測驗 數學考科解析 108學年度學科能力測驗

開南大學 資訊學院 就業學分學程簡介

Live數學學習網 ─

中山醫學大學一 九學年度入學 ... - csmu.edu.tw

劉瑞瓏 Rey-Long Liu 中華大學資訊管理系 中華民國 92 年 11 月 18 日

輔仁大學醫學系九十六學年 醫學倫理課

大學學系學群介紹系列活動 ~ 中山醫學院

•選修部份 - cmass.edu.hk · 部份理科生可選擇之大學學系：醫學、牙科、藥劑學、化學、化學工程、統計學、測量學、生物科學、物理某些大學要求生物組的學系：

數學、科學與哲學ocw.nctu.edu.tw/course/msp011/msp011-unit3.pdf · 國立交通大學應用數學系林琦焜老師數學、科學與哲學應用數學系林琦焜老師

東吳大學社會學系學士班【東吳大學社會學系學士班學生手冊】 3 黃朗文副教授最高學歷：美國康乃爾大學鄉村社會學博士學術專長：人口學、家庭社會學、調查研究方法

106 學年度南華大學管理學院國際企業學士學位學程說明手冊academic2.nhu.edu.tw/files/archive/913_38222482.pdf · 表3 南華大學管理學院國際企業學士學位學程三級能力指標

國立中央大學工學院學士班 - ipe.ec.ncu.edu.twipe.ec.ncu.edu.tw/files/shares/2018OpenHouseDay_NCUIPE.pdf · 逢甲精密系統設計學士學位學程 ... 入學新生

○○○ 博士現職 ○○○○ 大學 ○○○○ 大學 ○○○○ 學系

108學年度學科能⼒測驗數學考科解析 108學年度學科能力測驗 ...¸測數學科... · 2019. 1. 26. · 108學年度學科能⼒測驗數學考科解析 108學年度學科能力測驗

開南大學資訊學院就業學分學程簡介

中山醫學大學一九學年度入學 ... - csmu.edu.tw

劉瑞瓏 Rey-Long Liu 中華大學資訊管理系中華民國 92 年 11 月 18 日

輔仁大學醫學系九十六學年醫學倫理課