Pgx user meeting_20170602
Click here to load reader
-
Upload
nao-oec -
Category
Data & Analytics
-
view
161 -
download
0
Transcript of Pgx user meeting_20170602
![Page 1: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/1.jpg)
PGXを使った公共配列データのPagerank
オーイシ
![Page 2: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/2.jpg)
BioProject, SRAと関連するPubMedエントリ
BioProject, SRAとそこに記述のあるPubMedエントリのリンクをエッジとしてPGXでグラフデータ化します。
![Page 3: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/3.jpg)
目的:配列データベースから注目すべき配列
データを抽出する(ことができるかテストす
る)。
![Page 4: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/4.jpg)
Load graph data
Accession id - -> PubMed id
…のような単純なtab textをedge listとして読み込みました。本当はノードにプロパティを追加しvertexFilter、edgeFilterを利用して、登録期間などのレンジを設定…等したいところ。
DRP000001 20398357
DRP000002 20398357
DRP000003 20400770
DRP000004 20400770
……
![Page 5: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/5.jpg)
Pagerank
pgx> G = session.readGraphWithProperties(‘./edge_list.json')
pgx> analyst.pagerank(G, 0.0001, 0.85, 100)
pgx> G.queryPgql("SELECT n.id(), n.pagerank WHERE (n) ORDER BY n.pagerank
DESC").print(5)
------------------------------------
| n.id() | n.pagerank |
====================================
| 23851394 | 2.55259517565346E-4 |
| 24158624 | 1.4196374535289173E-4 |
| 25840857 | 1.3755439638029894E-4 |
| 23383127 | 1.3020548142597758E-4 |
| 9023104 | 1.1958426451250536E-4 |
------------------------------------
PubMed idしかランキングに表示されない。欲しいのは配列データのランキングなので、これはちょっと違う、、、。
![Page 6: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/6.jpg)
Undirect & re-pagerank
pgx> G = session.readGraphWithProperties(‘./edge_list.json')
pgx> G = G.undirect()
pgx> analyst.pagerank(G, 0.0001, 0.85, 100)
pgx> G.queryPgql("SELECT n.id(), n.pagerank WHERE (n) ORDER BY n.pagerank
DESC").print(5)
---------------------------------------
| n.id() | n.pagerank |
=======================================
| PRJNA33175 | 0.022764400568100977 |
| PRJNA168 | 0.0045585428625617855 |
| PRJNA178030 | 0.00448498012968195 |
| PRJNA313047 | 0.00392096260883999 |
| PRJNA177353 | 0.0019398995283837262 |
---------------------------------------
undirect()しグラフを無向化したのち集計すると読み込んだエッジの双方のノードがランキングされました(pagerankの上位は配列データ側に集中していたためここでは表示されませんが)。
![Page 7: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/7.jpg)
Top3 Nodes
Homo sapiens (human)
RefSeq annotation of the human reference genome assembly
https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA168
Homo sapiens (human)
RefSeq annotation of the human haploid hydatidiform mole cell line
genome assembly
https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA168
Bacterial 16S rRNA
Bacterial 16S Ribosomal RNA RefSeq Targeted Loci Project
https://www.ncbi.nlm.nih.gov/bioproject/33175
1
2
3
![Page 8: Pgx user meeting_20170602](https://reader038.fdocument.pub/reader038/viewer/2022100803/5aac739b7f8b9a435e8b4b63/html5/thumbnails/8.jpg)
Top3 Nodes (PubMed entry)
Nature. 2013 Jul 25;499(7459):431-7. doi: 10.1038/nature12352. Epub 2013 Jul 14.
Insights into the phylogeny and coding potential of microbial dark matter.
https://www.ncbi.nlm.nih.gov/pubmed/?term=23851394
Nature. 2013 Jul 25;499(7459):431-7. doi: 10.1038/nature12352. Epub 2013 Jul 14.
Insights into the phylogeny and coding potential of microbial dark matter.
https://www.ncbi.nlm.nih.gov/pubmed/?term=23851394
Genome Res. 2015 May;25(5):762-74. doi: 10.1101/gr.185538.114. Epub 2015 Apr 3.
The 100-genomes strains, an S. cerevisiae resource that illuminates its
natural phenotypic and genotypic variation and emergence as an opportunistic
pathogen.
https://www.ncbi.nlm.nih.gov/pubmed/?term=25840857
1
2
3
※PubMedのみで集計されたpageranking上位がどのような論文かも一応。