Development and Deployment of the New Research ...the New Research Infrastructure for Open Science...
Transcript of Development and Deployment of the New Research ...the New Research Infrastructure for Open Science...
Development and Deployment of the New Research Infrastructure for
Open Science in Japan 日本开放科学中新研究基础设施
的开发与部署
Kazu YAMAJI Professor and Director of Research Center for Open Science and Data Platform
National Institute of Informatics, Japan
TACC Workshop 26th September 2019 – 13:30
NII is the Japanese NREN 日本学术论文搜索网——日本的国家研究与教育网络
To US
To Asia
To Europe : Domestic line (100Gbps or more)
: International line (100Gbps) : International line (10Gbps)
National Universities
Municipal Universities
Private Universities
Junior Colleges
Colleges of Technology
Inter-Univ. Research Institutes
Labs and Others Total
Number of Organizations
86 (100%)
71 (78%)
348 (55%)
62 (18%)
55 (97%)
16 (100%)
179 817
(As of March 2015)
Sapporo
Fukuoka
Osaka
Tokyo
: SINET node
• SINET is a Japanese academic backbone network for more than 800 universities and research institutions, and for about 3 million users.
• SINET covers 100% of national, 78% of municipal, and 55% of private universities.
2
SINET5
GakuNin Federation
GakuNin-Cloud Direct Connection VPN
Collaboration and Promotion in Research and Education
Resource
Network
Cloud Dramatic cost reduction and
enhancement of research and education environment by tailored cloud services
Promotion of academic information circulation and open access
Collaborative promotion of institutional repository expansion
Security Network flow analysis
and dynamic control Raise of security level
for SINET users
Collaborative enhancement of authentication between universities
Federation
Flow Analysis
Nationwide 100-Gbps backbone network and scalable network expansion High-speed direct international lines to USA, Europe, and Asia Introduction of new technologies such as SDN in response to user needs
21st Century Academic Information Infrastructure for Advancing Open Science 21世纪学术信息基础设施促进开放科学
3
Recent Trend in E-Infrastructure 电子基础设施的最新趋势
4
European Open Science Cloud (EOSC)欧洲开放科学云数据
• Past: Each Institutions or Project have their own E-Infrastructure
• Future: Integrate Existing E-Infrastructure and make it All EU Available • Integrate Different Layer Services from Network to Domain DB • Make it Visible by Developing Portal (EOSC-hub) and Discovery (OpenAIRE) • Consider to Support Long-Tail Domain such as Humanity and Social Science • Consider to Collaborate with Small and Medium Enterprise
Federation Services
AAI, Accounting, Monitoring,
Basic Infrastructure Compute and Storage
Open Collaboration
Platforms
Application Repository,
Configuration Management, Marketplace
Common services
Thematic Service
Thematic Service
Thematic Service
Thematic Service
Thematic Service
Community Support services
Thematic Service
Added Value Services Compute, Data, Software
Management and Preservation
1. CLARIN (language resources) 2. DODAS-CMS (high energy physics) 3. ESAS-ENES (Climate analytics) 4. GEOSS (earth observation) 5. OpenCoastS (Coastal circulation forecast) 6. WeNMR (structural biology) 7. EP pillar (Earth observation) 8. DARIAH (digital humanities) 9. LifeWatch (biodiversity)
Common PF for RDM and Sharing
Discovery
Portal
Domain Services
Authn/Authz
Compute Storage
Network 5
Australia澳大利亚
ARDC first appears in 2016 roadmap
Service Architecture is almost same with EOSC 6
Australian Research Data Commons
Canada加拿大
7
Collaborative work between NREN, HPC and Library Community
Africa非洲
8
Africa非洲
9
Collaborative work between NREN and Library Community
• De facto Service Layer Stack
• Integration for More Useful Infrastructure • Service Integration: Between Services and Domains • Organizational Cooperation: Business Plan and Budget
What can be seen from Recent Movement 最新动向的启示
Network Identity and Access Federation Virtual Organization Platform
Cloud Computing / HPC
Common Services and Tools
Domain Centric Services
Common Discover Service
Improve UX and Cost Effect by Integrating existing E-Infrastructures Hard to Develop by the Single Institution
Require to Facilitating in National and Regional Level Cooperation 10
Each Domain using NII RDC
Open Science E-Infra in JAPAN NII Research Data Cloud
日本的电子基础设施开放科学 日本学术论文搜索网研究数据云
11
NII Research Data Cloud日本学术论文搜索网研究数据云
Discovery Platform
Publication Platform
Research Data Management System
DOI
Subject Repository
Metadata Management ● Linking Func between Article and Data ● Researcher and Research Project Identification and Management Func ● Data Exchange with International Discovery Service
Research Data Mng User Interface
Access Control Metadata Mng
Journal Article Supplemental
Data
Institutional Research Data Mng
Hot Storage
Hot Storage
Hot Storage
Cold Storage
Cold Storage
Cold Storage
Data Depositor
Archive Exp/Store
Search/Find
Data User
Article
● High Speed Access using SINET5 ● Data Sharing Func using Virtual NW and ID Federation ● Effective Data Storage Switcher
● Data oriented Self-Archiving Func ● Versioning and auto-Packaging Func ● User Dependent Personal Data Pseudonym Func
Research Data Repository
Private Shared Public
RDM Platform
Discovery Service
International Metadata
Aggregator
Storage Area for Long-term Preservation
Re-use Metadata Aggregation
Exp Data
User Flow Data Flow
by
12
Discovery Service CiNii 日本学术论文搜索网的探索服务
(2,600) (2,890) (3,090) (3,200) (3,500) (3,660) (3,790) (3,880) (4,020) (4,150) (4,260) (4,304)
7,206
13,286
28,919 35,000
64,100
57,600 60,460
57,580
56,400
61,200 61,270 59,120
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
0
5,000
10,000
15,000
20,000
25,000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Fulltext(internal) Meta Search Search (thousand)
Articles (thousand)
12,800
14,300 15,300
16,020
9,900 10,600 11,500
16,720
18,730 19,270
2016 Monthly ave. ■full text DL 4.52M ■detail view 10.9M ■search 4.93M
Articles and Searches
2009.4 Drastic UI Renewal
2007.4 Indexed by
12,000
19,730
13
14
CiNii Knowledge Base
CONTENT TYPE 件数 RESOURCE
Article 37,376,419 CiNii Articles, J-STAGE, Repository…
Book 11,801,960 NACSIS-CAT
Dissertation 634,467 NDL Onlne, Repository…
Project 848,051 KAKEN
Research Data 12,568,782 DataCite, Japan Link Center
Researcher 2,874,337 KAKEN, NACSIS-CAT
Total 66,104,016
Links 10,813,948 CiNii Articles, KAKEN
Biggest Knowledge Graph in Japan
CiNii KB could be used for Institutional Research
Research Data
Journal Article
Research Project
Conventional Research Output DOI, Handle, URI,
ISBN, ISSN...
Project ID
Funding Agency Crossref Funder, GRID, ISNI...
Research Institution Institutional ID, GRID, ISNI...
Researcher ID,ORCID...
Research Data DOI, URI...
Research Activity
Book Paper Dissertation
Coverage of Current CiNii
JAIRO Cloud日本在线学会知识库云
• Background • Limited resources and less technical knowledge hamper
implementation of IR especially in small universities. • JAIRO Cloud provides a shared instance of IR system on the virtual
server hosted by NII since April 2012.
• Service Architecture
15
Number of Institutional Repository in Japan日本学会知识库数量
16
2 10 58 101 144 193 228 260 284 301 316 310 285 256 228 226
73 130
210 288 396 498 558 568
41
2 10 58
101 144
193 228 260
357 431
526 598
681 754 829
835
0 100 200 300 400 500 600 700 800 900 829 IRs
■ by JAIRO Cloud: Pilot Operation ■ by JAIRO Cloud: Production Operation ■ by University On-premise System
• Current System WEKO2 • Journal Article Repository • Add Functions more and more
• New System WEKO3 • Based on Invenio3 which is originally focused as Data Repository • Integrate WEKO2 Functions into Invenio3
WEKO3 Data Repository
Strengthen Conventional Functions
Effective Development and Operation
Realize New Publication Platform based on sophisticated Invenio3 Architecture (Invenio3 = our RDM Platform in Architecture)
Domain Use-case by Extensibility
Research Data ✖
17
〇
International Collaboration with CERN 与CERN的国际合作
18
• Article Repository (Current Role) Preparing for migration from current system to the new WEKO3
• Data Repository (New Role)
19
Multi Tenancy Workflow Enable to define different workflow sets which
are required by different operational model. Enable to operate different data repositories in the single institutional repository service
Sustainability Flexibility
Laboratory Project Institution Data Repository is required at Different Level
< <
Cloud type Data Repository
International Collaboration with WACREN 与WACREN的国际合作
20
International Development of new Invenio Flavor 一股清流Invenio的国际化发展
21
New Service 新服务
Manage Research Data by Research Project
Share Research Data within Collaborators Authn by ID Fed
Connect Cloud Storage from Various Plugin
RDM Platform
Cloud Storage
Public Cloud (Provider DC)
Private Cloud (On-premise)
Customize selectable Plugin depending on University Environment and Policy
NII: Frontend Service
University: Backend Storage NII Storage
Public Cloud (Provider DC)
Default (minimum?) Storage by NII
Extension of Open Science Framework developed by COS, USA
22
New Functions Developed in FY2018 and FY2019 2018和2019年开发的新功能 • New Plugin
• New External Storage • ownCloud, S3 Compatible Storage, OpenStack Swift
• Integration with Publication Platform • Integration with Data Analysis Tool
• JupyterHub • Plugin SDK
• Research Data Management • Research Footprint Management • Metadata Management • Workflow Management
• Institutional Management • Plugin Selection • Statistics • Institutional Template
23
Publication from Repository知识库的出版物
• DOI Registration • Embargo Control • Metadata Validation etc.
Publication Platform
24
Integration with Data Analysis Tool结合数据分析工具
• GakuNin RDM add-on for Data Analysis Tool: JupyterHub • Easy to Data Transfer between GakuNin RDM and JupyterHub • GakuNin ID Federation allow uses Single Sign On between Systems
Connect
JupyterHub ・Programming ・Execution
GakuNin RDM ・Storage ・Repository
(2018年12月実装)
25
Research Footprint Management研究痕迹管理
商用時刻認証局
Time Stamp
2007.11.8
10:05:32
Time Stamping Authority
Admin
国立情報学研究所[Test] Project Log
Institutional Log
26
Institutional Management Function机构管理功能
• Select Institutionally Available Storage • Select Authorized External Services • Download Institutional Logs
27
Experimental Plan with Universities and Research Institutions 与大学和研究机构的实验计划
• αTesting#1 : March 2017 Object:Obtain feedback from IT Center in Large Scale Institutions Participants : Hokkaido University, Tohoku University, Kyoto University, Osaka University Kyushu University,
Nagoya Institute of Technology, National Institute for Environmental Studies.
• αTesting#2 : October 2017 Object : Obtain feedback from Laboratory Use Case Participants : The University of Tokyo, Nagoya University, Tsukuba University, Keio University, Aizu University,
Fukushima Medical University, RIKEN, JAXA
• βTesting#1 : June 2018 Object : Middle Scale Experiment by adopting New Functions developed in 2017
• Long Run Study : April- 2019 Object : Obtain feedback from Institutional and Domain Specific Use Case
28
Internal and External Collaboration内部和外部合作
Research Center for Open Science and Data Platform
R&D Center for Academic Networks
Academic Authentication Systems Office Center for Cloud R&D
AXIES Research Data Management WG
JPCOAR Research Data TF
University IT Center (System Requirement, Operation Policy)
University Library (RDM Training)
• Secure NW • Service Deployment
• Storage Procurement • Data Analysis Infra.
• ID Federation • VO Platform
International
29
• FY2019 –Q2 • RDM Collaboration Functionality, Operational System • Repository Migration Tools • Discovery KB Refinement, Several Different Algorisms
• FY2019 Q3-
• FY2020 Production Level Operation
Future Work未来的工作
30
Develop Effective Operational System
Extend Data Source from Domain DB
Case Study according to Research Data Life Cycle
Migration Test in JAIRO Cloud
Feasibility Study Obtain Case Study
Relationship between Research Data Infrastructure and Research Workflow研究数据基础设施与研究工作流程的关系
Project Start (Application) Member Management Initial Setting
Data Management Data Analysis
Paper writing Deposit with Supplemental Data
Aggregation
Institutional or Domain
Repository
Experiment Data Acquisition
Discovery Platform
Publication Platform
RDM Platform
RDM Platform
31