Analyzing Career Paths with College Miner - Matthew Harris @ GraphConnect NY 2013

Post on 15-Jan-2015

402 views 2 download

description

College Miner currently uses Neo4j to store and analyze resumes. With Neo we are able to look at career paths, timelines, and relationships of a particular resume.

Transcript of Analyzing Career Paths with College Miner - Matthew Harris @ GraphConnect NY 2013

Career Path Analysis with Neo4j

Matthew S. HarrisCo-Founder, Technology & Infrastructurematthew.harris@collegeminer.commatthew.harris@patheer.comTwitter @harrisminer

30 Beach St #2Quincy, MA 02170www.collegeminer.comwww.patheer.com

Presentation by:

About Me

Matthew Harris

• 10+ years database architecture and application development

• Masters Degree in Business Intelligence and Data Mining

• Experience in other startups

• Research at Boston University

Co-Founder, Technology & Infrastructure

Past Experience

Introduction

• Founded in 2011• Located in Boston, MA• Original Premise - Do students get jobs related to their major?• What can I do with my major?

Build data analytics tools focused on analyzing career outcomes and paths.

New Tool

• Private Beta Launched in May 2013

• Public Beta Launch on November 8, 2013 in Boston, MA

• www.patheer.com

Live your passion, discover your path!

Holistic career analysis, planning, and recommendation tool. 1. Don’t get weeded out2. Avoid painful job searches3. Discover and plan the path to your dream career

Goals & Focus

Patheer Hierarchy

User Data

Resume

User Activity

Market Data

Job Postings

Resumes

Extraction & Parsing

20gb/day -city

Data Processing

Data Relay Data Stores

MongoDB Neo4jPrecog MS SQLAnalysis Engine

Application

User Capabilities

• Understand how parsers work

• Analyze how complete your resume is according to parser

Resume Analysis

Job Matches• Get jobs that match your background

• Analyze why you don’t qualify for a particular job

Research • Jobs

• Companies

Career Path Analysis• View and analyze your career path

• Analyze what others did to reach your career goal

• Get recommendations on how to reach your career goal

• Schools

• Cities

Problem & Solution

• Not transactional• Somewhat relational• Unstructured/Semi-structured data• Direct and indirect connections• Real-time and batch• Flexible/Partial schema

How to store and analyze this data?

• 3 instances• Relationships• Paths• Weighted Paths

• Neo4jClient (C# Library)• Shout out to Tatham Oddie!

• Customized data processing• Mostly depth-first analysis

Relationships

Career Path Analysis

1. View and analyze your career path2. Analyze what others did to reach your career goal3. Get recommendations on how to reach your career goal

User Career Path

START person1=node(*)

MATCH m = person1-[p:PATH]->x

WHERE p.UserID! = {userid}

RETURN p order by p.Date asc;

What does my career path look like?

User 1 Bachelors Degree

Database Analyst

Database Admin

User Career Path

Add your career goal

User 1

Bachelors Degree

Database Analyst

Database Admin

?Database Architect

Career Path Analysis

1. View and analyze your career path2. Analyze what others did to reach your career goal3. Get recommendations on how to reach your career goal

Career Pathing with Neo4j

User X

User Y

User Z

Bachelors Degree

Data Analyst

Bachelors Degree

Database Analyst

Bachelors Degree

Masters Degree

Database Admin

Database Develop

Database Admin

Masters Degree

Database Architect

Database Architect

Database Architect

Database Admin

Database Develop

Career Pathing with Neo4j

User X

User Y

User Z

Data Analyst

Database Analyst

Bachelors Degree

Masters Degree

Database Admin

Database Architect

Database Develop

Career Pathing with Neo4j

Users

Data Analyst

Database Analyst

Bachelors Degree

Masters Degree

Database Admin

2

1

1 11

1 1

1

Database Develop

Database Architect

1

1

1

1 1

Analysis

• Not an easy task!• Variable path lengths with unique traversals• Can’t simply do allPaths or x – [p:PATH*]-> y

• Unique identifiers• Where x.pathnumber + 1 = y.pathnumber

• A* and Dijkstra• Only least cost/cheapest path• Need most cost (most traversed)

• Customized Solution/Query• Batch process nightly for all end nodes• Calculate sum of path weights (still testing optimal solution)• Store top 3 results in Precog (backend)• Application queries Precog

What are the top 3 traversals for each job group?

Career Path Analysis

1. View and analyze your career path2. Analyze what others did to reach your career goal3. Get recommendations on how to reach your career goal

User Career Path RecommendationsRecommendations based off of:

User 1

Bachelors Degree

Database Analyst

Database Admin

1

Database Architect

• Current Position –[p:path*1..?]->Career Goal• User background (from resume)• Real-time market data• User relationships and connections

2

3

Timeline

• November 8, 2013• Greater Boston Area• All Job Categories and Industries

Beta Launch!

City Expansion• Spring 2014• All Job Categories and Industries

New Features (Coming Soon!)• Research College Majors• Career Path Explorer

Career Path Explorer Teaser

1. How do I become a…?• Start at End Node and work backwards

2. What can I do with my degree?• Start at Start Node and work forward

3. Advanced Search• Select Start and End Nodes

Database Architect?

Bachelors Degree ?

Bachelors Degree ? Database

Architect

THANK YOU!!

30 Beach St #2Quincy, MA 02170www.collegeminer.comwww.patheer.com

Matthew S. HarrisCo-Founder, Technology & Infrastructurematthew.harris@collegeminer.commatthew.harris@patheer.comTwitter @harrisminer

Presentation by: