SQL gene in NoSQL
-
Upload
cihan-b -
Category
Engineering
-
view
218 -
download
2
Transcript of SQL gene in NoSQL
SQL Gene in NoSQL
How SQL Extensions are making flexible data structures Query-able
Cihan [email protected] - @cihangirb
NoSQL Now – Aug 2015 – San Jose
Agenda Nature of Data
Relational vs NoSQL Database Approach Extending SQL for JSON
1. Ingest JSON and Return JSON2. Accessing JSON Hierarchies – the dot notation3. Bring full syntax – including JOINs and Sub-Queries4. Handle missing values – MISSING , UNSET5. Fold & Unfold JSON Hierarchies6. Native Set/Collection Handling7. Adapt Indexing to JSON
Nature of Real-World Data
Rich structure Attributes, Sub-structure
Relationships To other data
Value evolution Data is updated
Structure evolution Data is reshaped
Models for Real-World Data
Data Concern Tabular Model Document Model
Rich Structure Multiple flat tables Constant assembly /
disassembly Documents No assembly required!
Relationships Represented Queried (SQL)
Represented Queried with limitations
Value Evolution Data can be updated Data can be updated
Structure Evolution
Uniform and rigid Manual change
(disruptive) Flexible Dynamic change
Models for Real World Data
Data Concern Tabular Model Document Model JSON + SQL
Rich Structure
Multiple flat tables Constant
assembly / disassembly
Documents No assembly
required! Document
Relationships
Represented Queried (SQL)
Represented Queried with
limitations SQL
Value Evolution
Data can be updated
Data can be updated
Data can be updated
Structure Evolution
Uniform and rigid Manual change
(disruptive) Flexible Dynamic change
Flexible & Dynamic
For the Best of Both, Bring SQL to NoSQL From JSON
Rich structure, no assembly Schema flexibility
From SQL General, proven, and pervasive query capabilities Querying across relationships
SQL + JSON
For the Best of Both, Bring SQL to NoSQL SQL + JSON isn’t new and many are doing it
already Couchbase Server, SQL++ , DocumentDB, Apache Spark, Datastax, Relational Guys – SQL Server,
Postgress, MySQL, Memsql And so on…
However these implementation are very different!
SQL ExtensionsRequirements for the Best SQL+JSON
Implementation
Bring SQL to JSON Requirements for Full SQL + JSON Solution
1. Ingest JSON and Return JSON2. Accessing JSON Hierarchies – the dot notation3. Bring full syntax – including JOINs and Sub-Queries4. Handle missing values – MISSING , UNSET5. Fold & Unfold JSON Hierarchies6. Native Set/Collection Handling
#1 Ingest JSON and Return JSON
Tables and Rows require constant assembly and disassembly…
VS
DEMO #1
"default": { "Company": "Some Company", "Organization": { "Research": { "Robotics": { "AI": { "People": [ { "manager": true, "name": "p2" }, { "manager": false, "name": "p3" }, { "manager": false, "name": "p4" } ]
#2 Access the JSON Hierarchies - Dot Notation
<>.Company
<>.Organization.Research
<>.Organization.Research.Robotics.AI.People[0]
#3 Bring full syntax – including JOINs and Sub-
Queries
…You don’t need JOINs & Subqueries in Document
Models…
#3Embedded vs. Referenced
{ “Name”: “Pike Brewing”, …
“City”:”Seattle”, “Country”:”USA”}
“Beers”: [{“Name”: “Blond”, …
},{“Name”:”Dark”,…}
],
Key: Brewery_Pike_Brewing{ “Name”: “Pike Brewing”,“Beers”:[“Pike_Brewing_Blond”, “Pike_Brewing_Dark”], … “City”:”Seattle”, “Country”:”USA”}
Key: Brewery_Pike_Brewing
{ “Name”:”Pike Brewing Blond”, “Type”:”Blond Beer” …}
Key: Pike_Brewing_Blond
#4 Handling Missing Values
Schema evolves fast and SQL should accommodate
SELECT * FROM …WHERE (trips IS MISSING AND departing_code = …) OR (trips IS NOT MISSING AND trips[0].departing_code = …)
v1 v2{ “flight_plan_id”:…, “departing_code”…, “arriving_code”:…, …}
{ “flight_plan_id”:…, “trips”:[ {“departing_code”…, “arriving_code”:…}, {“departing_code”…, “arriving_code”:…}, ] …}
#5 Fold and Unfold JSON Hierarchies
{ “flight_plan_id”:…, “trips”:[ {“departing_code”:…, “arriving_code”:…}, {“departing_code”:…, “arriving_code”:…}, ] …}
DEMO #2
#6 Native Set/Collection Handling
Search a Level or Sub-Hierarchy
SELECT * FROM …WHERE ANY item IN/WITHIN RoboticsSATISFY item.name=“P3”
Research
Robotics
AI
P1 P2
Speech
Listen
P3 P4
Speak
Group 1
P5
Group 2
P6
P3
DEMO #3
Conclusion
Models for Real World Data
Data Concern Tabular Model Document Model JSON + SQL
Rich Structure
Multiple flat tables Constant
assembly / disassembly
Documents No assembly
required! Document
Relationships
Represented Queried (SQL)
Represented Queried with
limitations SQL
Value Evolution
Data can be updated
Data can be updated
Data can be updated
Structure Evolution
Uniform and rigid Manual change
(disruptive) Flexible Dynamic change
Flexible & Dynamic