Presented by:

E311b6ca636d9f7de6d0e1e2ac941cbf

Charles Hathaway

from YottaDB

Charles works at YottaDB, a free/open source database startup with a rich heritage of unique database designs, as a technology guru; all to say, he spends his days writing code, working with new technology, and performing minor feats of software black magic. Prior to YottaDB, Charles spent a great deal of time studying the complexity of software engineering at Rensselaer Polytechnic Institute, with a particular interest in gaining insights to the impact software complexity has on education and contributions diversity.

Recent years have seen a massive rise in the prevalence of NoSQL technologies, despite the yelling and screaming of academics. Much of the popularity of these systems stems from the performance gains they have over SQL due their ability to directly manipulate data, without needing to go through a SQL engine. This is particularly important for ACID transaction processing: where a SQL engine would need complex logic to ensure that no touched rows are updated during the transaction, a simpler NoSQL engine can do a transaction with more ease. However, academics warn that in the end, we will need SQL to get the complex-query performance and flexibility required for any meaningful analytic on the data we store in these databases. But since when do academics get things right?

As it turns out, they got lucky this time. Users of these NoSQL engines have discovered that, although not needed for performance critical operations such as transaction processing, SQL is needed to perform meaningful analytic using much of the tooling available. In response to this new demand, many NoSQL engines have started adding support for SQL queries. However, implementing these SQL engines can be quite a task, especially as one attempts to generate code to fetch information from these data stores that is not only correct, but also performant. For many implementations, we hit difficulties when we examine more interesting SQL features, such as outer joins, sub queries, and set operations. Of course, the task of writing a query optimizer is a research area in and of, itself; decades of research has gone into the topic, and it is still alive and well. How can these new systems and developers utilize these expansive databases of research to make things run better?

YottaDB (https://yottadb.com/) is an free/open-source NoSQL data store with full support for transaction processing, whose codebase has long been used for mission-critical applications in banking and healthcare. It stores data in a hierarchical fashion, delivering blazing performance for simple (such as setting a value, or verifying that a key does or doesn't exist) and complex (such as ACID transactions across many tables) operations by providing primitives to iterate over the hierarchy and features to enable transaction processing.

This presentation discusses the process of implementing the SQL engine for YottaDB - an engine which provides a complete SQL '92 SELECT implementation, along with numerous optimizations to give exceptional analytical performance, in addition to the performance benefits we see from the YottaDB NoSQL engine. We discuss the pipeline that a SQL query goes through, from being parsed to rendered as machine-executable code. Time permitting, a deep delve into some of the optimizations the engine does will provide insights not only into YottaDB, but also into performance constraints of all existing SQL implementations.

Date:
2019 April 28 - 12:30
Duration:
45 min
Room:
CC-208
Conference:
LinuxFest Northwest 2019
Language:
Track:
Code
Difficulty:
Medium

Happening at the same time:

  1. Your Herd of Elephants: PostgreSQL Replication
  2. Start Time:
    2019 April 28 12:30

    Room:
    HC-103 Postgres

  3. Live Coding Minesweeper in Clojurescript
  4. Start Time:
    2019 April 28 12:30

    Room:
    CC-115

  5. Deep Dive into firecracker-containerd
  6. Start Time:
    2019 April 28 12:30

    Room:
    CC-200

  7. Zero Knowledge Architecture, is it possible?
  8. Start Time:
    2019 April 28 12:30

    Room:
    CC-114

  9. When NoSQL isn't enough, but SQL is too much
  10. Start Time:
    2019 April 28 12:30

    Room:
    CC-208

  11. Ghostbusting
  12. Start Time:
    2019 April 28 12:30

    Room:
    G-103

  13. From Analog to Digital and Back
  14. Start Time:
    2019 April 28 12:30

    Room:
    CC-236

  15. Creating a Stronger Community by Poisoning Your Own Well
  16. Start Time:
    2019 April 28 12:30

    Room:
    HC-104 Jupiter

  17. Common licensing issues for free software projects
  18. Start Time:
    2019 April 28 12:30

    Room:
    HC-108

  19. Hugo: Making Building Websites Fun Again
  20. Start Time:
    2019 April 28 12:30

    Room:
    CC-201 Tutorials

  21. Snapcraft Workshop
  22. Start Time:
    2019 April 28 12:30

    Room:
    CC-202 Tutorials

  23. FreeBSD is Everywhere
  24. Start Time:
    2019 April 28 12:30

    Room:
    CC-235