Exploring SQLite and the Litestack suite of SQLite based tools for Ruby and Rails applications. Litestack offers a SQL database, a cache store, a job queue, a pubsub engine, full text search and performance metrics for your Ruby/Ruby-on-Rails apps
Report
Share
Report
Share
1 of 138
Download to read offline
More Related Content
Litestack talk at Brighton 2024 (Unleashing the power of SQLite for Ruby apps)
1. Unleashing the power of SQLite for Ruby Applications
Mohamed Hassan
litestack
integrated, efficient, simple
https://github.com/oldmoe/litestack
3. Shipped my first app (for the MSX) in 1991
oldmoe.blog
github.com/oldmoe
x.com/oldmoe
About me
4. Shipped my first app (for the MSX) in 1991
A long time Rubyist (2006), SQLiter (2005)
oldmoe.blog
github.com/oldmoe
x.com/oldmoe
About me
5. Shipped my first app (for the MSX) in 1991
A long time Rubyist (2006), SQLiter (2005)
2x founder, 2x startup employee
oldmoe.blog
github.com/oldmoe
x.com/oldmoe
About me
6. Shipped my first app (for the MSX) in 1991
A long time Rubyist (2006), SQLiter (2005)
2x founder, 2x startup employee
Recently a Xoogler after 7 years at Google
oldmoe.blog
github.com/oldmoe
x.com/oldmoe
About me
7. Shipped my first app (for the MSX) in 1991
A long time Rubyist (2006), SQLiter (2005)
2x founder, 2x startup employee
Recently a Xoogler after 7 years at Google
Currently consulting while working on my new startup
oldmoe.blog
github.com/oldmoe
x.com/oldmoe
About me
10. A database of course, we all need some sort a data store
A database
server
11. Oh, and a cache, for the data we need nearby
A database
server
A cache server
12. And a key value store for complex data structures
A database
server
A cache server A key value
store (Kredis)
13. We also need to be hip with instantaneous updates!
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
14. And to keep latency low, we kiq long running jobs to the side
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
A queue server
(ActiveJob)
A job runner
instance(s)
15. And we need “smart” search to support our users
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
A queue server
(ActiveJob)
A job runner
instance(s)
A full text
search server
16. So we end up with our little forest of dependencies
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
A queue server
(ActiveJob)
A job runner
instance(s)
A full text
search server
5+
Server instances
running
5+
Extra gems
installed
6+
Server and client
libraries installed
17. So we end up with our little forest of dependencies
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
A queue server
(ActiveJob)
A job runner
instance(s)
A full text
search server
5+
Server instances
running
5+
Extra gems
installed
6+
Server and client
libraries installed
Every extra service brings with it:
● Configuration requirements
● Management requirements
● Extra latencies for remote requests
20. And we end up with our little forest of dependencies
A database
server
A cache server A key value
store (Kredis)
A pubsub server
(ActionCable)
A queue server
(ActiveJob)
A job runner
instance(s)
A full text
search server
5+
Server instances
running
5+
Extra gems
installed
6+
Server and client
libraries installed
Just use litestack
integrated, efficient, simple
24. SQLite is robust & reliable:
The most deployed dbms in the world, with
the largest test suite of any OSS project.
25. 155.8 KLOC (source)
92.053 MLOC (tests)*
* Almost 600 lines of tests (mostly in TCL scripts) for every line of code (C)
26. SQLite is highly performant:
Very low latency due to running in-process.
Has a very efficient bytecode engine.
27. 798K point reads/sec (~10 𝞵seconds/read)
167K point updates/sec*(~50 𝞵seconds/write)
* 8 concurrent processes on a Ryzen 5200u mobile CPU
28. SQLite is feature rich:
Partial indexes, JSON, recursive CTEs,
window functions, FTS, geopoly, virtual
columns, to name a few.
29. WITH ids AS MATERIALIZED ( SELECT rowid AS id, -rank AS search_rank FROM idx(:query) ),
ids_bitmap AS MATERIALIZED ( SELECT rb_create(id) AS bitmap FROM ids),
selected_facets AS MATERIALIZED (
select value ->> '$[0]' AS f, value ->> '$[1]' AS v from ( select value from json_each(:facet_filters) )
), filtered_ids_bitmap AS MATERIALIZED (
SELECT rb_and(anded_bitmaps, (select bitmap from ids_bitmap)) AS bitmap FROM (
SELECT rb_and_all(ored_bitmaps) as anded_bitmaps FROM (
SELECT field, rb_or_all(bitmap) as ored_bitmaps FROM facets, selected_facets WHERE field = f AND value = v
GROUP BY field )) WHERE :facet_filters IS NOT NULL UNION ALL
SELECT bitmap FROM ids_bitmap WHERE :facet_filters IS NULL
), filtered_ids_by_categories_bitmap AS MATERIALIZED (
SELECT iif(rb_length(bitmap) > 0, bitmap, (select bitmap from ids_bitmap)) AS bitmap FROM (
SELECT rb_and(anded_bitmaps, (select bitmap from ids_bitmap)) AS bitmap FROM (
SELECT rb_and_all(ored_bitmaps) as anded_bitmaps FROM (
SELECT field, rb_or_all(bitmap) as ored_bitmaps FROM facets, selected_facets WHERE field = f AND value = v
AND f = 'categoryName' GROUP BY field))) WHERE :facet_filters IS NOT NULL UNION ALL
SELECT bitmap FROM ids_bitmap WHERE :facet_filters IS NULL
), filtered_ids AS MATERIALIZED (
SELECT id, search_rank FROM ids WHERE id IN (
(SELECT value FROM carray((SELECT rb_array(bitmap) FROM filtered_ids_bitmap),
(SELECT rb_length(bitmap) FROM filtered_ids_bitmap),'int64'))
) AND :facet_filters IS NOT NULL UNION SELECT id, search_rank FROM ids WHERE :facet_filters IS NULL
),
total_hits AS MATERIALIZED (
SELECT count(*) AS hits FROM filtered_ids
), facets_generator AS (
SELECT JSON_GROUP_OBJECT(field, JSON(value)) AS facets FROM (
SELECT field, JSON_GROUP_ARRAY(JSON_OBJECT('name', value, 'count', len)) AS value FROM (
SELECT field, value, len
FROM (
SELECT field, value, rb_and_length(facets.bitmap, filtered_ids_by_categories_bitmap.bitmap) AS len
FROM facets, filtered_ids_by_categories_bitmap
WHERE field = 'stars'
ORDER BY len DESC LIMIT 10) UNION
SELECT field, value, len FROM (
SELECT field, value, rb_and_length(facets.bitmap, filtered_ids_by_stars_bitmap.bitmap) AS len
FROM facets, filtered_ids_by_stars_bitmap
WHERE field = 'categoryName'
ORDER BY len DESC LIMIT 10
) ORDER BY field, len DESC ) GROUP BY field )
), documents_page AS (
SELECT JSON_GROUP_ARRAY(
JSON_OBJECT(
'id', rowid, 'asin', asin, 'categoryName', categoryName,
'title', title, 'imgUrl', imgUrl, 'productUrl', productUrl,
'stars', stars, 'reviews', reviews, 'price', price,
'isBestSeller', isBestSeller, 'boughtInLastMonth',
boughtInLastMonth, 'search_rank', search_rank
)) AS documents FROM (
SELECT rowid, asin, categoryName, title, imgUrl, productUrl, stars, reviews, price,
isBestSeller, boughtInLastMonth, search_rank
FROM products, filtered_ids WHERE products.rowid = filtered_ids.id
ORDER BY ${ORD} Limit ifnull(:limit, 15) Offset ifnull(:offset, 0) )
) SELECT hits, JSON_GROUP_OBJECT(title, JSON(value)) FROM total_hits, (
SELECT 'total_hits' AS title, hits AS value FROM total_hits UNION ALL
SELECT 'documents' AS title, documents AS value FROM documents_page UNION ALL
SELECT 'facets' AS title, facets AS value FROM facets_generator UNION ALL
);
A single query to fetch records from an fts5 virtual table &
select facet values using CTEs, TVFs & UDFs, group all
results in a JSON object*
- Load the list of matching ids from fts5 virtual table
- Convert the list to a roaring bitmap
- Calculate the count of all records in result set
- Create the facets results based on filtered id bitmaps
- Calculate the facet values
- Create a results page and load the documents as JSON objects
- Aggregate all results (page, count, facets, etc) as JSON object
* Runs under 50ms on a ~2.1M products dataset
30. SQLite is extensible:
Many user defined functions, table valued
functions, virtual tables and virtual file
systems exist as extensions.
32. SQLite is low maintenance:
Just a file on disk that you can copy and
move around. No external
processes/services.
33. # capture the current time
timestamp = Time.now.to_s
# start a read transaction to prevent WAL file truncation
Litedb.transaction(:deferred) do
# create the backup directory
`mkdir litedb_backup_#{timestamp}`
# copy the database files using copy-on-write* (on Btrfs, XFS or ZFS)
`cp –reflink=always #{Litedb.filename}* litedb_backup_#{timestamp}`
end
My preferred backup method (file is on replicated block storage)
* Gigabyte sized files are copied in ~2ms
37. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
Client
Client
Server
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
38. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
VS
Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
1.
2.
3.
4.
5.
6.
39. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
What if the data we need already exists
in the cache?
1.
2.
3.
4.
5.
6.
40. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
What if the data we need already exists
in the cache?
1.
2.
3.
4.
5.
41. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
What if the data we need already exists
in the cache?
An what if we use a prepared statement?
1.
2.
3.
4.
5.
42. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
What if the data we need already exists
in the cache?
An what if we use a prepared statement?
1.
2.
3.
43. Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
VS
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
Encode query
Copy to TCP buffers
Send over the network
Receive on server side
Copy from TCP buffers
Decode query
Parse query
Generate plan
Execute
Load pages from cache
Load missing pages from disk
Encode results
Copy results to TCP buffers
Send over the network
Receive on client side
Copy from TCP buffers
Decode results
Return results to caller
1.
2.
3.
46. Reduce disk access by using a properly sized page cache!
Eliminate query compilation by using prepared statements!
47. Reduce disk access by using a properly sized page cache!
Eliminate query compilation by using prepared statements!
Build the proper indexes to speed up queries
48. Reduce disk access by using a properly sized page cache!
Eliminate query compilation by using prepared statements!
Build the proper indexes to speed up queries
Eliminate network overheads by using SQLite!
50. sqlite> CREATE TABLE users(id INTEGER PRIMARY KEY, data TEXT, email TEXT, password TEXT, status TEXT);
sqlite> EXPLAIN SELECT * FROM users WHERE id = ?;
An example bytecode program for a compiled statement
61. <
SELECT * FROM logs LIMIT 10000;
Low effort / Large result
Low effort / Small result
High effort / Large result
High effort / Small result
62. SELECT * FROM logs LIMIT 10000;
SELECT * FROM users WHERE id = 1;
Low effort / Large result
Low effort / Small result
High effort / Large result
High effort / Small result
<
<
63. SELECT * FROM logs LIMIT 10000; SELECT avg(duration) AS d, user_id
FROM logs GROUP BY user_id ORDER BY d
DESC;
SELECT * FROM users WHERE id = 1;
Low effort / Large result
Low effort / Small result
High effort / Large result
High effort / Small result
<
<
<=
64. SELECT * FROM logs LIMIT 10000;
SELECT avg(duration) FROM logs;
SELECT avg(duration) AS d, user_id
FROM logs GROUP BY user_id ORDER BY d
DESC;
SELECT * FROM users WHERE id = 1;
Low effort / Large result
Low effort / Small result
High effort / Large result
High effort / Small result
<
<
<=
=
67. Short transactions / Few locks
Short transactions / Many locks
Long transactions / Few locks
Long transactions / Many locks
<
UPDATE logs SET v = ? WHERE id = ?;
68. Short transactions / Few locks
Short transactions / Many locks
Long transactions / Few locks
Long transactions / Many locks
<
<
UPDATE logs SET v = ? WHERE id = ?;
INSERT INTO logs (..) VALUES (..);
69. Short transactions / Few locks
Short transactions / Many locks
Long transactions / Few locks
Long transactions / Many locks
<
<
>
UPDATE logs SET v = ? WHERE id = ?;
INSERT INTO logs (..) VALUES (..);
UPDATE logs SET v = ? WHERE k = ?;
70. Short transactions / Few locks
Short transactions / Many locks
Long transactions / Few locks
Long transactions / Many locks
<
<
>
>=
UPDATE logs SET v = ? WHERE id = ?;
INSERT INTO logs (..) VALUES (..);
UPDATE logs SET v = ? WHERE k = ?;
UPDATE logs SET v = ?;
73. Suboptimal defaults:
More optimized for a phone db vs a web
application db.
You will see SQLITE_BUSY errors when you
access the db from multiple processes.
74. No higher level abstractions:
Only a relational db with multiple indexing
options, but no queue, pubsub, data expiry,
etc.
90. An overview of Litestack components*
* All benchmarks were ran on an AMD 5700U machine from a single process
91. litedb
Native driver on top of SQLite3
Applies many optimizations out of the box, much
faster than the vanilla driver and more concurrent
ActiveRecord & Sequel Adapters
Metrics support for query execution time and
frequency
Point query throughput in ActiveRecord
* Litedb point reads performance was benchmarked using 8 processes
92. litesearch
A Litedb extension, loaded by default
A high performance and flexible full text search
engine
Built on top of SQLite FTS5, delivers a dynamic layer
on top of its fixed structure
Supports Porter and Trigram tokenizers
ActiveRecord and Sequel integration
class Article < ApplicationRecord
include Litesearch::Model
litesearch do |schema|
schema.fields [:body, :summary]
schema.field :title, weight: 10
end
end
Article.search(“litestack benchmarks”)
Meilisearch vs Litesearch 3 word phrase query throughput
93. A Litedb extension, loaded by default
A high performance and flexible full text search
engine
Built on top of SQLite’s FTS5, delivers a dynamic
layer on top of its fixed structure
Supports Porter and Trigram tokenizers
ActiveRecord and Sequel integration
litesearch
Meilisearch vs Litesearch 3 word phrase query throughput
Meilisearch
class Article < ApplicationRecord
include Litesearch::Model
litesearch do |schema|
schema.fields [:body, :summary]
schema.field :title, weight: 10
end
end
Article.search(“litestack benchmarks”)
94. litecache
A bare metal cache store with optimized database
operations
Supports size limiting and background garbage
collection of expired entries
Faster read operations than any other cache engine
for Ruby, including SQLite based ones that rely on
higher level abstractions
Rails Active Support cache adapter
Redis vs Solid Cache vs Litecache read throughput
* Litecache performance was ran on an AMD 7840HS machine using a single process
95. litecache
A bare metal cache store with optimized database
operations
Supports size limiting and background garbage
collection of expired entries
Faster read operations than any other cache engine
for Ruby, including SQLite based ones that rely on
higher level abstractions
Rails Active Support cache adapter
Redis vs Solid Cache vs Litecache read throughput
Not all SQLite caches
are created equal!
* Litecache performance was ran on an AMD 7840HS machine using a single process
96. litejob
A queue, a job processor and a job class interface
Supports multiple queues, with priorities
Runs in-process, no need for an external processing
service
A fiber optimized mode for much higher concurrency
for IO bound jobs
Rails Active Job adapter
Litejob throughput scaling from threads to fibers
97. litecable
An Action Cable pubsub backend based on SQLite
High performance and low latency message routing
Runs in-process, no need for an external processing
service
Dedicated reader connection that never block
Redis vs Litecable message processing throughput
98. Litemetric / Liteboard
Plugs into all other Litestack components
(configuration option)
High performance, can collect thousands of metrics
per second on a single core
Can be used to collect metrics for non Litestack
operations
Liteboard provides a web app for per component
custom metrics visualization
104. bundle add litestack
Step 1: Add Litestack to your Gemfile
bin/rails generate litestack:install
Step 2: Run the Litestack installer
105. bundle add litestack
Step 1: Add Litestack to your Gemfile
bin/rails generate litestack:install
Step 2: Run the Litestack installer
Step 3:
106. bundle add litestack
Step 1: Add Litestack to your Gemfile
bin/rails generate litestack:install
Step 2: Run the Litestack installer
Step 3: There is no step 3!
Congratulations!
You now have a database, a cache store, a
background job processor, a cable backend, a full
text search engine and (optionally) metrics for all
these components. All ready for you to use!
113. 1 vCore
1 GB RAM
5 GB Storage
256 vCores
2 TB RAM
10 TB Storage
Hardware keeps advancing at a great pace, and single machine performance is very high.
Coupled with scalable, replicated, block storage, you have all that you need for a Litestack
application that can grow as your needs grow
You scale up instead!
123. litestack
integrated, efficient, simple
Litestack has deep integration with the Fiber scheduler, leading to faster performance and
lower memory usage when coupled with a fiber scheduler based server like Falcon
https://github.com/socketry/falcon
126. litekd
Lite keyed data, a Kredis compatible interface
backed by SQLite
Implements the Kredis API & integrates with
ActiveModel/ActiveRecord
Also integrates with Sequel::Model and any PORO
Fast, consistent (transactional!) and durable
my_list = Litekd.unique_list(
“mylist”,
expires_in: 5.seconds
)
my_list.append([1, 2, 3])
my_list << 3
[1, 2, 3] == my_list.members
127. # truly almost everything
gem “litestack”
# keyed data
gem “kredis”
# database connection
gem “pg”
# cache, cable & queue
gem “redis”
gem “hiredis”
# job processing
gem “sidekiq”
# full text search
gem “elasticsearch-rails”
# performance monitoring
gem “rails_performance”
Turn this .. .. into this .. .. and get this!
Aggregate latencies
Litesearch
Litedb
Litejob
Litecache
Redis
Sidekiq
PostgreSQL
Elasticsearch
129. Simpler and DRYer:
- Single config file for all components
- Less configuration options
- More reliance on conventions
- Liteboard mounted inside your app
130. Faster and more concurrent:
- Optimized database access
- Concurrent queries
- Offloaded journal checkpointing
- Less blocking on write locks
131. Improved Litejob:
- More durable job execution
- Recurring jobs support
- Lower execution overhead
- Explicit starting of job workers