sudolabs logo

6. 11. 2023

19 min read

SurrealDB, AWS DynamoDB and AWS Lambda

One of the most significant paradigm shifts in recent times is the concept of serverless computing. At its core, 'serverless' doesn't mean that there are no servers involved. Instead, it signifies a model where developers are free from the worries of server management and can focus on their core application logic. The infrastructure details are abstracted away by cloud providers, who dynamically manage the allocation of machine resources. In this blog, we will walk you through our journey of integrating AWS DynamoDB with SurrealDB and preparing it to run on AWS Lambda. We will cover a short introduction to relevant topics, the challenges we faced, the solutions we discovered, and the promising results of our efforts.

Roman Šelmeci

In a serverless database model, scaling, patching, and administration of databases are handled automatically by the service provider. This shift allows developers to focus on the data and application, rather than the underlying database management.

Serverless computing is not free of challenges, for example:

  • Cold starts can cause latency

  • The abstract nature of serverless might not be appropriate for all types of applications

  • Although serverless can cut costs, it can result in unexpectedly high expenses if not managed correctly

  • Relying on the cloud provider can lead to issues with vendor lock-in

Despite these challenges, the serverless model, especially in the context of databases, is increasingly becoming an attractive option for businesses and developers.

Today, we are focusing on one possible update to SurrealDB, an open-source database that has a good chance of making its mark in the developer community. SurrealDB is proving itself as a robust platform for multi-model data. Nonetheless, as with any technology, there is always room for growth and innovation. We have successfully implemented a new data storage system for SurrealDB using AWS DynamoDB. This shift to AWS DynamoDB also includes transforming SurrealDB's compute engine to run on AWS Lambda.

However, there is a catch. This move to AWS DynamoDB means we are dropping transactional support from SurrealDB. It's a big trade-off, but one we believe is justified given the overarching benefits of a serverless architecture. Additionally, please note that the export function is not supported in our implementation.

Feature highlights of SurrealDB and SurrealQL

SurrealDB an exciting open-source project includes its SQL-style query language (SurrealQL), real-time queries with highly efficient related data retrieval, advanced security permissions for multi-tenant access, and support for performant analytical workloads. SurrealQL has similar syntax and statement types, however, there are a number of differences between traditional SQL and SurrealQL, making it a unique feature of SurrealDB.

One notable difference is the support for nested and structured data in SurrealQL, allowing for more flexible and expressive data modeling. This enables developers to efficiently work with complex data structures, such as nested objects or arrays, and perform queries that traverse and manipulate these structures seamlessly.

Another significant enhancement is the inclusion of graph-oriented querying features in SurrealQL. This allows developers to leverage the power of graph data structures and perform graph-related operations like traversing relationships, finding paths, and executing graph algorithms directly within SurrealDB.

Additionally, SurrealQL offers advanced security permissions for multi-tenant access, enabling fine-grained control over data access and ensuring data privacy and security in multi-user environments.

Here are some examples:

  • Create a new record with a random ID

CREATE person CONTENT {
name: 'Tobie',
company: 'SurrealDB',
skills: ['Rust', 'Go', 'JavaScript'],
};
  • Create a new record with a specific ID (tobie)

CREATE person:tobie CONTENT {
name: 'Tobie',
company: 'SurrealDB',
skills: ['Rust', 'Go', 'JavaScript'],
};

SurrealQL has also several new concepts that bring lots of new power to application development. One of them is for example concept of the future. Futures are values that are only computed when the data is selected and returned to the client. Futures can be stored inside records, to enable dynamic values which are always calculated when queried.

CREATE person SET
name = 'Jason',
friends = [person:tobie, person:jaime],
adult_friends = <future> { friends[WHERE age > 18].name }
;

Please note that this blog post isn't meant to be a comprehensive guide on how to use SurrealQL. Instead, it's a brief introduction to the capabilities and potential of this exciting query language. If you're interested in diving deeper and learning how to use SurrealQL, you can visit the official documentation page here.

The opportunities for our extensions

SurrealDB, by design, separates the compute and storage layers, allowing each to scale independently.

  • The query layer, also known as the compute layer, handles queries from the client, determining which records need to be selected, created, updated, or deleted. The process begins with a parser that interprets the SurrealQL query, followed by an executor that breaks up each statement within the query. The statements are then run through an iterator which determines which data should be fetched from the key-value storage engine. Lastly, each record is processed by the document processor, which manages permissions and determines which data is merged, altered, and stored on disk.

  • The storage layer is responsible for the storage of data for the query layer. This layer supports a number of underlying storage engines, some of which support concurrent readers, while others support both concurrent readers and writers.

Today's applications demand databases capable of managing vast data volumes, providing rapid read and write operations, and scaling with ease. While SurrealDB's present storage system is commendably effective, it isn't devoid of limitations, signaling avenues for enhancement.

Its current storage systems include RocksDB for embedded mode, TiKV for distributed mode, and IndexedDB for running in a web browser. Each storage system comes with its unique set of advantages and challenges.

RocksDB, an embedded key-value data store, is optimized to exploit many CPU cores and make efficient use of fast storage such as SSD disks. However, being an embedded database, it inherently lacks the capability to scale horizontally across multiple nodes, which can limit its use in highly distributed systems.

Conversely, TiKV offers impressive scalability and minimal latency. It not only supports both raw and transaction-based queries with ACID compliance but also accommodates multiple concurrent readers and writers. However, setting up and maintaining a TiKV cluster can be intricate, and its significant dependence on a robust network environment may not suit every scenario.

IndexedDB excels in web browser environments, boasting strong performance and mirroring the full suite of functionalities that SurrealDB offers and brings data closer to end users. However, its scope is restricted to web browsers, making it unsuitable for some use cases.

Recognizing these constraints, it became evident that there was an opportunity for a new data storage system for SurrealDB. A system that could combine the benefits of these storage engines, while mitigating their limitations. We considered AWS DynamoDB as SurrealDB's potential new data backbone for data storage.

We can go further and adapt SurrealDB to a fully serverless environment. Why is SurrealDB the ideal candidate for a serverless environment? SurrealDB query unit is separated from the data layer which allows it to scale independently from storage and can be run in different environments. Also, SurrealDB is entirely written in Rust, a systems programming language known for its safety and performance. Rust's focus on zero-cost abstractions, memory safety, and concurrency without data races makes it perfectly suited for high-performance serverless applications. These qualities allow Rust applications to start quickly and use less memory, thus reducing the cold start time typically associated with serverless functions. Rust's compilation strategy also helps minimize the size of the deployment package, further improving the cold start time. In conclusion, SurrealDB, with its architecture and use of Rust and compatibility with AWS Lambda, represents a significant step forward in the world of serverless databases.

In the following sections, we discuss our reasoning for choosing AWS DynamoDB, the process of implementation, and the benefits it brings to SurrealDB.

Adapting SurrealDB to a serverless environment

AWS DynamoDB is a fully managed NoSQL database service offered by Amazon Web Services (AWS). DynamoDB's architecture allows it to scale horizontally, supporting massive data sizes while providing single-digit millisecond latency. Its automatic sharding and replication across multiple AWS regions provide high availability and data durability. With DynamoDB, the operational burdens of database management — like hardware provisioning, configuration, and software patching — fade away, letting developers pivot their attention to developing applications.

The choice of DynamoDB as a new storage system for SurrealDB is driven by its features that can address the shortages of the current SurrealDB storage systems. Its ability to handle large amounts of data, support for key-value and document data models, low latency, high availability, and scalability make it a strong candidate. Its scalability mirrors that of TiKV, but it further simplifies the equation by being a managed service, negating the intricacies of orchestrating a distributed storage framework. In contrast with RocksDB, DynamoDB isn't just an embedded database; it's a serverless solution ready to fulfill server-based storage imperatives with very low latency like RocksDB has. Additionally, its global footprint for data replication across multiple AWS regions, is closer to the end-user, like IndexedDB stands.

Regardless AWS DynamoDB is a robust NoSQL database, it's not without its constraints. If we adopt AWS DynamoDB as the storage foundation for SurrealDB, these constraints become ours too, necessitating strategies to mitigate their effects.

Here are some key limitations to consider:

  1. Item Size Limit: AWS DynamoDB has a limit on the size of each item (a record) – it cannot exceed 400KB. This limit forces you to think about how you structure your data and encourages denormalization and efficient design patterns. For instance, if you have a large amount of data to store, it might be more efficient to break it into multiple items or store it elsewhere (e.g., S3 for large binary data) and keep a reference in AWS DynamoDB.

  2. Data Retrieval Constraints: The Scan and Query operations in AWS DynamoDB can return up to 1 MB of data per request. If the data you're looking for is not present in the first request's response, you'll have to paginate through the results, which could potentially impact performance in large-scale applications.

  3. Partition Throughput Limits: In AWS DynamoDB, a partition serves as a fundamental unit of data storage and throughput. When you create a table in DynamoDB, the system automatically partitions the data across multiple servers to ensure high availability and performance. Each partition is an independent chunk of your table's data and is managed separately, allowing for parallel read and write operations. Partitions are automatically managed by DynamoDB behind the scenes, but understanding how they work is crucial for optimizing performance, especially with respect to read and write throughput. Each partition in an AWS DynamoDB table has its own throughput limit (3,000 Read Capacity Units and 1,000 Write Capacity Units). Distributing load evenly across all partitions is crucial to avoid throttling. However, this is not always possible and can lead to issues with "hot" partitions.

  4. Other Limitations: There are also other limitations related to key lengths, attribute names, and expression lengths, as well as default quotas per table for read and write capacity units.

Constraints such as element size, data retrieval limits, and various other key and attribute restrictions often cannot be completely circumvented without a deep understanding of your application's specific requirements. Regardless, when it comes to the challenge of partition throughput limits, SurrealDB provides viable solutions to mitigate this issue using a well-known design pattern - Single-Table Design. To properly address this issue, it's important to first understand the concept of "hot" partitions.

Hot partitions

The term "hot" partition in the AWS DynamoDB landscape is used to denote a situation where a certain partition is getting hammered with a disproportionately high volume of read or write operations (arises due to an uneven distribution of partition key values). This imbalance can cause a number of issues, including throttling, reduced performance, and in extreme cases, even disruption of your service if the demand overshoots the provisioned throughput for that table or partition.

An easy example can be found on social media platforms. a social media platform where certain users, such as celebrities, amass followers far exceeding the average. If user data is segmented by user ID, partitions associated with these high-profile users can rapidly become "hot" due to the surge of read requests.

Addressing the issue of hot partitions effectively requires a well-thought-out design strategy. Here are a few ways to mitigate this issue:

  1. Uniform Distribution of Partition Key: When choosing your partition key, select an attribute that has a large number of unique values and is likely to have evenly distributed access patterns.

  2. Use Composite Keys: Composite keys, which combine a partition key and a sort key, can help distribute traffic more uniformly across your partitions.

  3. Disperse High-Traffic Items: If you anticipate certain items attracting a high volume of traffic, consider breaking their data into multiple items or even tables. This can ensure a more even distribution of the read and write load.

  4. Implement Caching: Implementing a caching layer can significantly reduce the load on your AWS DynamoDB tables. Technologies like Amazon ElastiCache or in-memory databases like Redis can be particularly effective.

  5. Use AWS DynamoDB Accelerator (DAX): DAX is a fully managed, highly available, in-memory cache for AWS DynamoDB that can dramatically boost the performance of read-intensive workloads.

By implementing these strategies, you can avoid the pitfalls of hot tables and ensure a smooth, scalable experience with AWS DynamoDB. SurrealDB uses structured keys for managing data in a key-value datastore. This internal data management aligns perfectly with the first two solutions, making it an ideal candidate for implementation on AWS DynamoDB. These keys have specific prefixes that help in identifying and categorizing the data into namespaces, databases, tables graphs, etc. Here are some examples:

  • Table: /*{ns}*{db}*{tb}

  • Thing: /*{ns}*{db}*{tb}*{id}

  • Graph: /*{ns}*{db}*{tb}~{id}{eg}{fk}

In these examples, ns stands for the namespace name, db for the database name, tb for the table name, id for the id of a record in a database, and so on. These prefixes are ideal for defining partitions in AWS DynamoDB, as they provide a logical separation of data based on its type and purpose. This is also an essential principle of the Single-Table Design Pattern.

Single Table Design Pattern

The Single Table Design pattern works by using composite primary keys and secondary indexes to enable complex data access patterns. This approach not only optimizes the utilization of provisioned throughput but also curtails costs and streamlines data access methodologies.

The benefits of Single-Table Design include:

  1. Efficiency: By storing all data in a single table, we can perform batch operations on multiple items more efficiently.

  2. Cost-effectiveness: AWS charges for each read and write to DynamoDB. By minimizing the number of tables, we can reduce the number of operations and thus the cost.

  3. Simplicity: It simplifies the data access patterns and makes it easier to manage the data.

The key to this design pattern is the use of composite primary keys and secondary indexes. A composite primary key is made up of two attributes: a partition key and a sort key. The partition key is used to distribute data across multiple partitions for scalability and performance. The sort key is used to sort items within each partition. This combination allows us to store multiple types of items in the table, each with a unique combination of partition key and sort key.

Additionally, the Global Secondary Indexes (GSIs) could be used to enable more flexible querying. GSI allows querying data in different ways, using different keys, than those defined by our table's primary key. This means we can create various access patterns to efficiently query our data, even when it's stored in a single table. In essence, the Single-Table Design principle allows us to store multiple types of related entities in a single table, and use GSIs to enable complex access patterns.

Let's consider an example of a blogging platform. In a traditional relational database, you might have separate tables for Users, Blogs, and Comments. However, with Single Table Design in AWS DynamoDB, you could store all these entities in a single table. Here's how the updated table might look:

In this table, each item's Partition Key begins with a prefix that indicates its type (USER, BLOG, or COMMENT), followed by a unique identifier for that item. The Sort Key is used to group related items together. For example, all blogs and comments related to a specific user are grouped under that user's partition. Retrieving all blogs and comments for a user, we could perform with a single query on the Partition Key. To retrieve all comments for a specific blog, you could perform a query on the Partition Key and a range on the Sort Key.

Our improvements

Our enhancements to SurrealDB include the implementation of a new datastore and the adaptation of the server for use in AWS Lambda.

1. Datastore

To make AWS DynamoDB compatible with SurrealDB, we need to implement a new data store and a new transactional layer. Within SurrealDB, the term "datastore" denotes the foundational key-value storage mechanism upon which the database functions. This can refer to any system compatible with key-value data storage, such as AWS DynamoDB, RocksDB, or TiKV. The database communicates with the datastore via a standardized interface, regardless of the specific datastore in play. This facilitates the harnessing of distinct features from each data store while presenting a cohesive API to the clients. Each datastore can be distinguished by a unique URI.

In relation to SurrealDB, we've defined a distinct URI format to signify AWS DynamoDB as a datastore. It follows the structure: dynamodb://tablename?shards=n, where dynamodb:// is the scheme indicating the use of DynamoDB as a data store, tablename is the name of the DynamoDB table, and shards=n is an optional parameter specifying the number of shards.

Why do we define additional shard parameters? Even with the structured key design, there's still a risk of a hot partition if any table, database, or resource is heavily used. Therefore, we have implemented a sharding mechanism for partition. Sharding is a technique where a partition is divided into smaller pieces, or shards, each of which can be processed independently. This allows us to distribute the load more evenly across the partition and evade hot spots.

Our implementation for the new datastore requires only one AWS DynamoDB table with one Global Secondary Index (GSI). If you would like to create DynamoDB in your AWS account, please use the following Terraform code.

resource "aws_dynamodb_table" "surrealdb" {
name = "${var.table_name}-${local.stage}"
billing_mode = "PAY_PER_REQUEST"
stream_enabled = true
stream_view_type = "NEW_AND_OLD_IMAGES"
hash_key = "pk"
range_key = "sk"
attribute {
name = "pk"
type = "B"
}
attribute {
name = "sk"
type = "B"
}
attribute {
name = "gsi1pk"
type = "S"
}
attribute {
name = "gsi1sk"
type = "B"
}
global_secondary_index {
name = "GSI1"
hash_key = "gsi1pk"
range_key = "gsi1sk"
projection_type = "INCLUDE"
non_key_attributes = ["pk"]
}
}

Structured keys in SurrealDB split stored data into many groups. We do not create partitions for every group, but we are focusing on bigger data sets. We would like to keep the implementation simple but still efficient with a good distribution of data and application of the Single Table Design pattern. Our GSI creates partitions for the following data sets:

  • global[{shard}]://: contains all data that do not belong to any specific group, mostly metadata

  • ns[{shard}]://{ns}: namespace metadata

  • db[{shard}]://{ns}/{db}: contains data specific for database in namespace

  • scope[{shard}]://{ns}/{db}/{sc}: handle scopes specific metadata

  • table[{shard}]://{ns}/{db}/{tb}: contains table rows

  • graph[{shard}]://{ns}/{db}/{tb}/{id}: handle graph specific data for item in database

  • index[{shard}]://{ns}/{db}/{tb}/{ix}: contains index specific data

  • fulltext[{shard}]://{ns}/{db}/{tb}/{ix}: handle data used for full-text indexing

The datastore layer in SurrealDB is responsible for a single operation - initiating a new transaction for the underlying system. This is done through the transaction factory method. Once a transaction is started, the transactional object then exposes a number of functions that allow the manipulation of data within the context of that transaction:

  • closed: This function checks if a transaction has been closed.

  • cancel: This function cancels a running transaction. Our implementation ignores this function because DynamoDB does not support distributed transactions.

  • commit: This function commits a transaction, ensuring all changes are written to the database. Our implementation ignores this function because DynamoDB does not support distributed transactions.

  • exi: This function checks if a given key exists in the database.

  • get: This function retrieves the value associated with a given key from the database.

  • set: This function inserts or updates a key-value pair in the database.

  • put: This function inserts a key-value pair into the database, but only if the key doesn't already exist.

  • putc: Similar to put, this function also inserts a key-value pair only if the key doesn't exist. However, it also checks a condition before insertion.

  • del: This function deletes a given key from the database.

  • delc: Similar to del, this function deletes a key, but only after verifying a condition.

  • scan: This function retrieves a range of keys from the database.

The functions putc and delc leverage DynamoDB's conditional writes feature. This allows for the specification of a condition that the write operation must satisfy to be successful, ensuring data consistency and integrity, especially in environments with multiple users or distributed setups.

/// Insert a key if it doesn't exist in the database
pub async fn put<K, V>(&mut self, key: K, val: V) -> Result<(), Error>
where
K: Into<Key>,
V: Into<Val>,
{
// Check to see if transaction is closed
if self.ok {
return Err(Error::TxFinished);
}
// Check to see if transaction is writable
if !self.rw {
return Err(Error::TxReadonly);
}
// Set the key if not exists
let request =
self.build_put_request(key, val).condition_expression("attribute_not_exists(pk)");
request.send().await.map_err(|err| {
let err = err.into_service_error();
if let PutItemError::ConditionalCheckFailedException(_) = err {
Error::Tx("KeyAlreadyExists".into())
} else {
Error::Ds(err.to_string())
}
})?;
// Return result
Ok(())
}

This implementation could be more efficient than methods in a datastore for TiKV, where data need to be first read, then compared on the client side and subsequently written or declined based on that comparison.

/// Insert a key if it doesn't exist in the database
pub async fn put<K, V>(&mut self, key: K, val: V) -> Result<(), Error>
where
K: Into<Key>,
V: Into<Val>,
{
// Check to see if transaction is closed
if self.ok {
return Err(Error::TxFinished);
}
// Check to see if transaction is writable
if !self.rw {
return Err(Error::TxReadonly);
}
// Get the key
let key = key.into();
// Get the val
let val = val.into();
// Set the key if empty, key_exists read data from DB
match self.tx.key_exists(key.clone()).await? {
false => self.tx.put(key, val).await?,
_ => return Err(Error::TxKeyAlreadyExists),
};
// Return result
Ok(())
}

In combining AWS DynamoDB with SurrealDB, these processes are both optimized and simplified, resulting in time and resource savings.

Sharding implementation details

Sharding distributing requires to implementation of additional adaptations in the scan function. We have implemented an algorithm for the parallelization of these queries → multiple queries can be executed at the same time, significantly improving the performance and efficiency of our data operations.

/// Retrieve a range of keys from the databases
pub async fn scan<K>(&mut self, rng: Range<K>, limit: u32) -> Result<Vec<(Key, Val)>, Error>
where
K: Into<Key>,
{
// Check to see if transaction is closed
if self.ok {
return Err(Error::TxFinished);
}
let from = rng.start.into();
let partition = Partition::new(&from);
let to = rng.end.into();
if to.cmp(&from) == Ordering::Less {
return Ok(Vec::with_capacity(0));
}
// Scan the keys
let from = AttributeValue::B(Blob::new(from.clone()));
let to = AttributeValue::B(Blob::new(to));
/// synchronize threads with channels.
/// Every shard is scanned concurrently.
let (tx, mut rx) = tokio::sync::mpsc::channel::<Result<Vec<Key>, Error>>(10);
for shard in 0u8..self.shards {
let tx = tx.clone();
let client = Arc::clone(&self.client);
let table = Arc::clone(&self.table);
let f = from.clone();
let t = to.clone();
let gsi1pk = partition.key(shard);
/// spawn concurrent processing
tokio::spawn(async move {
let query = client
.query()
.table_name(table.as_ref())
/// global secundary index contains only key to reduce index size
.index_name("GSI1")
.key_condition_expression("#gsi1pk = :gsi1pk and #gsi1sk between :from and :to")
// a BETWEEN b AND c — true if a is greater than or equal to b, and less than or equal to c.
// We don't want: or equal to c
.filter_expression("#pk < :to")
.expression_attribute_names("#gsi1pk", "gsi1pk")
.expression_attribute_names("#gsi1sk", "gsi1sk")
.expression_attribute_names("#pk", "pk")
.expression_attribute_values(":gsi1pk", AttributeValue::S(gsi1pk))
.expression_attribute_values(":from", f)
.expression_attribute_values(":to", t)
.limit(limit as i32);
let keys = query
.send()
.await
.map(|res| {
res.items.map_or(vec![], |items| {
items
.into_iter()
.map(|mut item| {
let key_att = item.remove("pk").expect("pk is defined in item");
let key = match key_att {
AttributeValue::B(blob) => blob.into_inner(),
_ => unreachable!("key is not a blob"),
};
key
})
.collect::<Vec<_>>()
})
})
.map_err(|err| Error::Ds(err.to_string()));
tx.send(keys).await.expect("Response from DynamoDB is processed");
});
}
drop(tx);
/// reduce all responses into one array with ID of items to fetch
let keys = {
let mut keys = Vec::new();
while let Some(response) = rx.recv().await {
keys.extend(response?);
}
keys.sort();
// drop irrelevant keys
keys.into_iter().take(limit as usize).collect::<Vec<_>>()
};
/// split to chucks to reduce response size from batch operation -> try to avoid to hit DynamoDB limit
let chunks = keys.chunks(40).map(|keys| {
keys.iter().fold(KeysAndAttributes::builder(), |acc, key| {
let key = AttributeValue::B(Blob::new(key.as_slice()));
acc.keys(HashMap::from([("pk".to_string(), key.clone()), ("sk".to_string(), key)]))
})
});
let (tx, mut rx) = tokio::sync::mpsc::channel::<Result<Vec<(Key, Val)>, Error>>(40);
/// fetch
for chunk in chunks {
let tx = tx.clone();
let client = Arc::clone(&self.client);
let table = Arc::clone(&self.table);
tokio::spawn(async move {
let items = client
.batch_get_item()
.request_items(table.as_ref(), chunk.build())
.send()
.await
.map(|res| {
res.responses
.map(|mut tables| {
tables.remove(table.as_ref()).expect("SurrealDB is exists")
})
.map_or(vec![], |items| {
items
.into_iter()
.map(|mut item| {
let key_att =
item.remove("pk").expect("pk is defined in item");
let val_att =
item.remove("value").expect("value is defined in item");
let key = match key_att {
AttributeValue::B(blob) => blob.into_inner(),
_ => unreachable!("pk is not a blob"),
};
let value = match val_att {
AttributeValue::B(blob) => blob.into_inner(),
_ => unreachable!("value is not a blob"),
};
(key, value)
})
.collect::<Vec<_>>()
})
})
.map_err(|err| Error::Ds(err.to_string()));
tx.send(items).await.expect("Response from DynamoDB is processed");
});
}
drop(tx);
let mut items = Vec::new();
while let Some(response) = rx.recv().await {
items.extend(response?);
}
items.sort_by(|a, b| a.cmp(b));
Ok(items)
}

In conclusion, the combination of SurrealDB's structured key design, the Single-Table Design principle, and our sharding and parallelization strategies provide a robust and scalable solution for managing large amounts of data on AWS DynamoDB.

2. Runtime

In our ongoing effort to make SurrealDB a more versatile and serverless database solution, we have introduced two options that provide greater flexibility and scalability for SurrealDB users.

To streamline the deployment process, we have created Terraform code that automates the setup of SurrealDB on AWS Lambda and AWS Fargate for containerization. This setup enables automatic scaling of the SurrealDB a compute engine based on the workload, ensuring optimal performance and resource utilization. By using Terraform, users can easily provision and manage the infrastructure required to run SurrealDB as a serverless solution on AWS.

Option 1: AWS Lambda

We have also made modifications to the core SurrealDB code base to enable its usage as a library, rather than just a standalone binary. This modification allows us to expose the inner Warp server, which powers the compute engine of SurrealDB, as a library that can be integrated into other projects. This means that users can now leverage the power of SurrealDB's compute engine within their own applications. Highlighting SurrealDB's serverless attributes, we've created a plug-and-play AWS Lambda wrapper, embedding the Warp server within the Lambda execution environment.

use lambda_web::{run_hyper_on_lambda, LambdaError};
use std::env;
use surreal::{
init_warp, ClientIp, CustomEnvFilter, EnvFilter, StartCommandArguments, StartCommandDbsOptions,
};
#[tokio::main]
async fn main() -> Result<(), LambdaError> {
let table = env::var("TABLE").expect("Missing DynamoDB table name. $TABLE");
let shards = env::var("SHARDS").unwrap_or("1".to_string());
let strict = env::var("STRICT").map_or(false, |v| v.eq("true"));
let username = env::var("USER").unwrap_or("root".to_string());
let log = env::var("LOG_LVL").unwrap_or("info".to_string());
let password = env::var("PASS").ok();
let routes = init_warp(StartCommandArguments {
path: format!("dynamodb://{}?shards={}", table, shards),
username,
password,
allowed_networks: vec!["0.0.0.0/32".into()],
client_ip: ClientIp::None,
listen_addresses: vec!["0.0.0.0:80".into()],
dbs: StartCommandDbsOptions {
query_timeout: None,
},
key: None,
kvs: None,
web: None,
strict,
log: CustomEnvFilter(EnvFilter::builder().parse(format!(
"error,surreal={log},surrealdb={log},surrealdb::txn=error",
log = log
))?),
no_banner: true,
})
.await
.expect("SurrealDB is not working!");
let warp_service = warp::service(routes);
run_hyper_on_lambda(warp_service).await?;
Ok(())
}

Option 2: Scalable one AWS Fargate

In addition to the library-based approach, we have also prepared a binary version of SurrealDB that is specifically designed to connect with AWS DynamoDB. To provide a more hands-on experience, we've demonstrated the deployment of this binary on AWS Fargate, showcasing its seamless integration and functionality within the AWS ecosystem. Recognizing the increasing prominence of ARM64 architecture in the cloud and edge computing environments, we have prepared pre-built Docker images tailored for ARM64.

Conclusion: Scalability, data storage, and flexibility

Essentially, our latest developments in SurrealDB affirm its commitment to delivering scalable, reliable, and high-performance database solutions. The integration of AWS DynamoDB as a supported datastore solution brings users a robust, highly scalable option for their data storage needs. In addition, the newly introduced serverless options with AWS Lambda and AWS Fargate provide users with unmatched flexibility and scalability.

Future plans - Introducing support for communication over WebSockets

In our first version of AWS Lambda deployment, we support communication only over HTTP(s). However, we are not stopping there. We have exciting plans to enhance the communication capabilities of SurrealDB by introducing support for communication over WebSockets. This extension will allow for real-time, bidirectional communication between clients and SurrealDB, enabling seamless integration with applications that require instant updates and interactive experiences.

To implement this WebSocket communication feature, we are leveraging the power of AWS API Gateway version 2 (API Gateway v2). AWS API Gateway v2 provides enhanced WebSocket support, allowing us to establish persistent connections between clients and the SurrealDB compute engine.

Share

Let's start a partnership together.

Let's talk

Our basecamp

700 N San Vicente Blvd, Los Angeles, CA 90069

Follow us


© 2023 Sudolabs

Privacy policy
Footer Logo

We use cookies to optimize your website experience. Do you consent to these cookies and processing of personal data ?