ScalarDB Java API is mainly composed of Administrative API and Transactional API. This guide briefly explains what kind of APIs exist and how to use them.
This section explains how to execute administrative operations with Administrative API in ScalarDB. You can execute administrative operations programmatically as follows, but you can also execute those operations through Schema Loader.
To execute administrative operations, you first need to get a DistributedTransactionAdmin
instance.
The DistributedTransactionAdmin
instance can be obtained from TransactionFactory
as follows:
TransactionFactory transactionFactory = TransactionFactory.create("<configuration file path>");
DistributedTransactionAdmin admin = transactionFactory.getTransactionAdmin();
Please see Getting Started for the details of the configuration file.
Once you have executed all administrative operations, you should close the DistributedTransactionAdmin
instance as follows:
admin.close();
Before creating tables, namespaces must be created since a table belongs to one namespace. You can create a namespace as follows:
// Create a namespace "ns". It will throw an exception if the namespace already exists
admin.createNamespace("ns");
// Create a namespace only if it does not already exist
boolean ifNotExists = true;
admin.createNamespace("ns", ifNotExists);
// Create a namespace with options
Map<String, String> options = ...;
admin.createNamespace("ns", options);
In the creation operations (creating a namespace, creating a table, etc.), you can specify options that are maps of option names and values (Map<String, String>
).
With the options, we can set storage adapter specific configurations.
Currently, we can set the following options for the storage adapters:
For Cosmos DB:
name | value | default |
---|---|---|
ru | Base resource unit | 400 |
no-scaling | Disable auto-scaling for Cosmos DB | false |
For DynamoDB:
name | value | default |
---|---|---|
no-scaling | Disable auto-scaling for DynamoDB | false |
no-backup | Disable continuous backup for DynamoDB | false |
ru | Base resource unit | 10 |
For Cassandra:
name | value | default |
---|---|---|
replication-strategy | Cassandra replication strategy, must be SimpleStrategy or NetworkTopologyStrategy |
SimpleStrategy |
compaction-strategy | Cassandra compaction strategy, must be LCS , STCS or TWCS |
STCS |
replication-factor | Cassandra replication factor | 1 |
Next, we will discuss table creation.
You firstly need to create the TaleMetadata as follows:
// Define a table metadata
TableMetadata tableMetadata =
TableMetadata.newBuilder()
.addColumn("c1", DataType.INT)
.addColumn("c2", DataType.TEXT)
.addColumn("c3", DataType.BIGINT)
.addColumn("c4", DataType.FLOAT)
.addColumn("c5", DataType.DOUBLE)
.addPartitionKey("c1")
.addClusteringKey("c2", Scan.Ordering.Order.DESC)
.addClusteringKey("c3", Scan.Ordering.Order.ASC)
.addSecondaryIndex("c4")
.build();
Here you define columns, a partition key, a clustering key including clustering orders, and secondary indexes of a table.
Please see ScalarDB design document - Data Model for the details of the ScalarDB Data Model.
And then, you can create a table as follows:
// Create a table "ns.tbl". It will throw an exception if the table already exists
admin.createTable("ns", "tbl", tableMetadata);
// Create a table only if it does not already exist
boolean ifNotExists = true;
admin.createTable("ns", "tbl", tableMetadata, ifNotExists);
// Create a table with options
Map<String, String> options = ...;
admin.createTable("ns", "tbl", tableMetadata, options);
You can create a secondary index as follows:
// Create a secondary index on a column "c5" of a table "ns.tbl". It will throw an exception if the secondary index already exists
admin.createIndex("ns", "tbl", "c5");
// Create a secondary index only if it does not already exist
boolean ifNotExists = true;
admin.createIndex("ns", "tbl", "c5", ifNotExists);
// Create a secondary index with options
Map<String, String> options = ...;
admin.createIndex("ns", "tbl", "c5", options);
You can add a new non-partition key column to a table as follows:
// Add the new column "c6" of type INT to the table "ns.tbl"
admin.addNewColumnToTable("ns", "tbl", "c6", DataType.INT)
This should be executed with significant consideration as the execution time may vary greatly depending on the underlying storage. Please plan accordingly especially if the database runs in production:
You can truncate a table as follows:
// Truncate a table "ns.tbl"
admin.truncateTable("ns", "tbl");
You can drop a secondary index as follows:
// Drop a secondary index on a column "c5" of a table "ns.tbl". It will throw an exception if the secondary index does not exist
admin.dropIndex("ns", "tbl", "c5");
// Drop a secondary index only if it exists
boolean ifExists = true;
admin.dropIndex("ns", "tbl", "c5", ifExists);
You can drop a table as follows:
// Drop a table "ns.tbl". It will throw an exception if the table does not exist
admin.dropTable("ns", "tbl");
// Drop a table only if it exists
boolean ifExists = true;
admin.dropTable("ns", "tbl", ifExists);
You can drop a namespace as follows:
// Drop a namespace "ns". It will throw an exception if the namespace does not exist
admin.dropNamespace("ns");
// Drop a namespace only if it exists
boolean ifExists = true;
admin.dropNamespace("ns", ifExists);
You can get a table metadata as follows:
// Get a table metadata of "ns.tbl"
TableMetadata tableMetadata = admin.getTableMetadata("ns", "tbl");
Depending on the transaction manager type, you need to create coordinator tables to execute transactions. The following items describe the operations for the coordinator table.
You can create coordinator tables as follows:
// Create coordinator tables
admin.createCoordinatorTables();
// Create coordinator tables only if they do not already exist
boolean ifNotExist = true;
admin.createCoordinatorTables(ifNotExist);
// Create coordinator tables with options
Map<String, String> options = ...;
admin.createCoordinatorTables(options);
You can truncate coordinator tables as follows:
// Truncate coordinator tables
admin.truncateCoordinatorTables();
You can drop coordinator tables as follows:
// Drop coordinator tables
admin.dropCoordinatorTables();
// Drop coordinator tables if they exist
boolean ifExist = true;
admin.dropCoordinatorTables(ifExist);
This section explains how to execute transactional operations with Transactional API in ScalarDB.
You need to get a DistributedTransactionManager
instance to execute transactional operations.
You can get it in the following way:
TransactionFactory transactionFactory = TransactionFactory.create("<configuration file path>");
DistributedTransactionManager transactionManager = transactionFactory.getTransactionManager();
Once you have executed all transactional operations, you should close the DistributedTransactionManager
instance as follows:
transactionManager.close();
You need to begin/start a transaction before executing transactional CRUD operations. You can begin/start a transaction as follows:
// Begin a transaction
DistributedTransaction transaction = transactionManager.begin();
Or
// Start a transaction
DistributedTransaction transaction = transactionManager.start();
You can also begin/start a transaction with specifying a transaction ID as follows:
// Begin a transaction with specifying a transaction ID
DistributedTransaction transaction = transactionManager.begin("<transaction ID>");
Or
// Start a transaction with specifying a transaction ID
DistributedTransaction transaction = transactionManager.start("<transaction ID>");
Note that you must guarantee uniqueness of the transaction ID in this case.
You can resume a transaction you have already begun with specifying a transaction ID as follows:
// Resume a transaction
DistributedTransaction transaction = transactionManager.resume("<transaction ID>");
It is helpful in a stateful application where a transaction spans multiple client requests.
In that case, the application can begin a transaction in the first client request.
And in the following client requests, it can resume the transaction with the resume()
method.
Most CRUD operations need to specify Key
objects (partition-key, clustering-key, etc.).
So, before moving on to CRUD operations, the following explains how to construct a Key
object.
For a single column key, you can use the Key.ofXXX()
methods (XXX is a type name) to construct it as follows:
// for a key that consists of a single column of Int
Key key1 = Key.ofInt("col1", 1);
// for a key that consists of a single column of BigInt
Key key2 = Key.ofBigInt("col1", 100L);
// for a key that consists of a single column of Double
Key key3 = Key.ofDouble("col1", 1.3d);
// for a key that consists of a single column of Text
Key key4 = Key.ofText("col1", "value");
For a key that consists of 2 - 5 columns, you can use the Key.of()
methods to construct it as follows:
// for a key that consists of 2 - 5 columns
Key key1 = Key.of("col1", 1, "col2", 100L);
Key key2 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d);
Key key3 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d, "col4", "value");
Key key4 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d, "col4", "value", "col5", false);
Similar to ImmutableMap.of()
in Guava, you need to specify column names and values in turns.
For a key that consists of more than 5 columns, we can use the builder to construct it as follows:
// for a key that consists of more than 5 columns
Key key = Key.newBuilder()
.addInt("col1", 1)
.addBigInt("col2", 100L)
.addDouble("col3", 1.3d)
.addText("col4", "value")
.addBoolean("col5", false)
.addInt("col6", 100)
.build();
Get
is an operation to retrieve a single record specified by a primary key.
You need to create a Get object first, and then you can execute it with the transaction.get()
method as follows:
// Create a Get operation
Key partitionKey = Key.ofInt("c1", 10);
Key clusteringKey = Key.of("c2", "aaa", "c3", 100L);
Get get =
Get.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKey)
.projections("c1", "c2", "c3", "c4")
.build();
// Execute the Get operation
Optional<Result> result = transaction.get(get);
You can also specify projections to choose which columns are returned.
The Get operation and Scan operation return Result
objects.
So the following shows how to handle Result
objects.
You can get a column value of a result with getXXX("<column name>")
methods (XXX is a type name) as follows:
// Get a Boolean value of a column
boolean booleanValue = result.getBoolean("<column name>");
// Get an Int value of a column
int intValue = result.getInt("<column name>");
// Get a BigInt value of a column
long bigIntValue = result.getBigInt("<column name>");
// Get a Float value of a column
float floatValue = result.getFloat("<column name>");
// Get a Double value of a column
double doubleValue = result.getDouble("<column name>");
// Get a Text value of a column
String textValue = result.getText("<column name>");
// Get a Blob value of a column (as a ByteBuffer)
ByteBuffer blobValue = result.getBlob("<column name>");
// Get a Blob value of a column as a byte array
byte[] blobValueAsBytes = result.getBlobAsBytes("<column name>");
And if you need to check if a value of a column is null, you can use the isNull("<column name>")
method.
// Check if a value of a column is null
boolean isNull = result.isNull("<column name>");
Please see also Javadoc of Result
for more details.
You can also execute a Get operation with a secondary index.
Instead of specifying a partition key, you can specify an index key (specifying an indexed column) to use a secondary index as follows:
// Create a Get operation with a secondary index
Key indexKey = Key.ofFloat("c4", 1.23F);
Get get =
Get.newBuilder()
.namespace("ns")
.table("tbl")
.indexKey(indexKey)
.projections("c1", "c2", "c3", "c4")
.build();
// Execute the Get operation
Optional<Result> result = transaction.get(get);
Note that if the result has more than one record, the transaction.get()
throws an exception.
If you want to handle multiple results, use Scan with a secondary index.
Scan
is an operation to retrieve multiple records within a partition.
You can specify clustering key boundaries and orderings for clustering key columns in Scan operations.
You need to create a Scan object first, and then you can execute it with the transaction.scan()
method as follows:
// Create a Scan operation
Key partitionKey = Key.ofInt("c1", 10);
Key startClusteringKey = Key.of("c2", "aaa", "c3", 100L);
Key endClusteringKey = Key.of("c2", "aaa", "c3", 300L);
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.start(startClusteringKey)
.end(endClusteringKey)
.projections("c1", "c2", "c3", "c4")
.orderings(Scan.Ordering.desc("c2"), Scan.Ordering.asc("c3"))
.limit(10)
.build();
// Execute the Scan operation
List<Result> results = transaction.scan(scan);
You can omit the clustering key boundaries, or you can specify either a start boundary or an end boundary. If you don’t specify orderings, you get results ordered by clustering order you defined when creating the table.
Also, you can specify projections to choose which columns are returned, and limit to specify the number of records to return in Scan operations.
You can also execute a Scan operation with a secondary index.
Instead of specifying a partition key, you can specify an index key (specifying an indexed column) to use a secondary index as follows:
// Create a Scan operation with a secondary index
Key indexKey = Key.ofFloat("c4", 1.23F);
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.indexKey(indexKey)
.projections("c1", "c2", "c3", "c4")
.limit(10)
.build();
// Execute the Scan operation
List<Result> results = transaction.scan(scan);
Note that you can’t specify clustering key boundaries and orderings in Scan with a secondary index.
You can also execute a Scan operation without specifying a partition key.
Instead of calling the partitionKey()
method in the builder, you can call the all()
method to scan a table without specifying a partition key as follows:
// Create a Scan operation without a partition key
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.all()
.projections("c1", "c2", "c3", "c4")
.limit(10)
.build();
// Execute the Scan operation
List<Result> results = transaction.scan(scan);
Note that you can’t specify clustering key boundaries and orderings in Scan without a partition key.
Put
is an operation to put a record specified by a primary key.
It behaves as an upsert operation for a record, i.e., updating the record if the record exists; otherwise, inserting the record.
Note that when you update an existing record, you need to read it using a Get
or a Scan
before a Put
operation.
You need to create a Put object first, and then you can execute it with the transaction.put()
method as follows:
// Create a Put operation
Key partitionKey = Key.ofInt("c1", 10);
Key clusteringKey = Key.of("c2", "aaa", "c3", 100L);
Put put =
Put.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKey)
.floatValue("c4", 1.23F)
.doubleValue("c5", 4.56)
.build();
// Execute the Put operation
transaction.put(put);
You can also put a record with null values as follows:
Put put =
Put.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKey)
.floatValue("c4", null)
.doubleValue("c5", null)
.build();
Delete
is an operation to delete a record specified by a primary key.
Note that when you delete a record, you need to read it using a Get
or a Scan
before a Delete
operation.
You need to create a Delete object first, and then you can execute it with the transaction.delete()
method as follows:
// Create a Delete operation
Key partitionKey = Key.ofInt("c1", 10);
Key clusteringKey = Key.of("c2", "aaa", "c3", 100L);
Delete delete =
Delete.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKey)
.build();
// Execute the Delete operation
transaction.delete(delete);
Mutate is an operation to execute multiple mutations (Put and Delete operations).
You need to create mutation objects first, and then you can execute them with the transaction.mutate()
method as follows:
// Create Put and Delete operations
Key partitionKey = Key.ofInt("c1", 10);
Key clusteringKeyForPut = Key.of("c2", "aaa", "c3", 100L);
Put put =
Put.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKeyForPut)
.floatValue("c4", 1.23F)
.doubleValue("c5", 4.56)
.build();
Key clusteringKeyForDelete = Key.of("c2", "bbb", "c3", 200L);
Delete delete =
Delete.newBuilder()
.namespace("ns")
.table("tbl")
.partitionKey(partitionKey)
.clusteringKey(clusteringKeyForDelete)
.build();
// Execute the operations
transaction.mutate(Arrays.asList(put, delete));
consistency()
methods, but it’s ignored, and the LINEARIZABLE
consistency level is always used in transactions.condition()
methods, but it’s ignored, too.
Please program such conditions in a transaction if you want to implement conditional mutation.After executing CRUD operations, you need to commit a transaction to finish it.
You can commit a transaction as follows;
// Commit a transaction
transaction.commit();
If you want to rollback/abort a transaction or an error happens during the execution, you can rollback/abort a transaction.
You can rollback/abort a transaction as follows;
// Rollback a transaction
transaction.rollback();
Or
// Abort a transaction
transaction.abort();
Please see Handle Exceptions for the details of how to handle exceptions in ScalarDB.
Handling exceptions correctly in ScalarDB is very important. If you mishandle exceptions, your data could become inconsistent. This document explains how to handle exceptions properly in ScalarDB.
Let’s look at the following example code to see how to handle exceptions in ScalarDB.
public class Sample {
public static void main(String[] args) throws IOException, InterruptedException {
TransactionFactory factory = TransactionFactory.create("<configuration file path>");
DistributedTransactionManager transactionManager = factory.getTransactionManager();
int retryCount = 0;
while (true) {
if (retryCount++ > 0) {
// Retry the transaction three times maximum in this sample code
if (retryCount >= 3) {
return;
}
// Sleep 100 milliseconds before retrying the transaction in this sample code
TimeUnit.MILLISECONDS.sleep(100);
}
// Begin a transaction
DistributedTransaction tx;
try {
tx = transactionManager.begin();
} catch (TransactionException e) {
// If beginning a transaction failed, it indicates some failure happens during the
// transaction, so you should cancel the transaction or retry the transaction after the
// failure/error is fixed
return;
}
try {
// Execute CRUD operations in the transaction
Optional<Result> result = tx.get(...);
List<Result> results = tx.scan(...);
tx.put(...);
tx.delete(...);
// Commit the transaction
tx.commit();
} catch (CrudConflictException | CommitConflictException e) {
// If you catch CrudConflictException or CommitConflictException, it indicates a transaction
// conflict occurs during the transaction so that you can retry the transaction from the
// beginning
try {
tx.rollback();
} catch (RollbackException ex) {
// Rolling back the transaction failed. You can log it here
}
} catch (CrudException | CommitException e) {
// If you catch CrudException or CommitException, it indicates some failure happens, so you
// should cancel the transaction or retry the transaction after the failure/error is fixed
try {
tx.rollback();
} catch (RollbackException ex) {
// Rolling back the transaction failed. You can log it here
}
return;
} catch (UnknownTransactionStatusException e) {
// If you catch `UnknownTransactionStatusException` when committing the transaction, you are
// not sure if the transaction succeeds or not. In such a case, you need to check if the
// transaction is committed successfully or not and retry it if it failed. How to identify a
// transaction status is delegated to users
return;
}
}
}
}
The APIs for CRUD operations (get()
/scan()
/put()
/delete()
/mutate()
) could throw CrudException
and CrudConflictException
.
If you catch CrudException
, it indicates some failure (e.g., database failure and network error) happens during a transaction, so you should cancel the transaction or retry the transaction after the failure/error is fixed.
If you catch CrudConflictException
, it indicates a transaction conflict occurs during the transaction so that you can retry the transaction from the beginning, preferably with well-adjusted exponential backoff based on your application and environment.
The sample code retries three times maximum and sleeps 100 milliseconds before retrying the transaction.
Also, the commit()
API could throw CommitException
, CommitConflictException
, and UnknownTransactionStatusException
.
If you catch CommitException
, like the CrudException
case, you should cancel the transaction or retry the transaction after the failure/error is fixed.
If you catch CommitConflictException
, like the CrudConflictException
case, you can retry the transaction from the beginning.
If you catch UnknownTransactionStatusException
, you are not sure if the transaction succeeds or not.
In such a case, you need to check if the transaction is committed successfully or not and retry it if it fails.
How to identify a transaction status is delegated to users.
You may want to create a transaction status table and update it transactionally with other application data so that you can get the status of a transaction from the status table.
Please see Two-phase Commit Transactions.
This configuration is only available to troubleshoot Consensus Commit transactions. By adding the following configuration, Get
and Scan
operations results will contain transaction metadata.
To see the transaction metadata columns details for a given table, you can use the DistributedTransactionAdmin.getTableMetadata()
method which will return the table metadata augmented with the transaction metadata columns.
All in all, using this configuration can be useful to investigate transaction related issues.
# By default, it is set to "false".
scalar.db.consensus_commit.include_metadata.enabled=true