SQLite
SQLite is a "pocket database", in that it is a full-featured relational database which however dispenses with the client/server architecture of most such databases and instead exists entirely in a single diskfile. Software in a wide variety of common languages can read from and write to this database file using standard SQL queries using the language's standard API for database interactions, assuming one exists and the database can be transferred to another computer simply by copying the file there.
Like other databases, SQLite should not be used, particularly when it is being written to, on a shared filesystem such as home, scratch and project. Typically, you should copy your SQLite file to the local scratch $SLURM_TMPDIR at the beginning of a job, where you can then use the database without any issues and also enjoy the best possible performance. Note as well that SQLite is not intended for use with multiple threads or processes writing concurrently to the database; for this you should consider a client/server solution.
Using SQLite directly
You can access an SQLite database directly using the native client:
[name@server ~]$ sqlite3 foo.sqlite
If the file foo.sqlite does not already exist, the SQLite software will create it and the client starts in an empty database, otherwise you are connected to the existing database. You may then execute whichever queries you wish on the database, such as SELECT * FROM tablename; to print to the screen the entire contents of the table tablename.
Accessing SQLite from software
This most common way of interacting with an SQLite (or other) database is programmatically, i.e. inside of a program written in one of various languages like R, C++ and Python, using a series of function calls to open a connection to the database, execute queries that can read or update existing data in the database as well as inserting new data and finally close the connection to the SQLite database so that the changes (if any) are flushed to the SQLite file. In the simple example below, we suppose that the database has already been created with a table called employee that has two columns, the string name and age, an integer.
#!/usr/bin/env python3
# For Python we can use the module sqlite3, installed in a virtual environment,
# to access an SQLite database
import sqlite3
age = 34
# Connect to the database...
dbase = sqlite3.connect("foo.sqlite")
dbase.execute("INSERT INTO employee(name,age) VALUES(\"John Smith\"," + str(age) + ");")
# Close the database connection
dbase.close()
# Using R, the first step is to install the RSQLite package in your R environment,
# after which you can use code like the following to interact with the SQLite database
library(DBI)
age <- 34
# Connect to the database...
dbase <- dbConnect(RSQLite::SQLite(),"foo.sqlite")
# A parameterized query
query <- paste(c("INSERT INTO employee(name,age) VALUES(\"John Smith\",",toString(age),");"),collapse='')
dbExecute(dbase,query)
# Close the database connection
dbDisconnect(dbase)
#include <iostream>
#include <string>
#include <sqlite3.h>
int main(int argc,char** argv)
{
int age = 34;
std::string query;
sqlite3* dbase;
sqlite3_open("foo.sqlite",&dbase);
query = "INSERT INTO employee(name,age) VALUES(\"John Smith\"," + std::to_string(age) + ");";
sqlite3_exec(dbase,query.c_str(),nullptr,nullptr,nullptr);
sqlite3_close(dbase);
return 0;
}
Caveats
SQLite is as the name suggests easy to use and intended for relatively simple databases and which are neither excessively large (hundreds of gigabytes or more) nor too complicated in terms of their entity-relationship diagram. As your SQLite database grows in size and complexity the performance could start to degrade, in which case the time may have come to consider the use of more sophisticated database software which uses a client/server model. The SQLite web site includes an excellent page on Appropriate Uses For SQLite, including a checklist for choosing between SQLite and client/server databases.