PostgreSQL extension
The postgres extension allows you to attach PostgreSQL databases to Ladybug.
Usage
Please see Install an extension and Load an extension first before getting started.
Set up a PostgreSQL server via Docker
During local development, it’s convenient to set up a PostgreSQL server using Docker. Run the following command to start a PostgreSQL server locally:
docker run --rm --name lbug-postgres \ -e POSTGRES_PASSWORD=testpassword \ -p 5432:5432 \ postgres:latestExample PostgreSQL database
To illustrate the usage of the extension, we create a sample PostgreSQL database of university students. We will use asyncpg, an asynchronous PostgreSQL client library for Python, to create the database and insert some sample data via a Python script.
pip install asyncpgimport asyncioimport asyncpg
async def main(): conn = await asyncpg.connect('postgresql://postgres:testpassword@localhost:5432/postgres') # Create and insert data to a new table try: await conn.execute("CREATE TABLE IF NOT EXISTS person (name VARCHAR, age INTEGER);") await conn.execute("INSERT INTO person (name, age) VALUES ('Alice', 30)") await conn.execute("INSERT INTO person (name, age) VALUES ('Bob', 27)") await conn.execute("INSERT INTO person (name, age) VALUES ('Carol', 19)") await conn.execute("INSERT INTO person (name, age) VALUES ('Dan', 25)") except asyncpg.exceptions.DuplicateTableError: print(f"Table already exists, skipping creation and insertion...") # Check results print(await conn.fetch("SELECT * FROM person"))
asyncio.run(main())Attach a database
ATTACH <PG_CONNECTION_STRING> AS <alias> (dbtype postgres, skip_unsupported_table = <OPTION>, schema = <SCHEMA_NAME>)PG_CONNECTION_STRING: PostgreSQL connection string with the necessary parametersalias: Database alias to use in Ladybug. If not provided, the database name from PostgreSQL will be used. When attaching multiple databases, we recommend using aliases.skip_unsupported_table: A boolean flag to indicate whether Ladybug should exit with an error or skip tables with unsupported datatypes. Defaults toFalse.schema: The name of the schema in PostgreSQL to attach. Ladybug attaches to thepublicschema of PostgreSQL by default.
PG_CONNECTION_STRING accepts the following parameters:
dbname: Database namehost: Host IP address. Defaults tolocalhost.user: PostgreSQL username. Defaults topostgres.password: PostgreSQL password. Defaults to an empty string.port: Port number. Defaults to5432.
For example, the university PostgreSQL database can be attached to Ladybug as:
ATTACH 'dbname=university user=postgres host=localhost password=testpassword port=5432' AS uw (dbtype postgres);PostgreSQL to Ladybug type mapping
The following table shows the mapping from PostgreSQL’s data types to Ladybug’s data types:
| PostgreSQL Data Type | Corresponding Data Type in Ladybug |
|---|---|
bigint (int8) | INT64 |
bigserial (serial8) | INT64 |
bit [ (n) ] | STRING |
bit varying [ (n) ] (varbit [ (n) ]) | STRING |
boolean (bool) | BOOL |
box | DOUBLE[] |
bytea | BLOB |
character [ (n) ] (char [ (n) ]) | STRING |
character varying [ (n) ] (varchar [ (n)]) | STRING |
cidr | STRING |
circle | DOUBLE[] |
date | DATE |
double precision (float8) | DOUBLE |
inet | STRING |
integer (int, int4) | INT32 |
interval [ fields ] [ (p) ] | INTERVAL |
json | JSON |
line | DOUBLE[] |
lseg | DOUBLE[] |
macaddr | STRING |
macaddr8 | STRING |
money | STRING |
numeric [ (p, s) ] (decimal [ (p, s) ]) | DECIMAL |
path | DOUBLE[] |
pg_lsn | STRING |
pg_snapshot | STRING |
point | STRUCT(x DOUBLE, y DOUBLE) |
polygon | DOUBLE[] |
real (float4) | FLOAT |
smallint (int2) | INT16 |
smallserial (serial2) | INT16 |
serial (serial4) | INT32 |
text | STRING |
time [ (p) ] [ without time zone ] | Unsupported |
time [ (p) ] with time zone (timetz) | Unsupported |
timestamp [ (p) ] [ without time zone ] | TIMESTAMP |
timestamp [ (p) ] with time zone (timestamptz) | Unsupported |
tsquery | STRING |
tsvector | STRING |
txid_snapshot | STRING |
uuid | UUID |
xml | STRING |
Scan PostgreSQL tables with Cypher
You can use the LOAD FROM statement to scan the person table from the attached uw database.
LOAD FROM uw.personRETURN *┌────────┬───────┐│ name │ age ││ STRING │ INT64 │├────────┼───────┤│ Alice │ 30 ││ Bob │ 27 ││ Carol │ 19 ││ Dan │ 25 │└────────┴───────┘Scan PostgreSQL tables with SQL
You can use the SQL_QUERY function to execute arbitrary read-only SQL queries on any attached PostgreSQL database
and retrieve its results in Ladybug.
For example, the following query filters out all people who are younger than 20:
CALL SQL_QUERY('uw', 'SELECT * FROM person WHERE age >= 20')RETURN *;┌────────┬───────┐│ name │ age ││ STRING │ INT64 │├────────┼───────┤│ Alice │ 30 ││ Bob │ 27 ││ Dan │ 25 │└────────┴───────┘Using a database without an alias
You can set a default database name to avoid having to specify the full database name as a prefix in every query.
The above example can be simplified as:
ATTACH 'university' AS uw (dbtype postgres);USE uw;LOAD FROM personRETURN *;Copy data from PostgreSQL tables
You can use the COPY FROM statement to import data from a PostgreSQL table into Ladybug.
First, create a Person table in Ladybug, with the same schema as the one defined in PostgreSQL:
CREATE NODE TABLE Person (name STRING PRIMARY KEY, age INT32);When the schemas are the same, you can copy the data from the external DBMS table to the Ladybug table as:
COPY Person FROM uw.person;If the schemas are not the same, e.g., Person contains only name property while the external uw.person contains
name and age, you can still use COPY FROM but with a subquery as:
COPY Person FROM (LOAD FROM uw.person RETURN name);Alternatively, you can use the SQL_QUERY function:
COPY Person FROM SQL_QUERY('uw', 'SELECT name FROM person');You can verify that the data was successfully imported:
MATCH (p:Person) RETURN p.*;┌────────┬───────┐│ name │ age ││ STRING │ INT64 │├────────┼───────┤│ Alice │ 30 ││ Bob │ 27 ││ Carol │ 19 ││ Dan │ 25 │└────────┴───────┘Clear the schema cache
To avoid redundantly retrieving schema information from attached databases, Ladybug maintains a schema cache
including table names and their respective columns and types. Should modifications occur in the schema
via an alternate connection to attached RDBMSs, such as creation or deletion of tables, the cached
schema data may become obsolete. You can use the clear_attached_db_cache() function to refresh the cached
schema information in such cases.
CALL clear_attached_db_cache();Detach a database
To detach a database, use DETACH [ALIAS] as follows:
DETACH uw;