Import NumPy
The .npy format is the standard binary file format in NumPy for persisting a
single arbitrary NumPy array on disk.
The primary use case for bulk loading NumPy files is to load
large node features or vectors that are stored in .npy format. You can use the COPY FROM statement
to import a set of *.npy files into a node table.
Import to node table
Consider a Paper table with an id column, a feature column that is an embedding (vector) with 768 dimensions,
a year column and a label column as ground truth. We first define the schema with the following statement:
CREATE NODE TABLE Paper(id INT64 PRIMARY KEY, feat FLOAT[768], year INT64, label DOUBLE);The raw data is stored in .npy format where each column is represented as a NumPy array on disk. The files are
specified below:
node_id.npy", "node_feat_f32.npy", "node_year.npy", "node_label.npy"We can copy the files with the following statement:
COPY Paper FROM ("node_id.npy", "node_feat_f32.npy", "node_year.npy", "node_label.npy") BY COLUMN;As stated before, the number of *.npy files must equal the number of columns, and must also be
specified in the same order as they are defined in the DDL.
Ignore erroneous rows
See the Ignore erroneous rows section for more details.