Track artifacts, in-memory objects & folders [S3 storage]¶
import lamindb as ln
import pytest
ln.setup.login("testuser1")
ln.setup.init(storage="s3://lamindb-ci/test-upload")
✓ logged in with email testuser1@lamin.ai (uid: DzTjkKse)
→ go to: https://lamin.ai/testuser1/test-upload
! updating cloud SQLite 's3://lamindb-ci/test-upload/9f1dbf16f2b45bee83ebc4ce719ac4e4.lndb' of instance 'testuser1/test-upload'
! locked instance (to unlock and push changes to the cloud SQLite file, call: lamin disconnect)
→ initialized lamindb: testuser1/test-upload
Local artifacts¶
Some test data.
pbmc68k = ln.core.datasets.anndata_pbmc68k_reduced()
Subset to a mini artifact to speed up the run time of this notebook:
pbmc68k = pbmc68k[:5, :5].copy()
pbmc68k
AnnData object with n_obs × n_vars = 5 × 5
obs: 'cell_type', 'n_genes', 'percent_mito', 'louvain'
var: 'n_counts', 'highly_variable'
uns: 'louvain', 'louvain_colors', 'neighbors', 'pca'
obsm: 'X_pca', 'X_umap'
varm: 'PCs'
obsp: 'connectivities', 'distances'
Upload from memory using explicit semantic key
¶
Upload h5ad¶
pbmc68k_h5ad = ln.Artifact.from_anndata(pbmc68k, key="test-upload/pbmc68k.h5ad")
! no run & transform got linked, call `ln.track()` & re-run
pbmc68k_h5ad.save()
Artifact(uid='maePVVs0xpC9puGF0000', is_latest=True, key='test-upload/pbmc68k.h5ad', suffix='.h5ad', kind='dataset', otype='AnnData', size=100800, hash='QqiU2ZhKKdXUMHcmKjn3sA', space_id=1, storage_id=1, created_by_id=1, created_at=2025-01-12 14:04:22 UTC)
pbmc68k_h5ad.delete(permanent=True)
Upload zarr¶
# Runs too long, should be tested elsewhere
# pbmc68k_zarr = ln.Artifact(pbmc68k, key="test-upload/pbmc68k.zarr", format="zarr")
# ln.save(pbmc68k_zarr)
# pbmc68k_zarr.delete(permanent=True, storage=True)
Upload using id
with implicit key
¶
Upload h5ad¶
pbmc68k_h5ad = ln.Artifact.from_anndata(pbmc68k, description="pbmc68k.h5ad")
! no run & transform got linked, call `ln.track()` & re-run
pbmc68k_h5ad.save()
Artifact(uid='eUW7xN73GiWh3IyI0000', is_latest=True, description='pbmc68k.h5ad', suffix='.h5ad', kind='dataset', otype='AnnData', size=100800, hash='QqiU2ZhKKdXUMHcmKjn3sA', space_id=1, storage_id=1, created_by_id=1, created_at=2025-01-12 14:04:23 UTC)
pbmc68k_h5ad.delete(permanent=True, storage=True)
Upload zarr¶
# Runs too long, should be tested elsewhere
# pbmc68k_zarr = ln.Artifact(pbmc68k, name="pbmc68k.zarr", format="zarr")
# ln.save(pbmc68k_zarr)
# pbmc68k_zarr.delete(permanent=True, storage=True)
Error behaviors¶
Specified file does not exist.
with pytest.raises(FileNotFoundError):
non_existent_h5ad = ln.Artifact(
"s3://lamindb-ci/test-upload/non_existent_file.h5ad"
)
! no run & transform got linked, call `ln.track()` & re-run
Specified buket does not exist. Normally non-existent bucket raises FileNotFoundError
, but sometimes strarts to raise PermissionError
.
with pytest.raises((FileNotFoundError, PermissionError)):
non_existent_h5ad = ln.Artifact(
"s3://non_existent_bucket_6612366/non_existent_file.h5ad"
)
! no run & transform got linked, call `ln.track()` & re-run
Test existing zarr¶
See test_artifact.py
for other artifact types.
This should probably go elsewhere:
# temporarily comment out because of head bucket permission error when
# attempting to get region
# artifact = ln.Artifact("s3://lamindb-ci/lndb-storage/pbmc68k.zarr")
# artifact.save()
# artifact.open()
ln.setup.delete("test-upload", force=True)
→ deleted storage record on hub f6bca7acd1615eeaafc3aafe3b8e85fb
→ deleted instance record on hub 9f1dbf16f2b45bee83ebc4ce719ac4e4