Track notebooks & scripts¶

This guide explains how to use track() & finish() to track notebook & scripts along with their inputs and outputs. For tracking pipelines, see: Pipelines – workflow managers.

# !pip install 'lamindb[jupyter]'
!lamin init --storage ./test-track

Track a notebook or script¶

Call track() to register your notebook or script as a transform and start tracking inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code

ln.finish()  # mark run as finished, save execution report, source code & environment

Load a notebook or script¶

On the hub, search or filter the transform page and then load a script or notebook on the CLI. For example,

lamin load https://lamin.ai/laminlabs/lamindata/transform/13VINnFk89PE

Query a notebook or script¶

You find your notebooks and scripts in the Transform registry (along with pipelines & functions). Run stores executions. You can use all usual ways of querying to obtain one or multiple transform records, e.g.:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code  # source code
transform.latest_run.report  # report of latest run
transform.latest_run.environment  # environment of latest run
transform.runs  # all runs

Sync scripts with git¶

To sync with your git commit, add the following line to your script:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

synced-with-git.py¶

import lamindb as ln

ln.settings.sync_git_repo = "https://github.com/..."
ln.track()
# your code
ln.finish()

Track parameters¶

In addition to tracking source code, run reports & environments, you can easily track run parameters.

Track run parameters¶

Before tracking parameter values, you need to define valid parameters, e.g.:

import lamindb as ln

ln.Param(name="input_dir", dtype="str").save()
ln.Param(name="learning_rate", dtype="float").save()
ln.Param(name="preprocess_params", dtype="dict").save()

Upon running the below script without those parameters defined, you’ll get a ValidationError from which you can copy & paste the definitions.

run-track-with-params.py¶

import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,  # nested parameter names & values in dictionaries are not validated
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run-track-with-params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample

Query by run parameters¶

Query for all runs that match a certain parameters:

ln.Run.params.filter(
    learning_rate=0.01, input_dir="./mydataset", preprocess_params__downsample=True
).df()

Show code cell output Hide code cell output

	uid	name	started_at	finished_at	reference	reference_type	_is_consecutive	_status_code	space_id	transform_id	report_id	_logfile_id	environment_id	initiated_by_run_id	created_at	created_by_id	aux	_branch_code
id
1	caQFc6b72ztb1CQko3Fv	None	2025-01-12 14:02:32.011754+00:00	2025-01-12 14:02:33.586941+00:00	None	None	None	0	1	1	2	None	1	None	2025-01-12 14:02:32.011809+00:00	1	None	1

Note that:

preprocess_params__downsample=True traverses the dictionary preprocess_params to find the key "downsample" and match it to True
nested keys like "downsample" in a dictionary do not appear in Param and hence, do not get validated

Access parameters of a run¶

Below is how you get the parameter values that were used for a given run.

run = ln.Run.params.filter(learning_rate=0.01).order_by("-started_at").first()
run.params.get_values()

Explore all parameter values¶

If you want to query all parameter values across all runs, use ParamValue.

ln.core.ParamValue.df(include=["param__name", "created_by__handle"])

Show code cell output Hide code cell output

	value	hash	space_id	param__name	created_by__handle
id
1	./mydataset	None	1	input_dir	testuser1
2	0.01	None	1	learning_rate	testuser1
3	{'downsample': True, 'normalization': 'the_goo...	None	1	preprocess_params	testuser1

Manage notebook templates¶

A notebook acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

label using ULabel, e.g., transform.ulabels.add(template_label)
tag with an indicative version string, e.g., transform.version = "T1"; transform.save()