Multi-GPU Nodes
Run your experiments on multiple GPUs with a single configuration change.
Distributed Training
Easily scale workloads across multiple nodes to train bigger models or get results faster. MPI frameworks such as Horovod are supported out of the box.
Simple Metric Logging
Easily log metrics from your training run by writing to the console or log files.
Visualize Training
Get detailed insights into your model's training by visualizing and comparing metrics across runs.
Optimize Training
System metrics are automatically gathered for every run, giving you the insights you need to fully utilize the hardware.
Experiment Versioning
Every experiment automatically snapshots the code, data, parameters and container images so you can always reproduce the experiments that worked.
Data Traceability
Turbine keeps a version history of your data sets so that results can always be traced back to the data that produced them.
Pipelines are the building blocks of machine learning. Pipelines bring flexibility and modularity to your model training - whether you are separating your data preparation from the training step, or running a quick smoke test before starting a large distributed job pipelines let you build a workflow to suit your project's needs.