maestro
coming: when it's ready...
maestro is a tool designed to streamline and accelerate the fine-tuning process for multimodal models. It provides ready-to-use recipes for fine-tuning popular vision-language models (VLMs) such as Florence-2, PaliGemma, and Qwen2-VL on downstream vision-language tasks.
install¶
Pip install the supervision package in a Python>=3.8 environment.
quickstart¶
CLI¶
VLMs can be fine-tuned on downstream tasks directly from the command line with
maestro
command:
SDK¶
Alternatively, you can fine-tune VLMs using the Python SDK, which accepts the same arguments as the CLI example above: