PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice implemented in PaddlePaddle. PaddleOCR support a variety of cutting-edge algorithms related to OCR, and developed industrial solution. PaddleOCR now comes with a Weights & Biases integration for logging training and evaluation metrics along with model checkpoints with corresponding metadata.
Example Blog & Colab
Read here to see how to train a model with PaddleOCR on the ICDAR2015 dataset. This also comes with a Google Colab and the corresponding live W&B dashboard is available here. There is also a Chinese version of this blog here: W&B对您的OCR模型进行训练和调试
Using PaddleOCR with Weights & Biases
1. Sign up and Log in to wandb
Sign up for a free account, then from the command line install the wandb library in a Python 3 environment. To login, you'll need to be signed in to you account at www.wandb.ai, then you will find your API key on the Authorize page.
- Command Line
pip install wandb
!pip install wandb
2. Add wandb to your
PaddleOCR requires configuration variables to be provided using a yaml file. Adding the following snippet at the end of the configuration yaml file will automatically log all training and validation metrics to a W&B dashboard along with model checkpoints:
Any additional, optional arguments that you might like to pass to
wandb.init can also be added under the
wandb header in the yaml file:
project: CoolOCR # (optional) this is the wandb project name
entity: my_team # (optional) if you're using a wandb team, you can pass the team name here
name: MyOCRModel # (optional) this is the name of the wandb run
3. Pass the
config.yml file to
The yaml file is then provided as an argument to the training script available in the PaddleOCR repository.
python tools/train.py -c config.yml
Once you run your
train.py file with Weights & Biases turned on, a link will be generated to bring you to your W&B dashboard: