By Phoebe Parker, Senior Consultant, Altis Consulting
LabelImg is a graphical image annotation tool which allows you to draw visual boxes around your objects in each image, it also automatically saves the XML files of your labelled images. It is a free and easy way to label your images.
If you are using Python 3, the easiest way to install LabelImg is using pip. Simply open your command line and run: ‘pip3 install labelImg’. To launch, run ‘labelImg’ in your command line prompt.
Organise your dataset
Once you have opened LabelImg:
- Create a folder titled “images” and copy all of your images there
- Create another folder titled “annotations”
- Go to the LabelImg “View” menu and check that “Auto Save Mode” is selected
- Click on “Open Dir” on the left hand side and select your “images” directory
- LabelImg will then ask where to save annotations to, select your “annotations” directory.
- If no image comes up click the “Next Image” button a few times and it should begin to flick through your images.
- Before labelling your images, check which label file format you would like to use. VOC XML is a more of a universal standard when it comes to object detection. LabelImg also supports YOLO format if you would prefer to use that. To change just click on the “</> PascalVOC” or “YOLO” box on the left hand side.
Annotating your Dataset
You are all set up to begin labelling your images now. In this example, we’re using the Kaggle cats v dogs dataset.
Begin annotating your images by:
- Click on the “Create n\RectBox”
- Use your mouse to draw a box around the object you are wanting to label
- Enter the name of the object in the pop-up box and select “OK”
- If there is more than one object in the picture select “Create \nRectBox” again and annotate all the objects
- Click on the “Next Image” button and repeat the process until you are finished annotating all your images. All the image annotations will be automatically saved in your “annotations” folder. They will be xml files with names corresponding to your image names.
- Label the whole object. It is better to include a buffer than to cut out part of the object you are labelling.
- If an object is obscured by another object, label it as if you can see the full object. This helps the model to understand the actual size and area of the object.
- If an object is partially out of the frame, label them anyway.
If you’re interested in this topic, you can watch the recording of the Data4Good Webinar, where Phoebe talked about the technology used in this project and what they hope to achieve as a long term goal.