This blog post will provide a comprehensive overview of Detectron2, highlighting its key features and advantages. We'll guide you through the installation and the practical usage of Detectron2. We'll address common challenges such as installation issues, compatibility concerns, and algorithmic intricacies.
The deep learning landscape is enriched with numerous tools and libraries made to simplify complex tasks. In the domain of Computer Vision, object detection has been one of the tasks that has attracted a lot of attention. With the release of Detectron2, Facebook AI Research (FAIR) took the challenge head-on, offering a cutting-edge platform for this purpose.
What is Detectron2?
Detectron2 is an open-source project from Facebook AI Research (FAIR) and represents the second version of the Detectron library. Unlike its predecessor, Detectron2 is written in PyTorch, one of the most popular deep learning libraries. This transition provides developers and researchers with greater flexibility, extensibility, and ease of use.
Key features of Detectron2
1. Modular and flexible design: Detectron2 is built with modularity in mind. This allows researchers and developers to easily plug in new components or tweak existing ones without much hassle.
2. Extensive model zoo: It comes with a plethora of pre-trained models. Whether you are looking to implement instance segmentation, panoptic segmentation, or plain object detection, Detectron2 has a pre-trained model available.
3. Native PyTorch implementation: Unlike its predecessor, which was built on Caffe2, Detectron2 leverages the capabilities of PyTorch, making it much easier to use and integrate with other PyTorch-based tools.
4. Training and evaluation utilities: Detectron2 provides out-of-the-box functionalities that streamline the process of training, evaluating, and fine-tuning models.
Detectron2 model zoo: models for every computer vision tasks
Detectron2 provides a wide range of models in its model zoo, each tailored for specific computer vision tasks. Here's a breakdown of the main models Detectron2 offers for different tasks:
1. Object detection
Faster R-CNN: This is a pioneering model that combines Region Proposal Networks (RPN) with Fast R-CNN for end-to-end object detection.
TridentNet: An object detection model that introduces multi-branch architectures, called "tridents," to handle objects of various scales more effectively.
RetinaNet: This model uses a Feature Pyramid Network (FPN) backbone and a novel focal loss to address the problem of class imbalance during object detection.
2. Semantic segmentation:
DeepLabv3+: An encoder-decoder structure-based model that is known for great performance in semantic segmentation tasks, leveraging atrous convolutions and fully connected spatial pyramid pooling.
3. Instance segmentation
Mask R-CNN: Building upon Faster R-CNN, Mask R-CNN adds a segmentation mask prediction branch, allowing it to predict object masks along with bounding boxes.
PointRend: A technique that iteratively refines segmentation masks by focusing on uncertain regions and employs point-based rendering to produce high-resolution and detailed object boundaries.
4. Panoptic segmentation
Panoptic FPN: A model that addresses both semantic and instance segmentation tasks. It unifies the typically distinct semantic segmentation and instance segmentation tasks under a single framework.
5. Keypoint detection:
Keypoint R-CNN: An extension of Mask R-CNN, it predicts object keypoints in addition to bounding boxes and masks, making it useful for tasks such as human pose estimation.
5. Dense pose estimation
DensePose R-CNN: A model that maps all human pixels in an RGB image to the 3D surface of the human body. It's useful for detailed human pose estimation.
How to use detectron2?
For this section, we will navigate through the Detectron2 documentation for instance segmentation. Before jumping in, we recommend that you review the entire process as we encountered some steps that were problematic. Browsing through our various attempts you will save time and energy.
Detectron2 suggests specific OS and Python and PyTorch versions for optimal results:
Linux or macOS with Python ≥ 3.7
PyTorch ≥ 1.8 and torchvision that matches the PyTorch installation. Install them together at pytorch.org to make sure of this
OpenCV is optional but needed by demo and visualization.
That said, we are venturing forward on a Windows machine, and for all the Windows users reading this, let's make it happen!
Setting up a working environment begins with creating a Python virtual environment and then installing the Torch dependencies.
Creating the virtual environment
Python 3.7 or higher is suggested, we are opting for Python 3.10:
This time, the compilation error stack was so extensive that the command prompt interface wouldn't even display the beginning of it.
It was disappointing to discover that the Detectron2 documentation did not provide any further installation alternatives except dockers which are ephemeral and complex to set up. Additionally, the 'common installation issues section' didn't address the specific error I encountered.
Windows Support: an oversight?
When facing issues with a particular repository, the 'issues' section is typically a reliable resource for potential solutions.
At the time of writing this post, there were several open issues related to support for Windows users. Unfortunately, the lack of response from the facebookresearch Detectron2 team suggests that Windows support may not be forthcoming.
Considering the 2023 Stack Overflow survey indicates Windows remains the dominant operating system for developers (both in personal and professional spheres), the absence of Windows support is indeed perplexing.
Attempt 3: successful installation of Detectron2
Given the lack of Windows support, we forked and edited the repository for correct compilation across the different operating systems.
We selected the ‘mask_rcnn_R_50_FPN_3x’ model and its corresponding config file from the model zoo. To demonstrate the built-in configurations, we utilized the ‘demo.py’ provided. Note that ‘demo.py’ can be found in the ‘detectron2/demo’ directory.
Navigating the Detectron2 setup proved to be a time-consuming challenge, taking over an hour and significant efforts to successfully implement.
Although the demo was executed seamlessly, identifying the right combination of model weights and configuration files for more extensive testing is less than intuitive.
In the following section, we'll demonstrate how to simplify the installation and usage of Detectron2 via the Ikomia API, significantly reducing both the steps and time needed to execute object detection tasks.
Easier Detectron2 object detection with a Python API
With the Ikomia team, we've been working on a prototyping tool to avoid dependencies and compatibility issues, thereby speeding up the often tedious processes of installation and testing.
We wrapped it in an open source Python API. Now we're going to explain how to use all the Detectron2 models in less than 5 minutes.
If you have any questions, please join our Discord.
Then the only thing you need to install is Ikomia API:
pip install ikomia
Detectron2 instance segmentation inference
You can also charge directly the notebook we have prepared.
from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display
# Init your workflow
wf = Workflow()
# Add algorithm
algo = wf.add_task(name="infer_detectron2_instance_segmentation", auto_connect=True)
# Set parameters
# Run on your image
# Display the results
Fast Detectron2 execution: from setup to results in just 5 minutes
To carry out instance segmentation, we simply installed Ikomia and ran the workflow code snippets. All dependencies were seamlessly handled in the background. We progressed from setting up a virtual environment to obtaining results in approximately 5 minutes.
Explore further with the Detectron2 Algorithms
We've implemented all the Detectron2 algorithms for both inference and training. You can conveniently find code snippets tailored to your needs on the Ikomia HUB.
Crafting production-ready Computer Vision applications with ease
One of the standout benefits of the API, aside from simplifying dependency installations, is its innate ability to seamlessly interlink algorithms from diverse frameworks, including frameworks like Detectron2, OpenMMLab, YOLO, Hugging Face.
Once you have crafted your solution with Ikomia's Python API, you can deploy it yourself, or opt for SCALE, our automated deployment SaaS platform.