Real-Time Object Detection Using TensorFlow.NET in .NET Core

Tapesh Mehta | Published on: Jul 05, 2024 | Est. reading time: 7 minutes

In today’s blog post we’ll study how to implement real-time object detection using TensorFlow.NET in .NET Core. Object detection is a fundamental task in computer vision to enable applications like surveillance, autonomous driving and image search. For real time detection and classification of objects, systems can react and make decisions based on visual data. As a Dot Net development company, such advanced AI can really add value to your apps and give you a competitive edge.

This tutorial shows you how to implement real-time object detection using TensorFlow.NET and camera feeds in a .NET Core application. Regardless if you’re a veteran designer or even just getting started, this step-by-step guide will walk you through the setup and running of an object detection system. We’ll walk you through setting up your project, downloading and loading pre-trained models, and processing live video feeds and drawing detection results. By the conclusion of this write-up you ought to have a great foundation in utilizing TensorFlow.NET for powerful real time object detection.

For those interested in learning more about .NET development, check out our .NET Development blogs. Stay updated with the latest insights and best practices!

Prerequisites
Step 1: Setting Up the .NET Core Project
Step 2: Downloading a Pre-trained Model
Step 3: Loading the Model in .NET Core
Step 4: Capturing Real-Time Camera Feed
Step 5: Object Detection Logic
Step 6: Drawing Detections on the Frame
Full Code
Conclusion

Prerequisites

Before we start, make sure you have the following installed on your machine:

.NET Core SDK
Visual Studio or Visual Studio Code
TensorFlow.NET
A webcam or an IP camera for real-time feed

Step 1: Setting Up the .NET Core Project

First, create a new .NET Core console application.

dotnet new console -n ObjectDetection
cd ObjectDetection

Next, install the necessary NuGet packages.

dotnet add package TensorFlow.NET
dotnet add package SciSharp.TensorFlow.Redist
dotnet add package OpenCvSharp4
dotnet add package OpenCvSharp4.runtime.win

These packages include TensorFlow.NET for model inference and OpenCvSharp for handling camera feeds and image processing.

Step 2: Downloading a Pre-trained Model

For this tutorial, we’ll use a pre-trained SSD MobileNet V2 model, which is efficient and suitable for real-time applications. Download the model and labels from TensorFlow’s Model Zoo:

Extract the model files and place them in a folder named models in your project directory.

Step 3: Loading the Model in .NET Core

Let’s start by loading the TensorFlow model and labels.

using System;
using System.Collections.Generic;
using System.IO;
using Tensorflow;
using Tensorflow.Serving;
using static Tensorflow.Binding;

class Program
{
    private static string modelPath = "models/ssd_mobilenet_v2_coco/saved_model";
    private static string labelsPath = "models/coco-labels-paper.txt";
    private static List<string> labels;

    static void Main(string[] args)
    {
        // Load labels
        labels = new List<string>(File.ReadAllLines(labelsPath));

        // Load model
        var session = LoadModel();

        // Implement the rest of the application
    }

    private static Session LoadModel()
    {
        var graph = new Graph().as_default();
        var graphDef = graph.Import(modelPath);
        var session = tf.Session(graph);
        return session;
    }
}

Step 4: Capturing Real-Time Camera Feed

We will use OpenCvSharp to capture the camera feed. Install the OpenCvSharp packages if you haven’t already.

using OpenCvSharp;
using OpenCvSharp.Extensions;
using System.Drawing;

class Program
{
    static void Main(string[] args)
    {
        // Initialize camera
        VideoCapture capture = new VideoCapture(0);
        if (!capture.IsOpened())
        {
            Console.WriteLine("Failed to open camera.");
            return;
        }

        // Load model and labels
        labels = new List<string>(File.ReadAllLines(labelsPath));
        var session = LoadModel();

        // Process camera feed
        Mat frame = new Mat();
        while (true)
        {
            capture.Read(frame);
            if (frame.Empty())
                break;

            // Process frame for object detection
            var detectionResult = DetectObjects(frame, session);

            // Draw detections on frame
            DrawDetections(frame, detectionResult);

            // Display frame
            Cv2.ImShow("Object Detection", frame);
            if (Cv2.WaitKey(1) == 'q')
                break;
        }

        capture.Release();
        Cv2.DestroyAllWindows();
    }

    // Implement DetectObjects and DrawDetections methods
}

Step 5: Object Detection Logic

Next, we’ll implement the DetectObjects method to process the frame through the TensorFlow model.

private static TensorProto PrepareInputTensor(Mat frame)
{
    var bitmap = BitmapConverter.ToBitmap(frame);
    var resizedBitmap = new Bitmap(bitmap, new System.Drawing.Size(300, 300));
    var imageData = resizedBitmap.ToByteArray();

    var tensorProto = new TensorProto
    {
        Dtype = DataType.DtUint8,
        TensorShape = new TensorShapeProto
        {
            Dim = { new TensorShapeProto.Types.Dim { Size = 1 }, new TensorShapeProto.Types.Dim { Size = 300 }, new TensorShapeProto.Types.Dim { Size = 300 }, new TensorShapeProto.Types.Dim { Size = 3 } }
        },
        TensorContent = Google.Protobuf.ByteString.CopyFrom(imageData)
    };

    return tensorProto;
}

private static object[] DetectObjects(Mat frame, Session session)
{
    TensorProto inputTensor = PrepareInputTensor(frame);
    var runner = session.GetRunner();

    runner.AddInput(session.graph["image_tensor"][0], inputTensor);
    runner.Fetch(session.graph["detection_boxes"][0]);
    runner.Fetch(session.graph["detection_scores"][0]);
    runner.Fetch(session.graph["detection_classes"][0]);
    runner.Fetch(session.graph["num_detections"][0]);

    var output = runner.Run();

    var boxes = output[0].ToNDArray<float>();
    var scores = output[1].ToNDArray<float>();
    var classes = output[2].ToNDArray<float>();
    var numDetections = output[3].ToNDArray<float>();

    return new object[] { boxes, scores, classes, numDetections };
}

Step 6: Drawing Detections on the Frame

Finally, we will implement the DrawDetections method to visualize the detected objects.

private static void DrawDetections(Mat frame, object[] detectionResult)
{
    var boxes = (float[,])detectionResult[0];
    var scores = (float[])detectionResult[1];
    var classes = (float[])detectionResult[2];
    var numDetections = (float)detectionResult[3];

    for (int i = 0; i < numDetections; i++)
    {
        float score = scores[i];
        if (score < 0.5)
            continue;

        int classId = (int)classes[i];
        string label = labels[classId];

        var box = new Rect(
            (int)(boxes[i, 1] * frame.Width),
            (int)(boxes[i, 0] * frame.Height),
            (int)((boxes[i, 3] - boxes[i, 1]) * frame.Width),
            (int)((boxes[i, 2] - boxes[i, 0]) * frame.Height)
        );

        Cv2.Rectangle(frame, box, Scalar.Red, 2);
        Cv2.PutText(frame, label, new Point(box.Left, box.Top - 10), HersheyFonts.HersheySimplex, 0.9, Scalar.Red, 2);
    }
}

Full Code

Here’s the complete code for reference:

using System;
using System.Collections.Generic;
using System.IO;
using Tensorflow;
using OpenCvSharp;
using OpenCvSharp.Extensions;

class Program
{
    private static string modelPath = "models/ssd_mobilenet_v2_coco/saved_model";
    private static string labelsPath = "models/coco-labels-paper.txt";
    private static List<string> labels;

    static void Main(string[] args)
    {
        // Load labels
        labels = new List<string>(File.ReadAllLines(labelsPath));

        // Load model
        var session = LoadModel();

        // Initialize camera
        VideoCapture capture = new VideoCapture(0);
        if (!capture.IsOpened())
        {
            Console.WriteLine("Failed to open camera.");
            return;
        }

        // Process camera feed
        Mat frame = new Mat();
        while (true)
        {
            capture.Read(frame);
            if (frame.Empty())
                break;

            // Process frame for object detection
            var detectionResult = DetectObjects(frame, session);

            // Draw detections on frame
            DrawDetections(frame, detectionResult);

            // Display frame
            Cv2.ImShow("Object Detection", frame);
            if (Cv2.WaitKey(1) == 'q')
                break;
        }

        capture.Release();
        Cv2.DestroyAllWindows();
    }

    private static Session LoadModel()
    {
        var graph = new Graph().as_default();
        var graphDef = graph.Import(modelPath);
        var session = tf.Session(graph);
        return session;
    }

    private static TensorProto PrepareInputTensor(Mat frame)
    {
        var bitmap = BitmapConverter.ToBitmap(frame);
        var resizedBitmap = new Bitmap(bitmap, new System.Drawing.Size(300, 300));
        var imageData = resizedBitmap.ToByteArray();

        var tensorProto = new TensorProto
        {
            Dtype = DataType.DtUint8,
            TensorShape = new TensorShapeProto
            {
                Dim = { new TensorShapeProto.Types.Dim { Size = 1 }, new TensorShapeProto.Types.Dim { Size = 300 }, new TensorShapeProto.Types.Dim { Size = 300 }, new TensorShapeProto.Types.Dim { Size = 3 } }
            },
            TensorContent = Google.Protobuf.ByteString.CopyFrom(imageData)
        };

        return tensorProto;
    }

    private static object[] DetectObjects(Mat frame, Session session)
    {
        TensorProto inputTensor = PrepareInputTensor(frame);
        var runner = session.GetRunner();

        runner.AddInput(session.graph["image_tensor"][0], inputTensor);
        runner.Fetch(session.graph["detection_boxes"][0]);
        runner.Fetch(session.graph["detection_scores"][0]);
        runner.Fetch(session.graph["detection_classes"][0]);
        runner.Fetch(session.graph["num_detections"][0]);

        var output = runner.Run();

        var boxes = output[0].ToNDArray<float>();
        var scores = output[1].ToNDArray<float>();
        var classes = output[2].ToNDArray<float>();
        var numDetections = output[3].ToNDArray<float>();

        return new object[] { boxes, scores, classes, numDetections };
    }

    private static void DrawDetections(Mat frame, object[] detectionResult)
    {
        var boxes = (float[,])detectionResult[0];
        var scores = (float[])detectionResult[1];
        var classes = (float[])detectionResult[2];
        var numDetections = (float)detectionResult[3];

        for (int i = 0; i < numDetections; i++)
        {
            float score = scores[i];
            if (score < 0.5)
                continue;

            int classId = (int)classes[i];
            string label = labels[classId];

            var box = new Rect(
                (int)(boxes[i, 1] * frame.Width),
                (int)(boxes[i, 0] * frame.Height),
                (int)((boxes[i, 3] - boxes[i, 1]) * frame.Width),
                (int)((boxes[i, 2] - boxes[i, 0]) * frame.Height)
            );

            Cv2.Rectangle(frame, box, Scalar.Red, 2);
            Cv2.PutText(frame, label, new Point(box.Left, box.Top - 10), HersheyFonts.HersheySimplex, 0.9, Scalar.Red, 2);
        }
    }
}

For those interested in learning more about .NET development, check out our .NET Development blogs. Stay updated with the latest insights and best practices!

Conclusion

This article shows you how to implement real-time object detection using TensorFlow.NET in a .NET Core application and OpenCvSharp. We set up the project, downloaded the necessary models and started taking real-time camera feeds. We then explored the code that would process these feeds using a TensorFlow model and mark detection results on the frames.

For a Dot Net development company, AI-powered features such as object detection can pave the way for new applications and services. Following this guide gives you the knowledge and tools to build and deploy sophisticated object detection systems. This improves the abilities of your applications and places your company in the forefront of technical developments in the market. Continue experimenting with configurations and models to improve performance and accuracy for your particular use cases.

Share

A Global Team for Global Solutions! 🌍

WireFuture's team spans the globe, bringing diverse perspectives and skills to the table. This global expertise means your software is designed to compete—and win—on the world stage.

Hire Now

Tapesh Mehta

Verified

Expert in Software Development

Tapesh Mehta is a seasoned tech worker who has been making apps for the web, mobile devices, and desktop for over 14+ years. Tapesh knows a lot of different computer languages and frameworks. For robust web solutions, he is an expert in Asp.Net, PHP, and Python. He is also very good at making hybrid mobile apps, which use Ionic, Xamarin, and Flutter to make cross-platform user experiences that work well together. In addition, Tapesh has a lot of experience making complex desktop apps with WPF, which shows how flexible and creative he is when it comes to making software. His work is marked by a constant desire to learn and change.

Get in Touch

Your Ideas, Our Strategy – Let's Connect.

No commitment required. Whether you’re a charity, business, start-up or you just have an idea – we’re happy to talk through your project.

Embrace a worry-free experience as we proactively update, secure, and optimize your software, enabling you to focus on what matters most – driving innovation and achieving your business goals.

Email: contact@wirefuture.com

Cell: +91-9925192180

Hire Your A-Team Here to Unlock Potential & Drive Results

You can send an email to contact@wirefuture.com