logo

Getting started with Hololens for Computer Vision

This is an article from upcoming series of tutorials for people interested in topic to help them get started with developing Computer Vision applications on Microsoft Hololens device. It’s based on gained experience from recently dellivered project Smart Prague (Czech language only).

Our workstack for this series will be:

  • Visusal Studio 2017 with modern [SDK][3] installed (for time being version 1809 was used).

  • OpenCV – as our computer vision algorithms library.

  • Unity – as our rendering and prototyping engine.

Please be sure to install VS and Unity with UWP platform support if you want to follow along the series.

In this article I will explain how to

Compiling OpenCV

Although process is straight forward it seems like many people are strugling with this. Thats why here is the step by step guide for compiling OpenCV from source for UWP applications.

1) Follow to OpenCV github releases page, download Source code (zip) file and extract archive of desired version. I will be using v 3.4.1 (my choice was based on fact that this is the version I was able to compile without any problems in comparison to earlier releases)

1.a. This step is optional, follow it only if you want to use OpenCV contrib modules in your project. Follow to [OpenCV_contrib](https://github.com/opencv/opencv_contrib/releases) github releases page and download *zip* file and extract archive. Right now it is important to choose same version as version of opencv you downloaded earlier. Again i downloaded v 3.4.1

2) Download and install [CMake][10]. We will be using CMake gui for our purpose so go ahead and open it. Create empty directory for source build, I named my opencv_3_4_1_uwp_build. Now choose opencv source folder by clicking Browse Source... button and build folder by clicking Browse Build... button. Next you need to define CMAKE_SYSTEM_NAME=WindowsStore and CMAKE_SYSTEM_VERSION=10.0 to let cmake know that you want to compile project for UWP with sdk support 10.0xxx. You can do so by clicking Add Entry button. Next press buttonConfigure, select Visual Studio 15 2017 and pressFinish.

2.a. Now if you want to compile opencv_contrib modules. Locate Entry OPENCV_EXTRA_MODULES_PATH (you can do so by typing modules inSearch field) and set path to downloaded [path_to_opencv_contrib_source]/modules folder source and pressConfigure button again.

image

Right now you can configure your OpenCV generator…

Disable this entries: BUILD_opencv_highgui; BUILD_opencv_videoio; BUILD_opencv_python_bindings_generator; BUILD_opencv_python_java_generator; BUILD_opencv_ts if you don't do so, you wont't be able to compile opencv libs without errors. Also I would reccomend you disabling BUILD_TESTS and BUILD_PERF_TESTS entries as well to speed up compilation process later on. Now press Configure followed by Generate buttons, this will create VS solution upon completion. Open generated solution switch to Release, build ALL_BUILD project and INSTALL project.

3) Open [build_folder_created_earlier]/install and here you should locate ready to use opencv libs as well as include directory and License file. Also cascade files are presented here (in etc folder), but this is out of scope of this tutorial, so you can just ignore it.

Creating UWP plugin

1) Open Visual Studio and create new project. Select Visual C++ -> Windows Universal -> DLL (Universal Windows); name it DevGuerillaCVPlugin and click OK.

2) Now we need to setup paths for opencv include and lib folders. I like to keep things organized and have ability to easily share a project between coworkers so this is one of many ways you could achive so within variaty of projects we will be implementing. Open your DevGuerillaCVPlugin solution directory and create folder tree Dependencies/UWP/x86/ now inside this directory create include, lib and bin folders. From [opencv_build]/install/include copy contents to include folder, from [opencv_build]/install/x86/vc15/lib copy contents to lib folder and from [opencv_build]/install/x86/vc15/bin copy contents to bin folder. Open Visual Studio project properties page, select Configuration to Release and Platform to Win32, then in C/C++->General tab locate field Additional Include Directories, at the end of field add ; and enter $(SolutionDir)Dependencies/UWP/$(PlatformTarget)/include![Image][12] Open Linker->General tab and to field Additional Library Dependencies enter $(SolutionDir)Dependencies/UWP/$(PlatformTarget)/lib![Image][13] Save changes and close property pages window.

3) Ok, we all setup to write some code. Let’s add some declarations to our plugin:

DevGuerillaCVPlugin.h

#pragma once

// Use this macros just to keep track of arguments we are using
#define IN

// Use this macros just to keep track of arguments we are using
#define OUT

namespace CVPlugin { 
// Our income frame data 
typedef struct FRAME_DATA 
{ 
    int width; int height; int channels; unsigned char * pixels; 
} FrameData;

// Our first processing function
extern "C" void __declspec(dllexport) _stdcall
processFrameCanny(_IN_ FrameData *, _OUT_ unsigned char *);

 

And now definitions:

DevGuerillaCVPlugin.cpp

#include "pch.h"
#include "DevGuerillaCVPlugin.h"
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"

// For any module we use in our code we need to include necessary dll to unity project
#pragma comment(lib, "opencv_core341.lib")
#pragma comment(lib, "opencv_imgproc341.lib")

namespace CVPlugin { 
    using namespace cv;

extern "C" void __declspec(dllexport) _stdcall
processFrameCanny(_IN_ FrameData * indata, _OUT_ unsigned char * outdata)
{
    /// Fill in Mat objects
    Mat inmat(Size(indata->width, indata->height), CV_8UC4, &indata->pixels[0]);
    Mat outmat(Size(indata->width, indata->height), CV_8UC1, &outdata[0]);

    /// Convert to grayscale
    cvtColor(inmat, outmat, COLOR_RGBA2GRAY);
    /// Blur out image
    GaussianBlur(outmat, outmat, Size(5, 5), 0, 0);
    /// Extract contours
    Canny(outmat, outmat, 20, 50 * 4, 3);
}
}

Now we are ready to build a project, so go ahead, do it. Don’t forget to switch Configuration to Release and Platform to x86. It should compile and link without errors. If you do have errors here are few troubleshootings tips:

  • #include <opencv2/core.hpp> says cannot open source file. You entered invalid path to Additional include directories at C++->General tab or forget to switch Platform/Configuration in property pages when setting this value.

  • When building solution you have linker errors, something like: DevGuerillaCVPlugin.obj : error LNK2001: unresolved external symbol "private: char * __thiscall cv::String::allocate(unsigned int)" (?allocate@String@cv@@AAEPADI@Z). As before but now double check if Additional Library Dependencies at Linker->General pointing to opencv lib folder for Platform/Configuration you are building.

This should be it, let’s move on and use our plugin in Unity project.

Calling plugin from Unity

Let’s start by creating empty project with 3D Template selected. Switch build target to UWP; default Quality settings to Very Low; in Player Configuration enable Virtual Reality Supported; now – I’m using Unity version 2019.1 so I don’t have scripting backend options besides IL2CPP, if you are using older version you might have .Net selected, my suggestion is to switch to IL2CPP if performance is something you care about. Also enable WebCam in Publish Settings->Capabilities. I assume you are familiar with basic scene/camera setup for Hololens, so I skip explaining this.

Now we need to setup further functionality:

  • We need to capture data from Hololens PVC (PhotoVideoCamera) camera.
  • We want to display this data (at least for debugging purpose).
  • We want to be able to call plugin function implemented earlier and send PVC data as input and got processed output.
  • We want to display received output (by the way, we should keep in mind that we receive 1 channel data).

Structure:

  • Create 3 script files CameraController.cs, Compositor.cs, CVPlugin.cs.
  • Create unlit shader AlphaDisplay.shader.
  • Create 2 materials TextureMaterial.mat with Unlit->Texture shader assigned and AlphaTextureMaterial.mat with Unlit->AlphaDisplay shader assigned.
  • Create next folder structure: Plugins/WSA/x86 and copy there dlls, right now it’s DevGuerillaCVPlugin.dll, opencv_core341.dll and opencv_imgproc341.dll. Also don’t forget to include opencv License file to your project.
  • In scene create 2 quads, those are going to be hour displays for raw image from camera and processed one. Since LocatableCamera on Hololens supports only 16:9 aspect ratio, lets resize them accordingly… I set scale to both (1, 0.5625, 1). Place them side by side. Assign to one of them TextureMaterial and name it RawCapture, to second one assigne AlphaTextureMaterial and name it ProcessedCapture.

Implementation:

First lets setup our camera stream. Unity provides a class for current purpose WebcCamTexture. It allows us to start webcam stream and read pixels data. It’s pretty much all we want at this stage.

CameraController.cs

using System.Collections;
using UnityEngine;

public class CameraController : MonoBehaviour { WebCamTexture webcamTexture; // color buffer to reduce GC overhead Color32[] buffer = null;

MaterialPropertyBlock prop;
public MeshRenderer visualizeCapture;

// Basic getters, should be self explanatory
public bool IsPlaying
{
    get
    {
        return webcamTexture && webcamTexture.isPlaying;
    }
}

public int Width
{
    get
    {
        return 896;
    }
}

public int Height
{
    get
    {
        return 504;
    }
}

public Color32[] Pixels {
    get
    {
        if (webcamTexture)
            webcamTexture.GetPixels32(buffer);
        return buffer;
    }
}

public static CameraController Instance 
{
    get; private set;
}

void CreateTexture()
{
    // Create Unity webcam texture
    webcamTexture = new WebCamTexture(Width, Height, 30);
    // Init buffer
    buffer = new Color32[Width * Height];
    // Start camera stream
    webcamTexture.Play();
    if (visualizeCapture)
    {
        prop = new MaterialPropertyBlock();
        prop.SetTexture("_MainTex", webcamTexture);
        visualizeCapture.SetPropertyBlock(prop);
    }
}

IEnumerator Start()
{
    if (Instance)
    {
        Destroy(this);
        yield break;
    }
    Instance = this;
    FindWebCams();

    // Let user allow to use the webcam first
    yield return Application.RequestUserAuthorization(UserAuthorization.WebCam);
    if (Application.HasUserAuthorization(UserAuthorization.WebCam))
    {
        Debug.Log("webcam found");
        CreateTexture();
    }
    else
    {
        Debug.Log("webcam not found");
    }
}

void FindWebCams()
{
    // Iterate over available devices
    foreach (var device in WebCamTexture.devices)
    {
        Debug.Log("Name: " + device.name);
    }
}

private void OnDestroy()
{
    if (webcamTexture)
    {
        webcamTexture.Stop();
        Destroy(webcamTexture);
        buffer = null;
        Instance = null;
    }
}
}

Next let’s setup a “bridge” between Unity and plugin implemented earlier.

CVPlugin.cs

using System.Runtime.InteropServices;

// Create struct layout to match memory layout of struct in plugin [StructLayout(LayoutKind.Sequential), System.Serializable] public struct FrameData { public int width; public int height; public int channels; public byte[] pixels; }

public static class CVPlugin { // Import function call from our plugin [DllImport("DevGuerillaCVPlugin.dll", EntryPoint = "processFrameCanny")] public static extern void ProcessFrame(ref FrameData fd, byte[] outData); }

And finally, putting it all together.

Compositor.cs

using System;
using UnityEngine;

public class Compositor : MonoBehaviour { // Buffer to hold processed data byte[] processedPixelsBuffer; // Buffer to hold raw data byte[] rawPixelBufer; // Texture for processed data Texture2D processedTexture;

bool processingReady = false;
bool shouldUpdateFrame = false;

public MeshRenderer visualizeProcessing;
MaterialPropertyBlock prop;

bool CreateTexture()
{
    if (processedTexture)
    {
        return true;
    }
    if (CameraController.Instance && CameraController.Instance.IsPlaying)
    {
        var cc = CameraController.Instance;

        // Prepare texture and buffer
        processedTexture = new Texture2D(cc.Width, cc.Height, TextureFormat.Alpha8, false);
        processedPixelsBuffer = new byte[cc.Width * cc.Height];
        rawPixelBufer = new byte[cc.Width * cc.Height * 4];
        processedTexture.Apply();
        if (visualizeProcessing)
        {
            // Set texture to visualize processed data
            prop = new MaterialPropertyBlock();
            prop.SetTexture("_MainTex", processedTexture);
            visualizeProcessing.SetPropertyBlock(prop);
        }
        return true;
    }
    return false;
}

void Update()
{
    if (processingReady)
    {
        if (shouldUpdateFrame)
        {
            shouldUpdateFrame = false;
            ProcessUpdateFrame();
        }
    }
    else
    {
        if (processingReady = CreateTexture())
        {
            shouldUpdateFrame = true;
        }
    }
}

private void ProcessUpdateFrame()
{
    var cc = CameraController.Instance;
    if (!cc || !cc.IsPlaying)
    {
        Debug.LogError("Something went wrong, CameraController is not valid");
        return;
    }
    Debug.Log("Updating frame");
    var pix = cc.Pixels;
    for (int i = 0; i &lt; pix.Length; i++)
    {
        rawPixelBufer[i * 4] = pix[i].r;
        rawPixelBufer[i * 4 + 1] = pix[i].g;
        rawPixelBufer[i * 4 + 2] = pix[i].b;
        rawPixelBufer[i * 4 + 3] = pix[i].a;
    }
    try
    {
        FrameData fd = new FrameData()
        {
            width = cc.Width,
            height = cc.Height,
            channels = 4, // Color32 struct has 4 channels (r,g,b,a)
            pixels = rawPixelBufer
        };
        // Call plugin function
        CVPlugin.ProcessFrame(ref fd, processedPixelsBuffer);
        // Update texture
        processedTexture.LoadRawTextureData(processedPixelsBuffer);
        processedTexture.Apply();
        shouldUpdateFrame = true;
    }
    catch (Exception e)
    {
        Debug.LogError("Error during processing");
        Debug.LogError(e.Message);
    }
}
}

 

Attach both MonoBehaviour scripts to some GameObject in scene, I added them to Camera. Assign renderer from RawCapture gameobject to CameraController component’s property visualizeCapture. Assign renderer from ProcessedCapture gameobject to Compositor component’s property visualizeProcessing. Here is the final layout for the scene:

Image

We almost there, one thing is missing though. We never implemnted AlphaDisplay.shader for properly display single channel Texture, let’s do that.

AlphaDisplay.shader

Shader "Unlit/AlphaDisplay"
{
Properties
{
_MainTex ("Texture", 2D) = "white" {}
}
SubShader
{
Tags { "RenderType"="Opaque" }

    Pass
    {
        CGPROGRAM
        #pragma vertex vert
        #pragma fragment frag

        #include "UnityCG.cginc"

        struct appdata
        {
            float4 vertex : POSITION;
            float2 uv : TEXCOORD0;
        };

        struct v2f
        {
            float2 uv : TEXCOORD0;
            float4 vertex : SV_POSITION;
        };

        sampler2D _MainTex;
        float4 _MainTex_ST;

        v2f vert (appdata v)
        {
            v2f o;
            o.vertex = UnityObjectToClipPos(v.vertex);
            o.uv = TRANSFORM_TEX(v.uv, _MainTex);
            return o;
        }

        fixed4 frag (v2f i) : SV_Target
        {
            // sample the texture
            fixed4 col = tex2D(_MainTex, i.uv);
            // set all colors to alpha value
            col.rgb = col.a;
            return col;
        }
        ENDCG
    }
}
}

Now you can build and deploy solution to your Hololens. Congratulations, you are all setup to use OpenCV in your project.

Although it’s far from being the most functional Computer Vision applications, we did setup our foundation and in future articles we will discuss how we can improve performance (by implementing multithreaded environment) and usablity, as well as implementing usable debuging functionality without the need of deploying app to the device.

You can grab the source code from Github repository.

Happy codding.

infbig

Vinohradská 22, 120 00
Praha 2, Czech Republic