I am actually looking into using my OAK-D lite (or Pro) to stream into StreamDiffusion for Img2Img. I thought, this would actually be an easy task, as there quite a lot of examples in the OAKExamples.toe, but to be honest, I actually don’t understand a lot of what’s going on in there.
What I want to achieve, is streaming rgb as the img-prompt and the sobel-edge-detection stream for controlnet, but the examples only work either one or the other and I really don’t understand, how to actually merge theses two together.
I looked for tutorials to help me, but most of the youtube stuff is quite outdated, I wasn’t really able to find good help to get started except on how to actually use the readymade examples or end up in quite some python-coding-work I was hoping to prevent.
I am quite sure, though, that in the end it will be no complex task to get these stream merged at all.
Can anyone help me out, give some hints, show some links, documentation, tutorials?
I really have no clue where to actually start without hardcoding the stuff by myself (which is to avoid, that’s why I actually use TD for this project. )
As I said: it’s just two outputs showing rgb and edge-detection. That’s all I need.
I used the “rgbStreamAndControl” example in the OAK file as the startingpoint here and edited the createPipeline() function a bit:
def createPipeline(oakDeviceOp):
# This example creates an RGB camera.
# https://docs.luxonis.com/projects/api/en/latest/samples/ColorCamera/rgb_preview/
# https://docs.luxonis.com/projects/api/en/latest/components/nodes/color_camera/
# https://docs.luxonis.com/projects/api/en/latest/components/messages/camera_control/
# Get custom parameters
OakProject = parent.OakProject
fps = OakProject.par.Rgbfps.eval()
resolution = eval(OakProject.par.Rgbresolution.eval())
ispscale1, ispscale2 = OakProject.parGroup.Ispscale.eval()
# Create pipeline
pipeline = dai.Pipeline()
# Define source and output
controlIn = pipeline.create(dai.node.XLinkIn)
camRgb = pipeline.create(dai.node.ColorCamera)
xoutVideo = pipeline.create(dai.node.XLinkOut)
# create an out node for the edge stream as well as the edgedetctor node
xoutEdge = pipeline.create(dai.node.XLinkOut)
edgeDetector = pipeline.create(dai.node.EdgeDetector)
# some initial settings for the edge detector
edgeDetector.setMaxOutputFrameSize(camRgb.getVideoWidth() * camRgb.getVideoHeight())
sobelHorizontalKernel = [[1, 0, -1], [2, 0, -2], [1, 0, -1]]
sobelVerticalKernel = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
edgeDetector.initialConfig.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)
# name all streams
controlIn.setStreamName('cameraControl')
xoutVideo.setStreamName('rgb')
xoutEdge.setStreamName('edge')
# link the nodes together
controlIn.out.link(camRgb.inputControl)
camRgb.video.link(xoutVideo.input)
camRgb.video.link(edgeDetector.inputImage)
edgeDetector.outputImage.link(xoutEdge.input)
# Properties
camRgb.setBoardSocket(dai.CameraBoardSocket.RGB)
camRgb.setResolution(resolution)
camRgb.setIspScale(ispscale1,ispscale2)
camRgb.setFps(fps)
return pipeline
The main thing is to create the edge detector depthai node and then connect the rgb stream to it as well as connecting the output of the detector node to the previously created out node.
It kind of works like TouchDesigner, I just also always have to consult the documentation.
As I supposed it isn’t actually that complicated. It’s just understanding the ideas behind the nodes in the code Thx for helping me get the knots out of my head. Works fine and I can now decide where to get the edges from (rgb or one of the IRs), which is great news.