Recurse SP2'23 #3: Towards a cel-shader

I worked late Monday and Tuesday, and today the rest of life just sorta caught up with me. So, the post is mostly just some research notes.

On Monday, a fellow Recurser, Efron, suggested writing a (2D) cel-shader. I want to pause my work on Vidrato to do some math, so this seems like a great next step.

This seems to have two main components:

Edge detection and manipulation
Color dithering (quantizing to a much smaller palette)

I don’t really know anything about any of this! For now, I’m going to focus on researching the edge detection angle.

OpenCV makes this too easy

I threw a Canny edge detection webcam filter together last weekend, but OpenCV is doing 100% of the legwork. (Funny enough, I got into OpenCV after reading a great intro by RC alum Sher Minn Chong! This article helped a lot with my initial explorations.) I still don’t understand the paramterization since I don’t understand the underlying algorithms yet, but I added some trackbars for today’s post so that the thresholds can be edited interactively.

Here’s a sample video, in which I’m sweeping the low threshold from 0 to 255, sweeping the high threshold, tuning setting the high threshold, and finally tuning the low threshold.

(Can’t see the video? Here’s an MP4 version for iOS users.)

The code:

#!/usr/bin/env python3

import cv2 as cv
import numpy as np

cap = cv.VideoCapture(0)
fps = cap.get(cv.CAP_PROP_FPS)

_, frame = cap.read()
y, x, _ = frame.shape

def noop(x):
    pass

cv.namedWindow('Edgy')
cv.createTrackbar('Low Threshold', 'Edgy', 0, 255, noop)
cv.createTrackbar('High Threshold', 'Edgy', 0, 255, noop)
output = cv.VideoWriter('edgy.mp4',
        cv.VideoWriter_fourcc(*'mp4v'), fps, (x, y))

while True:
    try:
        ret, frame = cap.read()

        blurred = cv.GaussianBlur(frame, (7,7), 0)
        blurred_gray  = cv.cvtColor(blurred, cv.COLOR_BGR2GRAY)

        t1 = cv.getTrackbarPos('Low Threshold', 'Edgy')
        t2 = cv.getTrackbarPos('High Threshold', 'Edgy')
        edges_2d = cv.Canny(blurred_gray, threshold1=t1, threshold2=t2)
        # Black on white instead of white on black
        edges_2d = cv.bitwise_not(edges_2d)
		# Convert back to 3-channel RGB
        edges = cv.cvtColor(edges_2d, cv.COLOR_GRAY2RGB)

        # mirror for monitor, not file
        cv.imshow('Edgy', np.flip(edges, axis=1)
        output.write(edges)

        if cv.waitKey(1) & 0xFF == ord('q'):
            break
    except KeyboardInterrupt:
        break

cap.release()
output.release()
cv.destroyAllWindows()

Some notes

So, the above code is pretty representative of what I know about edge detection, which is to say all I really know so far is this:

Basically, we’re interested in identifying discontinuities in a data set.
This task seems to boil down to “choice of smoothing function” and “choice of edge strength computation”, along with parameterization of each.
- The smoothing function is meant to reduce the ambiguity around potential edges. An example that helped make this more concrete compared two simple data sets: [5, 7, 6, 4, 152, 148, 149] and [5, 7, 6, 41, 133, 148, 149]. They both have a clear edge after the fourth element, but the second data set could also be argued to have edges after the third and fifth elements.
- Computing edge strength seems to be a matter of computing a first or second order gradient, and considering the gradient’s magnitude at each point.
  - Canny edge detection sounds like the most approachable method, and it seems to rely specifically on Gaussian smoothing.

My goals:

Understand and implement Gaussian smoothing, which seems like the most common pre-processing step.
- This in turn requires convolution.
- In particular, we’re interested in convolution matrices, which are called kernels in image processing. (These seem similar to kernel methods in machine learning, but I haven’t gotten around to tackling those yet.)
Understand and implement the rest of Canny edge detection.
- See the process outlined on Wikipedia. There’s still a lot more to unpack about that.

In the context of the above code, this means I want to implement my own versions of cv::GaussianBlur and cv::Canny, as well as fast convolution.

(Around here I switched off to working in a Jupyter notebook for a while, which I’m hoping to have more to share from soon! In the meantime, I’ll have a lot of math to sort through.)