Blinking into Morse code / Хабр

Explaining main algorithm

For a while I’ve been thinking of writing a scientific article. I wanted it to have certain utility.

Morse code is binary: it takes only two values – either dot (short) or hyphen (long). I figured out that short (s) can stand for two-eye blinking whilst long (l) can indicate left-eye blinking. Another question emerged: how to understand when does one-symbol recording stop?

Empty space between two symbols can be presented by right-eye blinking – r. If I input singly symbol of short (dot) and long (hyphen), I will blink my right eye once to indicate the space between two symbols.

To separate independent words, one has to blink her right eye twice and get rr.

Hence, I have collected an ordered set of symbols – r, l, s, - that can be converted into a full-fledged text. Once I accomplish the transformation, I get an answer.

Deciphering Python Code

Let’s take a closer look to the code functions I used.

eye_aspect_ratio(eye). I use dlib library for face detection. Next, out of 68 facial parameters (these are dots that are spread across human face in a sharpened shape of face), I pick 6 that are responsible for eyes location. In Function I determine whether an eye is opened or closed by counting two Euclidean distances between the upper and lower eyelids of the eye, equally offset from the center (parameters A and B), and Euclidean distance between the right and left corners of the eye (parameter C). The bigger the number, the more open the eye is.

def eye_aspect_ratio(eye):
    A = dist.euclidean(eye[1], eye[5])
    B = dist.euclidean(eye[2], eye[4])
    C = dist.euclidean(eye[0], eye[3])

    ear = (A + B) / (2.0 * C)
    return ear

build_plots(name, value, plot) is responsible for output of an image from camera to computer screen.

def build_plots(name, value, plot):
    plot = plot.update(value)
    cv2.imshow(name, plot)

draw_outline(frame, eye) takes camera frame and eyes coordinates. It then fixates eyes with neon-green rings.

def draw_outline(frame, eye):
    eyeHull = cv2.convexHull(eye)
    cv2.drawContours(frame, [eyeHull], -1, (0, 255, 0), 1)

get_args() gets me prerequisite arguments (shape-predictor) for future execution.

def get_args():
    ap = argparse.ArgumentParser()
    ap.add_argument("-p", "--shape-predictor", required=True, help="path to facial landmark predictor")
    args = vars(ap.parse_args())
    return args

open_video() opens my front camera and returns prerequisite arguments.

def open_video():
    args = get_args()
    print("[INFO] loading facial landmark predictor")

    detector = dlib.get_frontal_face_detector()
    predictor = dlib.shape_predictor(args["shape_predictor"])

    print("[INFO] starting video stream thread...")
    vs = VideoStream(0).start()
    time.sleep(1)

    return vs, detector, predictor

calibration(name, flag). Every person’s average ear (eye aspect ratio) is different, I can’t use any universal number for this parameter as former simply doesn’t exist. Thus, the calibration comes at handy. It helps to get numbers for several conditions: both eyes opened, both eyes closed, left eye closed & right eye opened and vice versa. After the first beep video opens and I have to hold my eyes opened till next beep. Afterwards, I close my eyes patiently waiting for last beep. Out of all sampling for eyes opened I return a minimum average value ear for left and right eyes. Out of all sampling for eyes closed I return a maximum average value ear for left and right eyes. Function thus returns optimal values for correct execution.

def calibration(name, flag):
    plotLeft = LivePlot(640, 360, [5, 35], invert=True)
    plotRight = LivePlot(640, 360, [5, 35], invert=True)

    (lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
    (rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]

    vs, detector, predictor = open_video()
    print('\a')
    time.sleep(3)
    time_start = time.time()
    both_eyes_open = []
    both_eyes_close = []

    while True:
        frame = vs.read()
        frame = imutils.resize(frame, width=450)
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

        rects = detector(gray, 0)
        for rect in rects:
            shape = predictor(gray, rect)
            shape = face_utils.shape_to_np(shape)

            leftEye = shape[lStart:lEnd]
            rightEye = shape[rStart:rEnd]
            leftEAR = int(100 * eye_aspect_ratio(leftEye))
            rightEAR = int(100 * eye_aspect_ratio(rightEye))

            if flag == 0:
                earAvg = (leftEAR + rightEAR) / 2.0
                both_eyes_open.append(earAvg)
            elif flag == 1:
                earAvg = (leftEAR + rightEAR) / 2.0
                both_eyes_close.append(earAvg)

            draw_outline(frame, leftEye)
            draw_outline(frame, rightEye)

            build_plots("ImagePlotLeft", leftEAR, plotLeft)
            build_plots("ImagePlotRight", rightEAR, plotRight)
        cv2.imshow(name, frame)

        time_dif = time.time() - time_start
        if cv2.waitKey(25) == ord("q"):
            break
        if time_dif > 5:
            print('\a')
            break

    cv2.destroyAllWindows()
    vs.stop()

    if flag == 0: return min(both_eyes_open)
    if flag == 1: return max(both_eyes_close)

morse_code_from_eyes(). This is the major function of my code. In real time it monitors and analyzes human eyes. Calibration goes first and is followed by a beep sound, after which the recording starts. Recording goes the similar way as during the calibration, however, now I am comparing the results I get from camera with the ones I got from calibration one step ago. I use counter to trace one symbol per once. If I haven’t utilized counter, I would have stored a large number of symbols per one blink as there wouldn’t be any break. After all symbols are passed, I press the “q” button on a keyboard to finish recording and close front camera. Then function returns the result of symbols recording.

def morse_code_from_eyes():
    both_open = calibration("both_eyes_open", 0)
    both_close = calibration("both_eyes_close", 1)

    plotLeft = LivePlot(640, 360, [5, 35], invert=True)
    plotRight = LivePlot(640, 360, [5, 35], invert=True)

    (lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
    (rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]

    vs, detector, predictor = open_video()
    print('\a')
    time.sleep(3)

    counter = 0
    points = ""

    while True:
        frame = vs.read()
        frame = imutils.resize(frame, width=450)
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

        rects = detector(gray, 0)

        for rect in rects:
            shape = predictor(gray, rect)
            shape = face_utils.shape_to_np(shape)

            leftEye = shape[lStart:lEnd]
            rightEye = shape[rStart:rEnd]
            leftEAR = int(100 * eye_aspect_ratio(leftEye))
            rightEAR = int(100 * eye_aspect_ratio(rightEye))

            earAvg = (leftEAR + rightEAR) / 2.0

            draw_outline(frame, leftEye)
            draw_outline(frame, rightEye)

            build_plots("ImagePlotLeft", leftEAR, plotLeft)
            build_plots("ImagePlotRight", rightEAR, plotRight)

            if counter == 0:
                if earAvg <= both_close + 1:
                    points += "s"
                    counter += 1
                    print(earAvg, "ssssssssssssssssssss")
                elif leftEAR - rightEAR >= 0 and earAvg <= both_open - 3: 
                    points += "p"
                    counter += 1
                    print(leftEAR - rightEAR, earAvg, "ppppppppppppppppppрр")
                elif rightEAR - leftEAR >= 0 and earAvg <= both_open - 3:
                    points += "l"
                    counter += 1
                    print(rightEAR - leftEAR, earAvg, "llllllllllllllllllll")
            else:
                if counter == 5:
                    counter = 0
                else:
                    counter += 1

        cv2.imshow("Frame", frame)
        if cv2.waitKey(25) == ord("q"):
            break

    cv2.destroyAllWindows()
    vs.stop()
    return points

text_from_morse_code(points) is responsible for converting a received stroke of symbols (r,l,s) into a comprehensible text. Firstly, I save a dictionary which keys are a designation in Morse code, and the value is a letter. I split the stroke into “pp” to get separate independent words. Consequently, I go through all the symbols before “p” and convert each one into a letter. After, letters are combined into words and words into sentences. The ultimate result is a returned word.

def text_from_morse_code(points):
    alphabet = {"sl": "A", "lsss": "B", "lsls": "C", "lss": "D", "s": "E",
                "ssls": "F", "lls": "G", "ssss": "H", "ss": "I", "slll": "J",
                "lsl": "K", "slss": "L", "ll": "M", "ls": "N", "lll": "O",
                "slls": "P", "llsl": "Q", "sls": "R", "sss": "S", "l": "T",
                "ssl": "U", "sssl": "V", "sll": "W", "lssl": "X", "lsll": "Y",
                "llss": "Z",
                "sllll": "1", "sslll": "2", "sssll": "3", "ssssl": "4",
                "sssss": "5", "lssss": "6", "llsss": "7", "lllss": "8",
                "lllls": "9", "lllll": "0"}

    points = points.split("pp")
    answer = ""
    for word in points:
        letters = word.split("p")
        new_word = ""
        for letter in letters:
            if letter in alphabet:
                new_word += alphabet[letter]
            else:
                new_word += "-"
        answer += new_word + " "
    return answer

Conclusion

First you need to run the morse_code_from_eyes() function, and save the result to a variable. After that, pass the resulting string to the text_from_morse_code() function and get the final result.