Video to ASCII

published 03/13/2022 • 2m reading time • 308 views

so i wanted to make a video into text,,, as we all do… Here in this article, I will outline how I did it.

Note

This is all on GitHub here.

Planning

It’s that time again!!! The first idea I had was to convert each image to grayscale, then scale down the image to be console size. The grayscale values could be interpolated from 0-255 to 0-9 in order to pick a character for the pixel. For the pixel characters, here is that I came up with:

0 ....................
1 ,,,,,,,,,,,,,,,,,,,,
2 ++++++++++++++++++++
3 ^^^^^^^^^^^^^^^^^^^^
4 oooooooooooooooooooo
5 ********************
6 &&&&&&&&&&&&&&&&&&&&
7 00000000000000000000
8 ####################
9 @@@@@@@@@@@@@@@@@@@@

I also thought that this would be a good time for dithering. I will talk more about that later.

Programming

Image Preprocessing

First step was to open and scale the image. For this task, I made use of the image crate. Here is the bit of code for that.

let img = image::open(i.path()).unwrap();
let img = img
    .resize(100, 100, imageops::FilterType::Triangle)
    .into_rgb8();

Image Conversion

The next thing to do was to convert the image into a grayscale pixel array. This function I made takes in the image and returns a value from 0 to 1. 0 is black and 1 is white, and everything in between is a gray of some sort.

fn im_load(img: RgbImage) -> Vec<Vec<f32>> {
    let mut image = Vec::new();
    let dim = img.dimensions();

    // Loop through Image Pixels
    for y in 0..dim.1 {
        let mut v = Vec::new();
        for x in 0..dim.0 {
            // Get pixel valye
            let px = img.get_pixel(x, y).0;

            // Convert to grayscale
            let px = px[0] as u16 + px[1] as u16 + px[2] as u16;

            // Convert to percentage (0-1)
            let per = (px / 3) as f32 / 255.0;

            v.push(per);
        }
        image.push(v);
    }
    image
}

ASCIIfication

Now onto the most important part of this system. Turning the image into text! This function builds a string by picking the best character. The difference from the best character to the actual value is then passed off to surrounding pixels with dithering.

const IMG_CHARS: [char; 10] = ['.', ',', '+', '^', 'o', '*', '&', '0', '#', '@'];

fn asciify(mut image: Vec<Vec<f32>>) -> String {
    let dim = (image[0].len(), image.len());

    let mut out = String::new();
    for y in 0..dim.1 as usize {
        for x in 0..dim.0 as usize {
            // Get pixel value (0-1)
            let mut px = image[y][x];
            if px > 1.0 {
                px = 1.0;
            }

            // Convert pixel value (0-1) to a char index (0-9)
            let index = (px * (IMG_CHARS.len() - 1) as f32).floor();
            let chr = IMG_CHARS[index as usize];

            // Get error
            let err = px - index / IMG_CHARS.len() as f32;

            // Apply error (Floyd–Steinberg dithering)
            if x > 1 && x < dim.1 as usize - 1 && y < dim.1 as usize - 1 {
                image[y + 0][x + 1] += err * 7.0 / 16.0;
                image[y + 1][x - 1] += err * 3.0 / 16.0;
                image[y + 1][x + 0] += err * 5.0 / 16.0;
                image[y + 1][x + 1] += err * 1.0 / 16.0;
            }

            // Add char to line
            // Added twice to maintain aspect ratio
            out.push(chr);
            out.push(chr);
        }
        out.push('\n');
    }

    out
}

Playback

Then the ASCIIfication is complete, all the frames are stores in a text file to be played back. The player reads it and displays the frames at the correct frame rate.

fn play(data: String, audio: Vec<u8>, fps: u16) {
    // Turn Frames per seconds into milliseconds per frame
    let fpms = 1000.0 / fps as f32;

    // Get the frames
    let data = data.replace("\r", "   ");
    let frames = data.split("\n\n");

    // Play the audio
    let (_stream, stream_handle) = OutputStream::try_default().unwrap();
    let file = std::io::Cursor::new(audio);
    let source = Decoder::new(file).unwrap();
    stream_handle.play_raw(source.convert_samples()).unwrap();

    // Loop through each frame and print it
    for i in frames {
        let start = std::time::Instant::now();
        std::io::stdout().write_all("\x1B[H".as_bytes()).unwrap();
        std::io::stdout().write_all(i.as_bytes()).unwrap();
        std::io::stdout().flush().unwrap();

        // Wait enough time has gone by before printing the next frame
        while (start.elapsed().as_millis() as f32) < fpms {}
    }
}

Showcase

Here is a little clip from Doja Cat’s Say So in ASCII. It probably looks horrible if you’re on mobile… sorry.

  
Loading...

Conclusion

This was a really cool project. Much bettor than doing homework at least! (i am a master of procrastinating)

well im off to go find something else to waste time one,,, those antique instructional films aren’t gonna watch themselves!