HomeProjectsContactGitHub

AutoOCR

Repository

Effortless OCR for Windows. Convert screenshots to text instantly and stay productive with a simple global hotkey.

Thread Handling

Since my goal was to write a minimal program that leaves almost no footprint unless used, I decided to handle this by spawning a second background thread.

1thread::spawn(move || {
2    let device_state = DeviceState::new();
3    let mut clipboard = Clipboard::new().unwrap();
4
5    // Define path to tessdata relative to the EXE
6    let mut tessdata_path = std::env::current_exe().unwrap();
7    tessdata_path.pop();
8    tessdata_path.push("tessdata");
9
10    loop {
11        let keys = device_state.get_keys();
12
13        // Trigger: Shift + Alt + O
14        if keys.contains(&Keycode::LShift)
15            && keys.contains(&Keycode::LAlt)
16            && keys.contains(&Keycode::O)
17        {
18            if let Ok(image) = clipboard.get_image() {
19                if let Some(text) = perform_ocr(&image, &tessdata_path) {
20                    let cleaned = text.trim().to_string();
21                    if !cleaned.is_empty() {
22                        let _ = clipboard.set_text(cleaned);
23                        notify("AutoOCR", "Text copied to clipboard!");
24                    } else {
25                        notify("AutoOCR", "OCR finished, but no text found.");
26                    }
27                } else {
28                    notify("AutoOCR", "OCR Failed. Check tessdata folder.");
29                }
30            }
31            // Cooldown to prevent multiple triggers in one press
32            thread::sleep(Duration::from_millis(1000));
33        }
34        // Low sleep to keep CPU usage minimal
35        thread::sleep(Duration::from_millis(50));
36    }
37});

System Tray Menu

Since it is a background task, I felt it didn't need to be visible via the taskbar and moved it into the system tray. Here you can see the simple setup for keeping it running in the tray.

1// Setup Tray Menu
2let tray_menu = Menu::new();
3let quit_item = MenuItem::new("Quit AutoOCR", true, None);
4let quit_id = quit_item.id();
5tray_menu.append(&quit_item).unwrap();
6
7let icon = load_icon();
8// Keep this variable in scope to keep the tray icon alive
9let _tray_icon = TrayIconBuilder::new()
10    .with_menu(Box::new(tray_menu))
11    .with_tooltip("AutoOCR - Shift+Alt+O")
12    .with_icon(icon)
13    .build()
14    .unwrap();

Performing the optical Character Recognition

This function initializes the Tesseract OCR engine and performs optical character recognition. This process works by splitting the image into smaller chunks, passing these chunks to Tesseract, and retrieving the results.

1fn perform_ocr(img: &ImageData, path: &PathBuf) -> Option<String> {
2    // Check for existence to avoid crashing
3    if !path.exists() {
4        return None;
5    }
6
7    // Initialize with your 5 language packs
8    let mut lt = LepTess::new(Some(path.to_str()?), "eng+deu+hin+pol+rus").ok()?;
9
10    // Convert RGBA to RGB (strip Alpha channel)
11    let mut rgb_data = Vec::with_capacity(img.width * img.height * 3);
12    for chunk in img.bytes.chunks_exact(4) {
13        rgb_data.push(chunk[0]); // R
14        rgb_data.push(chunk[1]); // G
15        rgb_data.push(chunk[2]); // B
16    }
17
18    // Prepare image for Tesseract via Leptonica
19    let img_buffer = RgbImage::from_raw(img.width as u32, img.height as u32, rgb_data)?;
20    let mut buffer = std::io::Cursor::new(Vec::new());
21    DynamicImage::ImageRgb8(img_buffer)
22        .write_to(&mut buffer, image::ImageFormat::Png)
23        .ok()?;
24
25    lt.set_image_from_mem(buffer.get_ref()).ok()?;
26    lt.get_utf8_text().ok()
27}

Try it out yourself!

Download

If you want to try this app out for yourself, click on the download button and install from GitHub.