28 Jul 2021

Tips and Tricks for Taking Screenshots and Selecting Text From Images on Sway

Several days ago I saw someone rant about how there were no good programs for copy text from an image. They were using the command line tool tesseract, but they felt it was too clunky for them. They were using tesseract by taking a screenshot of the text they were trying to copy, saving the screenshot to a file, and running tesseract on that file to generate a new file with the text that was found. After all that, they would open the final text file, copy the text, and delete the file.

This is something that I actually used to do a lot with my Android phone, before they decided to remove the feature for some reason (or at least move it somewhere that I haven’t been able to find for years) so I decided to see how easy it would be to hook up grim and slurp with tesseract to copy text from images the same way.

It turns out, it was not that hard to do:

grim -g "$(slurp)" - | tesseract - - | wl-copy

This works by selecting the region using "$(slurp)", and using grim -g to take a screenshot of that region. The file is then “saved” to standard output (that’s what the - represents here). tesseract is then able to read the image from - (standard input, which it gets from the standard output of grim), and write the text it finds to - (standard output). Finally, we stuff the text that was found using tesseract into the clipboard using wl-copy, and voilà! We now have the text from some image in our clipboard by just drawing a box around it, and we did it with no temporary files on the disk.

However, calling this from the command line each time you want to select text from an image could become cumbersome if it’s something you want to do frequently. I decided to set this up using a wofi menu, and several options for taking screenshots.

Here is the script that I created:

#!/bin/sh

screenshot_copy_all_displays="Screenshot all displays to clipboard"
screenshot_all_displays_to_file="Screenshot all displays to file"
screenshot_copy_area="Screenshot area to clipboard"
screenshot_copy_area_ocr="Screenshot area to copy text"
screenshot_area_to_file="Screenshot area to file"
screenshot_copy_window="Screenshot focused window to clipboard"
screenshot_window_to_file="Screenshot focused window to file"
screenshot_copy_monitor="Screenshot focused monitor to clipboard"
screenshot_monitor_to_file="Screenshot focused monitor to file"

# Store each option in a single string seperated by newlines.
options="$screenshot_copy_all_displays\n"
options+="$screenshot_all_displays_to_file\n"
options+="$screenshot_copy_area\n"
options+="$screenshot_copy_area_ocr\n"
options+="$screenshot_area_to_file\n"
options+="$screenshot_copy_window\n"
options+="$screenshot_window_to_file\n"
options+="$screenshot_copy_monitor\n"
options+="$screenshot_monitor_to_file"

# Prompt the user with wofi.
choice="$(echo -e "$options" | wofi -d)"

# Make sure that all pictures are saved in the screenshots folder.
cd ~/Pictures/Screenshots

case $choice in
    $screenshot_copy_all_displays)
        grim - | wl-copy
        ;;
    $screenshot_all_displays_to_file)
        grim
        ;;
    $screenshot_copy_area)
        grim -g "$(slurp)" - | wl-copy
        ;;
    $screenshot_copy_area_ocr)
        grim -g "$(slurp)" - | tesseract - - | wl-copy
        ;;
    $screenshot_area_to_file)
        grim -g "$(slurp)"
        ;;
    $screenshot_copy_window)
        grim -g "$(swaymsg -t get_tree | jq -j '.. | select(.type?) | select(.focused).rect | "\(.x),\(.y) \(.width)x\(.height)"')" - | wl-copy
        ;;
    $screenshot_window_to_file)
        grim -g "$(swaymsg -t get_tree | jq -j '.. | select(.type?) | select(.focused).rect | "\(.x),\(.y) \(.width)x\(.height)"')"
        ;;
    $screenshot_copy_monitor)
        grim -o $(swaymsg -t get_outputs | jq -r '.[] | select(.focused) | .name') - | wl-copy
        ;;
    $screenshot_monitor_to_file)
        grim -o $(swaymsg -t get_outputs | jq -r '.[] | select(.focused) | .name')
        ;;
esac

This script does assume that the file ~/Pictures/Screenshots exists. Make sure that folder either exists, or modify the script to save the file in a different folder

This script has a bunch of options that I find useful. You can either choose to save the screenshot to a file, or to the clipboard. You can choose whether to screenshot a region, the focused window, the focused monitor, or all monitors. These could of course all be bound to different shortcuts, but I personally like having a single screenshot button for simplicity.

Speaking of binding shortcuts, lets go over making this script into a shortcut on Sway. I recommend you save this script to a file in your path. I personally have this saved to ~/.local/bin/screenshot-menu.sh. Remember to mark the script as executable using chmod +x.

Make sure to check that the script works properly for you by running the script from the command line. In my case, I can open a new terminal and type screenshot-menu.sh and the wofi menu will appear.

If that’s working fine, you can now add a binding to your sway config to run the script. Add a new line to your config (mine is located at ~/.config/sway/config), for the new binding. It should look something like:

    bindsym $meh+a exec screenshot-menu.sh

The $meh key is a special modifier on my keyboard; you should use something that makes sense for you.

Happy screenshotting!

Robby Zambito

Tips and Tricks for Taking Screenshots and Selecting Text From Images on Sway