Skip to main content

OCR

GSM's OCR feature captures text directly from the game screen using optical character recognition, serving as an alternative to text hooking (Agent/Textractor). It's particularly useful for games that can't be hooked, visual novels with hard-subbed text, or console games played via emulator.

OCR detecting and capturing game text in real-time

How It Works

GSM supports a two-pass OCR pipeline:

  1. OCR1 (Fast Engine) — continuously scans the screen at a configurable rate to detect text changes.
  2. OCR2 (Accurate Engine) — once stable text is detected, a second, more accurate engine refines the result.

The recognized text is sent to the texthooker page just like hooked text, so your normal mining workflow remains the same.

Supported Engines

EngineTypeRoleNotes
OneOCRLocal (Windows)OCR1 / OCR2Default OCR1. Fast. Provides word-level bounding boxes for the overlay.
Google LensCloudOCR2Default OCR2. High accuracy. Recommended.
BingCloudOCR2High accuracy, similar to Google Lens.
ScreenAILocalOCR1 / OCR2Local alternative to Google Lens. Supports overlay coordinates.
GeminiCloudOCR2High accuracy. Requires API key.
MeikiOCRLocalOCR1 / OCR2Experimental. Text detection with bounding-box stability checks. Japanese only.
Meiki Text DetectorLocalOCR1Experimental, faster variant of MeikiOCR for stability detection.
Apple Live TextLocal (macOS)OCR1 / OCR2macOS only.
Local LLM OCRLocalOCR1 / OCR2Use a local LLM for OCR.
MLKit OCRLocalOCR1 / OCR2Requires Android server.
Google Vision / Azure / OCRSpaceCloudOCR2Enterprise cloud OCR services. Require API keys.

Setup

1. Select OCR Area

Before OCR can run, you need to define the screen region(s) to scan. The area selector provides three types of selection boxes:

OCR area selector with colored boxes
The area selector showing green, orange, and purple selection boxes
BoxInputPurpose
GreenLeft ClickPrimary area — dialogue or text that should be captured automatically.
OrangeShift + Left ClickExclusion zone — text within this area is ignored entirely.
PurpleCtrl + Left ClickMenu area — text captured only via the Manual/Menu OCR hotkey, not automatically.

There are no hard limits on how many boxes you can create, but keep it minimal for performance.

To set up:

  1. Start your game and ensure OBS is capturing it.
  2. Use the hotkey Ctrl+Shift+O (or the menu option) to open the area selector.
  3. Draw rectangles around the text areas using the appropriate click modifier.
note

You need a valid OBS Scene set up in order to configure OCR for a game. The area configuration is saved per OBS scene (or per window), so each game remembers its own OCR region.

If all text captured automatically falls inside a purple box, that text is ignored. This is useful for suppressing OCR when a game menu is open — text from the menu won't be sent to the texthooker.

Purple box covering a game menu
Purple boxes prevent menu text from being captured during automatic scanning

Import/Export Boxes

You can import and export your box configuration via the clipboard. This is useful for sharing setups with others playing the same game, or for backing up complex configurations.

Import/Export buttons in the area selector
Import and export OCR area configurations via the clipboard

2. Configure Settings

In GSM's settings under the OCR tab, the settings panel has two modes toggled by the Advanced Mode checkbox in the header.

Basic Mode (default) — ideal for most users. Just set Text Appearance Speed to match how fast text appears in your game and you're done. Engines are pre-configured to OneOCR + Google Lens.

Advanced Mode — exposes engine selection (OCR1/OCR2), scan rate, two-pass settings, and more. See the Settings section below for full details.

3. Start Scanning

OCR can be started in several ways:

  • Automatically via the AutoLauncher when a game is detected.
  • Manually from GSM's menu or system tray.
  • Via hotkey (see below).

Hotkeys

HotkeyFunction
Ctrl+Shift+OArea Select OCR — select a region and perform one-shot OCR
Ctrl+Shift+GMenu OCR — run OCR on secondary (menu) rectangles
Ctrl+Shift+WWhole Window OCR — run one-shot OCR on the entire window
Ctrl+Shift+PPause/Resume — toggle continuous OCR scanning

Settings

The OCR tab has two modes: Basic and Advanced. Toggle between them using the Advanced Mode checkbox in the settings header.

OCR Basic settings mode
Basic mode shows simplified settings for most users

Basic Mode

Basic mode is the default and is suitable for most users. It hides engine selection and scan rate details behind a single dropdown:

SettingDescription
Text Appearance SpeedControls the scan rate based on how quickly your game displays text. Options: Instant (0.2s — fastest scan, most CPU), Normal (0.5s — default), Slow (0.8s), Very Slow (1.0s — least CPU).
LanguageOCR language (default: Japanese).
Furigana FilterSlider to filter out furigana from results.
Send to ClipboardCopy recognized text to the clipboard.
HotkeysArea Select, Manual OCR, Whole Window OCR, and Pause/Resume hotkeys.

In Basic mode, the engines default to OneOCR (OCR1) and Google Lens (OCR2) with two-pass enabled.

Advanced Mode

Advanced mode exposes full control over the OCR pipeline:

OCR Advanced settings mode
Advanced mode provides full engine and scan rate control
SettingDescriptionDefault
Text Stability (OCR1)The fast engine that scans continuously to detect text changes.OneOCR
Main OCR (OCR2)The accurate engine triggered once text stabilizes.Google Lens
Scan RateDelay between screen captures in seconds. Lower = faster detection but more CPU.0.5
Two-Pass OCRUse OCR1 for stability detection, then OCR2 for accuracy. Recommended.Enabled
Optimize 2nd ScanCrop image to detected text region before running OCR2 for better performance.Enabled
Scan Image QualityImage scale factor (50–100%) before OCR. Lower values are faster but less accurate.75%
LanguageOCR language.Japanese
Furigana FilterSlider (0–100) to filter out furigana from results.0
Send to ClipboardCopy recognized text to the clipboard.
Keep NewlinesPreserve line breaks in OCR output.
OCR Clipboard ScreenshotsTake OCR screenshots and send to clipboard.
HotkeysArea Select, Manual OCR, Whole Window OCR, and Pause/Resume.

Extra & Debug Tools

A collapsible section at the bottom provides additional options:

SettingDescription
Process PrioritySet OCR process priority on Windows (Low → High). Default: Normal.
Default Furigana SensitivityDefault furigana filter value for new scenes.
Optional DependenciesInstall/uninstall optional engine dependencies (Faster PNG, Google Vision, Azure, OCRSpace).

Overlay Integration

When using engines that provide word-level bounding boxes (OneOCR, MeikiOCR, ScreenAI), GSM can send coordinates to the GSM Overlay. This enables Yomitan dictionary lookups directly on the game screen by hovering over recognized words.

Troubleshooting

Only the first part of text is detected

This happens when game text appears slowly (typewriter effect) and OCR captures the screen before the full line is displayed.

Fix: Set your game to display text instantly if possible. If not, increase the Scan Delay setting to wait longer before capturing.

Prerequisites

  • Windows recommended for OneOCR. Other engines work cross-platform.
  • OBS connected.
  • Internet connection required for cloud-based engines (Google Lens, Azure, etc.).