What is GameSentenceMiner?
GameSentenceMiner is a powerful application designed to supercharge your language learning through video games and visual novels. By automatically capturing context-rich materials from your gaming sessions, GSM transforms every game into an immersive language learning experience.
- Short Demo: Watch this first
- Installation Guide: Full video tutorial
- Discord Community: Join us
Core Features
Anki Card Enhancement
GSM significantly enhances your Anki cards with rich contextual information:
-
Voice Audio: Automatically records the voice line associated with the text.
-
Screenshot: Captures a screenshot of the game at the moment the voice line is spoken.
-
Multi-Line: It's possible to capture multiple lines at once with sentence audio with GSM's very own Texthooker.
-
AI Translation: Integrates AI to provide quick translations of the captured sentence. Custom Prompts also supported. (Optional, Bring your own Key)
Game Example (Has Audio)
VN Example (Has Audio)
OCR
GSM runs a fork of OwOCR to provide accurate text capture from games that do not have a hook. Here are some improvements GSM makes on stock OwOCR:
-
Easier Setup: With GSM's managed Python install, setup is only a matter of clicking a few buttons.
-
Exclusion Zones: Instead of choosing an area to OCR, you can choose an area to exclude from OCR. Useful if you have a static interface in your game and text appears randomly throughout.
-
Two-Pass OCR: To cut down on API calls and keep output clean, GSM features a "Two-Pass" OCR System. A Local OCR will be constantly running, and when the text on screen stabilizes, it will run a second, more accurate scan that gets sent to clipboard/WebSocket.
-
Consistent Audio Timing: With the two-pass system, we can still get accurate audio recorded and into Anki without the use of crazy offsets or hacks.
-
More Language Support: Stock OwOCR is hard-coded to Japanese, while in GSM you can use a variety of languages.
Overlay
GSM also features an overlay that allows for on-screen yomitan lookups. Whenever the overlay is on it will scan the screen ONCE whenever a text event from any source comes into GSM. It then allows for hovering over the actual characters in-game for yomitan lookups, and mining.
Stats
GSM has a statistics page with currently 32 graphs chock full of pretty data.

The stats are not just pretty.
They are designed to help you grow.
Set goals and see exactly what daily tasks you need to do to achieve them:

See all the Kanji you've read in whatever order you want:

And click on them to see every sentence you've read with that Kanji:

Use Anki? Find Kanji you read a lot but aren't in Anki yet

Clean up your data, anyway you want with advanced tools.

These statistics aren't just meant to look pretty, they are meant to help you answer questions:
- What can I play to maximise both fun and learning?
- Do I read better in the evening, or in the mornings?
- Am I progressing in this language?
- How long should I immerse to reach my goals?
Basic Requirements
To get started with GSM, you'll need:
- An Anki card creation tool: Yomitan (recommended), JL, etc.
- A text extraction method: Agent, Textractor, LunaTranslator, or GSM's built-in OCR
- A game in your target language
- OBS Studio - For audio and screenshot capture
- Anki - For creating and reviewing flashcards
Getting Started
Ready to begin? Head over to our Getting Started Guide for detailed installation instructions for your platform.
How GSM Works
GSM works by coordinating multiple tools to create rich, context-aware flashcards:
- Text Event Detection: A texthooker (Agent, Textractor, OCR) captures text from your game, marking the beginning of a voice line
- Audio Capture: GSM uses Voice Activity Detection (VAD) to automatically detect when speech begins and ends
- Screenshot Capture: At the moment of the text event, GSM saves a screenshot from OBS
- Card Enhancement: When you create an Anki card via Yomitan, GSM automatically adds the audio clip and screenshot
This process relies on accurately timed text events to capture corresponding audio. GSM provides extensive settings to accommodate various games and text sources, ensuring consistent results across different setups.
Support & Community
If you encounter issues or have questions:
- Join our Discord server for real-time help
- Check the Troubleshooting Guide for common issues
- Open an issue on GitHub for bug reports
Acknowledgements
-
OwOCR for their outstanding OCR implementation, which I've integrated into GSM.
-
chaiNNer for the idea of installing Python within an Electron app.
-
exSTATic for inspiration for GSM's Stats.
-
Jiten.moe for metadata
-
MeikiOCR by rtr46. Make sure to check out his cool project Meikipop if you need something simpler than GSM Overlay.
Donations
If you've found this or any of my other projects helpful, please consider supporting my work through GitHub Sponsors, or Ko-fi.