Japanese OCR tool for certain fonts and we believe it would outperform it on all fonts, given enough training data. Previous Work There is a surprisingly small amount of literature in En-glish focusing specifically on on Japanese OCR methods. More research exists in Chinese fonts, but given that the 1. Japanese Image to Text Step 1 - Select Image. Browse No file selected. You can also use a Screenshot by pressing Ctrl+V. High resolution images with.
Contents
What is Capture2Text?
Capture2Text enables users to quickly OCR a portion of the screen using a keyboard shortcut. The resulting text will be saved to the clipboard by default.
Conceptual illustration:
Capture2Text is free and licensed under the terms of the GNU General Public License.
Download
The latest version can be found on the Capture2Text download page hosted by SourceForge.
System Requirements
Supported operating systems:
- Windows 7
- Windows 8/8.1
- Windows 10
Note: Windows XP support has been dropped as of Capture2Text v4.0.
How to Launch Capture2Text (no installation required)
- Unzip the contents of the zip file.
- Double-click on Capture2Text.exe. You should see the Capture2Text icon on the bottom-right of your screen (though it might be hidden in which case you will have to click on the 'Show hidden icons' arrow).
Installing Additional OCR Languages
By default Capture2Text comes packaged with the following languages: English, French, German, Japanese, Korean, Russian, and Spanish.
Follow these steps if you would like to install additional OCR languages:
- Download the appropriate OCR language dictionary.
- Open the '.zip' file you just downloaded with 7-Zip or similar decompression software.
- Drag all files contained within the zip file to the tessdata folder:
- Restart Capture2Text.
Afrikaans (afr) | Greek (ell) | Odiya (ori) |
Albanian (sqi) | Gujarati (guj) | Panjabi (pan) |
Amharic (amh) | Haitian (hat) | Persian (fas) |
Ancient Greek (grc) | Hebrew (heb) | Polish (pol) |
Arabic (ara) | Hindi (hin) | Portuguese (por) |
Assamese (asm) | Hungarian (hun) | Pushto (pus) |
Azerbaijani (aze) | Icelandic (isl) | Romanian (ron) |
Basque (eus) | Indic (inc) | Russian (rus) |
Belarusian (bel) | Indonesian (ind) | Sanskrit (san) |
Bengali (ben) | Inuktitut (iku) | Serbian (srp) |
Bosnian (bos) | Irish (gle) | Sinhala (sin) |
Bulgarian (bul) | Italian (ita) | Slovak (slk) |
Burmese (mya) | Japanese (jpn) | Slovenian (slv) |
Catalan (cat) | Javanese (jav) | Spanish (spa) |
Cebuano (ceb) | Kannada (kan) | Swahili (swa) |
Central Khmer (khm) | Kazakh (kaz) | Swedish (swe) |
Cherokee (chr) | Kirghiz (kir) | Syriac (syr) |
Chinese - Simplified (chi_sim) | Korean (kor) | Tagalog (tgl) |
Chinese - Traditional (chi_tra) | Kurukh (kru) | Tajik (tgk) |
Croatian (hrv) | Lao (lao) | Tamil (tam) |
Czech (ces) | Latin (lat) | Telugu (tel) |
Danish (dan) | Latvian (lav) | Thai (tha) |
Dutch (nld) | Lithuanian (lit) | Tibetan (bod) |
Dzongkha (dzo) | Macedonian (mkd) | Tigrinya (tir) |
English (eng) | Malay (msa) | Turkish (tur) |
Esperanto (epo) | Malayalam (mal) | Uighur (uig) |
Estonian (est) | Maltese (mlt) | Ukrainian (ukr) |
Finnish (fin) | Marathi (mar) | Urdu (urd) |
Frankish (frk) | Math/Equations (equ) | Uzbek (uzb) |
French (fra) | Middle English (1100-1500) (enm) | Vietnamese (vie) |
Galician (glg) | Middle French (1400-1600) (frm) | Welsh (cym) |
Georgian (kat) | Nepali (nep) | Yiddish (yid) |
German (deu) | Norwegian (nor) |
How to Perform a Standard OCR Capture
Follow these steps to perform a standard OCR capture using the capture box:
- Position your mouse pointer at the top-left corner of the text that you want to OCR.
- Press the OCR hotkey (Windows Key + Q) to begin an OCR capture.
- Move your mouse to resize the blue capture box over the text that you want to OCR. You may hold down the right mouse button and drag to move the entire capture box.
- Press the OCR hotkey again (or left-click or press ENTER) to complete the OCR capture. The OCR'd text will be placed in the clipboard and a popup showing the captured text will appear (the popup may be disabled in the settings).
As with all OCR captures, you must manually select the language that you would like to OCR from the settings.
To change the OCR language, right-click the Capture2Text tray icon, select the OCR Language option and then select the desired language.
To quickly switch between 3 languages, use the OCR language quick access keys: Windows Key + 1, Windows Key + 2, and Windows Key + 3. The quick access languages may be specified in the settings.
When Chinese or Japanese is selected, you should specify the text direction (vertical/horizontal/auto) using the text direction hotkey: Windows Key + O. If auto is selected, horizontal will be used when the capture width is more than twice the height, otherwise vertical will be used. The text direction also affects how furigana is stripped from Japanese text.
(For Japanese) Capture2Text will attempt to automatically strip out furigana.
How to Perform a Text Line OCR Capture
Capture2Text can automatically capture the line of text that is closest to the mouse pointer.
Follow these steps to perform a Text Line OCR Capture:
- Position your mouse pointer on or near the line of text to capture.
- Press the Text Line OCR Capture hotkey (Windows Key + E).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Perform a Forward Text Line OCR Capture
Capture2Text can automatically capture the line of text starting at the character that is closest to the mouse pointer and working forward.
Follow these steps to perform a Forward Text Line OCR Capture:
Best Japanese Ocr
- Position your mouse pointer on or near the character to start at.
- Press the Forward Text Line OCR Capture hotkey (Windows Key + W).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Perform a Bubble OCR Capture
Capture2Text can automatically capture text contained within a comic book speech/thought bubble as long as the bubble is completely enclosed.
Follow these steps to perform a Bubble OCR Capture:
- Position your mouse pointer in the empty part of the bubble (not on the text).
- Press the bubble OCR Capture hotkey (Windows Key + S).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Specify the Active OCR Language
To specify the active OCR language, right-click the tray icon, click on OCR Language, and select an OCR languages from the list:
Translation
To enable the translation feature, start by opening the settings dialog (right-click tray icon and select 'Settings...'), and clicking on the Translate tab.
Check the 'Append translation to clipboard' checkbox to append the translated text to the clipboard using the provided separator.Check the 'Show translation in popup window' checkbox to display the translated text along side the OCR text in the popup window. For example:.Each installed OCR language may be translated to a different language.
Note 1: Some OCR languages do not have translation support. Unsupported languages will not be displayed.
Note 2: The translation feature requires Internet access.
Settings
Right-click the Capture2Text tray icon in the bottom-right of your screen and then select the 'Settings...' option to bring up the Settings dialog. You may hover over many of the option labels to display a helpful tooltip explaining the option.
The Hotkeys tab allows you to specify which key and modifiers to use for each hotkey. To disable a hotkey, select '<Unmapped>' from the drop-down list.
Current OCR language: Specify the active OCR language to use. You may also specify the active OCR language in the tray icon menu.
Quick-Access Languages: The languages to use for each of the quick-access language hotkeys.
Whitelist: Inform the OCR engine that the captured text will only contain the provided characters.
Blacklist: Inform the OCR engine that the captured text will never contain the provided characters.
Text Orientation: The orientation of the text that will be captured. This option is only used when Chinese or Japanese is set as the active OCR language. If Auto is selected, horizontal will be used when the capture width is more than twice the height, otherwise vertical will be used. The text direction also affects how furigana is stripped from Japanese text. You may also specify the text orientation in the tray icon menu or with the Text Orientation hotkey.
Tesseract Config File: An advanced feature that allows you to specify a Tesseract config file.
Trim Capture: During OCR preprocessing, trim captured image to foreground pixels and add a thin border. OCR accuracy will be more consistent and may even be improved.
Deskew Capture: During OCR preprocessing, attempt to compensate for slanted text found in an OCR capture.
Japanese Keyboard
Contains options for configuring the automatic captures. Hover over the option labels for more information.
Allows you to specify the colors of the OCR Capture Box. The transparency can be changed by adjusting the 'Alpha channel' value in the color selection dialog.
Allows you to specify the preview position, color, and font. You may disable the preview by unchecking the 'Show Preview Box' checkbox.
Save to clipboard: Save the captured OCR text to the clipboard.
Show popup window: Show the captured OCR text in a popup window:
Keep line breaks: Check this option if you don't want carriage returns and line feeds to be stripped from the captured text.
Logging: Allows you to save all captures to the specified file in the specified format. The following tokens may be used in the format: ${capture}, ${translation}, ${timestamp}, ${linebreak}, ${tab}. The default format is: '${capture}${linebreak}'.
Call Executable: An advanced feature that allows you call an executable after OCR is complete. The following tokens may be used: ${capture}, ${translation}, ${timestamp}. Example:
Allows you to perform text replacements. Supports regular expressions. The text on the left will be replaced with the text on the right. Different replacements may be specified for each OCR language.
See the translation section.
This page allow you to enable the text-to-speech feature, set the volume, and select the options (voice, rate, pitch) to use for each OCR langauge.
Enable Text-to-speech: Enable text-to-speech when text is captured.
When this option is checked and the voice is not set to '<Disabled>', the 'Say' button will appear in the popup dialog:
Volume: Master volume of the text-to-speech feature. Applies to all languages.
OCR Language: Specify speech options for the selected OCR language.
- Rate: Rate of text-to-speech voice.
- Pitch: Pitch of text-to-speech voice.
- Voice: Voice to use for the text-to-speech feature. Set to '<Disabled>' to disable the text-to-speech feature for just the selected OCR language.
Preview: Preview the current rate, pitch, and voice.
Command Line Options
Troubleshooting & FAQ
- I'm getting a message about a missing DLL file when I double-click Capture2Text.exe.
Solution: Install the Visual Studio 2015 redistributable.
- Capture2Text doesn't work at all. What can I do?
Possible solutions:
Make sure that you have unzipped Capture2Text. Search Google if you do not know how to unzip a file.
Make sure that your Anti-virus software is not blocking Capture2Text. Refer to the documentation that was bundled with your Anti-virus software.
Make sure that you have downloaded the latest version from SourceForge.
Restart your computer.
Ask one of your grandchildren to help you :)
- I found a bug!
Great! Create a ticket and describe the bug.
- I want to make a suggestion.
Great! Create a ticket and describe your suggestion.
- Capture2Text is outputting garbage characters.
Solution: Specify the correct OCR language.
- The language that I'm interested in doesn't appear in the OCR language menu.
Read Installing Additional OCR Languages.
- I don't see the Capture2Text tray icon.
Click the 'Show hidden icons' button (it looks like a triangle or a ^ character).
- I've clicked on the Capture2Text tray icon but it doesn't do anything.
Right-click it instead.
- Capture2Text isn't working on my Mac.
Capture2Text is a Windows-only software. If you have a technical background, feel free to port it (but don't ask me to help).
- Where is the uninstaller?
There isn't one. Capture2Text doesn't have an installer either. To remove Capture2Text from your computer, simply delete the Capture2Text directory.
- Where is the settings .ini file located?
Type '%appdata%Capture2Text' into Windows Explorer.
You may delete it to restore default settings.
- How do I make Capture2Text portable?
Call Capture2Text.exe using the --portable option. You may want to create a shortcut for this. Setting this option will make Capture2Text store the .ini settings file in same directory as Capture2Text.exe (as opposed to '%appdata%Capture2Text' which is the normal location).
- Where is the source code located?
The source code is located on SourceForge.
Ocr Japanese Text For Mac Shortcut
Related Tools for Japanese Language Learners
- JGlossator (Windows)
Automatically lookup Japanese words that you have OCR'd with Capture2Text. Supports de-inflected expressions, readings, audio pronunciation, example sentences, pitch accent, word frequency, kanji information, and grammar analysis. Supports both EDICT and EPWING dictionaries.
- OCR Manga Reader (Android)
Free and open source Manga reader android app that allows you to quickly OCR and lookup Japanese words in real-time. There are no ads and no mysterious network permissions. Supports both EDICT and EPWING dictionaries.