ocr - text recognition

2022年10月22日

ocr - text recognition

Added in Pro 9.2 version

The $ocr module is optical text recognition, which is used to recognize text in pictures. This built-in module is implemented based on PaddleOCR, and you need to download the official PaddleOCR plug-in in the plug-in store of Auto.js Pro before using it. When packaging, the plug-in can be packaged into the apk, and there is no need to install the plug-in separately.

In addition, another Google MLKit-based OCR plugin is officially provided, see Official MLKitOCR Pluginopen in new window.

提示

Special thanks to Auto.js enthusiast L (QQ: 2056968162, [7Zip plug-in author] (https://blog.autojs.org/2022/09/30/7zip-plugin/)) for providing the initial docking code, and in the follow-up Provided some bug fixes and optimization help, greatly saving development time ❤️.

$ocr. create([options])

  • options {object} optional parameters, options include the following:
    • models {string} model, slim specifies a model with relatively low accuracy but faster speed, if not specified, it is the default model, which has higher accuracy but slower speed. You can also directly specify the absolute path of the custom model.
    • labelsFile {string} The label file of the model, the default is null, it needs to be used with the models field.
    • cpuPowerMode {string} CPU mode, default is LITE_POWER_HIGH, optional values are:
      • LITE_POWER_HIGH is bound to the large-core operating mode. If the ARM CPU supports big.LITTLE, it will be used first and bound to the Big cluster. If the number of threads set is greater than the number of large cores, the number of threads will be automatically scaled to the number of large cores. If the system does not have a large core or some mobile phones will fail to bind the core when the battery is low, if it fails, it will enter the non-core mode.
      • LITE_POWER_LOW is bound to the small core operating mode. If the ARM CPU supports big.LITTLE, it will be used first and bound to the Little cluster. If the number of threads set is greater than the number of small cores, the number of threads will be automatically scaled to the number of small cores. If the small core cannot be found, it will automatically enter the mode without binding the core.
      • LITE_POWER_FULL mixed size core mode. The number of threads can be greater than the number of cores. When the number of threads is greater than the number of cores, the number of threads will be automatically scaled to the number of cores.
      • LITE_POWER_NO_BIND Run mode without core binding (recommended). The system automatically schedules tasks to idle CPU cores based on load.
      • LITE_POWER_RAND_HIGH binds the large core mode in turn. If the Big cluster has multiple cores, switch binding to the next core after every 10 predictions.
      • LITE_POWER_RAND_LOW binds the small core mode in turn. If the Little cluster has multiple cores, switch binding to the next core after every 10 predictions.
    • parallelThreads {number} The number of parallel threads, the default is 4
    • useOpenCL {boolean} Whether to use OpenCL, the default is false
  • return {OCR} returns the new OCR object

According to the given options, create an OCR object that can be used for text recognition. Generally, there is no need to customize parameters, and a valid OCR object can be created by using $ocr.create().

A simple screenshot and text recognition example is as follows:

// To create an OCR object, you need to download the official PaddleOCR plug-in in the plug-in store of Auto.js Pro first.
let ocr = $ocr. create({
     models: 'slim', // Specify a model with relatively low accuracy but faster speed, if not specified, it will be the default model, with higher accuracy but slower speed
});

requestScreenCapture();

for (let i = 0; i < 5; i++) {
     let capture = captureScreen();

     // Detect the screenshot text and calculate the detection time, the first detection takes a long time
     // The detection time depends on the image size, content, and text quantity
     // You can adjust the detection efficiency by adjusting the thread, CPU mode and other parameters of $ocr.create()
     let start = Date.now();
     let result = ocr. detect(capture);
     let end = Date.now();
     console. log(result);

     toastLog(`${i + 1} detection: ${end - start}ms`);
     sleep(3000);
}

ocr. release();

For relevant information, see PaddleOCR Documentationopen in new window.

OCR

The object returned by $ocr.create() is used for specific text recognition. When the object is no longer needed, the release() function needs to be called to release the resource.

OCR. detect(image[, options])

  • image {Image} Image, the image to recognize text.
  • options {object} Optional parameters, options options include the following:
    • max {number} The upper limit of the number of recognized text, the default is 1000
    • detectRotation {boolean} Whether to detect text rotation, the default is false
    • region {Array} OCR recognition region. is an array of two or four elements. (region[0], region[1]) indicates the upper left corner of the region; region[2]*region[3] indicates the width and height of the region. If only region has only two elements, the region is (region[0], region[1]) to the lower right corner of the image. If the region option is not specified, the recognition region is the entire image. **This option was added in version 9.3. **
  • Return {Array<OCRResult>} an array of text recognition results, including credibility, text content, text range, etc.

Perform text recognition on the given image according to the given options, and return the text recognition results as an array.

requestScreenCapture();
sleep(1000);

let ocr = $ocr.create();

let capture = captureScreen();
let result = ocr. detect(capture);
// Traverse the results and print their text
result. forEach(item => {
     console.log(item.text, item.confidence);
});
// Filter the text with confidence level above 0.9
let filtered = result. filter(item => item. confidence > 0.9);
// Fuzzy search text results with "Auto.js"
let autojs = filtered.find(item => item.text.includes("edit"));
console.log(autojs);
// If it is found, print its reliability, range and midpoint position and click
if (autojs) {
     console.log(`confidence = ${autojs.confidence}, bounds = ${autojs.bounds}, center = (${autojs.bounds.centerX()}, ${autojs.bounds.centerY()})`) ;
     autojs.clickCenter();
}

ocr. release();

OCR. release()

Release OCR resources. By default, it will be released automatically when the program exits, but please release it in time when OCR is not used to release resources.

#OCRResult

The element object of the array returned by $ocr.detect() contains the reliability of text recognition, text content, text range, text rotation, and the reliability of text rotation.

OCRResult. confidence

*{number}

The credibility of the OCR text, the range is [0, 1], the closer to 1, the more accurate and credible the result.

OCRResult.text

*{string}

Text content recognized by OCR.

OCRResult. bounds

The range of the recognized text in the picture.

OCRResult. rotation

*{number}

The rotation angle of the recognized text in the picture ranges from [0, 360), and generally takes values of 0 and 180 degrees. This field is only valid when detectRotation is specified as true when detecting.

OCRResult.rotationConfidence

*{number}

The reliability of the rotation angle of the recognized text, the range is [0, 1]. This field is only valid when detectRotation is specified as true when detecting.

OCRResult.javaObject

*{object}

The original Java object of the OCR recognition result. It is useless in the official PaddleOCR, and additional information may be obtained in other official OCRs, such as lines, fields, and word segmentation.

OCRResult. clickCenter()

  • return {boolean}

Click the midpoint of the OCR result in the image range on the screen, and return whether the click is successful. Actually equivalent to click(result.bounds.centerX(), result.bounds.centerY()).

上次编辑于:
贡献者: hyb1996