Citrix Automation. OCR Data Extraction

Note: This video is deprecated. Please see Citrix Automation and Advanced Citrix Automation for the newer versions.

Data Extraction using OCR on Citrix or remote desktops

In this video, you will learn how to get data from a Citrix apps in an image format by creating a workflow that would convert images into text.

You are going to learn scraping data using an OCR (Optical Character Recognition) techniques, where an image is converted into text. You will also be introduced to the Screen Scraper Wizard where you will see two options in scraping data: Native and OCR. For image or Citrix automations only OCR is relevant.

Why use OCR?

When we extract data from Citrix, there are data that cannot be captured by using a regular keyboard and mouse actions. It needs a special software to extract the data. This is when we use the OCR. UiPath provides this capability. UiPath uses a default OCR engine from Google – Teserract, but we strongly recommend to use Microsoft’s MODI. You can install it in the Settings Section of UiPath.

The Process

We use the activity Find to limit the image region we want OCR to run. We used Total Expense ($): as our reference image so we can scrape the numbers showing next to it.
Now that UiPath know where and what image needs to find, we can start the Screen Scraping by highlighting the actual text, which is the total expense amount.
The Screen Scraper Wizard will do the rest for you. We used Microsoft MODI as our OCR engine because we find it much more reliable.
That’s it! UiPath’s automation tools combined with OCR technology allows you to extract data with ease using easy to follow screen scraping wizards.

Resources