A few weeks ago I've come across a job posting on a freelance website saying — I need a person to write some RPAs for now, and to evolve them into an enterprise solution later. It just got my attention since I hear a lot about RPAs, but I never got my hands dirty on that. Thus, I wanted to try it.

Let’s describe a job posting for more context.
There are a lot of processes happening in the hospital — many data sources, employees have to copy things around, put them into Excel, add some pre-defined formulas, come up with reports on that, spend countless hours doing the routine. There is a thing — these processes are predictable and routine, there is no real need to have human input on that except that you want to have that done.
So, instead, the proposed solution is to automate that collecting of data, preparing it, processing, outputting into some other data storage.

This is easily automatable since the sequence of steps is predefined, and could be considered with some frequency — such as

Once a day I want the data to be polled from three websites, joined into similar format, and put to Google Sheets. Once you put that in Google sheets, I want to have formulas and formatting be applied to that so human can understand that easily with corresponding highlights.

This is a thing — as RPA developer you must understand what is automatable and what is not. Processes with strict patterns are, processes where human input is mandatory are not.

Let’s start with the definition of RPA as it is now — robotic process automation — this is pretty much what’ve been to up until now. There are processes, the cronjobs doing these processes are called robots. The main aim of that whole segment of products is to automate routines, so humans can do more intelligent work instead of copy-pasting excel rows here and there. And that’s great, in the end. Automation leads the progress nowadays.

Going back to our job posting, the test task was required — it was paid, had decent description, a video explaining what’s exactly to be done.
The test task was to create an app which would daily perform a set of actions:

  • Go to health tracking app website
  • Login into that given login and password
  • Go onto corresponding page to request calories report for last month to be sent out to user’s email
  • Then, get that email from Google inbox, in HTML of the email layout you can find a link to download the report
  • Once you’ve downloaded the report, you have a breakdown of calories spent/earned during the last 30 days, including information about carbs, fats, workouts, etc. That report is archive with multiple files, each describing a particular type of data — food data, workout data, etc.
  • Then, all that data must be unified in format, grouped by day, and put to Google sheet which contains of multiple sheets— Target and a sheet for each month, e.g. March 2021.
  • Then, once the data is inside the March 2021 sheet, it’s required to add some Excel formulas on top of that, calculating the deltas between Target and Actual for a particular month. Also, it’s required to do some custom formatting — such as if the target of carbs per day is met, then it displays green, otherwise red, so it’s easy for a human being to quickly understand what’s going on.
No fancy colors, but you get the idea — compare with target on Target sheet, and then come up with percentage of completeness
No fancy colors, but you get the idea — you have the data per day — how many calories spent, then compare with Target, and calculate completeness.

Ok, it’s clear now, let’s get to planning and implementation!

Since that’s a small thing, it makes great sense to try something code-less, but I was not able to find any platform that out-of-the box offered something as going onto a website and entering some input, clicking some buttons by xpath. Thus, I’d need to settle with something rather low-code. One would think of any scripting tool as Python, but, given the context of having RPAs as only feasible option at the time, and then a wish to convert it into enterprise solution once the integration with various APIs are done, I’ve decided to give it a go with C# — the exact language which nowadays can do both — scripting & easy development, and converting that to enterprise level solution sooner or later.

So, from the design point of view it’d be a C# .NET console app, and Azure Functions as a platform to deploy it. I’ve taken Azure Functions as something extremely easy to start, light-weight, and, the most important, something free ;)

A very basic, and sharp design with a sequence of steps:

Tools used:

  • Selenium web driver — initially a tool for UI testing, but since has great support for web UI manipulations, is a great choice for our app going onto websites, and clicking stuff there.
  • OpenPop.NET — for reading email and extracting link to download report from.
  • Iterating over email items until hit the last received fitness data export. Then, use good old HtmlAgilityPack to find the link by XPath expression. Later on, that link to be used for downloading exported data to a local machine.
  • Once the data is converted to C# objects, we are free to manipulate that and put it into Google Sheet. For connecting to Google Sheets, I use official SDK — Google.Apis.Sheets.v4.

Once the data is inside Google Sheet, it seems to be all done!

All done from code POV, but as full-cycle engineer I have to take care of all the aspects of software lifecycle, to name one — deployment and operations.

For that, Azure Functions could be utilized — a service available on Azure. It can run our software based on some schedule, and we won’t need to pay for that a grand.

For more details on that, please consult with this article from Microsoft.

So, done and done now!

In the end, I was able to build a software that automated the process consisting of multiple actions previously done by a human. Isn’t it great?
Having modern C# .NET as platform for that, the job becomes extra smooth and easy.

Going forward, the solution could be evolved into a set of jobs running on Azure Functions, which could be triggered not only by time trigger, but also via HTTP trigger from Microsoft Power Automate (former Microsoft Flow) which is an RPA solution provided by Microsoft. It has most common cases covered with drag-n-drop user experience, allowing code-less solutions to business problems.

Sample of Microsoft Power Automate flow.

Overall, it was a great journey into RPAs with Microsoft Tech stack, I wish to get deeper into that someday. The whole direction is extremely innovative at the time, and seems to have great potential.

Let me know if you are looking for RPA consulting.

Ukraine, Lviv. Young & mad. IT, .NET, F#, C#, Azure, software developer. Leading @lvivdotnet. Traveling, seeing things, living life.