Archiving Overhaul

foehl · ‎28-01-21

I've called this a request for an overhaul of the archiving function because there are so many problems with the existing implementation that to me it makes sense to completely redo it. There's a list of requirements for a solution at the end if you don't feel like reading through the issues in detail.

The session log archiving function as it stands in Blue Prism is not well executed, and is something I feel embarrassed about every time I have to explain it to a client.

Depending on logging settings, the session log table in the database quickly fills up and for customers with little space for databases, this causes issues. This is widely covered in Blue Prism documentation, and the recommended solution is to undertake regular archiving of the session logs.
Issues
Here are the problems with the recommended solution:

Archiving fails if you attempt to archive too much data, either by selecting to many logs to archive, or if a single log file is too big. In the event that a single log file is too big to archive, there is no way to archive that file. I have encountered this a couple of times and consider the amount of data that can be handled in a single archiving action to be too low for enterprise use.
Automatic archiving consumes a license. Why? I still don't understand why this is handled 'by a bot' and cannot just be built into the product. Explaining to a client that you can do this automatically, but it might cost more is never easy.
You cannot specify a schedule for automatic archiving. Since I have to use a bot to do the archiving, at least let me specify for it to happen when the bot has some free time anyway or a time that is convenient to me.
Archiving does not actually remove the data from the database table. If your database reaches capacity because the session log table is too big (causing Blue Prism to stop working), even archiving all session logs does not drastically reduce the size of the table, in fact it has a minimal effect on reducing database size. If archiving is the recommended solution to the database becoming too full, it should actually solve the problem.
Archiving is done to a drive location relative to the local machine on which you are carrying out the archiving. This means that archiving must be done via the Blue Prism client on the same machine each time, or (if archiving from clients on different machines) each machine must have a shared location mapped to exactly the same network drive and this must be set up as the archiving location separately on each client.
The archiving drive location is set and remembered at the client level. If you have a setup where a client has connections to multiple environments, the archive location will be set the same for both. If, for example, I were tasked with being the member of staff responsible for archiving for both development and test environments, it would use my single client-based setting for both environments, requiring me to overwrite the archiving location each time I want to archive on a different environment.

Solution
Please re-develop the archiving function and provide a solution that

Can run automatically on a user defined schedule, or in the background provided that does not have an impact on performance.
Does not consume a license to run automatically.
Allows any size or quantity of data to be archived, or which handles a large data load intelligently rather than just saying 'Archiving Failed' after an inordinate waiting period.
Reduces the size of the session log database table appropriately. (e.g. if you archive 70% of the session logs, I would expect the database size to reduce by 60-70%, not 5%).
Allows archiving to be done to shared or common drive locations.
Maintains a single, persistent archive location per environment, not per client.

ChristianPanhan · ‎09-02-21

I agree but I feel I need to say something about deletes in databases. The database is not being reduced in size only by deleting data. Delete here means, the database engine marks the deleted areas as free space so it can be overwritten when new data comes in. If you really would like the database to consume less space after archiving, you would need to shrink it. But this is probably not a good idea. When new data comes in, the database file has to be expanded anyway to fit the new data. This needs time to allocate the space from OS.

LesZatony · ‎13-04-21

I agree with most of what you've laid out here with a few comments -

My understanding is that the BOT will archive when it is not busy so it can run on a BOT that is used for other things
I use UNC for location to store archives which eliminates the need to make sure all BOTs have same drive mapping
For better archive control I prefer to schedule an automatec with the /archive

One additional thing that I wish archiving could handle is better recovery options when there was a failure on a prior run. Specifically, if archiving failed there will often be a file generated but the data has not been removed yet from the database. When you try again archiving will fail since the file exists. It would be much better if the file would be renamed with a -2 (for example) or some other suffix.

SS&C Blue Prism Community

Archiving Overhaul

Mini-Patches

Resource PC initialization function

Modular Platform

Enable placing an action on top of a line between ...

Blue Prism Cloud Products Free Trial or License