cancel
Showing results for 
Search instead for 
Did you mean: 

Database Management Issues - Looking for Tips

pors924s
Level 3
Hi all,

I am looking for some pointers on managing the data resulting from our processes. We are continuing to build and run processes that are outputting a sizeable amount of data. To provide an idea, we have about 4TB of data in about 6 months of logging, in the session logs table alone. After 6 months we are allowed to archive the data. I just built a script to archive the data but seems to freeze after a small transfer ~2GB, this is reminiscent of some issues I had archiving automatically/manually using the BP Interactive Client in the past, which means the DB is already too large to properly archive. Some more obvious things to take care of, making sure we are logging only necessary data, necessary processes etc. But I would love to hear from my community! What are some tips you all have for this situation?

Thank you,

Andrew Castillo
2 REPLIES 2

bruce.liu
Staff
Staff
Hi Andrew,

I would suggest you reach out to our Support team for details on how they can assist in turning this around. But you have certainly hit an issue that many of our customers have experienced during their scaling of their digital workers, and in fact a key factor that would grind your Blue Prism database to a halt.

We have a version of the house keeping script that can trim records based on a set number, to avoid locking up the database while performing archiving. This should be suitable to help with the sitaution you have on hand. Again, I would suggest you reach out to our support team so they can take you through the process.

Another key takeaway is that you should build plans to monitor the growth of various tables, with the session log table being the key. This will give you an idea on how newly introduced processes will affect the size of your database growth. In some instance, you may be able to catch processes that are doing excessive logging than they suppose to be. You should look at regularly reviewing your archiving process so it remains relevant to the increasing volume of data in the database.

Lastly, Blue Prism database is quite transactional, and the volume of data can be quite high. The requirement of having the data there does not necesasrily mean that the data must be hosted within the database. You can certainly consider piping the data out to a third party system, i.e., a dashboard solution, so that data can be made visible there instead. In many large customers, we have advised them keeping session log data for no longer than 2 weeks. This ensures the underlying database can be as performant as possible.

Hope the above helps.

Walter.Koller
Level 11
We also have requirements on keeping a long history of logs and databases become quite big easily. 
But we only keep the last couple of days online and archive the rest to file system using Blue Prism. We can easily and quickly restore this archived data back to Blue Prism in case it is needed to answer questions of investigation, analysis or audit.
The size of the logs on the file system is only a fraction to what it uses in DB. So regular archiving helps reducing space consumption.

Archiving in BP can be tricky though. If one single process has generated lots of log data, the archiving step might fail. In order to avoid this, try to keep execution time of processes as short as possible. When a process should work for longer period of time (eg wait and check for a file, pull for new data, wait for an event, ...) consider to review the overall design (eg don't wait for the event but have the event trigger/start the process) or create shorter runtime but iterate as reoccurring schedule. 

Also select smaller portion to be archived in Blue Prism. Don't select one Month but several days. We only archive one day at a time, sometimes even less and use automatec for command line archiving (but be aware _debug sessions might not be archived at command line and have to be archived in UI. This behavior depends on your BP version).

Generally we have turned on quite a lot of logging. Just in case to have some basis for investigating issues that happened at runtime. 
Only a few of our processes run for more than a few hours. This, possibly upgrading to 6.9 and archiving only one day at a time made archiving for us much easier. 

Anyways, manual housekeeping of BPASession (and especially BPASessionLog) is necessary. We just had a case when one of our databases became 200GB in size when BP showed only one week of log data being online. (We have several BP databases. We are always between 500GB to 1TB of data in total even with regular archiving)