cancel
Showing results for 
Search instead for 
Did you mean: 

Comparing two collections based on three variables to produce exceptions report

ChristaSmit
Level 2

Hi all, 

We are trying to automate an internal process regarding employee leave. 

Outline: Two data sets from two systems, HR and Timesheet. Need to check if all data from Timesheet is present in HR data set . Flag items that are not. Reason for not checking the other way around is that should data be captured in HR first, there is an API that populates Timesheet, but it's not bi-directional. User behavior is causing the problem, and it's been prevalent for a number of years.

HR variables: Name, Surname in two different columns. Leave Type (named similar but slightly different from Timesheet version, Date range (From date, To date), number of days.

Timesheet variable: Name Surname in one column, Task = Leave type, Individual days, Number of hours. 

Objective: Compare the Collections, find records that don't match, E.g Someone captured Sick Leave in Timesheet, but did not capture Sick Leave in HR. Someone captured a full day (8 hours) in Timesheet, vs 0.5 days in HR. 

HR Collection: 35909.png

Example 1: Timesheet Collection:

35910.png

Expected result: Match, no issues. 

Example 2: 

HR collection

35911.png

Timesheet Collection

35912.png

Expected Result: 2023/07/24 and 2023/07/28 should be flagged because it appears in Timesheet, but not in HR. Data should be extracted and placed in a new exception report. 

Any assistance, guidance on this will be helpful. We've struggled to compare data because it's in a range (HR) and checking data because the naming conventions aren't the same (Leave Type vs Task). 

1 REPLY 1

Hi Christa, 

Do you know if your team has a KS package? It might be best to schedule a call for a KS Session if you have hours. One of the ways I've dealt with this in the past is by creating a set of conventions (with the help of the business) and placing those in a choice statement. If the value is one of the recognized conventions, assign the comparison variable to an arbitrary value. If it isn't, go into the otherwise flow and mark the item as a Business Exception and write the unrecognized convention in the detail or tag. You could then add the new convention into the choice statement and start to see a decrease in deviation. 

For the range, you would have to calculate the number of hours minus any stat holidays that could fall in that range. What I have done in the past is create a database table or a static Excel file to read and compare the date range, so if the range, one of the days falls under stat, minus that from the number of days and then convert to hours. 

I hope this helps.