cancel
Showing results for 
Search instead for 
Did you mean: 

Filtering large collections

FelipeDantas1
Level 2
Hello, I'm trying to filter a large collection (70 thousand lines) but an error is happening due to lack of memory (System.outofmemory). Is there any other way to filter large collections?
1 BEST ANSWER

Best Answers

PvD_SE
Level 12
Hi Felipe,

Typically, you'd filter a collection with the appropriate action on the BP Collection object. As you have found out, this works nice for all but larger collections that tend to get the infamous OutOfMemory error. There are two easy ways to avoid this:
  1. Filter the large collection in smaller chunks
  2. Filter the source of the collection
1. Smaller chunks:
You first filter the first 10k rows of your collection by using the BP Collection object, adding the results to a new collection. Then you do the same with the next 10k rows of the large collection, again adding to the same new collection. This you repeat until all is filtered. The 10k chunk size is an indication, depending on the number of columns in you large collection, you may use larger or smaller chunks. From what I have experienced, even this method can sometimes lead to the OutOfMemory exception.

2. Filter the source:
If you got the collection data by the process downloading a CSV or XL file, you might want to filter while downloading the CSV or XL data, rather than later in the process. The best way imho to do this is by using OLEDB. There are a number of posts on this community that in great detail describe how to do that. Notably, this method is super fast and low on memory allocation and would be my preferred solution to your problem.


Happy coding!
---------------
Paul 
Sweden
Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

View answer in original post

3 REPLIES 3

Neel1
MVP
Hello Felipe - One thing you can try is that if you are taking information from CSV to collection then you can use the Chunking action in Utility File management and then use the filter query to do it in part basically.

Also if you can do filtration at source(CSV,EXCEL) then it will be more better to do than in Collection.

PvD_SE
Level 12
Hi Felipe,

Typically, you'd filter a collection with the appropriate action on the BP Collection object. As you have found out, this works nice for all but larger collections that tend to get the infamous OutOfMemory error. There are two easy ways to avoid this:
  1. Filter the large collection in smaller chunks
  2. Filter the source of the collection
1. Smaller chunks:
You first filter the first 10k rows of your collection by using the BP Collection object, adding the results to a new collection. Then you do the same with the next 10k rows of the large collection, again adding to the same new collection. This you repeat until all is filtered. The 10k chunk size is an indication, depending on the number of columns in you large collection, you may use larger or smaller chunks. From what I have experienced, even this method can sometimes lead to the OutOfMemory exception.

2. Filter the source:
If you got the collection data by the process downloading a CSV or XL file, you might want to filter while downloading the CSV or XL data, rather than later in the process. The best way imho to do this is by using OLEDB. There are a number of posts on this community that in great detail describe how to do that. Notably, this method is super fast and low on memory allocation and would be my preferred solution to your problem.


Happy coding!
---------------
Paul 
Sweden
Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

johan.m
Level 4
I agree with Paul's solutions

You can try our "Filter" action that we rewrote to use less memory :

36407.png