cancel
Showing results for 
Search instead for 
Did you mean: 

Utility - File Management::Get Files and Wildcards (or not)

stepher
Level 6

I have a process that a consultancy built for us years ago.  In my attempts to troubleshoot an issue I noticed that they were [consistently] doing something differtent than I might.  When calling the "Utility - File Management::Get Files" action, they use 'Patterns CSV' values such as:

  • "*"
  • ".jpeg"
  • ".xlsx"
  • "."

The first one caught my attention because, just on muscle memory alone, I would have written "*.*".  The others would have had an asterisk/star in front of the dot.  I checked the documentation and it explicitly states that 'This is the way.'  But this process has been running, albeit imperfectly, for years.  And, if I open File Explorer in Windows 10, search based on each variation (including the "."), it returns the expected results.

So... In "Utility - File Management::Get Files", are wildcards necessary? Or are they implied?

And, yes, the OCD Devil on my shoulder is thoroughly proud of himself for making me even ask the question.

Thanks, in advance,

Red

Robert "Red" Stephens Application Developer, RPA Sutter Health Sacramento, CA
1 BEST ANSWER

Helpful Answers

david.l.morris
Level 14

Hi Red,

This nearly derailed my day when I saw this, but it turns out the answer is a little confusing but makes sense. I will be making some assumptions here, and also I would say that my post here is going to be right in general but maybe not specifically for the details as to why.

First, this Stack Overflow post explains part of it, at least why "." works: https://stackoverflow.com/questions/14915473/directory-getfiles-not-working-with-a-pattern-of .

I tested this out in multiple versions of .NET, such as .NET 8, .NET Framework 3.5, and .NET Framework 4.7.2. As the person in the stack overflow post points out, the implementation of Directory.GetFiles() is different depending on the version of .NET we're referring to. I want to point out that the description Microsoft gives indicates that you cannot use regular expressions in the file pattern of Directory.GetFiles(). That is noted here: https://learn.microsoft.com/en-us/dotnet/api/system.io.directory.getfiles?view=net-8.0#system-io-directory-getfiles(system-string-system-string) . However, I guess at some point they wanted to expand the wildcard ability in the file pattern support. It has always supported at least question mark ? and asterisk *. But as of at least .NET Framework 4.5, if you provide just a period then it will convert that to an asterisk * before doing the search.

This still doesn't answer the question of why those developers used patterns such as ".jpeg" and ".xlsx". Maybe I'll find the answer to this at some point, but at the moment I suspect that the object used was edited in some way or is simply a different version than I have access to. Perhaps it is a very old version of the object and it's been changed since then. I'm not sure. I tested this in VB.NET and C# and in different .NET framework versions and couldn't get ".txt" to work. I also tested an unedited version of the File Management object from the Digital Exchange with an input of ".txt" for patterns CSV, and it did not pull back a .txt file, but using "." or "*" or "*.txt" all return that text file as a result. The other possibility here is of course that the .NET Framework changed at some point, and I just haven't found the right version where ".txt" would return all files with that file extension. For example, there were apparently other changes to the file pattern functionality in .NET 5 too. Anyway, I'm sure there's a simple explanation about this, but now I am curious whether your current object successfully returns results with ".xlsx" and such. If you don't mind, please post the code here from that code stage. I'd be interested in seeing it.

As for why those developers used "." or "*", it's just because they found it works and it's a simplistic input. I also prefer what you said which is to use "*.txt" and "*.*", but I think "*" is also acceptable.

Edit: I never actually answered the question. I believe the answer is that wildcards are necessary and are not implied. However, in some cases such as ".", some versions of .NET Framework will convert it to a wildcard character. That was an interesting thing to look up.


Dave Morris, 3Ci at Southern Company

View answer in original post

4 REPLIES 4

david.l.morris
Level 14

Hi Red,

This nearly derailed my day when I saw this, but it turns out the answer is a little confusing but makes sense. I will be making some assumptions here, and also I would say that my post here is going to be right in general but maybe not specifically for the details as to why.

First, this Stack Overflow post explains part of it, at least why "." works: https://stackoverflow.com/questions/14915473/directory-getfiles-not-working-with-a-pattern-of .

I tested this out in multiple versions of .NET, such as .NET 8, .NET Framework 3.5, and .NET Framework 4.7.2. As the person in the stack overflow post points out, the implementation of Directory.GetFiles() is different depending on the version of .NET we're referring to. I want to point out that the description Microsoft gives indicates that you cannot use regular expressions in the file pattern of Directory.GetFiles(). That is noted here: https://learn.microsoft.com/en-us/dotnet/api/system.io.directory.getfiles?view=net-8.0#system-io-directory-getfiles(system-string-system-string) . However, I guess at some point they wanted to expand the wildcard ability in the file pattern support. It has always supported at least question mark ? and asterisk *. But as of at least .NET Framework 4.5, if you provide just a period then it will convert that to an asterisk * before doing the search.

This still doesn't answer the question of why those developers used patterns such as ".jpeg" and ".xlsx". Maybe I'll find the answer to this at some point, but at the moment I suspect that the object used was edited in some way or is simply a different version than I have access to. Perhaps it is a very old version of the object and it's been changed since then. I'm not sure. I tested this in VB.NET and C# and in different .NET framework versions and couldn't get ".txt" to work. I also tested an unedited version of the File Management object from the Digital Exchange with an input of ".txt" for patterns CSV, and it did not pull back a .txt file, but using "." or "*" or "*.txt" all return that text file as a result. The other possibility here is of course that the .NET Framework changed at some point, and I just haven't found the right version where ".txt" would return all files with that file extension. For example, there were apparently other changes to the file pattern functionality in .NET 5 too. Anyway, I'm sure there's a simple explanation about this, but now I am curious whether your current object successfully returns results with ".xlsx" and such. If you don't mind, please post the code here from that code stage. I'd be interested in seeing it.

As for why those developers used "." or "*", it's just because they found it works and it's a simplistic input. I also prefer what you said which is to use "*.txt" and "*.*", but I think "*" is also acceptable.

Edit: I never actually answered the question. I believe the answer is that wildcards are necessary and are not implied. However, in some cases such as ".", some versions of .NET Framework will convert it to a wildcard character. That was an interesting thing to look up.


Dave Morris, 3Ci at Southern Company

Asilarow
MVP

If I understand correctly, you are saying the Object in your process is used like so:

Asilarow_0-1726136865111.png

Is that right?

If so, then I would advise checking the code stage used, to see if it has been modified in some way.

  • The first pattern "*" will include all files in the directory.
  • The second pattern ".jpeg" will only match files named exactly .jpeg
  • The third pattern ".xlsx" will only match files named exactly .xlsx
  • The fourth pattern "." will match files with any extension.

 

In the original code, supplying such a pattern will mostly cause the "*" pattern to dominate (matching all files). However, the other patterns will not provide the intended functionality, as they don't correctly target files with .jpeg and .xlsx extensions. 

Andrzej Silarow

Thanks, Dave,

With only a single exception, the 'Pattern CSV' values that the consultants used were against the standard, vanilla version of the 'Utility - File Management::Get Files' action.

The single exception is a customized version which is the 'Get Files' but it excludes hidden files.  This used the "*" variant, and was the stage that caught my attention.  Not being a Visual Basic expert (everything I know I learned from VBA), I reviewed the custom code and it is pretty straightforward.  I did not see any part that translated the "*".

I appreciate it,

Red

Robert "Red" Stephens Application Developer, RPA Sutter Health Sacramento, CA

Andrzej,

Your interpretations are in line with my expectations.  And that is really what got me thinking about this. The process, as written, seems to be performing as I think the consultants were expecting.  So either our mutual interpretations are not quite correct, or the consultants had something a bit different in mind.

Thanks so much,

Red

Robert "Red" Stephens Application Developer, RPA Sutter Health Sacramento, CA