cancel
Showing results for 
Search instead for 
Did you mean: 

How to check page numbers are present are not in the word document?

HarishM2
Level 6

Hi,

 I have some vb.net code to read page number in word but the problem is whether the page numbers are present visually or not. code  is reading the page numbers  and giving me the page count But What I 'm really looking is to check  the page number is present visually or not? and update the data item accordingly.

I know I can read Header and Footer text and  check any number present there. But the problem with that is  here is possibility that Header and footer may have numbers  appended to end of the text. So Its not reliable to use the Header and Footer text.

 Is there a best way to do it?

Thanks,

Harish



------------------------------
Harish
RPA Developer
------------------------------
3 REPLIES 3

james.man
Staff
Staff
If this is a solution that you need for a specific document as opposed to a generic solution that can be used across multiple documents, I would look into using surface automation, since that is the "spy mode" (region) that most matches the "visual" page number that you are looking for.

If the Word document can be saved as an XML document you can search the XML version of that doc to see if you have the "page number" element:
12366.png
If you can predict what other numbers or text may be present in the header or footer along with your page number, you can try using regex to clean/remove that text, leaving you with just the page number.


------------------------------
James Man
Professional Services
Blue Prism
Asia/Hong_Kong
------------------------------

Thanks james. I checked in to XML in the past but Page element will give some thing always even if the page number is not displayed in word. So it will be ruled out.


Looks like  surface automation is the only option. If we are not able to predict the pattern of page numbers.



------------------------------
Harish
RPA Developer
------------------------------

RachmaSalim
Level 3
Hi Harish,

I think another alternative, should you not want to use the XML option might be the following:

If you wish to read the header or/and footer text and check the page value to see if it's being displayed or not, I would do the following:

1. have a consistent syntax in your header or footer,
2. use the action to read the footer/header,
3. then extract the actual page value and remove any extra details there might be (text or number before or after the actual page value).

For instance you might want to have the following in the footer:
"Page 2 - Extra details including numbers" . If you just wish to have the value "2" and ignore everything else, just extract it using the BP String functions already available. I have added a hyphen in the footer to separate the page value (that could be 1, 2, 3 or even more digits). You'd extract everything after "Page" and stop at "-" and then trim the page value to remove any space. This way, you should be able to see if there's a page value in the footer or not, and exclude any other irrelevant information.

Let me know if it might help.

Rachma










------------------------------
Rachma Salim
------------------------------