Let me start by mentioning that what I describe below is with the assumption that you're trying to build this logic on your own. If you have the resources for it, I'd say using a commercial service for this purpose would be easiest, something that already knows how to extract invoices from a large PDF. I don't have any suggestions on that front for which product to use.
If these are scanned PDFs, then you'll have to use some kind of OCR to read the text. Are you already using an OCR solution? If not, I'll assume you're using AWS Textract or a tool that provides similar results. The way I handle this is very dependent on the content of the PDFs. So, unless you use a service that will split it up for you, you'll need to come up with some logic to identify each of the pages and group them together. For example, something I do is to determine if there are one or more sets of phrases in the first or last page of the group. If these invoices can come from any company and could contain any text, then this might be kind of difficult. Let us know if there is some kind of pattern to them, such as set phrases on the first page to look for.
The other thing you can do is to try extracting the page numbers. A lot of times, things like that (invoices etc.) will say 'Page 1 of 6' or just '1 of 6'. You could use regular expressions to extract all the instances of the word 'of' when it is surrounded by two numbers. Do a little string manipulation to determine if the current page is the last one such as '6 of 6' and then start a new group.
Another way to consider is to try reading entity names or something off the pages (might need some kind of NLP/NLU for this) and then try including that into your page grouping logic.
------------------------------
Dave Morris
Cano Ai
Atlanta, GA
------------------------------
Dave Morris, 3Ci at Southern Company