Question

I`m trying to setup my calibre (calibre-ebook.com) to automatic get data from imported pdf files into library. Usually i name my files this way:

Author. Title. Local. Publisher. Published. ISBN.pdf

Example:

C:\Test\RANCIÊRE, Jacques. O mestre ignorante. Belo Horizonte. Autêntica. 2010. 978-85-7526-045-6.pdf


I`m stuck trying get the first paramenter: Author, using the regex:

([^\\]+)\.

I`m getting this value:

RANCIÊRE, Jacques. O mestre ignorante. Belo Horizonte. Autêntica. 2010. 978-85-7526-045-6


Since regex read from left to right isn`t to stop on first dot (.) from .?

The desired value on this example is:

RANCIÊRE, Jacques

Any hint for the other fields? Example for Title the desired value is:

O mestre ignorante

Thanks in advice!!!

Was it helpful?

Solution

^.+?\. will get you the C:\Test\RANCIÊRE, Jacques.

it means get the all characters before the first dot.

if you want only RANCIÊRE, Jacques than use:

(?!(.*\\))(.+?\.)

will give you RANCIÊRE, Jacques.

OTHER TIPS

Regex capturing is greedy, meaning it tries to get the largest match as possible. Try the non-greedy version:

([^\\]+?)\.

Note the only difference is the addition of a ?.

Afterwards, you should be able to retrieve the author's name ("RANCIÊRE, Jacques") with just \1.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top