문제

I'm trying to automate my work of converting PDF to png file with scons. The tool used for my conversion is convert from ImageMagick.

Here's the raw command line:

  1. convert input.pdf temp/temp.png
  2. convert temp/*.png -append output.png

The first command will generate one PNG file for each page in PDF file, so the target of the first command is a dynamic file list.

Here's the SConstruct file I'm working on:

convert = Builder(action=[
    Delete("${TARGET.dir}"),
    Mkdir("${TARGET.dir}"),
    "convert $SOURCE $TARGET"])
combine = Builder(action="convert $SOURCE -append $TARGET")

env = Environment(BUILDERS={"Convert": convert, "Combine": combine})

pdf = env.PDF("input.tex")
pngs = env.Convert("temp/temp.png", pdf) # I don't know how to specify target in this line
png = env.Combine('output.png', pngs)
Default(png)

The code pngs = env.Convert("temp/temp.png", pdf) actually is wrong since the target is multiple files that I don't know how many before env.Convert is executed, so the final output.png only contains the first page of the PDF file.

Any hint is appreciated.

UPDATE:

I just found that I can use command convert input.pdf -append output.png to avoid the two-step conversion.

Still I'm curious how to handle the scenario when the intermediate temporary file list is unknown beforehand and requires a dynamic target list.

도움이 되었습니까?

해결책

If you want to know how to do the original (convert and combine) situation you proposed, I would suggest creating a builder with a SCons Emitter. The emitter allows you to modify the list of source and target files. This works nicely for generated files that dont exist with a clean build.

As you mentioned, the convert step will generate multiple targets, the trick is you need to be able to "calculate" those targets in the emitter based on the source. For example, recently I created a wsdl2java builder and was able to do some simple wsdl parsing in the emitter to calculate all of the target java files to be generated (the source being the wsdl).

Here is a general idea of what the build scripts should look like:

def convert_emitter(source, target, env):
    # both and source and target will be a list of nodes
    # in this case, the target will be empty, and you need
    # to calculate all of the generated targets based on the
    # source pdf file. You will need to open the source file 
    # with standard python code. All of the targets will be
    # removed when cleaned (scons -c)
    target = [] # fill in accordingly
    return (target, source)

# Optionally, you could supply a function for the action
# which would have the same signature as the emitter
convert = env.Builder(emitter=convert_emitter,
                      action=[
                         Delete("temp"),
                         Mkdir("temp"),
                         "convert $SOURCE $TARGET"])
env.Append(BUILDERS={'Convert' : convert})

combine = env.Builder(action=convert_action, emitter=combine_emitter)
env.Append(BUILDERS={'Combine' : combine})

pdf = env.PDF('input.tex')
# You can omit the target in this call, as it will be filled-in by the emitter
pngs = env.Convert(source=pdf)
png = env.Combine(target='output.png', source=pngs)

다른 팁

Depending on what qualifies as "dynamic" for you, I believe the correct answer is: not possible.

As long as the source on which you would like to "dynamically" compute a target set is present when SCons is run, @Brady's solution should work fine. However, if the source in question itself is the target of some other command, it will not work. This is a fundamental limitation of SCons, as it makes the assumption that the set of build targets can be statically determined from the base set of input (non-intermediate) sources. It runs through and computes a build/target/dependency graph in one sweep, then executes it in the next. It has no ability to run through some known portion of the build graph, stop to introspect some intermediate targets to dynamically compute the rest of the build graph, and then continue. I'd frankly love for this ability in the work that I do with SCons, but I'm afraid this is just a fundamental limitation.

The best you can do is set the build up so that on the first run, it stops at the construction of the PDF (if no PDF target exists when the build script is executed). Once the PDF has been built, you can rerun the build and set things up so the rest of the build steps execute based on the PDF built from the last run. This more or less works decently... except for one problem. If the PDF ends up changing (and producing some new pages for instance), you'll actually have to rerun the build twice in order to capture the changes to the PDF, since any page counts (etc) will be based on the old version of the PDF.

I'd love for someone to prove me wrong here, but such is the way of things.

Looking at this, there's no requirement for the individual temp/*png to be kept - if there was, you shouldn't be putting them in a temp directory, and in any case you'd have to do quite a bit of work if you wanted to work out which pages to generate.

So it looks more sensible to do this as one step, this So you'd have something like this

png = env.Convert('output.png', 'input.pdf')

where the action function for convert was something like this:

Delete('temp'),
Mkdir('temp'),
'convert $SOURCE temp/$TARGET',
'for i in temp/*png; do convert $TARGET temp/$i',
Delete('temp')

Though frankly you might do better with writing that whole thing as a single callable script to make sure you got the page sorting correct.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top