How To: Concatenate PDF files in Linux
When I am working in Inkscape I often want to put together multiple-page graphics. Sadly Inkscape doesn’t yet support multiple page documents, although they are planning this feature and so it shouldn’t be too long before it does. So the way I work around this is to create multiple documents, one for each page, then use “Print to file” to save each one as a PDF. Now I have, say, 3 PDFs. All I need do now is concatenate them to create one single PDF that I can send to the customer.
Here’s one way to do it using GhostScript from the command line:
gs -q -sPAPERSIZE=a4 -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=logo-concepts.pdf logo-concepts-1.pdf logo-concepts-2.pdf logo-concepts-3.pdf
In this example logo-concepts.pdf
is the finished PDF with all pages appended, and logo-concepts-1.pdf, logo-concepts-2.pdf and logo-concepts-3.pdf are the individual pages. You can tweak “-sPAPERSIZE=
” to suit your needs.
Another Method
As “anon” points out in the comments it is also possible to use the command “pdfjoin”, if you have it installed. The program is part of the “pdfjam” package which, at a hefty 389mb of dependencies, may not be ideal for you. Installing the package is easy though:
sudo apt-get install pdfjam
After the package and its dependencies have been installed you can simply use
pdfjoin logo-concepts-1.pdf logo-concepts-2.pdf logo-concepts-3.pdf
pdfjoin will choose a filename for the joined file for you. In this example it would create a file named logo-concepts-3-joined.pdf.
Using ImageMagick
As suggested by kosmo in the comments, another great method is to use ImageMagick:
convert 1.pdf 2.pdf 3.pdf target.pdf
This doesn’t work with ImageMagick on my Mac, but I expect it would work fine on a Linux system with GhostScript installed.
pdfjoin file1.pdf file2.pdf file3.pdf
Yes, granted that’s an easier way of doing it if you have the command installed. You need to install the “pdfjam” package first though, on Ubuntu:
sudo apt-get install pdfjam
I’ll update the article to include this method, thanks for your comment. :)
I just discovered that I already had pdfjam installed on my computer (Ubuntu 12.10) as a part of texlive-extra-utils – this should explain why it has this slew of dependencies. Interestingly, it actually calls pdflatex and is not just a sophisticated wrapper for Ghostscript.
Another alternative would be pdftk (PDF Toolkit), which in turn requires GCJ and therefore has quite some overhead, too. For simple tasks like concatenation, bare-bones Ghostscript (which should already be a part of your system) is probably the easiest solution, even if the command-line syntax is slightly more complicated than for a dedicated concatenation tool.
pdfunite from Poppler is also an option.
Try pdftk (23Mb).
pdftk is no longer available for CentOS/RHEL 7.
Just what i wanted , many thanks
the easiest way is to use imagemagick:
convert 1.pdf 2.pdf 3.pdf target.pdf
Ha, I’ve been using ImageMagick for years, never knew it could manipulate PDFs! :)
Just tried it on my Mac and it failed, but I think that’s more to do with not having GhostScript.
Well, you can kind of use ImageMagick, but remember that ImageMagick a pixel graphic manipulator and cannot deal with vector graphics. Therefore, the first step is that it forces rasterization, then it joins the pictures and finally outputs them as a PDF. As pointed out by Sadeq, this results in poor quality and/or huge file size.
Personally, I prefer pdfunite (comes with pdftk, I think).
However, there are cases when ImageMagick is one of the few solutions that work. This is for example the case for read-only-encrypted PDFs, which cannot be simply merged, In these cases I work around this using ImageMagick in this way:
convert -units PixelsPerInch -density 300 readonly.pdf readonly.png
convert readonly.png readonly-rasterized.pdf
pdfunite file1.pdf file2.pdf readonly-rasterized.pdf merged.pdf
Note on resolution:
-units PixelsPerInch -density 300 (which must be specified BEFORE the file name, because as soon as the file name appears, ImageMagick imports the PDF with the resolution specified AT THAT TIME – the default is 72dpi) specifies a resolution of 300dpi. You may want to pick something lower (though, 150dpi is the bare minimum if you consider ever printing the file) if file size is the main concern or a higher value (600dpi is considered good print quality). The default value is 72dpi and low enough to even look crappy on nowadays’ computer screens (e.g. a Full HD 15″ is around 150dpi resolution), let alone printers. Also note that resolutions other than 150dpi, 300dpi, 600dpi and 1200dpi (and sometimes 2400dpi in high end) are unusual for printers and therefore font rasterized at 400dpi might look worse than font rasterized at 300dpi when printed.
Note on PNG intermediate:
You see that I convert first to PNG and then back to PDF. This gives the same quality as going directly to PDF, but with the benefit of PNG compression. This is going to help reduce file size dramatically. Probably, there is a way to tell ImageMagick to switch on compression without having to go through the intermediate PNG step, but I’m fine with that and haven’t looked for a way around it.
The convert sucks. It seems it convert input PDF files to image and then make output PDF file that make huge size or low quality.
Which conversion? Or “convert” as in ImageMagick?