pdf rubbish text

General discussions about Inkscape.
kleinempfaenger
Posts: 2
Joined: Thu Sep 15, 2011 5:35 am

pdf rubbish text

Postby kleinempfaenger » Thu Sep 27, 2012 12:05 am

Good day,
I want to open a pdf in Inkscape, but the text imports as rubbish. Viewing .pdf-file in Evince is correct.
What happens?

Ubuntu 12.04

Thanks an greetings, kl

User avatar
brynn
Posts: 10309
Joined: Wed Sep 26, 2007 4:34 pm
Location: western USA
Contact:

Re: pdf rubbish text

Postby brynn » Fri Sep 28, 2012 10:26 am

Hi kl,
There's a small dialog window that appears when you import. Usually it will ask how to handle text. And usually the choices are text as text and text as paths. Whichever one you used the first time, try the other. Although to be honest, sometimes it only gives you one choice (text as text).

I know there's something about flowed text, that's a problem when you're making a document with Inkscape that's destined for the web (if I understand correctly). I don't know if it affects imports (like for example, if the PDF was made with flowed text).

Was the PDF originally made with Inkscape? Or otherwise, do you know anything about its origins?

I'm not sure what else to say..... I guess if you could provide the PDF, we could try the import. Well, I could try the import on my system. But others here might be able to spot the problem.

Sorry, I wish I could be more helpful :)

~suv
Posts: 2272
Joined: Sun May 10, 2009 2:07 am

Re: pdf rubbish text

Postby ~suv » Fri Sep 28, 2012 2:27 pm

brynn wrote:There's a small dialog window that appears when you import. Usually it will ask how to handle text. And usually the choices are text as text and text as paths. Whichever one you used the first time, try the other. Although to be honest, sometimes it only gives you one choice (text as text).
The only available option is 'Text as text'. Importing text as paths from a PDF file is not supported so far (yes, it's among the well-known feature requests, but not implemented).

kleinempfaenger wrote:What happens?
Could be a problem with the text encoding which might not be handled properly by Inkscape's poppler-based import, or protected text content. You might consider sharing the PDF file to allow others to take a closer look.

Jelle
Posts: 78
Joined: Sat Nov 06, 2010 11:25 am

Re: pdf rubbish text

Postby Jelle » Tue Oct 09, 2012 5:50 pm

I think I may know what he is referring to. If you import .pdf, it works brilliantly, but if you then want to edit the text of the imported file, you'll find out that it is a bit problematic. The pdf import will create one Text element with a number of Spans and all the text inside is "manually" kerned to make up for Inkscapes rather primitive capabilities in this regard. As a result, if you change a text, you newly typed text will NOT get to be where you'd expect, but at the position as defined by the dx dy of the kerning. Ehm,.. if you know SVG you'll get the point and otherwise just import some pdf text and edit it for fun.

There is however a fairly easy way to solve the problem. Use the Split text tool from the extensions with a line setting and apply it on the text you want to edit. It will automatically straighten all the text out and remove the kerning positioning. After that you can easily edit all the text. Unfortunately a tool to put split text back into a single Text with Span has not been created yet (though it happens in the pdf converter, so I guess code could be found there).

If your PDF is protected, just open it with LibreOffice Draw and save it as pdf again (or SVG ;-). LibreOffice Draw ignores the protection and turns it into usable content. It also creates nice SVG font libraries, for those interested, when exporting text to SVG.

nigeldodd
Posts: 1
Joined: Wed Dec 18, 2013 11:36 pm

Re: pdf rubbish text

Postby nigeldodd » Wed Dec 18, 2013 11:57 pm

This seems a persistent problem over different pdf's. Inkscape 0.48

The text imported is in single line chunks but the kerning is wrong. The characters are too close together. If I try to edit the text, using the text tool, the new characters are placed on top of one another.

I can change the font but the bad kerning remains. Some lines of text seem to be properly rendered.

Further investigation reveals that if I select Text/Remove Manual Kerning the text expands in length and the kerning is correct but now the line is too long to fit on the page so I have to reduce the font size. Then I find there is some text missing off the end.

I have had this problem with pdf's from completely different sources.


Return to “General Discussions”