C# Html 到 pdf 缺少某些字符（itextsharp）

Question

提问by slayer35

I want to export gridview to pdf by using the itextsharp library. The problem is that some turkish characters such as ?,?,?,? etc... are missing in the pdf document. The code used to export the pdf is:

我想使用 itextsharp 库将 gridview 导出为 pdf。问题是一些土耳其语字符，如 ?,?,?,? 等...在pdf文档中丢失。用于导出pdf的代码是：

 protected void LinkButtonPdf_Click(object sender, EventArgs e)
    {
        Response.ContentType = "application/pdf";
        Response.ContentEncoding = System.Text.Encoding.UTF8;
        Response.AddHeader("content-disposition", "attachment;filename=FileName.pdf");
        Response.Cache.SetCacheability(HttpCacheability.NoCache);
        System.IO.StringWriter stringWrite = new StringWriter();
        System.Web.UI.HtmlTextWriter htmlWrite = new HtmlTextWriter(stringWrite);
        GridView1.RenderControl(htmlWrite);
        StringReader reader = new StringReader(textConvert(stringWrite.ToString()));
        Document doc = new Document(PageSize.A4);
        HTMLWorker parser = new HTMLWorker(doc);
        PdfWriter.GetInstance(doc, Response.OutputStream);
        doc.Open();
        parser.Parse(reader);
        doc.Close();
    }
    public static string textConvert(string S)
    {
        if (S == null) { return null; }
        try
        {
            System.Text.Encoding encFrom = System.Text.Encoding.UTF8;
            System.Text.Encoding encTo = System.Text.Encoding.UTF8;
            string str = S;
            Byte[] b = encFrom.GetBytes(str);
            return encTo.GetString(b);
        }
        catch { return null; }
    }

Note: when I want to insert characters into the pdf document, the missing characters are shown in it. I insert the characters with this code:

注意：当我想在pdf文档中插入字符时，缺少的字符会显示在其中。我用这个代码插入字符：

   BaseFont bffont = BaseFont.CreateFont("C:\WINDOWS\Fonts\arial.ttf", BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
        Font fontozel = new Font(bffont, 12, Font.NORMAL, new Color(0, 0, 0));
        doc.Add(new Paragraph("????????????", fontozel));

Answer 1

采纳答案by slayer35

Finaly I think I found the solution,I changed itextsharp source code a little in order to show turkish characters.(turkish character code is cp1254)

最后我想我找到了解决方案，为了显示土耳其语字符，我稍微更改了 itextsharp 源代码。（土耳其语字符代码是 cp1254）

I add "public const string CP1254 = "Cp1254";" to [BaseFont.cs] in the source code.
After that I modify the [FactoryProperties.cs].I changed like this;

我public const string CP1254 = "Cp1254";在源代码中将“ ”添加到 [BaseFont.cs]。
之后我修改了[FactoryProperties.cs]。我改成这样；

public Font GetFont(ChainedProperties props)
{
I don't write the whole code.I changed only code below;
------------Default itextsharp code------------------------------------------------------
  if (encoding == null)
                encoding = BaseFont.WINANSI;
            return fontImp.GetFont(face, encoding, true, size, style, color);
-------------modified code--------------------------------------------

            encoding = BaseFont.CP1254;
            return fontImp.GetFont("C:\WINDOWS\Fonts\arial.ttf", encoding, true, size, style, color);
}

.After I compile new dll ,and missing characters are shown.

.在我编译新的 dll 后，显示缺少的字符。

Answer 2

回答by paracycle

I am not familiar with the iTextSharp library; however, you seem to be converting the output of your gridview component to a string and reading from that string to construct your PDF document. You also have a strange conversion from UTF-8 to UTF-8 going on.

我不熟悉 iTextSharp 库；但是，您似乎正在将 gridview 组件的输出转换为字符串并从该字符串中读取以构建您的 PDF 文档。您还进行了从 UTF-8 到 UTF-8 的奇怪转换。

From what I can see (given that your GridView is outputting characters correctly) if you are outputting the characters to a string they would be represented as UTF-16 in memory. You probably need to pass this string directly into the PDF library (like how you pass the raw UTF-16 .NET string "???????????"as it is).

从我所看到的（假设您的 GridView 正确输出字符）如果您将字符输出到字符串，它们将在内存中表示为 UTF-16。您可能需要将此字符串直接传递到 PDF 库中（就像传递原始 UTF-16 .NET 字符串的方式"???????????"一样）。

Answer 3

回答by Axl

For Turkish encoding

对于土耳其语编码

CultureInfo ci = new CultureInfo("tr-TR");
Encoding enc = Encoding.GetEncoding(ci.TextInfo.ANSICodePage);

If you're outputting HTML, try different DOCTYPE tags at the top of the page.

如果您正在输出 HTML，请尝试在页面顶部使用不同的 DOCTYPE 标记。

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Note if using HTML you may need to HTMLEncode the characters.

请注意，如果使用 HTML，您可能需要对字符进行 HTMLEncode。

Server.HTMLEncode()

HttpServerUtility.HtmlEncode()

Answer 4

回答by Murat

No need to change the source code.

无需更改源代码。

Try this:

尝试这个：

iTextSharp.text.pdf.BaseFont STF_Helvetica_Turkish = iTextSharp.text.pdf.BaseFont.CreateFont("Helvetica","Cp1254", iTextSharp.text.pdf.BaseFont.NOT_EMBEDDED);    

iTextSharp.text.Font fontNormal = new iTextSharp.text.Font(STF_Helvetica_Turkish, 12, iTextSharp.text.Font.NORMAL);

Answer 5

回答by xoraxbx

BaseFont bF = BaseFont.CreateFont("c:\arial.ttf","windows-1254",true);
Font f = new Font(bF,12f,Font.NORMAL);
Chunk c = new Chunk();
c.Font = f;
c.Append("Turkish characters: ?ü???? ?ü????");
document.Add(c);

In the first line, you may write these instead of "windows-1254". All works:

在第一行，你可以写这些而不是“windows-1254”。所有作品：

Cp1254
iso-8859-9
windows-1254

cp1254
iso-8859-9
视窗-1254

Answer 6

回答by dungnguyen

You can use:

您可以使用：

iTextSharp.text.pdf.BaseFont Vn_Helvetica = iTextSharp.text.pdf.BaseFont.CreateFont(@"C:\Windows\Fonts\arial.ttf", "Identity-H", iTextSharp.text.pdf.BaseFont.EMBEDDED);
iTextSharp.text.Font fontNormal = new iTextSharp.text.Font(Vn_Helvetica, 12, iTextSharp.text.Font.NORMAL);

Answer 7

回答by Fatih ?engel

I solved the problem. I can provide my the other solution type...

我解决了这个问题。我可以提供我的其他解决方案类型...

try
{
        BaseFont bf = BaseFont.CreateFont("c:\windows\fonts\calibrib.ttf",
            BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
        Document document = new Document(PageSize.A4, 25, 25, 30, 30);
        PdfWriter writer = PdfWriter.GetInstance(document, fs);

        Font f = new Font(bf, 12f, Font.NORMAL);
        // Open the document to enable you to write to the document
        document.Open();
        // Add a simple and wellknown phrase to the document
        for (int x = 0; x != 100; x++)
        {
            document.Add(new Paragraph("Paragraph - This is a test! ??????????üü",f));
        }

        // Close the document
        document.Close();          
}
catch(Exception)
{

}

Answer 8

回答by VahidN

Don't change the source code of the iTextSharp. Define a new style:

不要更改 iTextSharp 的源代码。定义新样式：

        var styles = new StyleSheet();
        styles.LoadTagStyle(HtmlTags.BODY, HtmlTags.FONTFAMILY, "tahoma");
        styles.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, "Identity-H");

and then pass it to the HTMLWorker.ParseToList method.

然后将其传递给 HTMLWorker.ParseToList 方法。

Answer 9

回答by Vinit Patel

i have finally find a soultution for this problem , by this you can print all turkish character.

我终于找到了解决这个问题的方法，通过这个你可以打印所有的土耳其字符。

String htmlText = html.ToString();

    Document document = new Document();

    string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
    PdfWriter.GetInstance(document, new FileStream(filePath + "\pdf-"+Name+".pdf", FileMode.Create));
    document.Open();

    iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document);
    FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")),  "Garamond");   // just give a path of arial.ttf 
    StyleSheet css = new StyleSheet();
    css.LoadTagStyle("body", "face", "Garamond");
    css.LoadTagStyle("body", "encoding", "Identity-H");
    css.LoadTagStyle("body", "size", "12pt");

    hw.SetStyleSheet(css);

     hw.Parse(new StringReader(htmlText));

Answer 10

回答by ekarakus

thank you very much all who posted the samples..

非常感谢所有张贴样品的人..

i use the below solution from codeproject , and there was the turkish char set problems due to font..

我使用来自 codeproject 的以下解决方案，并且由于字体存在土耳其字符集问题..

If you use htmlworker you should register font and pass to htmlworker

如果你使用 htmlworker 你应该注册字体并传递给 htmlworker

http://www.codeproject.com/Articles/260470/PDF-reporting-using-ASP-NET-MVC3

      StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
                styles.LoadTagStyle("h3", "size", "5");
                styles.LoadTagStyle("td", "size", ".6");
                FontFactory.Register("c:\windows\fonts\arial.ttf", "Garamond");   // just give a path of arial.ttf 
                styles.LoadTagStyle("body", "face", "Garamond");
                styles.LoadTagStyle("body", "encoding", "Identity-H");
                styles.LoadTagStyle("body", "size", "12pt");
                using (var htmlViewReader = new StringReader(htmlText))
                {
                    using (var htmlWorker = new HTMLWorker(pdfDocument, null, styles))
                    {
                        htmlWorker.Parse(htmlViewReader);
                    }
                }

C# Html 到 pdf 缺少某些字符（itextsharp）

提问by slayer35

采纳答案by slayer35

回答by paracycle

回答by Axl

回答by Murat

回答by xoraxbx

回答by dungnguyen

回答by Fatih ?engel

回答by VahidN

回答by Vinit Patel

回答by ekarakus

相关推荐

最近更新

标签

C# Html 到 pdf 缺少某些字符（itextsharp）

提问by slayer35

采纳答案by slayer35

回答by paracycle

回答by Axl

回答by Murat

回答by xoraxbx

回答by dungnguyen

回答by Fatih ?engel

回答by VahidN

回答by Vinit Patel

回答by ekarakus

相关推荐

从 C# 以管理员身份执行 PowerShell

C# 我如何在中继器中执行 if 语句

C# - 委托 System.Func< >

C# 您应该如何诊断错误 SEHException - External component has throw an exception

相关推荐

最近更新

标签