Click here to Skip to main content
16,016,394 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
While extracting text from PDF file using itextsharp iam getting an error "Could not find image data or EI"

I have placed my code and sample file. Kindly do the needful.


Dim simg,tmp,sImgPDFLst As String
simg = AOTD_SU_20131118_006.pdf
Dim reader As iTextSharp.text.pdf.PdfReader = Nothing
tmp = pdf.parser.PdfTextExtractor.GetTextFromPage(reader, 1, New pdf.parser.SimpleTextExtractionStrategy())
If tmp.Length = 0 Then
sImgPDFLst = "Following files are IMAGE PDF"
End If
reader.Dispose()
reader.Close()
reader = Nothing

Link: https://drive.google.com/file/d/0B_nzYHWVJJ7KbnFSRWx5ZVNpSkk/edit?usp=sharing
Posted

Here simg = AOTD_SU_20131118_006.pdf you have not define the path of pdf file. Check it and give exact path of said file.
 
Share this answer
 
Comments
vinodh107 15-Mar-14 1:14am    
i have mentioned path..just for information i have placed filename in this coding part.
vinodh107 17-Mar-14 0:00am    
is there any solution for this???kindly do the needful
sohail awr 17-Mar-14 2:15am    
I am trying and soon it will be done.
Member 10885947 15-Jun-14 16:28pm    
Did you find solution to this problem, I am having the same error. Can you please help
Thank You
VB
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports System.IO

Partial Public Class WebForm2
    Inherits System.Web.UI.Page
    Dim path As String = Server.MapPath("PDFs")

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        If GetTextFromPDF(path + "/AOTD_SU_20131118_006.pdf") <> "" Then
            Me.Label1.Text = GetTextFromPDF(path + "/AOTD_SU_20131118_006.pdf")
        Else
            Me.Label1.Text = "The PDF has Images"
        End If
    End Sub

    Public Shared Function GetTextFromPDF(ByVal PdfFileName As String) As String
        Dim oReader As New iTextSharp.text.pdf.PdfReader(PdfFileName)
        Dim i As Integer
        Dim sOut = ""

        For i = 1 To oReader.NumberOfPages
            Dim its As New iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy

            sOut &= iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(oReader, i, its)
        Next

        Return sOut
    End Function

End Class
 
Share this answer
 
v2
Comments
vinodh107 25-Mar-14 1:47am    
hi sohail, this is same code which i have posted, From your code,Pages will read 1 by 1..But in My code i have hardcored single pdf..

Even after using your code iam getting same kind of error.
sohail awr 25-Mar-14 9:03am    
have you created the folder named "PDFs" in your solution. Because I have displayed the pdf through above code.
vinodh107 25-Mar-14 9:24am    
ya sohail...inside folder only i have created...but i have 1 doubt sohail, for what purpose we are creating foldername "PDFs"
sohail awr 25-Mar-14 11:32am    
Your above PDF file "AOTD_SU_20131118_006.pdf" should placed there
Did you get a solution for this one, I am also facing same issue. Could you please help me with this issue
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900