|
There are several easy ways to update this code to make it more readable and functional as well as perform better.
1) Always try to avoid using string concatenation. Strings in C# are immutable, and string + string + string requires allocating 2 new temporary strings. There's really no reason for AddWithComma to be as complicated and slow as your implementation. Just do this:
private string AddWithComma(string strSource,string strAdd)
{
return String.Format("{0}, {1}", strSource, strAdd );
}
2) Many of your functions look like this:
try {
} catch( Exception ex ) {
throw ex;
}
That's completely redundant. If all you plan to do with the exception is throw it, then don't catch it in the first place.
3) Many of your variables look like this:
private bool _blnKeepConnectionOpen=false;
public KeepConnectionOpen {
get { return _blnKeepConnectionOpen; }
set { _blnKeepConnectionOpen = value; }
}
This code is also completely redundant. If all you do is pass through the value, just make the variable public in the first place. Also, if you are going to use some form of Hungarian notation, consider using standard Hungarian (bKeepConnectionOpen, not blnKeepConnectionOpen). Also, why do you name your private variables in Hungarian-ish style and then not have the public properties use the same notation? It's confusing.
4) Don't ever catch and drop exceptions unless you have a very good reason to.
public String[] GetExcelSheetNames()
{
try {
// do the work
}
catch() {
return null;
}
}
Why did you do that? Anything that failed will just die silently, which is no help to anyone.
Please consider cleaning up your code before releasing it publicly. Readability and maintainability is as important as functionality.
|
|
|
|
|
I searched the code files with: AddWithComma, KeepConnectionOpen, GetExcelSheetNames, but nothing found.
Why you post this here?
|
|
|
|
|
With regard to item #4, one good reason to use the style shown is when the exception is considered not fatal to the whole program itself. Why catch it in the first place? Well if you don't, the program crashes. If you catch it without returning, the program crashes the same way.
For example, I modified the Decode function in the BOF.cs to read like this:
public override void Decode()
{
MemoryStream stream = new MemoryStream(Data);
BinaryReader reader = new BinaryReader(stream);
try
{
this.BIFFversion = reader.ReadUInt16();
this.StreamType = reader.ReadUInt16();
this.BuildID = reader.ReadUInt16();
this.BuildYear = reader.ReadUInt16();
this.FileHistoryFlags = reader.ReadUInt32();
this.RequiredExcelVersion = reader.ReadUInt32();
}
catch
{
return;
}
}
Why did I do this? Because the program crashes whenever it reads past the end of file.
|
|
|
|
|
In response to 3, please don't do that!
Hungarian should be dropped completely, but keep the properties. Even though it's just a wrapper around the member it does provide a good mechanism for encapsulation and therefore internal implementation changes later. There's rarely a need to expose member variables and false optimizations like this usually isn't a good enough reason.
|
|
|
|
|
Hello Phantasm,
Thank you for your partial constructive comments. However you did forgot to start with some positive feedback to the author. It's some good code, that will everybody give a headstart.
about the string concatenation; you are not 100% right. The compiler will optimize a "string" + "string" for you. If you run some performance test, you will notice that a string format is not perse faster. See http://blogs.msdn.com/ricom/archive/2004/03/12/88715.aspx[^]
I do agree on the exception handling (2 and 4). Don't catch it if you don't handle it.
|
|
|
|
|
Hi,
thanks for the great library.
how could i modify the code to replace empty cells with empty strings ?
Thx.
|
|
|
|
|
Thank you for your contribution!
|
|
|
|
|
Hi,
I have found a bug and have a fix for it.
Sometimes I get the following error when loading a file.
"The output char buffer is too small to contain the decoded characters, encoding 'Unicode (UTF-8)' fallback 'System.Text.DecoderReplacementFallback'. Parameter name: chars"
record.cs lines 83 and 133 appear to be the cause of the problem.
Just change (reader.PeekChar() == -1) to (reader.BaseStream.Position == reader.BaseStream.Length).
This solution came from
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=127647&SiteID=1
This appears to fix the issue.
Thanks,
Douglas
|
|
|
|
|
Thanks.
|
|
|
|
|
I was trying to read the Excel file which has 84 columns, an exception was thrown from EscherRecord.cs:
System.IO.EndOfStreamException was unhandled
Message="Unable to read beyond the end of the stream."
Source="mscorlib"
StackTrace:
at System.IO.__Error.EndOfFile()
at System.IO.BinaryReader.FillBuffer(Int32 numBytes)
at System.IO.BinaryReader.ReadUInt16()
at QiHe.Office.Excel.EscherRecord.ReadBase(Stream stream) in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\EscherRecord.cs:line 42
at QiHe.Office.Excel.EscherRecord.Read(Stream stream) in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\EscherRecords\EscherRecord.cs:line 12
at QiHe.Office.Excel.MsofbtContainer.Decode() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\Extended\MsofbtContainer.cs:line 18
at QiHe.Office.Excel.MsofbtContainer.Decode() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\Extended\MsofbtContainer.cs:line 19
at QiHe.Office.Excel.MSOCONTAINER.Decode() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\Extended\MSOCONTAINER.cs:line 19
at QiHe.Office.Excel.Worksheet.PopulateCells() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\Worksheet.cs:line 80
at QiHe.Office.Excel.Workbook.Read(Stream stream) in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Excel\Workbook.cs:line 51
at QiHe.Office.Tool.Form1.LoadExcelSheets() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Tool\Form1.cs:line 87
at QiHe.Office.Tool.Form1.tabControl1_Selected(Object sender, TabControlEventArgs e) in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Tool\Form1.cs:line 74
at System.Windows.Forms.TabControl.OnSelected(TabControlEventArgs e)
at System.Windows.Forms.TabControl.WmSelChange()
at System.Windows.Forms.TabControl.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
at System.Windows.Forms.UnsafeNativeMethods.SendMessage(HandleRef hWnd, Int32 msg, IntPtr wParam, IntPtr lParam)
at System.Windows.Forms.Control.SendMessage(Int32 msg, IntPtr wparam, IntPtr lparam)
at System.Windows.Forms.Control.ReflectMessageInternal(IntPtr hWnd, Message& m)
at System.Windows.Forms.Control.WmNotify(Message& m)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ScrollableControl.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
at System.Windows.Forms.UnsafeNativeMethods.CallWindowProc(IntPtr wndProc, IntPtr hWnd, Int32 msg, IntPtr wParam, IntPtr lParam)
at System.Windows.Forms.NativeWindow.DefWndProc(Message& m)
at System.Windows.Forms.Control.DefWndProc(Message& m)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.TabControl.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
at System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
at System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
at System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
at System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
at System.Windows.Forms.Application.Run(Form mainForm)
at QiHe.Office.Tool.Program.Main() in C:\Documents and Settings\qgv02722\My Documents\Visual Studio 2005\Projects\ExcelReader_src\Office\Tool\Program.cs:line 17
It works for me with the Excel file which has less than 69 columns. I wonder if anyone knows the tricky.
Thanks,
Wen
|
|
|
|
|
I'm receiving the same error whenever I'm working with spreadsheets that contain large amounts of data. I haven't had the time yet to figure out if it's a certain number of columns causing it or what, but what I do know is that when it is attempting to load the "directoryStream". What happens on my sheets is that it takes the value for header.FirstSectorIDofDirectoryStream and gives that to the function ReadArrayOfInt32, it then reaches private static List<int> GetSIDChain(List<int> sat, int startSID... The iterationg, it's ok, but the value that it assigns to sid is a -3. When it iterates the second time, it tries to use the -3 as an array index and kills over.
So any insite would be great as my project will commonly be working wtih spreadsheets containing large amounts of data.
Thanks!
Steve
|
|
|
|
|
I found a problem with reading array formulas (e.g. TRANSPOSE). I fixed this problem by inserting the following lines in Worksheet.cs , after line 36 (i.e. within the switch statement)
if (last_record is FORMULA)
{
record = last_record;
}
break;
Thanks very much for an extremely useful project
|
|
|
|
|
As per earlier messages still cannot get the date format. Any help on that. I have gone and go the XFindex as per you last message but nothing.
|
|
|
|
|
Hi,
here is example, how to read DateTime cell from sheet
CompoundDocument doc;
doc = CompoundDocument.Read(PathToExcelFile);
byte[] bookdata = doc.GetStreamData("Workbook");
Workbook book = new Workbook();
book.Read(new MemoryStream(bookdata));
Worksheet sheet = book.Worksheets[0];
//if cell[1,1] have datetime value you must write
DateTime S_Date =(sheet.Cells[1,1].DateTimeValue);
//after that you can format this value using this
string s1 = S_Date.ToString("dd-M-yyyy");
That's it
NEO
|
|
|
|
|
I totally forgot that I had posted this. This is how i did it
if ( cell.Sheet.GetCellXF(keyIndex,count).FormatIndex == 164 )
//I changed the code around a little bit..to get to the GetCellXF..
That being said I tested a variety of spreadsheets. The FormatIndex is 164 for datetime.
|
|
|
|
|
The integer number is the number of days since 1/1/1900 minus 2.
I used the method below: the method gets the date in string format and attempts to convert to int.
If it doesn't work, it replaces expressions like "1st" "2nd" "3rd" "4th" and tries again.
If it still doesn't work, it gives a final attempt to convert it straight to a datetime.
private static string ConvertExcelDate(string strDate)
{
int iDate = 0;
if (!int.TryParse(strDate, out iDate))
{
string strConvertedDate = strDate.Replace("st.", ",");
strConvertedDate = strConvertedDate.Replace("nd.", ",");
strConvertedDate = strConvertedDate.Replace("rd.", ",");
strConvertedDate = strConvertedDate.Replace("th.", ",");
if (!int.TryParse(strConvertedDate, out iDate))
{
DateTime dtConvertedDate = new DateTime();
if( DateTime.TryParse( strConvertedDate, out dtConvertedDate ) )
return dtConvertedDate.ToShortDateString();
return "";
}
}
DateTime dtDate = new DateTime(1900, 1, 1);
dtDate = dtDate.AddDays(iDate - 2);
return dtDate.ToShortDateString();
}
|
|
|
|
|
Line 97 from Record.cs :
Encoding encoding = Encoding.Default;
(a lot of accents in the French language )
Thanks !
|
|
|
|
|
2007-5-17
display each Sheet in separate tabpage
fixed some bugs in FORMULA.cs and Form1.cs
2007-5-26
decode FORMULAR result
|
|
|
|
|
Hey I liked what u did.!
Moim Hossain
Sr. Software Engineer
Onirban Orion Technologies.
|
|
|
|
|
Thanks!
|
|
|
|
|
I get this error on some files
System.Exception: String not ended.
at QiHe.Office.Excel.Record.ReadString(BinaryReader reader, Int32 lengthbits) in C:\Visual Studio 2005\Projects\Personal\GDC\ExcelReader_src\Office\Excel\Record.cs:line 101
at QiHe.Office.Excel.FORMULA.Decode() in C:\Visual Studio 2005\Projects\Personal\GDC\ExcelReader_src\Office\Excel\Records\FORMULA.cs:line 39
at QiHe.Office.Excel.Worksheet.PopulateCells() in C:\Visual Studio 2005\Projects\Personal\GDC\ExcelReader_src\Office\Excel\Worksheet.cs:line 73
at QiHe.Office.Excel.Workbook.Read(Stream stream) in C:\Visual Studio 2005\Projects\Personal\GDC\ExcelReader_src\Office\Excel\Workbook.cs:line 51
at QiHe.Office.Excel.Workbook.Open(Stream file) in C:\Visual Studio 2005\Projects\Personal\GDC\ExcelReader_src\Office\Excel\Workbook.cs:line 37
at GDC.ImportExcelReportForm.ImportFile(String path, Int32 officeNumber) in C:\Visual Studio 2005\Projects\Personal\GDC\GDC\ImportExcelReportForm.vb:line 214
|
|
|
|
|
Yes, I also get this exception all the time. I have tried the code on some files, but it crashes. Only very basic Excel files seem to work.
Great initiative though! It would be great if you could fix the code, because a free Excel reader for .NET which doesn't involve COM is really needed out there!
Tomat coder
|
|
|
|
|
I have been getting the same errors off and on and I believe the problem is Functions.
Remove all the functions from the spreadsheet (do a "cut" and a "paste special - values") and re-read it and see if that helps.
It seems to work for me.
Martin.
|
|
|
|
|
Thanks for your message.
When coding in .NET it is really important to do away with everything related to COM. It is a major source of annoyances and strange bugs.
I would have used a bug-free .NET Excel reader to put the contents of the spreadsheets into DataTables or something. If I manually have to edit the Excel files and resave them in order to clean them from things this code cannot handle, I might just as well open Excel and save the files as CSV files straight away and forget about this code.
There are tons of features in Excel files. What about graphics, formatting, macros... I suspect this code won't handle it. I have seen there are commercial .NET Excel readers but I'm not sure if they do the trick.
|
|
|
|
|
OK, this bug is fixed.
It results in:
public partial class FORMULA : Record
{
...
public override void Decode()
{
...
FormulaData = Record.ReadString(reader,16);
}
}
FormulaData is not of string type.
-- modified at 3:16 Monday 28th May, 2007
|
|
|
|
|