END GOAL: Efficiently (in one pass) read all CellRecords
on a huge (30,000+ row), protected Worksheet
.
Problem:
Using the HSSF.EventUserModel
, how can I read all Record
s (including CellRecords
) for an XLS file with both Workbook and Worksheet protection?
Create Input Spreadsheet (in Excel 2010):
- Create new Blank workbook.
- Set value of A1 to number: 50
- Set value of A2 to string: fifty
- Set value of A3 to formula: =25*2
- Review (ribbon) -> Protect Sheet -> Password: pass1
- Review (ribbon) -> Protect Workbook -> Password: pass1
- File (ribbon) ->Save As... -> Save as type: Excel 97-2003 Workbook
Progress thus far:
- The XLS file opens without a password in Excel. Therefore, you shouldn't need the password to open it in POI.
- The XLS file opens successfully with
new HSSFWorkbook(Stream fs)
. However, I need the efficiency of EventUserModel
for my actual spreadsheet.
- Setting
NPOI.HSSF.Record.Crypto.Biff8EncryptionKey.CurrentUserPassword = "pass1";
did not work.
- The
ProcessRecord( )
function catches a PasswordRecord
, but I can't find any documentation on how to properly handle it.
- Perhaps, the
EncryptionInfo
or Decryptor
classes may be of some use.
Note:
I'm using NPOI. However, I can translate any java examples to C#.
Code:
I use the following code to capture Record
events. My Book1-unprotected.xls
(without protection) shows all Record
events (including cell values). My Book1-protected.xls
displays some records and throws an exception.
I just view processedEvents
in the debugger.
using System;
using System.Collections.Generic;
using System.IO;
using NPOI.HSSF.Record;
using NPOI.HSSF.Model;
using NPOI.HSSF.UserModel;
using NPOI.HSSF.EventUserModel;
using NPOI.POIFS;
using NPOI.POIFS.FileSystem;
namespace NPOI_small {
class myListener : IHSSFListener {
List<Record> processedRecords;
private Stream fs;
public myListener(Stream fs) {
processedRecords = new List<Record>();
this.fs = fs;
HSSFEventFactory factory = new HSSFEventFactory();
HSSFRequest request = new HSSFRequest();
MissingRecordAwareHSSFListener mraListener;
FormatTrackingHSSFListener fmtListener;
EventWorkbookBuilder.SheetRecordCollectingListener recListener;
mraListener = new MissingRecordAwareHSSFListener(this);
fmtListener = new FormatTrackingHSSFListener(mraListener);
recListener = new EventWorkbookBuilder.SheetRecordCollectingListener(fmtListener);
request.AddListenerForAllRecords(recListener);
POIFSFileSystem poifs = new POIFSFileSystem(this.fs);
factory.ProcessWorkbookEvents(request, poifs);
}
public void ProcessRecord(Record record) {
processedRecords.Add(record);
}
}
class Program {
static void Main(string[] args) {
Stream fs = File.OpenRead(@"c:\users\me\desktop\xx\Book1-protected.xls");
myListener testListener = new myListener(fs); // Use EventModel
//HSSFWorkbook book = new HSSFWorkbook(fs); // Use UserModel
Console.Read();
}
}
}
UPDATE (for Juan Mellado):
Below is the exception. My best guess right now (in the answer by Victor Petrykin) is that the HSSFEventFactory
uses RecordInputStream
which cannot natively decrypt protected records. Upon receiving the exception, processedRecords
contains 22 records including the following potentially significant ones:
- processedRecords[5] is a
WriteAccessRecord
with a garbled (probably encrypted) value for .name
- processedRecords[22] is a
RefreshAllRecord
and is the last Record
in the list
Exception:
NPOI.Util.RecordFormatException was unhandled
HResult=-2146233088
Message=Unable to construct record instance
Source=NPOI
StackTrace:
at NPOI.HSSF.Record.RecordFactory.ReflectionConstructorRecordCreator.Create(RecordInputStream in1)
at NPOI.HSSF.Record.RecordFactory.CreateSingleRecord(RecordInputStream in1)
at NPOI.HSSF.Record.RecordFactory.CreateRecord(RecordInputStream in1)
at NPOI.HSSF.EventUserModel.HSSFRecordStream.GetNextRecord()
at NPOI.HSSF.EventUserModel.HSSFRecordStream.NextRecord()
at NPOI.HSSF.EventUserModel.HSSFEventFactory.GenericProcessEvents(HSSFRequest req, RecordInputStream in1)
at NPOI.HSSF.EventUserModel.HSSFEventFactory.ProcessEvents(HSSFRequest req, Stream in1)
at NPOI.HSSF.EventUserModel.HSSFEventFactory.ProcessWorkbookEvents(HSSFRequest req, POIFSFileSystem fs)
at NPOI_small.myListener..ctor(Stream fs) in c:\Users\me\Documents\Visual Studio 2012\Projects\myTest\NPOI_small\Program.cs:line 35
at NPOI_small.Program.Main(String[] args) in c:\Users\me\Documents\Visual Studio 2012\Projects\myTest\NPOI_small\Program.cs:line 80
at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
InnerException: NPOI.Util.RecordFormatException
HResult=-2146233088
Message=Expected to find a ContinueRecord in order to read remaining 137 of 144 chars
Source=NPOI
StackTrace:
at NPOI.HSSF.Record.RecordInputStream.ReadStringCommon(Int32 requestedLength, Boolean pIsCompressedEncoding)
at NPOI.HSSF.Record.RecordInputStream.ReadUnicodeLEString(Int32 requestedLength)
at NPOI.HSSF.Record.FontRecord..ctor(RecordInputStream in1)