Excel file read

If you are using the main C++ distribution of wxWidgets, Feel free to ask any question related to wxWidgets development here. This means questions regarding to C++ and wxWidgets, not compile problems.
PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Fri Dec 10, 2010 2:31 pm

Disclaimer: I provide no warranty the code I wrote works. Before I started reading this thread (hoping to find some neat solution that might come handy if I needed it), I had no idea how automation works. Don't just blindly copy&paste the code removing wxMessageBox() calls, read the code carefully, properly handle errors, debug with various data sets etc.

Following sample code shows how to:
1. Create MS Excel instance
2. Open a workbook
3. Determine worksheet count
3. Iterate through all worksheets
4. Obtain data from a range (using UsedRange), both by columns and by rows

Code: Select all

#ifndef WX_PRECOMP
    #include "wx/wx.h"   
#endif

#include <wx/stopwatch.h>
#include <wx/msw/ole/automtn.h>
#include <vector>
#include <wx/filedlg.h>

typedef std::vector<wxVariant> variantVector;


// obtains all values from range
// do some sanity check on Range.Count
// before calling this function, so you don't attempt to grab 
// tens of thousands or more values at once
void GetDataFromRange(wxAutomationObject& range, variantVector& data)
{
   wxVariant values = range.GetProperty("Value");            
   
   if (values.IsNull()) // no values in the range
      return;

   if (values.IsType("list")) {      
      size_t count = values.GetCount();
      data.reserve(count);      
      for (size_t i = 0; i < count; i++) {                                         
         data.push_back(values[i]);            
      }
   } else { // a single value
      data.push_back(values);
   }   
}

// gets either Row or Column Range depending on variable "column" at "index"
bool GetColumnOrRow(wxAutomationObject& range, long index, bool column, wxAutomationObject& item)
{
   wxString itemName;
   wxVariant params[2];

   params[0] = index;           
   if (column)
      itemName = "Columns";
   else
      itemName = "Rows";

   return range.GetObject(item, itemName, 1, params);
}

bool ProcessWorksheet(wxAutomationObject& worksheet)
{
   wxAutomationObject dataRange;
   if (worksheet.GetObject(dataRange, "UsedRange")) {
      wxString sheetName;
      wxString usedRangeAddress;
      long valueCount;
      long columnCount, rowCount;
      wxString s;

      // Get and display some info about the worksheet and usedRange
      sheetName = worksheet.GetProperty("Name").GetString();
      usedRangeAddress = dataRange.GetProperty("Address").GetString();      
      valueCount = dataRange.GetProperty("Count").GetLong();
      columnCount = dataRange.GetProperty("Columns.Count").GetLong();
      rowCount = dataRange.GetProperty("Rows.Count").GetLong();
      s.Printf("Next step will be reading data from the following sheet:\n"
         "Sheet.Name: %s\nSheet.UsedRange properties:\n"
         "  Address: %s\n  Count: %ld\n  Columns.Count: %ld\n  Rows.Count: %ld",
         sheetName.c_str(), usedRangeAddress.c_str(),
         valueCount, columnCount, rowCount);
      wxMessageBox(s);            

      // Get the data
      wxVariant result;
      wxVariant params[2];
      variantVector data;
      size_t valuesRead;
      wxStopWatch stopWatch;   
      long time;

      // First, I show how to get data column by column            
      wxAutomationObject column;
      
      stopWatch.Start();
      valuesRead = 0;
      for (long l = 1; l <= columnCount; l++) {          
          if (GetColumnOrRow(dataRange, l, true, column)) {    
            data.clear();
            
            result = column.GetProperty("Count");            
            if (result.GetLong() > 5000) { 
               // in production code you should proceed getting data in chunks instead of attempting
               // to grab them all at once
               stopWatch.Pause();
               s.Printf("There are %ld values in the column %ld. Continue anyway?", result.GetLong(), l);
               if (wxMessageBox(s, "Question", wxYES_NO | wxNO_DEFAULT) != wxYES) {
                  stopWatch.Resume();
                  continue;
               }
               stopWatch.Resume();
            }
            
            GetDataFromRange(column, data);               
            if (!data.empty()) {     
               // do something with the data here

               valuesRead += data.size();
               // for debugging purposes display first and last value in the column
               wxString firstVal, lastVal;
               wxVariant v;
               v = data.front();
               firstVal = v.IsNull() ? "<blank>" : v.GetString();
               v = data.back();
               lastVal = v.IsNull() ? "<blank>" : v.GetString();               
               v = column.GetProperty("Address");               
               s.Printf("Column %ld with address %s (%d values):\n First value: %s\n Last value: %s", 
                  l, v.GetString().c_str(), data.size(), firstVal.c_str(), lastVal.c_str());            
            } else {
               s.Printf("Column %ld has no data", l);
            }
            stopWatch.Pause();
            wxMessageBox(s);
            stopWatch.Resume();
          } else {            
            s.Printf("Failed to get the Column object for column %ld", l);
            stopWatch.Pause();
            wxMessageBox(s);   
            stopWatch.Resume();
          }
      }
   time = stopWatch.Time();
      s.Printf("Get data by columns:\n obtaining %d values in total (from %ld columns) took %d ms.", valuesRead, columnCount, time);

      // And this is how would you get data by rows
      // which method is better for you depends on your needs
      wxAutomationObject row;
      
      stopWatch.Start();
      valuesRead = 0;
      for (long l = 1; l <= rowCount; l++) {          
          if (GetColumnOrRow(dataRange, l, false, row)) {    
            data.clear();
            GetDataFromRange(row, data);               
            if (!data.empty()) {

               // do something with the data here
                valuesRead += data.size();
               // for debugging purposes display first and last value in the row
               wxString firstVal, lastVal;
               wxVariant v;
               v = data.front();
               firstVal = v.IsNull() ? "<blank>" : v.GetString();
               v = data.back();
               lastVal = v.IsNull() ? "<blank>" : v.GetString();               
               s.Printf("Row %ld (%d values):\n First value: %s\n Last value: %s", l, data.size(), firstVal.c_str(), lastVal.c_str());            
            } else {
               s.Printf("Row %ld has no data", l);
            }
            /* don't flood me with information about every  row
            stopWatch.Pause();
            wxMessageBox(s);
            stopWatch.Resume();
				*/ 
          } else {            
            s.Printf("Failed to get the Row object for row %ld", l);
            stopWatch.Pause();
            wxMessageBox(s);   
            stopWatch.Resume();
          }
      }
      
      time = stopWatch.Time();
      s.Printf("Get data by rows:\n obtaining %d values in total (from %ld rows) took %d ms.", valuesRead, rowCount, time);
      wxMessageBox(s);
   }
   return false;
}

bool ProcessWorkbook(wxAutomationObject& workbook)
{   
   wxString s;
   wxString fullWorkbookName;
   long worksheetCount;   
         
   fullWorkbookName = workbook.GetProperty("FullName").GetString();   
      
   wxAutomationObject worksheets;   
   if (!workbook.GetObject(worksheets, "Worksheets")) {
      wxMessageBox("Failed to get the Worksheets");
      return false;
   }
   worksheetCount = worksheets.GetProperty("Count").GetLong();   
   s.Printf("File %s has %ld worksheets", fullWorkbookName.c_str(), worksheetCount);
   wxMessageBox(s);
   
   wxVariant params[2];
   wxAutomationObject worksheet;
   for (long l = 1; l <= worksheetCount; l++) {
      params[0] = l;
      if (workbook.GetObject(worksheet, "Worksheets", 1, params)) {
         ProcessWorksheet(worksheet);
      }      
   }
   return true;
}


/**** MyApp ****/
class MyApp : public wxApp
{
public:   
   virtual bool OnInit()
   {
      if (!wxApp::OnInit())
         return false;   

   wxString fileName = "C:\\test.xls";

   // uncomment the following line to invoke file open dialog
   // fileName = "";
   if (fileName.IsEmpty()) {
      wxFileDialog dlg(NULL, _("Open Excel file"), "", "",
                           "XLS* files (*.xls*)|*.xls*", wxFD_OPEN|wxFD_FILE_MUST_EXIST);
      if (dlg.ShowModal() == wxID_CANCEL)
         return false;
      fileName = dlg.GetPath();
   }
   
   wxAutomationObject excel;     
   if (!excel.CreateInstance("Excel.Application")) {
      wxMessageBox("Failed to create Excel instance");
      return false;
   }
   
   wxVariant result = excel.CallMethod("Workbooks.Open", fileName);
   if (result.IsNull()) {
      wxString s;
      s.Printf("Failed to open %s", fileName.c_str());
      wxMessageBox(s);      
   }  else {
      wxAutomationObject workbook;
      if (!excel.GetObject(workbook, "ActiveWorkbook")) {
         wxMessageBox("Failed to get ActiveWorkbook");         
      } else {         
         ProcessWorkbook(workbook);
      }
   }   
   
   excel.CallMethod("Quit"); // Close Excel 
   
    return false; // quit
   }
};

IMPLEMENT_APP(MyApp)
As I wrote before, I expect you to be familiar with UsedRange. For example:
* if first column is empty UsedRange starts at B1,
* if first row is empty UsedRange starts at A2
* if both first row and columns are empty UsedRange starts at B2.
Depending on your needs, you might want to always grab first rows/columns, even if they're blank. This is how you might proceed, if you want your range to start always at A1 cell (I haven't time to test if it really works):

Code: Select all

// if we want to get our data always starting from the cell A1
// even if first row(s)/column(s) are completely blank
bool GetMyRange(wxAutomationObject& worksheet, wxAutomationObject& myRange)
{
   bool result = false;

   wxAutomationObject usedRange;
   if (worksheet.GetObject(usedRange, "UsedRange")) {
      wxVariant result = usedRange.GetProperty("Address");
      wxString usedRangeAddress = result.GetString();
      if (!usedRangeAddress .IsEmpty()) {
         wxString myRangeAddress;
         wxString s = usedRangeAddress.AfterFirst(':');
         if (s.IsEmpty()) { // colon not found, so usedRange is a single cell, e.g. $C$12
            myRangeAddress = "$A$1: " + usedRangeAddress;
         } else { // cell range, e.g. $C$12:$D$100
            myRangeAddress = "$A$1:" + s;
         }
         wxVariant params[2];
         params[0] = myRangeAddress;
         return worksheet.GetObject(myRange, "Range", 1, params);         
      }   
      return true;
   }
   return false;
}
I've also seen sheets where although the cells appeared to be blank, UsedRange covered all cells. That means about 65,000 * 255 values for Excel 2003 or older, MUCH more for version 2007 or newer. If you want absolutely fastest way you might try getting all data from a range (UsedRange) at once. For example, if you have sheet with 20 columns, where every column has 100 rows, it's 2000 values - should be doable. You might try to do something like that:
1. Call GetDataFromRange() where range is UsedRange
2. The values in data will be organized like this: values 0-99 are from column 1, values 100-199 are from column 2 and so on. I didn't test if it really works like that, but it should. Debug first. Then profile - it might not be that much faster to be worth the hassle.

greg
Earned some good credits
Earned some good credits
Posts: 111
Joined: Fri Sep 12, 2008 7:17 am

Post by greg » Sat Dec 11, 2010 7:53 am

PB
I tried Your code but here is a problem below:

Code: Select all

void GetDataFromRange(wxAutomationObject& range, variantVector& data)
{
   wxVariant values = range.GetProperty("Value");

   if (values.IsNull()) // no values in the range
      return;
}
values always is null !
I tried xls with one sheet and 3 cols 2 rows

Code: Select all

      columnCount = dataRange.GetProperty("Columns.Count").GetLong();
      rowCount = dataRange.GetProperty("Rows.Count").GetLong();
number od columns and number of rows is proper.


what's wrong?
Regards
Greg

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Sat Dec 11, 2010 9:11 am

Are there any values in those cells? What is valueCount? valueCount (= Range.Count) is 1 even when there are no values, it's never 0.

Is range you pass to GetDataFromRange() a valid wxAutomationObject, i.e does it return true if you call range.IsOK()? Is it really a Range object? Does it return proper address if you call range.GetProperty("Address").GetString())?

I suspect if there are indeed some values and it still returns null there might be bug in your code. It works as charm for me, I have only Excel 2007 to test my code with, but I don't think Excel version matters here. Of course, it's still possible that there's something wrong with my code, but I have no idea what that would be.

Just compile the code exactly as it is and test it on a workbook with some data.

Edit
Compile application below exactly as it is and see if it still doesn't work. It's slightly changed code from my previous post, the only changes are 1. the info messages go into the log instead of being displayed in MessageBoxes, 2. I've added a function call that should suppress Excel alerts like "This workbook was saved in previous version of MS Excel, recompute formulas?". Also check the log for traces from OLE errors, if there are any.

Code: Select all

#ifndef WX_PRECOMP
    #include "wx/wx.h"   
#endif

#include <wx/stopwatch.h>
#include <wx/msw/ole/automtn.h>
#include <vector>
#include <wx/filedlg.h>

#include <wx/textctrl.h>
#include <wx/sizer.h>
#include <wx/frame.h>

typedef std::vector<wxVariant> variantVector;


// obtains all values from range
// do some sanity check on Range.Count
// before calling this function, so you don't attempt to grab 
// tens of thousands or more values at once
void GetDataFromRange(wxAutomationObject& range, variantVector& data)
{
   wxVariant values = range.GetProperty("Value");            
   
   if (values.IsNull()) // no values in the range
      return;

   if (values.IsType("list")) {      
      size_t count = values.GetCount();
      data.reserve(count);      
      for (size_t i = 0; i < count; i++) {                                         
         data.push_back(values[i]);            
      }
   } else { // a single value
      data.push_back(values);
   }   
}

// gets either Row or Column Range depending on variable "column" at "index"
bool GetColumnOrRow(wxAutomationObject& range, long index, bool column, wxAutomationObject& item)
{
   wxString itemName;
   wxVariant params[2];

   params[0] = index;           
   if (column)
      itemName = "Columns";
   else
      itemName = "Rows";

   return range.GetObject(item, itemName, 1, params);
}


bool ProcessWorksheet(wxAutomationObject& worksheet)
{
   wxAutomationObject dataRange;
   if (worksheet.GetObject(dataRange, "UsedRange")) {
      wxString sheetName;
      wxString usedRangeAddress;
      long valueCount;
      long columnCount, rowCount;
      wxString s, s1;

      // Get and display some info about the worksheet and usedRange
      sheetName = worksheet.GetProperty("Name").GetString();
      sheetName.Replace("%", "%%"); sheetName.Replace("\\", "\\\\"); // get rid of possible format specifies in user data strings
      usedRangeAddress = dataRange.GetProperty("Address").GetString();      
      valueCount = dataRange.GetProperty("Count").GetLong();
      columnCount = dataRange.GetProperty("Columns.Count").GetLong();
      rowCount = dataRange.GetProperty("Rows.Count").GetLong();
      wxLogMessage("\n\n*************************");            
      s.Printf("Next step will be reading data from the following sheet:\n"
         "Sheet.Name: %s\nSheet.UsedRange properties:\n"
         "  Address: %s\n  Count: %ld\n  Columns.Count: %ld\n  Rows.Count: %ld",
         sheetName.c_str(), usedRangeAddress.c_str(),
         valueCount, columnCount, rowCount);      
      wxLogMessage(s);            

      // Get the data
      wxVariant result;
      wxVariant params[2];
      variantVector data;
      size_t valuesRead;
      wxStopWatch stopWatch;   
      long time;

      // First, I show how to get data column by column            
      wxAutomationObject column;
      
      stopWatch.Start();
      valuesRead = 0;
      for (long l = 1; l <= columnCount; l++) {          
          if (GetColumnOrRow(dataRange, l, true, column)) {    
            data.clear();            

            result = column.GetProperty("Count");            
            if (result.GetLong() > 5000) { 
               // in production code you should proceed getting data in chunks instead of attempting
               // to grab them all at once
               stopWatch.Pause();               
               s.Printf("There are %ld values in column %ld. Continue anyway?", result.GetLong(), l);
               if (wxMessageBox(s, "Question", wxYES_NO | wxNO_DEFAULT) != wxYES) {
                  stopWatch.Resume();
                  continue;
               }
               stopWatch.Resume();
            }
            
            GetDataFromRange(column, data);               
            if (!data.empty()) {
               valuesRead += data.size();
               // do something with the data here
               
               // for debugging purposes display first and last value in the column
               wxString firstVal, lastVal;
               wxVariant v;
               v = data.front();
               firstVal = v.IsNull() ? "<blank>" : v.GetString();
               v = data.back();
               lastVal = v.IsNull() ? "<blank>" : v.GetString();               
               v = column.GetProperty("Address");               
               s.Printf("Column %ld with address %s (%d values):\n First value: %s\n Last value: %s", 
                  l, v.GetString().c_str(), data.size(), firstVal.c_str(), lastVal.c_str());            
            } else {
               s.Printf("Column %ld has no data", l);
            }                        
            stopWatch.Pause();            
            s.Replace("%", "%%"); s.Replace("\\", "\\\\"); // get rid of possible format specifies in user data strings
            wxLogMessage(s);            
            stopWatch.Resume();
          } else {            
            s.Printf("Failed to get the Column object for column %ld", l);
            stopWatch.Pause();
            wxMessageBox(s);   
            stopWatch.Resume();
          }
      }
      time = stopWatch.Time();
      s.Printf("Get data by columns:\n obtaining %d values in total (from %ld columns) took %d ms.\n", valuesRead, columnCount, time);      
      wxLogMessage("---------\n");
      wxLogMessage(s);


      // And this is how would you get data by rows
      // which method is better for you depends on your needs
      wxAutomationObject row;
      
      stopWatch.Start();
      valuesRead = 0;
      for (long l = 1; l <= rowCount; l++) {          
          if (GetColumnOrRow(dataRange, l, false, row)) {    
            data.clear();
            GetDataFromRange(row, data);               
            if (!data.empty()) {
               valuesRead += data.size();
               // do something with the data here
               
               // for debugging purposes display first and last value in the row
               wxString firstVal, lastVal;
               wxVariant v;
               v = data.front();
               firstVal = v.IsNull() ? "<blank>" : v.GetString();
               v = data.back();
               lastVal = v.IsNull() ? "<blank>" : v.GetString();               
               s.Printf("Row %ld (%d values):\n First value: %s\n Last value: %s", l, data.size(), firstVal.c_str(), lastVal.c_str());            
            } else {
               s.Printf("Row %ld has no data", l);
            }            
            stopWatch.Pause();            
            s.Replace("%", "%%"); s.Replace("\\", "\\\\"); // get rid of possible format specifies in user data strings
            wxLogMessage(s);            
            stopWatch.Resume();            
          } else {            
            s.Printf("Failed to get the Row object for row %ld", l);
            stopWatch.Pause();
            wxMessageBox(s);   
            stopWatch.Resume();
          }
      }
      
      time = stopWatch.Time();
      s.Printf("Get data by rows:\n obtaining %d values in total (from %ld rows) took %d ms.\n", valuesRead, rowCount, time);
      wxLogMessage(s);
   }
   return false;
}

bool ProcessWorkbook(wxAutomationObject& workbook)
{   
   wxString s;
   wxString fullWorkbookName;
   long worksheetCount;   
         
   fullWorkbookName = workbook.GetProperty("FullName").GetString();   
      
   wxAutomationObject worksheets;   
   if (!workbook.GetObject(worksheets, "Worksheets")) {
      wxMessageBox("Failed to get the Worksheets");
      return false;
   }
   worksheetCount = worksheets.GetProperty("Count").GetLong();   
   s.Printf("File %s has %ld worksheets", fullWorkbookName.c_str(), worksheetCount);
   wxLogMessage(s);
   
   wxVariant params[2];
   wxAutomationObject worksheet;
   for (long l = 1; l <= worksheetCount; l++) {
      params[0] = l;
      if (workbook.GetObject(worksheet, "Worksheets", 1, params)) {
         ProcessWorksheet(worksheet);
      }      
   }
   return true;
}

class MyFrame : public wxFrame {
public:
   MyFrame()
      : wxFrame(NULL, wxID_ANY, "Excel Automation Test", wxDefaultPosition, wxSize(600,500))
   {

      SetSizeHints( wxDefaultSize, wxDefaultSize );      
       
      wxBoxSizer* pSizer = new wxBoxSizer(wxVERTICAL);      
      wxTextCtrl* pTextCtrl = new wxTextCtrl(this, wxID_ANY, wxEmptyString, wxDefaultPosition, wxDefaultSize, wxTE_MULTILINE|wxTE_READONLY);
      wxLog::SetActiveTarget(new wxLogTextCtrl(pTextCtrl));            
      wxLog::SetTimestamp("");
      wxLog::AddTraceMask(wxTRACE_OleCalls);
      pSizer->Add(pTextCtrl, 1, wxALL|wxEXPAND, 0 );      
      SetSizer(pSizer);
      Layout();
      Centre();

   }   
};




/**** MyApp ****/
class MyApp : public wxApp
{
public:   
   virtual bool OnInit()
   {
      if (!wxApp::OnInit())
         return false;   

   MyFrame* pFrame = new MyFrame();            
   pFrame->Show();
   SetTopWindow(pFrame);

   wxString fileName = "C:\\test.xls";   

   // uncomment the following line to invoke file open dialog
   // fileName = "";
   if (fileName.IsEmpty()) {
      wxFileDialog dlg(pFrame, "Open Excel file", "", "",
                           "XLS* files (*.xls*)|*.xls*", wxFD_OPEN|wxFD_FILE_MUST_EXIST);
      if (dlg.ShowModal() == wxID_CANCEL)
         return false;
      fileName = dlg.GetPath();
   }
   
   wxAutomationObject excel;     
   wxStopWatch stopWatch;
   
   stopWatch.Start();
   wxLogMessage("Attempting to create Excel instance...");
   if (!excel.CreateInstance("Excel.Application")) {
      wxMessageBox("Failed to create Excel instance");
      return false;
   }
   wxLogMessage("Launching Excel took %d ms", stopWatch.Time());

   excel.PutProperty("DisplayAlerts", false);

   stopWatch.Start();
   wxLogMessage("Attempting to open file %s", fileName.c_str());
   wxVariant result = excel.CallMethod("Workbooks.Open", fileName);
   if (result.IsNull()) {
      wxString s;
      s.Printf("Failed to open %s", fileName.c_str());
      wxMessageBox(s);      
   }  else {
      wxAutomationObject workbook;
      if (!excel.GetObject(workbook, "ActiveWorkbook")) {
         wxMessageBox("Failed to get ActiveWorkbook");         
      } else {         
         wxLogMessage("Opening file took %d ms", stopWatch.Time());
         ProcessWorkbook(workbook);
      }
   }   
   
   excel.CallMethod("Quit"); // Close Excel 
   
    return true;
   }
};

IMPLEMENT_APP(MyApp)


greg
Earned some good credits
Earned some good credits
Posts: 111
Joined: Fri Sep 12, 2008 7:17 am

Post by greg » Sat Dec 11, 2010 1:11 pm

I'm really confused.
I pasted Your code in my project, and still not working.
I'm testing on Excel 2003. I've attached my *.xls file named a.xls .
It is really a very simple file.

Here is my output.
: Attempting to create Excel instance...
: Launching Excel took 172 ms
: Attempting to open file C:\a.xls
: Opening file took 860 ms
: File C:\a.xls has 3 worksheets
:

*************************
: Next step will be reading data from the following sheet:
Sheet.Name: Arkusz1
Sheet.UsedRange properties:
Address: $A$1:$B$2
Count: 4
Columns.Count: 2
Rows.Count: 2
: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
: Column 1 has no data
: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
: Column 2 has no data
: ---------

: Get data by columns:
obtaining 0 values in total (from 2 columns) took 0 ms.

: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
: Row 1 has no data
: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
: Row 2 has no data
: Get data by rows:
obtaining 0 values in total (from 2 rows) took 0 ms.

:

*************************
: Next step will be reading data from the following sheet:
Sheet.Name: Arkusz2
Sheet.UsedRange properties:
Address: $A$1
Count: 1
Columns.Count: 1
Rows.Count: 1
: Column 1 has no data
: ---------

: Get data by columns:
obtaining 0 values in total (from 1 columns) took 0 ms.

: Row 1 has no data
: Get data by rows:
obtaining 0 values in total (from 1 rows) took 0 ms.

:

*************************
: Next step will be reading data from the following sheet:
Sheet.Name: Arkusz3
Sheet.UsedRange properties:
Address: $A$1
Count: 1
Columns.Count: 1
Rows.Count: 1
: Column 1 has no data
: ---------

: Get data by columns:
obtaining 0 values in total (from 1 columns) took 0 ms.

: Row 1 has no data
: Get data by rows:
obtaining 0 values in total (from 1 rows) took 0 ms.
Regards
Greg

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Sat Dec 11, 2010 3:13 pm

greg wrote:I'm really confused.
I pasted Your code in my project, and still not working.
I'm testing on Excel 2003. I've attached my *.xls file named a.xls .
It is really a very simple file.
I don't see any file attached to your post. Please post the file so I can test it.
greg wrote: Here is my output.
Sheet.UsedRange properties:
Address: $A$1:$B$2
Count: 4
Columns.Count: 2
Rows.Count: 2
: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
: Column 1 has no data
: Error: wxAutomationObject::ConvertOleToVariant: Unknown variant value type
As you can see there are OLE errors, I don't get these with my files. What are the cell types in A1:B2 of the first sheet? Well, I guess I will see it when you post the file in question. Unfortunately, it looks I won't be able to do anything about it, as you can see it fails in wxWidgets/OLE, unable to convert the type. See function
wxConvertOleToVariant in %WXWIN%\src\msw\ole\oleutils.cpp. But post the file anyway, I'd like to see what's in those cells.

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Sun Dec 12, 2010 1:45 pm

I've spent some more time looking into the issue and I think I have bad news. I believe that OLE wrapper layer in wxWidgets is fundamentally flawed. This is not a surprise, as it is a part of wxWidgets specific only to MSW port and probably very rarely used. See e.g. lack of responses in this thread.

wxWidgets OLE wrapper is based on wxVariant, which provides only a subset of types VARIANT does. Among those wxVariant doesn't support is unfortunately a type often in used in MS Excel: CY (currency, implemented as 8-byte integer scaled by 10,000). wxVariant<->VARIANT conversion functions in oleutils.cpp fail with those unknown types, setting the function return value to false and variant type to null. I don't think there's an easy way to fix it in wxWidgets, aside from adding new types (at least CY, DECIMAL (128-bit integer), SCODE (32-bit integer) and maybe R_4 (float)) to wxVariant and handling those types in conversion functions. This might not be as easy as it sounds, because some types are exactly same C types (e.g. (in 32-bit code) SCODE is a long), resulting in ambiguous constructors and operators.

I've also noticed that e.g. conversion from wxVariant to VARIANT.boolVal doesn't follow the specs: boolVal is is supposed to be either 0xFFFF (true) or 0 (false), all other values are illegal. wxWidget seem to use C++ built-in bool to short conversion which I believe results to 1 or 0. Nevertheless, it seems to work.

I think it's also not possible to call methods with variable number of arguments while omitting some. This is done using VT_ERROR and scode, also not possible with wxVariant.

One more thing I believe would be nice is to have access to HRESULT from the last OLE call, currently it is swallowed in wxWidgets functions and unavailable in user code.

But I will be pleased when someone smarter and more experienced among forum readers proves that my conclusion is utterly wrong. If no one does, ask in the mailing list, someone there (Vadim Zeitlin?) might be able to help you.

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Sun Dec 12, 2010 4:19 pm

One last thing. Below is a quick hack that allows you to get (and only get) values from Excel even if the type of VARIANT is VT_CY. wxVariant type will be set to double. Precision is just 4 digits to the right, if the cyVal is REALLY big it might overflow.
Insert the following lines right before the last default label in wxConvertOleToVariant() in oleutils.cpp:

Code: Select all

case VT_CY:               
   variant = wxLongLong(oleVariant.cyVal.Hi, oleVariant.cyVal.Lo).ToDouble() / 10000.;
   break;
Recompile wxWidgets. No guarantee provided, use at your own risk when you get really desperate. I have no solution for DECIMAL, but I don't think this type is often used (or even can be used) in MS Excel.

The other and probably much better (but still not ideal) solution would be to use Range.Value2 property instead of Range.Value. If this is the case, MS Excel won't use date and currency types and return these types as doubles. You can easily recreate wxDateTime from double, you only have to know that you're dealing with dates.

I still don't have any solution for VT_ERROR, which is returned for cells having error status, usually as a result from a formula, e.g. division by zero, referencing unknown range etc. :(

greg
Earned some good credits
Earned some good credits
Posts: 111
Joined: Fri Sep 12, 2008 7:17 am

Post by greg » Mon Dec 13, 2010 9:27 am

Many many Thanks to PB for Your help.
Your code really works!
I've compiled using WX ver 2.8.2 and that was my fault.
Now ported to 2.8.10 and started to work finally.
Can You tell me what WX version did You tested Your code?
Regards
Greg

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Mon Dec 13, 2010 9:39 am

I use wxWidgets 2.9 current SVN trunk, of course in Unicode.
I don't get how did you solve the problem I deemed to be unsolvable? I mean lack of certain VARIANT types support in wxVariant. Did you use Range.Value2 to bypass currency problems? Or that ugly hack in oleutils.cpp I suggested earlier? Even if you did one or another, you still won't get values from cells that have content like "#DIV/0!" or "#REF!". Even if there's only one cell with such a value, wxConvertOleToVariant and thus my GetDataFromRange() fails for the whole range. The only solution that comes to my mind, still far from perfect:
1. obtain the range count for range used in call of GetRangeData(range, data)
2. call GetRangeData(range, data)
3. if data.size() < range.count get values one by one.
The problem is that even empty range.Count returns 1 so there's no way how to differentiate between empty range and a range consisting of just 1 error value.

User avatar
doublemax
Moderator
Moderator
Posts: 15068
Joined: Fri Apr 21, 2006 8:03 pm
Location: $FCE2

Post by doublemax » Mon Dec 13, 2010 10:37 am

Hi PB,

you seem to know more about wxAutomation than anyone else. Could you please post your findings to the wx-users mailing list, i don't want your knowledge to get lost ;)
Use the source, Luke!

PB
Part Of The Furniture
Part Of The Furniture
Posts: 2512
Joined: Sun Jan 03, 2010 5:45 pm

Post by PB » Mon Dec 13, 2010 1:11 pm

Hi doublemax!

Unfortunately, I don't know much about wxAutomation. As I wrote in my previous post, before I started reading this thread I had no idea how it works. I also I don't have any previous experience with OLE/Automation. The code I posted here is only demonstrating how to interact with MS Excel Automation interface, there's not much specific to either wxWidgets or MS Automation/OLE. My knowledge of Excel interface comes from a project I had to do in VBA in Excel (oh, the painful memories).

I hope that my "OLE wrapper layer in wxWidgets is fundamentally flawed" didn't came out as too harsh - I'm really struggling when trying to express myself in English. But still, for very understandable reasons I listed before, wxAutomation is not exactly a centre of attention of wxWidgets developers. When I was looking for information on wxWidgets Trac I think I even saw a suggestion of moving automation out of wxWidgets to contrib.

It could be possible to enhance wxVariant so it supports more VARIANT types: VT_CY (MS Windows type CY, 64-bit integer, could be based on wxLongLong), VT_ERROR (MS Windows type SCODE = HRESULT = 32-bit long), I don't think wxWidgets have a type compatible with VT_DECIMAL (MS Windows type DECIMAL, 128-bit integer).
But this all would probably be useful only for wxAutomation. I don't believe wxWidgets maintainers would be willing to accept such changes, useful only for a part of library that is available only on one platform and hardly anyone ever uses. Also, as of wxWidgets 2.9.1, wxVariant is considered to be somewhat deprecated, we are supposed to use wxAny instead.
(Assuming familiarity with wxVariant implementation) I also have to admit I don't have any idea how to "fix" wxVariant without breaking its current implementation. I mean, if you write variant = 12L, how do you know if the variant type should be "long" or "errorcode" - operator=(long) or wxVariant(long) would be ambiguous (similarly for "longlong"/"currency" or for VT_R4 "double"/"float").
Last but not least there might be other issues with wxAutomation, e.g. OLE interface reference counting not being properly implemented. I believe I saw a recentish bug report hinting that, no idea if the report was right and if it was fixed.

TL;DR: For reasons listed above, I am not going to bother wxWidgets developers, particularly when I don't have an ideal solution (=code) at hand.

Maybe Greg, who unlike me needs to use wxAutomation right now, would be willing to provide his solution?

Best regards,
Petr B.

Post Reply