[ANN] wxJSON new versions planned

Talk here about issues with one of the components hosted at wxCode, or suggest features for it.
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

[ANN] wxJSON new versions planned

Post by luccat »

Hi all,
the wxJSON library is now a complete implementation of the JSON specifications but I recently received some feature requests from wxJSON users and I planned to release a few new versions of the library.

The first one is a workaround to correctly build the library on MinGW.
The second one is an extension of the JSON syntax.
In the third version I will not implement new features but some improvements related to speed. Compared to other JSON implementations, wxJSON is very slow in parsing JSON streams: some simple optimisations will surely speed it up.

To know more about the new releases read the following page:

http://luccat.users.sourceforge.net/wxj ... er_01.html


Regards
Luciano
Last edited by luccat on Sat Oct 03, 2009 5:06 pm, edited 1 time in total.
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

Hi Luciano,

Great job with wxJSON!

If I may add a request: to have it work with wxW 2.9.x .

I made some changes to v1.0 to compile and I actually use it with/for 2.9.x but reading and writing with streams is broken in my hack. I use it with wxString (and wxString::utf8_str() for writing) successfully though.
My changes made it incompatible with v2.8.x because wxString::GetChar is different.

Thanks!
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

Hi,
I will make wxJSON to compile and work with wxW 2.9 although I think that it is not wise to break the compatibility with a stable release.
I read the docs for wxW 3.0 and I noticed that wxString has totally changed.
By the way wxString::GetChar() is very inefficient on platforms that use UTF-8 as the internal encoding.

Just a question: I have never compiled a development release of wxWidgets. Do I have to checkout the SVN sources or is it easier to get the daily snapshot?
I use linux for development, do you have some tips for compiling the library or can you post some links?

Luciano
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

luccat wrote:I think that it is not wise to break the compatibility with a stable release.
I think so too; there may be better ways but maybe you'll know better :)

luccat wrote:wxString::GetChar() is very inefficient on platforms that use UTF-8 as the internal encoding.
For character access I don't think there is an obvious alternative.. But you were saying something about converting the whole string..?

luccat wrote:Do I have to checkout the SVN sources or is it easier to get the daily snapshot?
AFAIK there is no daily-snapshot-like archive, so yes, I checked-out the files from svn.
The closest thing to a stable 2.9.x branch is WX_2_9_0_BRANCH (currently in RC6 state, and will just be renamed when reaching a "release" state)
while the trunk, where all the changes are done is in TRUNK
Compiling 2.9.x is just like getting a 2.8.x archive, put the sources in a folder and build. It may even be easier as it contains some VS project files (not applicable for you in Linux..). At least this is how I did it.
I used CodeBlocks to import VS project/solution and then just built - maybe you cand do it also.
One thing - you'll need to remove -D compiling option from 'Release' build. Debug information when building with gcc is usually not wanted for Release builds.

I'd vote for using 'the latest branch' rather than the trunk for building wxJSON, but theoretically none of them should fail at compilation.. If it matters, I built my version of wxJSON with both and was the same thing (ok).
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

Well,
I am now working on the wxW 2.9 port of wxJSON and I think that this is the right place to do the speed optimisation that I left for last in my future plans.
So the next releases of wxJSON will be:

1.1: compatible with both wxW 2.8 and 2.9
1.2: use of STL containers
1.3: implementation of the new 'binary buffer' JSON data type.

I will update the doc's page in the next days.

Luciano
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

Hi there,

I thought of saying hello :)


..And a few more things:

- wxLogTrace changed somewhere in the trunk, after 2.9.0 was released; I think it should still be ok if WXWIN_COMPATIBILITY_2_8 will be set to 1, but just thought to let you know if you want to take a look there too, and I suppose it should work with the flag set to 0 too.

- I also have a request, or better said a suggestion because I'm not 100% sure it is feasible: can the error/warning system be extended so that an incorrect access is accounted for, or at least a flag is set?
I'll give you an example: a wxJSONValue contains an int, but the code tries to get the value using AsString - it will always return _something_. That is ok, but I suggest that it should 'signal' somehow that an incorrect access was performed.
How I find this useful: if validations are needed, currently for every reading of a value another call to IsXXX() should be added. If a signaling mechanism would be there, after reading some (all?) the values, there can be only one check [at an upper level] for 'valid' state.

- some links in the docs on the website are not working - in the detailed description of a function (wxJSONValue::IsBool), the links in "References" and "Referenced by" seem to be wrong.

Regards,
Catalin
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

wxLogTrace changed somewhere in the trunk, after 2.9.0 was released; I think it should still be ok if WXWIN_COMPATIBILITY_2_8 will be set to 1, but just thought to let you know if you want to take a look there too, and I suppose it should work with the flag set to 0 too
Thanks for the hint
also have a request, or better said a suggestion because I'm not 100% sure it is feasible: can the error/warning system be extended so that an incorrect access is accounted for, or at least a flag is set
This is surely possible. Maybe the best solution is to add a overloaded version of all AsXXXXX functions, like this:

Code: Select all

bool AsInt( int& i )
which returns FALSE if the type is not a INT. Another solution would be to set a static data member to a non-NULL value every time an incorrect access is performed (for example the pointer to the offending wxJSONValue object).
Do you have better ideas?

some links in the docs on the website are not working - in the detailed description of a function (wxJSONValue::IsBool), the links in "References" and "Referenced by" seem to be wrong
All references are generated automatically by doxygen... I have to investigate in the doxygen's docs.

Luciano
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

luccat wrote:Maybe the best solution is to add a overloaded version of all AsXXXXX functions, like this:

Code: Select all

bool AsInt( int& i )
which returns FALSE if the type is not a INT.
Hmm, this is indeed very good, and just as simple :)
luccat wrote:Another solution would be to set a static data member to a non-NULL value every time an incorrect access is performed (for example the pointer to the offending wxJSONValue object).
I think I'm missing something here.. How would this work if there are multiple json 'trees' used at the same time in the app, and more than 1 are faulty? ..pointers become invalid?
luccat wrote:Do you have better ideas?
I tried to imagine something, but not necessary a better idea :)
I was thinking about a way to have the errors cached somehow inside each 'tree'.
Maybe if a node is created alone, it will also create its errorsList, otherwise (is created by another node) it'll use the parent's errorList (..somehow). The errorList could contain pointers to offending nodes (this is stolen from your idea) or even something more complicated (a pair of <nodePtr, strError> ?). When a node is removed, the entries referring to it will also be removed from the tree's errorList - or maybe better: moved to the removed node's new errorList?
If performance would be an issue, maybe the constructor could receive a flag, if to use this for incorrect accesses.
It definitely sounds more complicated than your first idea, and maybe not extremely useful... I'm not entirely convinced by it either :)
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

catalin wrote:

Code: Select all

bool AsInt( int& i )
Hmm, this is indeed very good, and just as simple :)
This will be implemented in version 1.1
catalin wrote:
luccat wrote:Another solution would be to set a static data member to a non-NULL value every time an incorrect access is performed (for example the pointer to the offending wxJSONValue object).
I think I'm missing something here.. How would this work if there are multiple json 'trees' used at the same time in the app, and more than 1 are faulty? ..pointers become invalid?
Well, it is just a flag: it can only contain the pointer to the last object that was incorrectly accessed.
It should also be reset to NULL by the user just before accessing JSON values and tested when finished.
Not much usefull, I think.
catalin wrote: I was thinking about a way to have the errors cached somehow inside each 'tree'.
Maybe if a node is created alone, it will also create its errorsList, otherwise (is created by another node) it'll use the parent's errorList (..somehow). The errorList could contain pointers to offending nodes (this is stolen from your idea) or even something more complicated (a pair of <nodePtr, strError> ?). When a node is removed, the entries referring to it will also be removed from the tree's errorList - or maybe better: moved to the removed node's new errorList?
If performance would be an issue, maybe the constructor could receive a flag, if to use this for incorrect accesses.
It definitely sounds more complicated than your first idea, and maybe not extremely useful... I'm not entirely convinced by it either :)
mhh, handling pointers is a nightmare. What do you think about keeping an array of strings which contains the path to incorrectly accessed objects such as, for example:

Code: Select all

key1/key2/key3[array-item]
Paths are similar to XMLPaths. Also there is a JSONPath implementation which is interesting:

http://goessner.net/articles/JsonPath/


Luciano
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

Hi,
I'm rather slow in replying these days... I hope it is because of the bad weather outside :)

luccat wrote:

Code: Select all

bool AsInt( int& i )
This will be implemented in version 1.1
Thanks!

luccat wrote:
catalin wrote: I was thinking about a way to have the errors cached somehow inside each 'tree'.
Maybe if a node is created alone, it will also create its errorsList, otherwise (is created by another node) it'll use the parent's errorList (..somehow). The errorList could contain pointers to offending nodes (this is stolen from your idea) or even something more complicated (a pair of <nodePtr, strError> ?). When a node is removed, the entries referring to it will also be removed from the tree's errorList - or maybe better: moved to the removed node's new errorList?
If performance would be an issue, maybe the constructor could receive a flag, if to use this for incorrect accesses.
It definitely sounds more complicated than your first idea, and maybe not extremely useful... I'm not entirely convinced by it either :)

mhh, handling pointers is a nightmare.
I don't think it will be in this case.
I suggested the pointers (the memory addresses) as unique ids for the nodes, not for managing the nodes. It should be seen as handling ids rather than pointers.

luccat wrote:What do you think about keeping an array of strings which contains the path to incorrectly accessed objects such as, for example:

Code: Select all

key1/key2/key3[array-item]
I see two drawbacks here: strings comparison, and the case when more than one 'tree' have the same name for the first node.

luccat wrote:there is a JSONPath implementation which is interesting:

http://goessner.net/articles/JsonPath/
I agree that it is interesting, but I see it more like "something else", unless I'm missing something..
It would be useful for retrieving data from multiple elements, or using wild card like expressions, or checking if an explicit path exists, but not if it is accessed correctly (ex. asking for a double from a node that exists but contains a bool).

Catalin
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

Hi,
I have had a totally new idea for keeping track of incorrect accessed values. It is, by now, only a draft, I still have to solve some problems:

the errorList itself would be a wxJSONValue structure that contains the same tree of the value it referes to.
An access to a value will create a node in the error list with two possibile values:

1. TRUE if correctly accessed
2. a string describing the error in the opposite case

For example consider the following wxJSONValue:

Code: Select all

{
  "key1" : 10,
  "key2" : {
    "subkey1" : 100,
    "subkey2" : "a string"
  }
}
We access two nodes, the first correctly and the second incorrectly:

Code: Select all

  int i = root["key1"].AsInt();   //OK
  double d = root["key2"]["subkey2"].AsDouble();
The error list would contain:

Code: Select all

{
  "key1" : TRUE
  "key2" : {
    "subkey2" : "accessing a string as double"
  }
}
No entry in the error list would be created for "key2/subkey1" because it was never accessed.

The problems to solve are:

1. If a node is accessed twice, the first one incorrectly and the second one correctly, only the second access would be registered as a correct access.

2. if a node is accessed using subnodes, should the error list still register the access? For example:

Code: Select all

  wxJSONValue key2 = root["key2"];
  double d = key2["subkey2"].AsDouble();
3. Because wxJSONValue does not have a parent link it is hard to know if a subnode was accessed using the root node or a subnode.

4. I have no idea (by now) to track subnodes access in the operator[] function

5. If an object / array node was incorrectly accessed than its subnodes cannot be further traced for incorrect access. Example:

Code: Select all

  // "key2" is a key/value pair
  int i = root["key2"].AsInt();

  // the error list would contain:
  {
    "key1" : TRUE,
    "key2" : "accessing a key/value pair as int"
  }
}
But when an access to the "key2" subnodes is done it will delete the string reporting the incorrect access to "key2".
A big advantage of this solution is that we can have a visual representation of accessed values - either correctly and incorrectly - simply by writing the error list via a wxJSONWriter object.

What do you think about this? Could this be the correct path (provided that I solve the above problems)?

Luciano
catalin
Moderator
Moderator
Posts: 1618
Joined: Wed Nov 12, 2008 7:23 am
Location: Romania

Post by catalin »

Hi,
luccat wrote:I have had a totally new idea for keeping track of incorrect accessed values.
[...]
the errorList itself would be a wxJSONValue structure that contains the same tree of the value it refers to.
Hmm.. but won't it then look more like logging than a quick/simple check for errors?
Maybe the start should be with [simple] error catching and later it could be extended to contain more info about each access.
luccat wrote:1. If a node is accessed twice, the first one incorrectly and the second one correctly, only the second access would be registered as a correct access.
Don't do anything for correct access.
luccat wrote:2. if a node is accessed using subnodes, should the error list still register the access?
IMO it should...
luccat wrote:3. Because wxJSONValue does not have a parent link it is hard to know if a subnode was accessed using the root node or a subnode.

4. I have no idea (by now) to track subnodes access in the operator[] function
This are the parts which need the best ideas :)
luccat wrote:5. If an object / array node was incorrectly accessed than its subnodes cannot be further traced for incorrect access.
This is way I thought of unique ids.
luccat wrote:when an access to the "key2" subnodes is done it will delete the string reporting the incorrect access to "key2".
Same as 5.
luccat wrote:A big advantage of this solution is that we can have a visual representation of accessed values - either correctly and incorrectly - simply by writing the error list via a wxJSONWriter object.
I can see the advantages, but I'm afraid it will get relatively complicated, and not very fast because of the strings..
luccat wrote:What do you think about this?
I'd vote for a simpler initial approach, and I think it could be easily extended to include all the other details.
luccat
Knows some wx things
Knows some wx things
Posts: 31
Joined: Tue Oct 23, 2007 9:10 am
Location: Italy

Post by luccat »

Hi,
I have now committed in the SVN repository the version 1.1 of wxJSON. It is compatible with 1.0 except for one aspect:

in ANSI builds the reader does no more store Unicode Escaped Sequences of unrepresentable characters such as for example, a greek letter in a Latin-1 environment:

Code: Select all

\u03B1
This is not a compatibility break but a bug fix for the following reasons:

1. Unrepresentable chars as stored in 4 hex digits so only the first unicode plane can be represented (the BMP)
2. The \uXXXX sequence should be used only for control characters
3. writing the sequence back to UTF-8 streams does not revert to UTF-8 so I think that this is not valid JSON text (although the wxJSON reader is capable to read it correctly)

Now the reader tries to convert the UTF-8 stream containing a string type to a wxString object using the wxString::FromUTF8(). In Unicode builds the conversion always succeeds but in ANSI ones it may fail because of the presence of unrepresentable chars. If the conversion fails the UTF-8 buffer is simply copied to the wxString object. Also note that in this case writing the string back to UTF-8 we get a different result.

The new features are:

1. the library and test application compiles on both wxGTK 2.8 and 2.9 - please let me know if you are compiling on different systems / compilers.

2. added the

Code: Select all

bool wxJSONValue::AsXxxxx(T&)
function which can be used to get the value and test if it is of the expected type in only one call.

3. the reader and the writer were totally reorganized: now they process UTF-8 streams; strings are converted in a single step and no more char-by-char so there is a speed improvement of 30-50%.

The documentation is not yet fully updated. When finished I will upload a new release on SF.