Question

I've been banging my head against this for hours, but I'm obviously lacking fundamental Regex knowledge to do what I want.

I have a WKT (well known text, see http://en.wikipedia.org/wiki/Well-known_text) string, that looks like this:

PROJCS["MGI / Austria GK Central",GEOGCS["MGI",DATUM["Militar_Geographische_Institute",SPHEROID["Bessel 1841",6377397.155,299.1528128000009,AUTHORITY["EPSG","7004"]],AUTHORITY["EPSG","6312"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4312"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",13.33333333333333],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER["false_northing",-5000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AUTHORITY["EPSG","31255"]]

I want to parse this string into key / value pairs. So, as an example:

SPHEROID["Bessel 1841",6377397.155,299.1528128000009,AUTHORITY["EPSG","7004"]] would become:

key: SPHEROID

value: "Bessel 1841",6377397.155,299.1528128000009,AUTHORITY["EPSG","7004"]

By matching against \[(.*?)\] I'm getting all the values (see http://rubular.com/r/6SxMbRMufJ), but I'm losing the keys. How can I create a Regex where the first group is the key, and the second group is the value?

Also, is there a way to split nested values (like key[key[value]]]) as well, or do I have to use recursion on every match?

Was it helpful?

Solution

The regular expression to achieve the minimum you are asking, is ([^\[]+?)\[(.*)\].


However, since you are parsing a specific format you should look for existing parsers that do that.

For example, you can look at the code from http://www.dupuis.me/node/28

Also, http://gis.stackexchange.com has answers that mention other libraries: https://gis.stackexchange.com/questions/13078/how-to-unproject-wkt-to-wkt-in-net

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top