Pergunta

The technical details

I want to EXTRACT values from a text file containing parameter names and values. For each line that starts with "request.config." (there are empty lines, lines with comments, etc. which I don't want to extract anything from) I want to extract these values (in bold) :

request.config.my_param_1 = "some random string";

I thought the best way to do this might be using REGEX, but how can I do this?

I thought there would be something like a regular expression that would extract the 2 values request.config.${1} = ${2}; and retrieve ${1} and ${2}, for each line, but only if it matches.

I tried experimenting but it did not work:

<cfset str = "request.config.MY_PARAM_NAME = 'The parameter VALUE!!';">
<cfset arrSearch = rematch("^request.config.(.*?) = (.*?);$", str) >
<cfdump var="#arrSearch#" label="Extracted values">

Unfortunately, this code gives me the FULL STRING I already had, I just want the 2 extracted values!

Some META : WHAT I'm trying to do

I am building a web app that lets end-users modify some application parameters which are stored in a params_file.cfm. Instead of having developpers change the variables manually in the file, we want to be able to do it from within the application.

My application first makes an AJAX call to the backend, which reads the params file, get all the data pairs (param_name, param_value and possibly later on a param_description) and returns them as JSON to populate my list for which I have an autocomplete tool to search them by name (Typeahead.js for the curious). When I select a parameter name the value appears along with some controls to modify them (the controls depend on the data type, JQuery is used to determine the type).

The thing is the param_value can take many forms.. because this params file is maintained by different people it can have different syntax. For example a boolean can be stored as "TRUE", 'true', TRUE, true (you get the idea).

Since the SerializeJSON handles the types (booleans, numbers, strings) I thought my REGEX should return me the text WITHOUT the quotes (single or double) but I am having trouble crafting that expression.

I got

<cfset match = REFind("^request\.config\.(\S+) = ['|""]?(.*)['|""]?;$", str, 1, "Yes")>

and I tested it with request.config.my_param_1 = 'MYTEST123'; and it ONLY REMOVES THE FIRST SINGLE QUOTE, for some reason the expression returns me MYTEST123' when I don't want any surrounding quote. I need HELP with my REGEX

Foi útil?

Solução

You don't want REMatch, you want REFind (docs):

REFind(reg_expression, string [, start, returnsubexpressions ] )

returnsubexpressions is what you need, so...

<cfset str = "request.config.MY_PARAM_NAME = 'The parameter VALUE!!';">
<cfset match = REFind("^request\.config\.(\S+) = (.*);", str, 1, "Yes")>

<cfdump var="#match#">

match will be a Struct with two keys (POS and LEN), listing the positions and lengths of each sub-match.

You can then feed this information to Mid() and cut out the actual substrings.

Don't forget to check whether REFind succeeded, ArrayLen(match.POS) must be 3 in your case (1 overall match, two match groups, think $0 .. $2).

To find all occurrences in the entire file, either

  • run this function in a loop, setting start to match.POS[1] + match.LEN[1] for the next iteration
  • or loop through the file in a line-by-line manner, via <cfloop list> with newline Chr(10) as delimiter or via <cfloop array> and ArrayToList(file, Chr(10)).

Outras dicas

You'll want to try out refind(), not rematch() because it returns array data that can be used to get and found subexpressions

arrsearch = rematch("^request.config.(.*?) = (.*?);$", line)

Just returns the whole line: #Mid(line,arrsearch.pos[0],arrsearch.len[0])#

Returns the first subexpression (varname): #Mid(line,arrsearch.pos[1],arrsearch.len[1])#

Returns the second subexpression (value): #Mid(line,arrsearch.pos[2],arrsearch.len[2])#

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top