tcl - explanation on the following regexp and regsub commands

https://stackoverflow.com/questions/23094188

regex
tcl

04-07-2023
|

Question

Ok so tcl expert here (Brad Lanam) wrote the following regexp and regsub commands in a tcl script to parse my file format (called liberty (.lib) used in chip design). I just want to know what they mean (if not why they were used since you don't have the context). I have used the references on tcl wiki but simply cannot seem to connect the dots. Here's the snippet of his code

set fh [open z.lib r]
set inval false
while { [gets $fh line] >= 0 } {
  if { [regexp {\);} $line] } {
    set inval false
  }
  if { [regexp {index_(\d+)} $line all idx] } {
    regsub {^[^"]*"} $line {} d
    regsub {".*} $d {} d
    regsub -all {,} $d {} d
    dict set risedata constraints indexes $idx $d
  }
  if { $inval } {
    regsub {^[^"]*"} $line {} d
    regsub {".*} $d {} d
    regsub -all {[ ,]+} $d { } d
    set row [expr {$rcount % 5}]
    set column [expr {$rcount / 5}]
    set i 0
    foreach {v} [split $d { }] {
      set c [lindex [dict get $risedata constraints indexes 3] $i]
      dict set risedata constraints constraint $c $row $column $v
      incr i
    }
    incr rcount
  }
  if { [regexp {values} $line] } {
    set inval true
    set row 0
    set rcount 0
  }
}
close $fh

Especially, what does

if { [regexp {index_(\d+)} $line all idx] } {
        regsub {^[^"]*"} $line {} d
        regsub {".*} $d {} d
        regsub -all {,} $d {} d

Mean?? does line containing \d+ search for line variable for more than one digit and match against the string line ? What is regsub {^[^"]*"} $line {} d ?

Big thanks for helping a noob like me understand.

Reference: Brad Lanam

La solution

I'll take it line-by-line and explain what it appears to be doing.

if { [regexp {index_(\d+)} $line all idx] } {

This first line checks to see if the string stored in line includes a substring of index_ followed by 1 or more digits. If so, it stores the matching substring in all (which the rest of the code appears to ignore) and stores the digits found in the variable idx.

So if line were set to "stuff index_123 more stuff", you would end up with all set to index_123 and idx set to 123.

regsub {^[^"]*"} $line {} d

This regsub will remove everything from the beginning of line up to and including the first double-quote. It stores the result in d.

regsub {".*} $d {} d

The next regsub operates on the value now in d. It looks for a double-quote and removes that character and everything afterward, storing the result again in d.

regsub -all {,} $d {} d

Finally, this line deletes any commas found in d, storing the result back in d.

The next set of regexp/regsub lines perform a similar set of operations except for the last line in the group:

regsub -all {[ ,]+} $d { } d

After the previous lines removed everything except the section that had been in double-quotes, this line removes any sections made up of one or more spaces and commas and substitutes them with a single space.

Let me know if that is clear.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow