Regex: rounding all the real number in a text file (keeping the 15 decimal digits)

StackOverflow https://stackoverflow.com/questions/21835386

  •  12-10-2022
  •  | 
  •  

Question

I have a text file with thousands of entries (the nodes of a mesh) like

7.40000000000060391E+01   7.40000866410523770E+00    
1.05000000970718801E+01   6.40000007900613273E+01
2.40500000000000321E+02   2.40000000428227065E+00   
6.00000000000000000E+00   3.70000085530326430E+01   
7.40000019596598406E+01   6.40000000000000000E+01
3.10000144967919340E+01   1.92000112854581857E+01
6.40000000000000000E+01   6.40004500000000000E+01

where some of my entries have a small error that I would like to remove. I am using textpad. I would like to keep the first numbers until I found a triple zero and set all the other decimal digits to zero. The example above would read:

7.40000000000000000E+01   7.40000000000000000E+00    
1.05000000000000000E+01   6.40000000000000000E+01
2.40500000000000000E+02   2.40000000000000000E+00   
6.00000000000000000E+00   3.70000000000000000E+01   
7.40000000000000000E+01   6.40000000000000000E+01
3.10000000000000000E+01   1.92000000000000000E+01
6.40000000000000000E+01   6.40000000000000000E+01

Any suggestion? Thanks Alberto

Was it helpful?

Solution

If you don't have JGSoft or a .NET engine near to you, you can try regexhero.net. It's an online regular expression tester powered by .NET !

I have build a demo based on the excellent Tim's proposition.

You can:

  • quickly cut and paste your entries
  • have them modified by regexhero
  • and then cut-and-paste the result back to your editor

NOTA: I am not affiliated in any way with regexhero :)

OTHER TIPS

Interesting problem. This can be easily solved in JavaScript using a regex and the String.replace() method with a callback function replacement value. First a regex needs to be crafted which matches those high precision floating point numbers which have non-zero digits following three consecutive zero digits in the mantissa. Here is just such a regex written in python free-spacing mode with comments:

A regex to match numbers to be truncated:

re_truncatablebleFloat = re.compile(r"""
    # Match real/float number having form: 1.23450009012345678E+12
    \b             # Anchor to word boundary.
    (              # $1: Part to be preserved.
      \d+\.        # Integer portion and dot.
      [1-9]*       # Zero or more non-zero decimal digits.
      (?:          # Zero or more significant zeros.
        0          # Allow a zero digit, but only
        (?!00)     # if not start of a triple 000.
        [1-9]*     # Zero or more non-zero decimal digits.
      )*           # Zero or more significant zeros.
      000+         # Three or more zeros mark end of first part.
    )              # End $1: Part to be preserved.
    ([1-9]\d*)     # $2: Mantissa digits to be zeroed out.
    ([Ee][+-]\d+)  # $3: Well formed exponent.
    \b             # Anchor to word boundary.
    """, re.VERBOSE)

Note that this regex matches only those floats which need to be modified. It captures the initial part of the number in capture group $1, the digits to be zeroed in group $2, and finally the exponent in group $3.

A JavaScript Function to fixup the text:

The following is a tested JavaScript function which utilizes the above regex with a callback function to solve the problem at hand:

function processText(text) {
    var re = /\b(\d+\.[1-9]*(?:0(?!00)[1-9]*)*000+)([1-9]\d*)([Ee][+-]\d+)\b/g;
    return text.replace(re,
        function(m0, m1, m2, m3){
            m2 = m2.replace(/[1-9]/g, '0');
            return m1 + m2 + m3;
        });
}

Single Page Stand Alone Web Application Solution:

The following is a stand-alone web page which incorporates the above JavaScript function

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><title>Process Text 20140217_1300</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<style type="text/css" media="screen">
    body {margin: 2em; color:#333; background:#DDB;}
    h1, p {font-family: monospace; text-align: center;}
    textarea {width: 99%;}
</style>
<script type="text/javascript">/* <![CDATA[ */
// Process the input text.
function processText(text) {
    var re = /\b(\d+\.[1-9]*(?:0(?!00)[1-9]*)*000+)([1-9]\d*)([Ee][+-]\d+)\b/g;
    return text.replace(re,
        function(m0, m1, m2, m3){
            m2 = m2.replace(/[1-9]/g, '0');
            return m1 + m2 + m3;
        });
}
/* Read input, process, then write to output */
function handleOnclick() {
    var el_in  = document.getElementById('inbox'),
        el_out = document.getElementById('outbox');
    el_out.value = processText(el_in.value);
    return false;
} /* ]]> */</script>
</head><body>
<h1>Process Text</h1>
<form action="" method="get">
<h2>Input:</h2>
<p>
    <textarea id="inbox" name="inbox" rows="10" cols="80"></textarea>
    <input type="button" id="process" name="process" value="Process"
        onclick="return handleOnclick();"/>
</p>
<h2>Output:</h2>
<p>
    <textarea id="outbox" name="outbox" rows="10" cols="80"></textarea>
Note: Line endings may not be preserved! (i.e.
LF may be changed to CRLF or vice-verse)
</p>
</form>
</body></html>

Just save this as an HTML file on your desktop and open it up with your favorite browser. It requires no external resources and can be easily modified/re-purposed to solve similar problems.

Happy regexing!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top