Formatting a txt file of equations into the same format and then manipulating them for linear algebra calculations in Python

StackOverflow https://stackoverflow.com/questions/15983553

Question

I'm looking for an universal way of transforming equations in Python 3.2. I've only recently begun playing around with it and stumbled upon some of my old MATLAB homework. I'm able to calculate this in MATLAB but pylab is still a bit of a mystery to me.

So, I have a text file of equations that I'm trying to convert into the the same form of A x = b and then solve some linear algebra problems associated with them in PYLAB.

The text file, "equations.txt",contains collections of linear equations in the following format:

-38 y1  +  35 y2  +  31 y3  = -3047

11 y1  + -13 y2  + -34 y3  = 784

34 y1  + -21 y2  +  19 y3  = 2949

etc.

The file contains the equations for four sets of equations, each set with a different number of variables. Each set of equations is of the exact form shown (3 examples above) with one empty line between each set.

I want to write a program to read all the sets of equations in the files, convert sets of equations into a matrix equation A x = b, and solve the set of equations for the vector x.

My approach has been very "MATLABy", which is a problem because I want to be able to write a program that will solve for all of the variables.

I've tried reading a single equation as a text line, stripped of the carriage return at the end, and splitting line at the = sign because as we know the 2nd element in the split is the right hand side of the equation, that goes into the vector b.

The first element in the split is the part you have to get the coefficients that go in the A matrix.  If you split this at white space ' ', you will get a list like

['-38', 'y1', '+', '35', 'y2', '+', '31', 'y3']

Note now that you can pull every 3rd element and get the coefficients that go into the matrix A.
Partial answers would be:

y1 = 90; c2 = 28; x4 = 41; z100 = 59

I'm trying to manipulate them to give me the sum of the entries of the solutions y1, ..., y3 for the first block of equations, the sum of the entries of the solutions c1, ..., c6 for the second block of equations, the sum of the entries of the solutions x1, ..., x13 for the third block of equations, and the sum of the entries of the solutions z1, ..., z100 for the fourth block of equations.

Like, I said - I'm able to do this in MATLAB but not in Python so I'm probably approaching this from the wrong way but this is what I have so far:

import pylab
f = open('equations.txt', 'r')

L=f.readlines()

list_final = []

for line in L:
line_l = line.rstrip()
list_l = line_l.split(";")
list_l = filter(None, list_l)

for expression in list_l:

and ending it with

f.close()

This was just my go at trying to format the equations to all look the same. I realise it's not a lot but I was really hoping someone could get my started because even though I know some python I normally don't use it for math because I have MATLAB for that.

I think this could be useful for many of us who have prior MATLAB experience but not pylab. How would you go around this? Thank you!

Was it helpful?

Solution 2

An alternative approach that is possibly more robust to unstructured input is to use a combination of the Python symbolic math package (sympy), and a few parsing tricks. This scales to the variables in the equations being written in an arbitrary order.

Although sympy has some tools for parsing, (your input is very close in appearance to Mathematica), it appears that the sympy.parsing.mathematica module can't deal with some of the input (particularly leading minus signs).

import sympy
from sympy.parsing.sympy_parser import parse_expr
import re

def text_to_equations(text):
    lines = text.split('\n')
    lines = [line.split('=') for line in lines]
    eqns = []
    for lhs, rhs in lines:
        # clobber all the spaces
        lhs = lhs.replace(' ','')
        # *assume* that a number followed by a letter is an
        # implicit multiplication
        lhs = re.sub(r'(\d)([a-z])', r'\g<1>*\g<2>', lhs)
        eqns.append( (parse_expr(lhs), parse_expr(rhs)) )
    return eqns

def get_all_symbols(eqns):
    symbs = set()
    for lhs, rhs in eqns:
        for sym in lhs.atoms(sympy.Symbol):
            symbs.add(sym)
    return symbs

def text_to_eqn_matrix(text):
    eqns = text_to_equations(text)
    symbs = get_all_symbols(eqns)
    n = len(eqns)
    m = len(symbs)
    A = numpy.zeros((m, n))
    b = numpy.zeros((m, 1))
    for i, (lhs, rhs) in enumerate(eqns):
        d = lhs.as_coefficients_dict()
        b[i] = int(rhs)
        for j, s in enumerate(symbs):
            A[i, j] = d[s]
    x = sympy.Matrix([list(symbs)]).T
    return sympy.Matrix(A), x, sympy.Matrix(b)

s = '''-38 y1  +  35 y2  +  31 y3  = -3047
11 y1  + -13 y2  + -34 y3  = 784
34 y1  + -21 y2  +  19 y3  = 2949'''
A, x, b = text_to_eqn_matrix(s)
print A
print x
print b

OTHER TIPS

For your example format, it's very easy to process it by numpy.loadtxt():

import numpy as np
data = np.loadtxt("equations.txt", dtype=str)[:, ::3].astype(np.float)
a = data[:, :-1]
b = data[:, -1]
x = np.linalg.solve(a, b)

The steps are:

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top