Is this “Valid mathematical expression” problem P, or NP?

https://stackoverflow.com/questions/975626

13-09-2019
|

Question

This question is purely out of curiosity. I am off school for the summer, and was going to implement an algorithm to solve this just for fun. That led to the above question, how hard is this problem?

The problem: you are given a list of positive integers, a set of mathematical operators and the equal sign(=). can you create a valid mathematical expression using the integers (in the same order) and the operators (any number of times)?

An example will should clarify any questions:

given: {2, 3, 5, 25} , {+, -, *, /} , {=}
output: YES

the expression (only one i think) is (2 + 3) * 5 = 25. you only need to output YES/NO.

I believe the problem is in NP. I say this because it is a decision problem (YES/NO answer) and I can find a non-deterministic poly time algorithm that decides it.

a. non-deterministically select a sequence of operators to place between the integers.
b. verify you answer is a valid mathematical expression (this can be done in constant time).

In this case, the big question is this: Is the problem in P? (i.e. Is there a deterministic poly time algorithm that decides it?) OR Is the problem NP complete? (i.e. Can a known NP Complete problem be reduced to this? or equivalently Is every NP language poly time reducable to this problem?) OR neither? (i.e. problem in NP but not NP Complete)

Note: This problem statement assumes P not equal to NP. Also, although I am new to Stack Overflow, I am familiar with the homework tag. This is indeed just curiosity, not homework :)

Solution

An straightforward reduction from the Partition problem (which is NP-Complete) - given a set of N integers S, the input to the "Valid Math" problem would be - the elements of S, N-2 '+' operators and an '=' sign.

OTHER TIPS

There seems to be some sort of confusion about how to check for NP-completeness. An NP-complete problem is at least as hard, in a particular sense, as any other problem in NP. Suppose we were comparing to 3SAT, as some posters are trying to do.

Now, reducing the given problem to 3SAT proves nothing. It is then true that, if 3SAT can be solve efficiently (meaning P=NP), the given problem can be solved efficiently. However, if the given problem can be solved efficiently, then perhaps it corresponds only to easy special cases of 3SAT.

We would have to reduce 3SAT to the given problem. This means that we would have to make up a rule to transform arbitrary 3SAT problems to examples of the given problem, such that the solution of the given problem would tell us how to solve the 3SAT problem. This means that 3SAT couldn't be harder than the given problem. Since 3SAT is the hardest possible, then the given problem must also be the hardest possible.

The reduction from the Partition problem works. That problem works like this: given a multiset S of integers, can we divide this into two disjoint subsets that between them include each member of S, such that the sums of the disjoint subsets are equal?

To do this, we construct a sequence beginning with 0, containing each element of S, and then 0. We use {+, -} as the operation set. This means that each element of S will be either added or subtracted to total to 0, meaning that the sum of the added elements is the same as the sum of the subtracted elements.

Therefore, this problem is at least as hard as the Partition problem, since we can solve a example Partition program if we can solve the given one, and is therefore NP-complete.

OK, first, you specify "set" of integers but a set is by definition unordered, so you mean a "list" of integers.

Also, I am going to make an assumption here which may be wrong, which is that the = sign always appears exactly once, between the second to last and the last integer on your list. If you allow the equals sign in the middle, it becomes more complicated.

Here is an actual proof that "Valid Mathematical Expression" (VME) is NP complete. We can do a reduction from Subset sum. NOTE that Wikipedia's definition of subset sum requires that the subset is non-empty. In fact, it is true that the more general problem of subset sum allowing empty subsets is NP complete, if the desired sum is also part of the input. I won't give that proof unless requested. Given the instance of subset sum {i_1, i_2, ..., i_n} along with desired sum s, create the following instance of VME:

{0, i_1, 0, i_2, 0, ..., i_n, s}, {+, *}, {=}

IF the instance of subset sum is solvable, then there is some subset of the integers that adds to 0. If the integer i1 is part of the sum, add it with its corresponding zero (immediately to the left) and if i1 is not part of the sum, multiply it. Between each zero and the term to the right, insert an addition sign.

Taking the Wikipedia example

{−7, −3, −2, 5, 8}

where { −3, −2, 5} sums to 0, we would encode it as

{0, -7, 0, -3, 0, -2, 0, 5, 0, 8, 0}

and the resulting expression would be

{0*7 + 0 + -3 + 0 + -2 + 0 + 5 + 0*8 = 0}

Now we also need to show that any solution to this instance of VME results in a solution to the instance of subset sum. This is easier than you think. When we look at a resulting expression, we can group the numbers into those which are multiplied with a 0 (including as part of a chain multiplication) and those that are not. Any number that is multiplied with a zero is not included in the final sum. Any number that is not multiplied with a zero must be added into the final sum.

So we have shown that this instance of VME is solvable IF and ONLY IF the corresponding instance of subset sum is solvable, so the reduction is complete.

EDIT: The Partition reduction (with the comment) works as well, and is better because it allows you to put the equals sign anywhere. Neat!

Don't have the time for the full answer right now, but you can describe a reduction from this problem to the Knapsack Problem.

Using dynamic programming you can achieve pseudo-polynomial time solution. Note that this does not conflict with the fact that the problem is indeed NP Complete.

There are two properties that need to be satisfied for it to be NP Complete.

A decision problem C is NP-complete if:

C is in NP, and
Every problem in NP is reducible to C in polynomial time.

We have established 1. 2 results from the fact that every problem in NP is reducible to 3SAT and 3SAT is reducible to the current problem.

Therefore it is NP-complete.

(edit) Answer to the comment below:

I will prove that SAT is reducible to the current problem, and since 3SAT is reducible to SAT, the result follows.

Input formula is the conjunction of the following expressions:

(x₁ V x₂ V x₃ V ... x_n V y₁)

(x₁ V x₂ V x₃ V ... x_n V y₂)

(x₁ V x₂ V x₃ V ... x_n V y₃)

(x₁ V x₂ V x₃ V ... x_n V y₆₄)

where each y_i is a boolean based on what the order of the operators applied between all the x_i's is. i.e., y_i can take a total of 4x4x4x4x1 values (assuming that only +, -, x, / are the operators and = is always the last operator; this can be changed if the operator set is modified to include other operators)

If none of the expressions is true, then the complete expression will evaluate to FALSE, and there is no way to check unless we substitute all possible values, i.e, x₁ through x_n as the n numbers and y₁ through y₆₄ as the various ways in which the operators can be applied (This takes care of order)

This conversion is in POLY-time, and the given boolean formula is satisfiable iff the mathematical expression is valid, etc.

Anyone notice a flaw?

This isn't really an answer to your complexity question, but your problem sounds a bit like the Countdown problem. A quick search turned up this paper: http://www.cs.nott.ac.uk/~gmh/countdown.pdf

I don't have to time to work out a proof at the moment, but a hunch tells me that it may not be in P. you can define a grammar for arithmetic, and then this question amounts to finding if there's a valid parse tree that uses all these terminals. i believe that that problem is in NP but outside of P.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow