Why does the US government disallow dynamic languages for secure projects?

https://softwareengineering.stackexchange.com/questions/206558

29-09-2020
|

Pregunta

I know some people that are currently working on a project for the US military (low security level, non-combat human resources type data).

An initial state of the project code was submitted to the military for review, and they ran the program through some sort of security analyzer tool. It returned a report of known security issues in the code and required changes that needed to be implemented before delivery of the final product.

One of the items that needed to be resolved was removal of part of the project that was written in Ruby as it is a dynamic language.

What is the background/reason for not allowing a dynamic language to be used in a secure setting? Is this the government being slow to adopt new technologies? Or do dynamic languages pose an additional security risk compared to static languages (ala C++ or Java)?

Solución

There are a number of 'neat' things that can be done in dynamic languages that can be tucked away in parts of the code that aren't immediately obvious to another programmer or auditor as to the functionality of a given piece of code.

Consider this sequence in irb (interactive ruby shell):

irb(main):001:0> "bar".foo
NoMethodError: undefined method `foo' for "bar":String
        from (irb):1
        from /usr/bin/irb:12:in `<main>'
irb(main):002:0> class String
irb(main):003:1> def foo
irb(main):004:2> "foobar!"
irb(main):005:2> end
irb(main):006:1> end
=> nil
irb(main):007:0> "bar".foo
=> "foobar!"

What happened there is I tried to call the method foo in a String constant. This failed. I then opened up the String class and defined the method foo o return "foobar!", and then called it. This worked.

This is known as an open class and gives me nightmares every time I think of writing code in ruby that has any sort of security or integrity. Sure it lets you do some neat things quite fast... but I could make it so every time someone stored a string, it stored it to a file, or sent it over the network. And this little bit of redefining the String can be tucked anywhere in the code.

Many other dynamic languages have similar things that can be done. Perl has Tie::Scalar that can behind the scenes change how a given scalar works (this is a bit more obvious and requires a specific command that you can see, but a scalar that is passed in from somewhere else could be a problem). If you have access to the Perl Cookbook, look up Recipe 13.15 - Creating Magic Variables with tie.

Because of these things (and others often part of dynamic languages), many approaches to static analysis of security in code doesn't work. Perl and Undecidability shows this to be the case and points out even such trivial problems with syntax highlighting (whatever / 25 ; # / ; die "this dies!"; poses challenges because the whatever can be defined to take arguments or not at runtime completely defeating a syntax highlighter or static analyzer).

This can get even more interesting in Ruby with the ability to access the environment that a closure was defined in (see YouTube: Keeping Ruby Reasonable from RubyConf 2011 by Joshua Ballanco). I was made aware of this video from an Ars Technica comment by MouseTheLuckyDog.

Consider the following code:

def mal(&block)
    puts ">:)"
    block.call
    t = block.binding.eval('(self.methods - Object.methods).sample')
    block.binding.eval <<-END
        def #{t.to_s}
          raise 'MWHWAHAW!'
        end
    END
end

class Foo
    def bar
        puts "bar"
    end

    def qux
        mal do
            puts "qux"
        end
    end
end

f = Foo.new
f.bar
f.qux

f.bar
f.qux

This code is fully visible, but the mal method could be somewhere else... and with open classes, of course, it could be redefined somewhere else.

Running this code:

~/$ ruby foo.rb 
bar
>:)
qux
bar
b.rb:20:in `qux': MWHWAHAW! (RuntimeError)
    from b.rb:30:in `'
~/$ ruby foo.rb 
bar
>:)
qux
b.rb:20:in `bar': MWHWAHAW! (RuntimeError)
    from b.rb:29:in `'

In this code, the closure was able to access all of the methods and other bindings defined in the class at that scope. It picked a random method and redefined it to raise an exception. (see the Binding class in Ruby to get an idea of what this object has access to)

The variables, methods, value of self, and possibly an iterator block that can be accessed in this context are all retained.

A shorter version that shows the redefinition of a variable:

def mal(&block)
    block.call
    block.binding.eval('a = 43')
end

a = 42
puts a
mal do 
  puts 1
end
puts a

Which, when run produces:

42
1
43

This is more than the open class that I mentioned above that makes static analysis impossible. What is demonstrated above is that a closure that is passed somewhere else, carries with it the full environment that it was defined in. This is known as a first class environment (just as when you can pass around functions, they are first class functions, this is the environment and all of the bindings available at that time). One could redefine any variable that was defined in the scope of the closure.

Good or bad, complaining about ruby or not (there are uses where one would want to be able to get at the environment of a method (see Safe in Perl)), the question of "why would ruby be restricted in for a government project" really is answered in that video linked above.

Given that:

Ruby allows one to extract the environment from any closure
Ruby captures all bindings in the scope of the closure
Ruby maintains all bindings as live and mutable
Ruby has new bindings shadow old bindings (rather than cloning the environment or prohibiting rebinding)

With the implications of these four design choices, it is impossible to know what any bit of code does.

More about this can be read at Abstract Heresies blog. The particular post is about Scheme where such a debate was had. (related on SO: Why doesn't Scheme support first class environments?)

Over time, however, I came to realise that there was more difficulty and less power with first-class environments than I had originally thought. At this point I believe that first-class environments are useless at best, and dangerous at worst.

I hope this section shows the danger aspect of first class environments and why it would be asked to remove Ruby from the provided solution. Its not just that Ruby is a dynamic language (as mentioned else-answer, other dynamic languages have been allowed in other projects), but that there are specific issues that make some dynamic languages even more difficult to reason about.

Otros consejos

Assuming the evaluation was security-only, and not just an acceptance scan (that is, they do not accept Ruby because they don't want to support Ruby) then:

Security analysis tools typically have a bad time with dynamic behaviors.

For example:

Run any .NET project written with modern features like ASP.NET MVC and Entity Framework through something like Veracode and see what kind of laundry list of false positives you receive in your report.

Veracode even lists many basic techniques within .NET 4 core libraries as "unsupported frameworks" as unsupported or beta only even though most of them are several years old at this point.

If you are dealing with an entity that has a strong reliance on such a tool they are almost forced to consider those insecure if they do not have the technical expertise, and the resources, to manually evaluate a project and see if it is properly written and secure.

In civilian operations where computer systems generally don't control anything dangerous or terribly expensive the mitigation is that you discuss the false positives and they are generally accepted as such in bulk.

In banking operations you still have a chance of a false positive mitigation, but you are going to spend a lot more time discussing the minutiae of each item. This rapidly becomes cost prohibitive and you start using more traditional methods.

In the military, aviation, heavy industry and the like, systems can control things that have terrible failure modes those systems so they may have very strict rules about languages, compilers, etc.

Organizations also generally write their security policy for the worst case they know of, so even if you are writing something trivial, if you are writing it for an organization that has non-trivial systems, the default is generally going to be to hold it to a higher standard unless someone requests a specific exception.

Dynamic languages can be used in defense and military applications. I've personally used and delivered Perl and Python in DoD applications. I've also seen PHP and JavaScript used and deployed. In my experiences, most of the non-compiled code that I've seen has been shell scripts and Perl because the environments required are approved and installed on a variety of possible target systems.

The fact that these languages are dynamic most likely isn't the problem. The interpreters for these languages must be approved for use on the target systems. If the interpreter is not approved for use (or, perhaps, it is, but it is not deployed on the target systems), then the language can't be used. Using a given interpreter (or any application) on a secure system requires any number of security hurdles: analysis of the source, the ability to compile from source for target environments, additional analysis of the binaries, ensuring no conflicts with existing infrastructure, etc.

I spent some time interviewing with the DOD (Department of Defense), for a position writing code for the F-16's MMU. Without violating any non-disclosures: the MMU is the computer unit which controls nearly all the F-16's functions. It is (obviously) critical that no errors, such as run-time bugs, occur during flight. It's equally critical that the system perform real-time computing operations.

For this and other historical reasons, all code for this system is written in or compiled to ADA, a static object-oriented programming language.

Because of Ada's safety-critical support features, it is now used not only for military applications, but also in commercial projects where a software bug can have severe consequences, e.g. avionics and air traffic control, commercial rockets (e.g. Ariane 4 and 5), satellites and other space systems, railway transport and banking. For example, the fly-by-wire system software in the Boeing 777 was written in Ada.

I hate to quote too much, but this explains really well why exactly static languages (like ADA) are used for projects like this:

A large number of compile-time checks are supported to help avoid bugs that would not be detectable until run-time in some other languages or would require explicit checks to be added to the source code. For example, the syntax requires explicitly named closing of blocks to prevent errors due to mismatched end tokens. The adherence to strong typing allows detection of many common software errors (wrong parameters, range violations, invalid references, mismatched types, etc.) either during compile-time, or otherwise during run-time. As concurrency is part of the language specification, the compiler can in some cases detect potential deadlocks. Compilers also commonly check for misspelled identifiers, visibility of packages, redundant declarations, etc. and can provide warnings and useful suggestions on how to fix the error.

Ada also supports run-time checks to protect against access to unallocated memory, buffer overflow errors, range violations, off-by-one errors, array access errors, and other detectable bugs. These checks can be disabled in the interest of runtime efficiency, but can often be compiled efficiently. It also includes facilities to help program verification. For these reasons, Ada is widely used in critical systems, where any anomaly might lead to very serious consequences, e.g., accidental death, injury or severe financial loss. Examples of systems where Ada is used include avionics, railways, banking, military and space technology.

Ada's dynamic memory management is high-level and type-safe. Ada does not have generic (and vague) "pointers"; nor does it implicitly declare any pointer type. Instead, all dynamic memory allocation and deallocation must take place through explicitly declared access types. Each access type has an associated storage pool that handles the low-level details of memory management; the programmer can either use the default storage pool or define new ones (this is particularly relevant for Non-Uniform Memory Access). It is even possible to declare several different access types that all designate the same type but use different storage pools. Also, the language provides for accessibility checks, both at compile time and at run time, that ensures that an access value cannot outlive the type of the object it points to.

Both DoD and NASA have a long history with programming fails that cost them billions of dollars. Both institutions have accepted processes that should protect them from repeating the same mistakes.

Is this the government being slow to adopting new technologies?

This is a misconception - dynamic languages are not a new technology, they are quite old. The problem is that if you ever had a problem caused by a dynamic language (e.g. by weak/dynamic typing) and that problem cost you a lot of money, you could accept a policy that would prevent you to do the same mistake again - e.g. banning the use of dynamic languages in sensitive systems.

Dynamic languages will often "swallow" bugs and will end up with some unexpected behavior. This is very dangerous in sensitive systems. If something wrong is happening, you want to know it as soon as possible.

If security is concerned, it would be necessary to see the actual use case. For example, I don't think that a Ruby on Rails web page would be automatically less secure than a Java webpage.

I'd like to add to the existing answers by describing Drupal's SA-CORE-2014-005, which is a highly critical vulnerability which enables SQL injection and ultimately arbitrary code execution. It is caused by PHP's dynamic typing and lax runtime typing rules.

The entirety of the patch for this issue is:

-      foreach ($data as $i => $value) {
+      foreach (array_values($data) as $i => $value) {

This code is part of an SQL abstraction layer designed to prevent SQL injection. It takes an SQL query with named parameters, and an associative array which provides a value for each named parameter. The value is allowed to be an array, for cases like WHERE x IN (val1, val2, val3), where all three values can be passed in as a single array value for a single named parameter.

The vulnerability occurs because the code assumes that $i in $i => $value must be an integer index of the value. It goes on and concatenates this "index" directly into the SQL query as part of a parameter name, because integers don't need any escaping, right?

Unfortunately for Drupal, PHP does not provide such a guarantee. It's possible to pass in another associative array, whose keys are strings, and this loop will happily concatenate the string key into the query, as-is (remember the code thinks it can only ever be an integer).

While there are ways to have a similar error in a statically typed language, they are unlikely. A good developer would consider what possible things $i could be before concatenating it into the query. With a statically typed language, it's very easy to enforce that $i must be an integer, and in security sensitive code like this, that would most certainly be done.

Furthermore, the code actually checks whether the value is an array before iterating over the items. And herein lies a second part of the fail that enables this vulnerability: both an associative array and a "normal" array return true for is_array. While it's also true that in C#, both dictionaries and arrays are IEnumerable, it is difficult to construct code which would conflate dictionary keys with array indices like this even intentionally, let alone accidentally.

As far as I can tell, official Department of Defense policy does not generally forbid dynamic languages.

Standards for software developed or procured by the DoD are promulgated by the Defense Information Systems Agency (DISA). Their Application Security - Application Security & Development Security Technical Implementation Guide (STIG) does not prohibit any particular language. It doesn't mention Ruby, but it mentions Perl and Python which are similarly dynamic. It mentions them in the context of various topics (following established Coding Standards, avoiding Command Injection Vulnerabilities, etc).

Probably what you are seeing is an overly-strict scanning tool (there are several different ones mentioned in the STIG, each may have its own interpretation of the rules) and/or overly-strict interpretation of its output.

Whether a codebase is secure or not depends on how you write your code, how you test it and how you validate and monitor your development and deployment process. Languages are neither secure nor insecure, it's how you code.

The majority of security incidents due to malicious input (sql injections, buffer overflows), viruses, rootkits and trojans. No language can protect you from that.

So banning classes of languages for being "insecure" is not a valid reason.

I suspect that someone, for whatever reason - informed or not - decided to ban these languages. After a while it became an organizational truth. It may have been true at that point in time for some projects, but control cultures are not keen on changing decisions (admit they were wrong) and rather prefers simple rules. They thrive on rules and regulations and it does not matter if they make sense or not, it is the perceived safety that counts.

This happens all the time in control cultures. I see it more or less daily. It makes no sense, but that's how it goes. If you want to read more about this highly relevant topic, I recommend Schneider's book "The Reengineering Alternative". Here is a culture diagram by Michael Sahoto/Agilitrix, based on Schneider's book: enter image description here

Licenciado bajo: CC-BY-SA con atribución

No afiliado a softwareengineering.stackexchange