How does Python's lack of static typing affect maintainability and extensibility in larger projects? [closed]

https://stackoverflow.com/questions/9050068

20-04-2021
|

Question

After reading this very informative (albeit somewhat argumentative) question I would like to know your experience with programming large projects with Python. Do things become un manageable as the project becomes larger? This concern is one thing that keeps me attached to Java. I would therefore be particularly interested in informed comparisons of maintainability and extensibility of Java and Python for large projects.

Solution

I work on a large scale commercial product done in Python. I give a very rough estimate of 5000 files x 500 lines each. That's about 2.5 millions lines of Python. Mind you the complexity of this project is probably equivalent to 10 mil+ lines of code in other languages. I've not heard from a single engineer/architecture/manager who complain about Python code being unmaintainable. From what I've seen from our bug tracker, I do not see any systemic problem that could be avoided by static type checking. In fact there is very few bugs spawn from incorrect use of object type at all.

I think this is a very good academic subject to empirically study why static class based language does not seems to be as critical as one might think.

And about extensibility. We just added a database 2 on top of the database 1 in our product, both of them non-SQL. There is no issue related to type checking. First of all we have designed an API flexible enough to anticipate different underlying implementation. I think dynamic language is a helps rather than hindrance in this regard. When we went on to testing and bug fixing phrase, we were working on the kind of bugs people working on any language would have to face. For example, memory usage issues, consistence and referential integrity issues, error handling issues. I don't see static type checking have much help on any of these challenges. On the other hand we have benefited greatly from dynamic language by being able to inject code mid-flight or after simple patching. And we are able to test our hypothesis and demonstrate our fixes quickly.

It is safe to say most of our 100+ engineers are happy and productive using Python. It is probably unthinkable for us to build the same product using a static typed language in the same amount of time with the same quality.

OTHER TIPS

From my experience statically typed languages can be difficult to maintain. For instance lets say you have a utility function which accepts a custom class as a parameter. If down the road you adopt a new naming convention than this class's name will have to change, and then then all of your utility functions will have to change as well. In a language like python it doesn't matter as long at the class implements the same methods.

Personally I despise a language that gets in my way. Speed of expressing your ideas is value, and this is the advantage Python has over Java.

A large code base in python without good test coverage might be an issue. But thats just one part of the image. It's all about people and suitable approaches to do the job.

Without

Source Control
Bug Tracking
Unit Tests
Committed Team

you might fail with any kind of language.

Try tracing back the source of an apparently malformed object in a large, dynamically-typed framework with lots of IoC or other design patterns where the object cannot be traced directly up the stack.

Now try doing this in a statically-typed language.

Unless the type of the object is documented close to the use-site (e.g. via type annotations, a-la Python's typesafe library) or somewhere on the stack, deducing where it came from can be virtually impossible. I speak from experience, having tried to debug parts of the BuildBot framework. It involved an immense amount of raw text searching through the framework, even using fancy IDEs such as PyDev, Komodo and Wingware.

I do not doubt that it is possible to impose some type constraints on dynamic languages, but the lack of any standardisation on this seems to be an impediment to anyone trying to debug part of a large, existent framework.

EDIT: since 2014, Guido added PEP484, MyPy and the typing module. This has made my experience much, much better in terms of maintaining large projects.

I remember the days before and after the innovation of IntelliJ IDEA. There are huge differences. Before, static typing was only for compilation, development basically treats source code as text files. After, source code is structured information, many development tasks are must easier, thanks to static typing.

However, it's not like the old days were living hell. We took it as is, do whatever necessary, use the tools available to date, get the system built, satisfaction. There weren't too many unhappy memories. That's probably what dynamic typing programmers feel now. It's not that bad.

Of course, I'll never go back to the old days. If I'm forbidden to use such an IDE, I guess I'll give us programming all together.

In my experience, maintainability depends on low coupling, good documentation, good development process, and excellent testing. Static typing has very little to do with any of this.

The errors that Java will catch at compile time are only a small subset of the errors that can occur. They're also almost always the most trivial to detect by testing; there's no way you can miss calling a method on an object of the wrong class if you're testing that your code produces the right answer! In that respect you could argue that Python actually is better for ensuring quality; by forcing you to test at least a bit to ensure your code is free of simple typos, it ensures that you actually do test at least a bit.

In fact Java is not even a very good example of a language with strong static checks for catching lots of bugs. Try programming in Haskell or Mercury to see what I mean, or better yet try programming in Scala and interfacing with Java libraries; the difference in how much "correctness" the compiler is able to guarantee for you is striking when you compare the normal idiomatic Scala code using Scala libraries to the code that has to deal with Java libraries (I have actually done this, since I program a bit in Scala on Android).

Your ability to write good maintainable code in large code-bases worked on by many developers over long periods of time, despite the shortcomings of Java's static error detection compared to languages like Scala, depends on exactly the same techniques Python programmers use to do the same thing in their large code-bases, despite the shortcomings of Python's static error detection compared to Java.

I've used Python for many projects, from a few hundred lines to several thousand lines. Dynamic typing is a great time saver and it makes OO concepts like polymorphism way easier to use. The type system does not make projects unmaintainable. If you have trouble imagining that, try writing a few things in Python and see how they go.

I am working in a big data start-up company which use python as main language. My projects is about 30k lines of python. from my experience, if your team adopts a good programming practice, fox example, adding type hints and extensively unit testing, it maybe not so affect maintainability. since Pycharm can automatically detect type some type errors if there are type hints.

the real issues are: 1. performance, this may not related to maintainability, but it is an issue. 2. not every python code base you handled are well written. since python is easy to learn and code. some people who do not have some basic CS training would develop a python project which is impossible to maintain. i worked a python project which has a lot files that has several thousand lines each without type hints. and that guy do not know about OOP. he basically write python in a way like writing C. he just utilize some python language features but completely imperative programming. Good written python projects really rely on well trained engineers. if you can not hire good-enough engineer, it would better to rely on tools itself. 3. in a python big data company, product managers and some non-technical people do not care data type in consistence. those people would design a data product which are not type-safe. for example, in a Json if a field which is usually a str, but when it is empty, some people would make it null. this would fail at Runtime.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow