This was written as a piece of coursework for my Principles of Programming Languages module, where we had free choice to write an essay about programming languages. I opted for a critque of some predictions made by John Ousterhout. The essay suffers from a netural/skeptical academic tone. Despite this I think the points are still valid so I might restructure some of it into a more passioniate article later.
Introduction
Programming language paradigms provide a way to classify programming languages. Common paradigms include imperative, functional, logical, and object-oriented. In the late 20th century, a new paradigm was being discussed: scripting languages. Writing in Computer, the flagship publication of the IEEE Computer Society, John K. Ousterhout made bold predictions about this paradigm [1].
He identified this paradigm as languages which enable developers to link together separate components with minimal obstructions. In this report I will compare Ousterhout’s predictions to two modern scripting languages: Python and JavaScript. By evaluating the flaws, and the success of his predictions we can enhance our predictions of current programming languages and use these to inform language design decisions.
Scripting Languages
In his 1998 essay Ousterhout observed that the way people wrote code was changing. He noticed that more people were using programming languages to join pre-existing components together and were less focused on the specific implementations of these components.
He defined scripting languages as generally typeless languages which are designed to glue together several existing components where the relaxed type system is used to simplify links between components. He observes that these components will be written in lower-level languages such as C, C++, and Java. He also says a defining feature of a scripting language is the that the primitive operations do much more work than in lower-level languages.
In order to facilitate a comparison with modern programming languages we can slightly modify Ousterhout’s type system requirement for a scripting language. In his initial definition he states that “scripting languages are generally typeless”, but later in his essay states “Visual Basic [classic], … JavaScript and Perl are used for scripting”. The two languages chosen for comparison in this essay, Python and JavaScript have different type systems. JavaScript is a dynamic, weakly typed language [2] whilst Python is a dynamic, strongly typed language [3]. Despite the languages not meeting Ousterhout’s original definition, that he names both of them as scripting languages so their inclusion in this essay is valid. Further research could explore the impact of strong and weak typing on a language’s ability to be used for scripting. This essay’s consideration of type systems will focus on the impact of dynamic and static types on a language’s ability to be used for scripting.
In modern terms we can describe a scripting language as the following: a language with dynamic typing, a rich ecosystem, and has interoperability with a higher-performance, lower-level programming language.
In addition to his bolder claims, Ousterhout also claims that scripting languages are used by casual programmers, people who are not employed as software engineers but used their programming skills to do their work more efficiently. He also states that future scripting languages will allow for a large increase in this group of programmers.
Types
Ousterhout claims that type systems slow down the programmer and a weak typing system is a prerequisite for fast development in a scripting language. His claims are reflected in contemporary language design. The late 90’s and early 2000’s were marked by a rise of dynamically typed languages, where type checks are performed at runtime. Examples include JavaScript, Python, Lua, and Perl. However, later developments have brought static type analysis to these languages.
JavaScript to TypeScript
In 2012, Microsoft released TypeScript, this language was developed to combat the rising complexity of Microsoft’s JavaScript codebase. Among other features, TypeScript adds type annotations, compile-time type checking, interfaces, and compile-time type checking. On release TypeScript was met with positive reviews. Community members [4] praised the easy transition from “vanilla” JavaScript to a typed version of JavaScript, the new ability to catch errors before deploying the code and intelligent type inference.
Empirical analysis of GitHub repositories contradicts Ousterhout’s claim that scripting languages empower developers. A study [5] compared code-quality metrics in JavaScript and TypeScript. I believe the following metrics are the most important results from the study: code quality, code understandability and bug resolution. Code quality can by evaluated by checking the code against anti-patterns in the language, for example, not declaring class methods as static if they do not require an instantiation of the class. Code understandability was examined through cognitive complexity. This is a value which expresses how convoluted the control flow of a program is. Finally, bug resolution time is the mean time between the submission of bug-report on GitHub and when that issue was being marked as resolved. These three metrics can show the differences between working in a pleasant codebase, one with easy expansion and one where the code does not follow set patterns and consequently issues take a long time to resolve.
Their results showed that TypeScript applications have fewer anti-patterns than JavaScript applications and have a lower cognitive complexity. They also showed that using stricter type, rather than generic types such as list[int] rather than the top type any was negatively correlated with bug resolution times.
Python Type Annotations
A similar refutation of Ousterhout’s claims can be seen with the Python codebase. Much like JavaScript, python was dynamically typed on release, with no compile-time type checking. However, a community project mypy [6] adds static typing to Python. Similar to how TypeScript aggressively infers types, MyPy is often able to infer types from just adding annotations to a function signature.
Python’s optional type hinting is less widespread than TypeScript which prevents meaningful empirical study into its benefits. However, one study [7] examined how often developers choose to use MyPy and the error they encounter. Of the 70,826 sampled public python GitHub repositories only 2,678 used type information at all. This does not mean they had complete type information, just that they were using some form of annotation in the codebase. In fact, the most used type was Any the top-level type in the annotation system.
However, MyPy’s inferred types often contradict types annotated by the developer. 318, or 15%, of the repositories with type annotations had no errors. All of the remaining repositories had type errors. The small sample size means it is impossible to conclude why so few codebases had correct type annotations, but it shows that correctly-typed Python programs are very rare.
Whereas TypeScript presents a clear refutation of Ousterhout’s claim that type inhibit the developer, there is a lack of evidence that community efforts to add types to Python projects have a benefit. Further analysis would lead to a more conclusive result once MyPy version 1 is released.
Ecosystems
As previously observed, Ousterhout believed that a scripting language requires a rich ecosystem of performant code to enable less effort on the developer – they can spent time focussing on the connections between these components and less time considering how to implement the functionality of their application. Both Python and JavaScript have a mature ecosystem with 3,979,533 releases on PyPI [8], the leading python package repository and JavaScript’s largest repository npm having 1.3 million packages [9]. These largest repositories show a vibrant open-source community and culture of sharing code, however, with so many packages available programmers can grow too dependent on other people’s code – something Ousterhout failed to predict.
A Python ecosystem success story
One of Python’s most famous libraries is NumPy [10]. Created in 2005 by merged two existing Python mathematics packages it now powers almost every scientific Python library. NumPy exposes an array class with very efficient operations, written in C. Despite being a community-driven project and not directly related to the development of the Python language, it benefits from features added to Python specifically to enable certain NumPy operations. This close collaboration provides a good platform for the NumPy contributors to build off, and I believe contributes to the success of the package.
Following NumPy’s implementation of array-based programming many other scientific packages build on NumPy’s foundations. A prominent example being the eht-imaging library. This is a library developed for the Event Horizon Telescope. The scientists working on the black hole imaging did not need to consider performance when writing their code because they were using NumPy, allowing them to focus on manipulate their data and the scientific meaning of their results.
NumPy follows Ousterhout’s vision of well-written packages enabling programmers to focus on domain-specific issues rather than implementation of performant data structures.
The wide world of JavaScript’s ecosystem
NumPy appears to demonstrate the power of Ousterhout’s claims, however, I believe that NumPy is an outlier, especially when compared to the JavaScript ecosystem.
One of the largest studies into the JavaScript ecosystem [11] explored the idea of a micropackage. This is what Ousterhout would describe as a component taken to the extreme: these are small packages with a minimal feature set designed to accomplish only 1 task. This means that the developer is able to write in a declarative style, stringing together functions without needed to understand the precise implementation. Micropackages are prevalent in the npm registry. The median number of functions per library is 2 and micropackages make up almost half of all libraries. One of the largest concerns when leaning on a language’s ecosystem is the dependency chain. This is when including a one library with many dependencies can cause an exponential increase in the total number of dependencies for a project. This often causes security concerns, something which will explore later in this section.
Analysis of the dependency chains of micropackages showed that they were not significantly longer than more wide-ranging libraries. The authors’ conjecture that due to the limited functionality of a micropackage there is no need for many dependencies whereas a more feature-complete library would require more dependencies to deliver a wide range of features. However, the authors do suggest that so many packages available hint at duplicated functionality across modules. To combat this, they suggest that certain features are merged into the standard library, such as how the Joda-time Java library was merged into java.time. This is also shown in the earlier analysis of the Python ecosystem, where NumPy was given special privileges when requesting changes to the Python programming language.
However, over reliance of ecosystems can increase the fragility of programs. npm has many such examples. One modern instance is a package which was required for Vue.js, a commonly used JavaScript library with 12 million downloads over the week of writing [12]. As a protest against the Russian invasion of Ukraine, the author pushed a change to the package which would delete files from users in Russia and Belarus. This led to users of Vue.js in those regions being affected. The issue was quickly detected and fixed, but it highlighted the security concerns of a language’s ecosystem.
From these two ecosystems, there is strong evidence in favour of Ousterhout’s claims. The Python ecosystem shows the power of computer specialists writing performant libraries which can be used by domain specialists to accomplish tasks easily. Whilst the JavaScript ecosystem highlights how abstracting functions behind libraries does not comprise performance, or dependency management. However, due to the primitive ecosystems at the time, there is no evidence that Ousterhout considered the security implications of such a programming style when forecasting the rise of scripting languages.
Casual Programmers
Comparing the users of a particular programming language is a harder task than empirically surveying GitHub repositories or comparing syntax. However, to evaluate Ousterhout’s prediction about scripting languages causing a rise in casual programmers we can use Stack Overflow’s Trends. Stack Overflow is a forum where programmers ask technical questions and more knowledgeable users are able to help them. Stack Overflow publishes data about which languages are being interacted with the most and this can act as proxy for language popularity. Python and JavaScript routinely top the list of question views [13] whereas C and C++ have much fewer views. Users are also able to tag their questions with the appropriate language, this is another list dominated by scripting languages: Python and JavaScript have high growth whereas the number of questions asked about Lua, Haskell, Visual Basic, C and C++ remains constant. The author of the analysis suggests that this could be because Python is much newer than the other languages. However, when compared to new languages, such as R, Scala, Go and Rust, Python dominates. Python accounted for 10% of all question views in 2017 whereas Scala, Go and Rust have less than 1% each. However, Ousterhout also predicted that contemporary languages, such as Perl and Lua would also increase in popularity, the data shows that this did not happen.
Ousterhout predicted that scripting languages would increase in popularity as more casual programmers enter the workplace, despite this being a hard claim to evaluate, Stack Overflow’s data shows that this was a correct prediction. His specific claims about language popularity were incorrect but I do not believe this invalidates his wider prediction.
Conclusion
John Ousterhout made bold predictions about the future of programming languages. Looking at two modern scripting languages I believe his claims were correct. His prediction on the rise of “glue” languages was correct; his prediction on high-performance components proved to be correct; and his prediction about an increasing number of casual programmers adopting scripting languages proved to be correct. Despite opining over 20 years ago, I do not see his predictions being invalidated in the near future. Python and JavaScript are mature languages and attempts to disrupt them have not resulted in widespread adoption. Despite incorrect claims regarding specific languages, I believe his predictions show a deep understanding of programming languages.
[1] J. K. Ousterhout, “Scripting: higher level programming for the 21st Century,” in Computer, vol. 31, no. 3, pp. 23-30, March 1998, doi: 10.1109/2.660187.
[2] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures
[3] https://wiki.python.org/moin/Why%20is%20Python%20a%20dynamic%20language%20-and%20also%20a%20strongly%20typed%20language
[4] https://tirania.org/blog/archive/2012/Oct-01.html
[5] Bogner, J. and Merkel, M., 2022. To Type or Not to Type? A Systematic Comparison of the Software Quality of JavaScript and TypeScript Applications on GitHub.
[7] Rak-Amnouykit, Ingkarat, et al. “Python 3 types in the wild: a tale of two type systems.” Proceedings of the 16th ACM SIGPLAN International Symposium on Dynamic Languages. 2020.
[8] https://pypi.org
[9] https://blog.npmjs.org/post/615388323067854848/so-long-and-thanks-for-all-the-packages
[10] Harris, Charles R., et al. “Array programming with NumPy.” Nature 585.7825 (2020): 357-362.
[11] Gaikovina Kula, Raula, et al. “On the Impact of Micro-Packages: An Empirical Study of the npm JavaScript Ecosystem.” arXiv e-prints (2017): arXiv-1709.
[12] https://www.npmjs.com/package/vue
[13] https://stackoverflow.blog/2017/09/06/incredible-growth-python