Why Obix? Reliable code and software developer productivity

Christian Neumanns

March 2011

Abstract

This article explains why the Obix programming language has been created.


Introduction

Do you prefer to write code or to debug code?

Suppose your application needs to replace text in a file. Which code would you want to write?:

  1. code that uses open/close file commands; loops to read from and write to fixed-sized buffers in memory; and try-catch-finally blocks to handle exceptions

  2. a single instruction that calls a standard library function which takes care of everything and tells you if the operation has succeeded

Would you like greater simplicity in the process of software creation and maintenance?

If you answer these questions with 'write code', '2', 'yes', then you are not alone.

Practice shows that:

  • Most programs contain many bugs and require a lot of debugging.
  • Developing professional, production-ready software takes a lot of time.
  • Writing good code is not easy and requires years of experience.

These are three major problems that have persisted for decades in the software development industry, and they often lead to high development costs.

But these are also precisely the three problems that Obix tries to address - with the highest priority. Obix has been specifically designed to:

  • produce more reliable software (fewer bugs)
  • increase developer productivity
  • simplify the software development process

The following chapters explain how this is done.

1. More reliability

Two indisputable facts exist in the world of software development:

  1. It is very hard to create error-free software; in most cases this is almost impossible.

  2. The costs generated through errors in software can be huge, and the costs often increase dramatically when the errors are detected late in the process of developing-testing-using the software application.

Although every programmer probably agrees that too many errors remain in the great majority of software delivered to customers, and that it takes a lot of time, discipline and experience to find and repair those errors, most of them are surprised when they hear the 'real numbers' that have been provided in several prominent studies undertaken to prove and quantify the above facts. The highly praised book Code complete, second edition, 2004 (ISBN 0-7356-1967-0), written by Steve McConnell contains the following interesting conclusions, which are the results of studies done by companies such as IBM, NASA, etc.

  • Concerning the number of errors:

    • [Industry average experience is about 1 - 25 errors per 1000 lines of code for delivered software] (page 521)

      This means that a mid-size application consisting of 50000 instructions contains no fewer than 50 to 1250 errors when the software is delivered to the customer(s)! And if it contains 'only' 50 errors it is nevertheless considered to be of 'high reliability', because there is only 1 error in 1000 lines of code.

  • Concerning the costs generated by these errors:

    • [Researchers at HP, IBM, Hughes Aircraft, TRW, and other organizations have found that purging an error by the beginning of construction allows rework to be done 10 to 100 times less expensively than when it's done in the last part of the process, during system test or after release] (page 29)

    • [... software defect removal is actually the most expensive and time-consuming form of work for software] (study at IBM, page 474)

A famous example of how dramatic the consequences of a bug can be is the Ariane 5 launcher that crashed on June 4, 1996. This crash was due to an arithmetic overflow error at runtime when a 64 bit floating-point number was converted to a 16 bit signed integer. The (uninsured!) cost was estimated to be USD 500,000,000!

A more frequent example would be an error detected in an ERP application after the software has been delivered to hundreds or thousands of customers. Although it might only take a few minutes for the programmer to fix the bug in the source code, the total costs can easily be orders of magnitude higher. This is a consequence of the need for redeployment and retesting, the need for informing and updating all customers, perhaps the need for correcting wrong results stored in databases, and so on. To this we must add the customers' loss of time, frustration and decrease of trust in the software and the software provider.

The exponential increase in costs for program errors is shown schematically in the following figure:

Figure 1. Costs of program errors (bugs) in software developments

Costs of program errors (bugs) in software developments

Let us do a simple calculation: According to the above facts, an error that costs $ 7 if it is detected at compile-time might cost $ 700 if it is detected at production-time. If the delivered software contained 100 such errors (and most software contains more errors) the total costs to repair the defects would be $ 70,000, instead of only $ 700 if they were detected at compile-time. Obviously, such a big difference can determine the success or failure of the project.

Of course, this is just a theoretical and oversimplified example. Real costs depend on many factors and can vary largely. However, the fact that the cumulative costs of program errors can end up in huge amounts remains true in any case. Detecting errors early can even save lives! Just think about the dramatic consequences a single undetected error could have in a medical application or a national defense application.

The lesson is clear and leads to the following very important Fail fast! rule:

To increase reliability and maintainability and reduce costs, software errors must be detected and repaired as early as possible. Errors should preferably be automatically detected at compile-time, or else as early as possible at run-time.

The Fail fast! rule in software development is very similar to a well known rule in medicine: Prevention is better than cure! It is always better and cheaper to prevent or cure a problem early than to try to cure it later, and sometimes a life can be saved.

Experience proves that the costs of software development can be reduced considerably if we apply the Fail fast! rule.

Experience also shows that the benefits of bug-reducing features grow exponentially with:

  • the size of the application
  • the number of programmers involved in the project, as well as the number of programmers who are added or replaced during the project
  • the number of changes and extensions during the application's lifecycle
  • the number of people using the software

Fail fast! concepts built into Obix

Although no program language can prevent programmers from writing bad or erroneous programs, the goal pursued with Obix is to provide a language with facilities that reduce the number of program errors (bugs). As a consequence, Obix helps programmers to write more reliable and maintainable code in less time.

This goal is achieved through a unique combination of proven and innovative concepts which all support the Fail fast! rule. Moreover, error-prone programming techniques such as automatic type conversions are prevented whenever possible.

In addition to being a compiled, statically typed language, the most important error-preventing concepts built into Obix are:

  • Unit testing.
  • Contract programming (Design by Contract).
  • Feature redefinition in child types.
  • Generic types (without type erasure at runtime).
  • Objects are immutable by default.
  • Void (null) values are not allowed by default.

The full list of over 20 Fail fast! concepts integrated in Obix and examples of why and how they reduce the number of bugs can be found in Obix's 24 Fail fast! concepts for more reliable software.

Advantages of the Fail fast! rule applied in Obix

If the support for fewer bugs is not embedded in the language, then much more time, discipline and experience is required from the programmers. This distracts from the main task of solving a business problem.

From the outset Obix has been designed with these concepts in mind, in order to immediately integrate them in the language. This approach allows a better implementation than where these features are merely added later to a language, or offered as an optional, possibly third-party extension. First, the language does not suffer from restrictions or exceptional cases which are sometimes inevitable or necessary because of the need for backward compatibility. Second, the concepts seamlessly evolve with new versions of the language because they are part of the language, and there are no version conflicts. Third, they are easier to understand and use, and no special syntax constructs need to be invented. Finally, the application does not depend on optional extensions to the language. As a result, programmers are much more motivated (or compelled) to use these features, and this leads to better software that is easier to maintain.

Designed to work seamlessly together, the unique combination and interaction of these Fail fast! concepts leads to more robust code and lower development costs. In this context, it is again interesting to quote the book Code complete:

  • [... if project developers are striving for a higher defect-detection rate, they need to use a combination of techniques] (page 470)

Besides being available for writing new programs, all Fail fast! concepts are also systematically applied in Obix's standard libraries. Again, the goal is to help to detect bugs earlier. For example, contract programming is applied as follows in the standard libraries: If a script calls a string command that extracts a substring from position from to position to, then both input arguments from and to cannot be void (null), and their values must be less or equal to the length of the string. Moreover, to must be equal to or greater than from. Furthermore, the result returned by the command cannot be void, and the length of the substring returned is guaranteed to be equal to to - from + 1. A runtime error occurs immediately if any condition is violated.

Furthermore, the software libraries are designed in a way that more errors can be detected at compile-time, or early at runtime. For example, instead of defining a single file type that is used for absolute and relative files, as well as for absolute and relative directories, Obix provides four distinct types with some common functionalities and with some type-specific functionalities. Therefore, writing code that tries to delete a relative file, or code that tries to store a string into a directory instead of a file, results in compile-time errors.

Because Obix is statically typed, and also because it supports generic types with no type erasure at runtime, it will be possible to write static analysis tools in the future, or to integrate static analysis in the compiler. This will allow even more errors to be found automatically. Tools like FindBugs (http://findbugs.sourceforge.net/) prove that it is possible to find bugs with static analysis.

2. Greater productivity

It goes without saying that greater productivity is another important goal pursued in software development projects.

Obviously, the above mentioned concepts for increased reliability and maintainability also increase productivity because fewer bugs means that less time needs to be spent in finding and repairing bugs.

Greater productivity is also achieved by making recurring tasks as easy as possible. Here are some examples:

  • As well as applying the Fail fast! concepts in Obix's libraries, these libraries also aim to provide single commands to execute common tasks that would otherwise require to write a whole script. For example, replacing a string in a text file with another string can be done with a single command (that supports regular expressions), instead of having to read from and write to the file in a loop using buffers, and embed the operation in a try-catch-finally block to properly close the file and handle resource errors. Such high-level commands not only increase productivity, but they also help to reduce the number of bugs, because writing a single command is much less error-prone than writing a script with loops and try-catch-finally blocks.

  • It is easy to create the different kinds of applications required in practice. These range from quickly executing a single instruction, up to developing a complex web application. For example, if you just want to write and execute a small script to test some code or to create a small utility, you don't need to create and maintain a project. You can simply put the source code into a text file located anywhere on the disk or the network, and then execute it. For more information and examples of different kinds of applications (e.g. command-line utilities, terminal input/ouput applications, desktop applications or web applications) please refer to Part II, “Versatility” in the tutorial.

  • The Obix compiler automatically takes care of some frequent tasks that have to be done manually in other programming languages and lead to so-called boilerplate code. For example:

    • You don't have to add and maintain package and import statements in the head of source code files. The compiler creates them automatically, based on the location of the source code files in the directory structure.

    • You don't have to write 'getters' and 'setters', except when special behavior is required.

    • Simple object constructors that take one input argument for each attribute of an object are created by the compiler, if desired.

      For example, a Java constructor like this:

      public Customer ( int identifier, String name, String address, String city ) {
         this.identifier = identifier;
         this.name = name;
         this.address = address;
         this.city = city;
      }

      is reduced to the following single instruction in Obix:

      creator create kind:in_all end
  • Java source code can easily be embedded in Obix source code. This means that you can use the vast number of existing Java libraries in your Obix applications. The inverse is true too. Obix code can be called from within Java code, and data can easily be exchanged between Obix and Java. For more information, please refer to Chapter 15, Embedded Java source code.

  • Applications can be developed on Windows or Linux systems, and then be deployed on Windows and/or Linux systems. The libraries take care of the differences between the Unix- and Windows-world, such as different line feed characters and different file path separators.

[Note]Note

At the time of writing, a web framework for Obix is under construction. This framework aims to considerably simplify and speed up the development of professional websites with a modern user interface. Its main goals are:

  • automatic creation of dynamic web-sites to interact with Obix objects (no need to write GUI code; ideal for prototyping)
  • customization of the framework's default web sites by writing web-site code in Java or Obix (no need to write HTML, Javascript and CSS code, except in case of very specific requirements).
  • complete separation of business code and UI code.
  • provision of rich user interfaces (e.g. sorting, filtering, sizing and paging of tables).

The availability of this framework will be announced on www.rps-obix.com.

3. Greater simplicity

Besides increased reliability and productivity, a programming language should also be easy to learn and use.

Nobody said it better than Albert Einstein: Everything should be as simple as possible, but not simpler.

Productivity and simplicity go hand in hand. If a programming language increases productivity, then it is generally also simpler to use. Therefore, all the points mentioned in the previous chapter about greater productivity also lead to greater simplicity. Moreover, the following design choices add to simplicity:

  • Important object oriented concepts are automatically enforced through the language itself, so that the programmer cannot violate them. For example, the hiding of implementation details is automatically enforced through the concept of types (i.e. the interface) and factories (i.e. the implementation) (see the section called “Type” and the section called “Factory”). The types define what you can do with an object. The factories define how this is realized.

  • Obix favors strict default values. This helps programmers to write better code without extra effort. For example, objects are immutable by default, void values are not allowed by default, and code is statically typed by default.

    Default values for input arguments in the standard libraries are also defined in a strict, non-error-prone way.

  • The source code syntax has intentionally been designed to be easy to understand. This implies that we may have to type a little more sometimes, but the reasons for favoring read-ability over write-ability are:

    • Programmers spend more time reading code and trying to understand code, than writing code. It is generally assumed that about 70% of programmers' time is spent in maintaining code.
    • Quickly understanding code is important when we have to read code written by somebody else (or written by ourselves two years ago).
    • Less experienced or occasional programmers should not have to struggle with cryptic or ambiguous syntax constructs.
    • Obix supports source code templates. Besides helping to avoid code duplication, templates also reduce the amount of code that has to be typed and maintained.

  • As stated previously, confusing and error-prone programming techniques, such as automatic type casts, are prevented.

  • Because everything is an object in Obix, you don't have to deal with different and sometimes surprising or restrictive behaviors of primitive data types and arrays co-existing with 'real' objects.

Conclusion

Figure 2. Obix programming language: reliable code, developer productivity, simple programming

Obix programming language: reliable code, developer productivity, simple programming