Wednesday, 7 October 2020

ADT, type theory and set theory.

Intro

Mathematical models underpin most (arguably all) software, to create models we need to understand the bits that make them. Sets, types, algebraic structures.

Type and set theory provide the formal definitions for programming today. I believe that understanding how these pieces fit together we can create "better" software; more readable, easier to debug.

There is a saying that "the ideal solution for a professor to a problem was to introduce 10 classes instead of creating 1 function" and i think that this blog post might shed light on it. By distinguishing Types from Sets from Algebraic Structures and ADTs.

Sets and types

Sets and types are similar concepts and can both be thought about at a high level.

  1. They are used in logic.
  2. They are used to create models.

Since programming is about representing concepts and creating models, sets and types are important.

An example of a type is PERSON, and using programming we can represent this type in many ways, the most common is by using classes. Therefore we can create a class called PERSON which represents the type PERSON. Also, we can view the PERSON type as a set, it is identical to the type view.

class PERSON {
public:
 void Op();
private:
 bool HasName();
private:
 int data;
};

Algebraic structure as ADT as set/type

An algebraic structure is defined as a carrier set, operations, axioms and distinguished members. The above classes are expressed as an Abstract Data Type. Where PERSON is the carrier set (or type), operations are Op(), and an example of an explicit axiom is HasName().

Since we can represent algebraic structures with an ADT. And algebraic structures can represent types/sets, therefore, we can represent types/sets with an ADT. This is recursive definition where we can represent a set of a set using two ADTs.

Conclusion

By using classes we have bound the operations to the type/set explicitally, this means we can more easily represent individual types than individual sets. However, the unification of the two means that while programming, "names run out" and we often find ourselves creating strange types to manage types. However, if we seperate the carrier set and the operations that perform on it, we can create more sensible programs. To summarise; by using classes to create individual types and favouring modules over types to perform functions. We might end up with better sematic and formalized programs.

Seperating OOP objects and modules using this as our reasoning gives us a huge amount of flexibility with how we structure our programs. On the one hand we can create descriptive real-life heirarchies by following the type view, where PERSON is-a HUMAN and using polymorphism for dynamic dispatch. On the other hand we can functionally decompose the programs system into modules which more directly represent what the software is doing.

I believe that a common error with modern OOP (where real life concept=object) is that everything is strictly a type, and never viewed as a algebraic structure. Functions are bound to the objects as types and then more types are created to hold functions that manage other types. Now this view may seem ideal, and represent real life perfectly where say; a chest holds and number of items, however as we can see above, type theory is not the only way to create programs. This is reflected with Java, with a heavy use of type theory the application can be overly verbose and make abstractions that are wasteful and make no sense.

References

  1. https://en.wikipedia.org/wiki/Abstract_data_type
  2. https://en.wikipedia.org/wiki/Algebraic_structure
  3. Discrete Mathematics in Computer Science - STANAT, DONALD F - 1977 - ISBM 0-13-2160528 - Chapter 7 Algebras, 7.1 Structure of algebras

1 comment:

  1. Great post. I think I've almost got my head around 'objects' now. Obviously not as clear cut as some would like us to believe. The question in my mind now is do you not need some element of 'functionality' to reside in 'types'. Maybe at a lower level. For example, in simulating say a library - if I've got this correct - books and dvd's would be classed types. A library function would be say 'allow_reading_of_items' - i.e. not taking the item out but 'reading' it in the library. In this case you would need two separate functions to read one for books and another for dvd's. Books you can pick up and read without too much bother. However, a dvd needs you to load it into a reader and display the content. Would these lower level functions not be better associated with the 'type' as they are specific to that particular 'type'? Or have I gone completely bonkers trying to rationalise this too much?

    ReplyDelete

Commands, queries and side-effects (object interface design)

Commands and queries "A machine has 2 types of buttons, command and query buttons. When a c...