Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Class vs data structure

In object oriented programming, a custom class (like Person class with data of Name, list of addresses, etc)holds data and can include collection objects too. A data structure is also used to hold data too. So, is a class considered advanced data structure conceptually ? And in design of efficient systems (in object oriented world and large systems), are classes considered as similar to data structures and algorithmic analysis done for efficient classes designs for greater efficiency(in companies like google, facebook) ?

like image 979
Carbonizer Avatar asked Nov 22 '10 05:11

Carbonizer


People also ask

What is the difference between a class and a data structure?

A data structure is a conceptual way of modeling data, each different data structure having different properties and use cases. A class is a syntactic way that some languages offer to group data and methods.

Is a class A type of data structure?

A class is a data structure that has named fields and crucially named behaviours. Some/many/all of those named fields could themselves be complex data structures (lists, maps, tree etc) or even instances of other classes. It is the behaviors (often called methods) that separates a class from a data structure.

What is class object and data structure?

A class can define a set of properties/fields that every instance/object of that class inherits. A data structure is a way to organize and store data. Technically a data structure is an object, but it's an object with the specific use for holding other objects (everything in Java is an object, even primitive types).


2 Answers

I recommend you to read Clean Code chapter 6: objects and data structures. The whole chapter is about this... You can read an abstract if you don't want to buy the book, it can be found here.

According to that, you can use classes efficiently in two different ways. This phenomenon is called data/object anti-symmetry. Depending on your goals, you have to decide whether your classes will follow the open/closed principle or not.
If they follow the OCP, they will be polymorph, and their instances will be used as objects. So they will hide data and implementation of a common interface, and it will be easy to add a new type which implements that interface as well. Most of the design patterns fulfill the OCP, for example MVC, IoC, every wrapper, adapter, etc...
If they don't follow the OCP, they won't be polymorph, their instances will be used as data structures. So they will expose data, and that data will be manipulated by other classes. This is a typical approach by procedural programming as well. There are several examples which don't use OCP, for example DTOs, Exceptions, config objects, visitor pattern etc...

Typical pattern when you should think about fulfilling OCP and move the code to a lower abstraction level:

class Manipulator {     doSomething(Object dataStructure){         if (dataStructure instanceof MyType1){             // doSomething implementation 1         }         else if (dataStructure instanceof MyType2)         {             // doSomething implementation 2         }         // ...     },     domSomethingElse(Object dataStructure){         if (dataStructure instanceof MyType1){             // domSomethingElse implementation 1         }         else if (dataStructure instanceof MyType2)         {             // domSomethingElse implementation 2         }         // ...     } }  class MyType1 {} class MyType2 {} //if you want to add a new type, every method of the Manipulator will change 

fix: moving implementation to a lower abstraction level and fulfill OCP

interface MyType {     doSomething();     domSomethingElse(); }  class MyType1 implements MyType {     doSomething(){         // doSomething implementation 1     },     domSomethingElse(){         // domSomethingElse implementation 1     } }  class MyType2 implements MyType {     doSomething(){         // doSomething implementation 2     },     domSomethingElse(){         // domSomethingElse implementation 2     } }  // the recently added new type class MyType3 implements MyType {     doSomething(){         // doSomething implementation 3     },     domSomethingElse(){         // domSomethingElse implementation 3     } } 

Typical pattern when you should think about violating OCP and move the code to an higher abstraction level:

interface MyType {     doSomething();     domSomethingElse();      //if you want to add a new method here, every class which implements this interface, will be modified }  class MyType1 implements MyType {     doSomething(){         // doSomething implementation 1     },     domSomethingElse(){         // domSomethingElse implementation 1     } }  class MyType2 implements MyType {     doSomething(){         // doSomething implementation 2     },     domSomethingElse(){         // domSomethingElse implementation 2     } } 

or

interface MyType {     doSomething();     domSomethingElse(); }  class MyType1 implements MyType {     doSomething(){         // doSomething implementation 1     },     domSomethingElse(){         // domSomethingElse implementation 1     } }  class MyType2 implements MyType {     doSomething(){         // doSomething implementation 2     },     domSomethingElse(){         // domSomethingElse implementation 2     } }  //adding a new type by which one or more of the methods are meaningless class MyType3 implements MyType {     doSomething(){         throw new Exception("Not implemented, because it does not make any sense.");     },     domSomethingElse(){         // domSomethingElse implementation 3     } } 

fix: moving implementation to a higher abstraction level and violate OCP

class Manipulator {     doSomething(Object dataStructure){         if (dataStructure instanceof MyType1){             // doSomething implementation 1         }         else if (dataStructure instanceof MyType2)         {             // doSomething implementation 2         }         // ...     },     domSomethingElse(Object dataStructure){         if (dataStructure instanceof MyType1){             // domSomethingElse implementation 1         }         else if (dataStructure instanceof MyType2)         {             // domSomethingElse implementation 2         }         // ...     },     // the recently added new method     doAnotherThing(Object dataStructure){         if (dataStructure instanceof MyType1){             // doAnotherThing implementation 1         }         else if (dataStructure instanceof MyType2)         {             // doAnotherThing implementation 2         }         // ...     } }  class MyType1 {} class MyType2 {} 

or splitting up the classes into subclasses.

People usually follow OCP over the method count one or two because repeating the same if-else statements is not DRY enough.

I don't recommend you to use mixed classes which partially fulfill, partially violate the OCP, because then the code will be very hard maintainable. You should decide by every situation which approach you follow. This should be usually an easy decision, but if you make a mistake, you can still refactor your code later...

like image 184
inf3rno Avatar answered Oct 11 '22 09:10

inf3rno


Whether a custom class is a data structure depends on whom you ask. At the very least, the yes people would acknowledge than it's a user-defined data structure which is more domain specific and less established than data structures such as arrays, linked lists or binary trees for example. For this answer, I consider them distinct.

While it's easy to apply Big O algorithm analysis to data structures, it's a little more complex for classes since they wrap many of these structures, as well as other instances of other classes... but a lot of operations on class instances can be broken down into primitive operations on data structures and represented in terms of Big O. As a programmer, you can endeavour to make your classes more efficient by avoiding unnecessary copying of members and ensuring that method invocations don't go through too many layers. And of course, using performant algorithms in your methods goes without saying, but that's not OOP specific. However, functionality, design and clarity should not be sacrificed in favour of performance unless necessary. And premature optimisation is the devil yada yada yada.

I'm certain that some academic, somewhere, has attempted to formulate a metric for quantifying class performance or even a calculus for classes and their operations, but I haven't come across it yet. However, there exists QA research like this which measures dependencies between classes in a project... one could possibly argue that there's a correlation between the number of dependencies and the layering of method invocations (and therefore lower class performance). But if someone has researched this, I'm sure you could find a more relevant metric which doesn't require sweeping inferences.

like image 22
asdfjklqwer Avatar answered Oct 11 '22 10:10

asdfjklqwer