Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Semantic versioning: changing a non-opaque struct that *should* be allocated through library functions

My C library, in version 1.0.0, defines a struct and some functions to allocate and use the struct:

typedef struct { int x; int y; } MyStruct;
MyStruct *allocate( int, int );
void destroy( MyStruct* );
void print( MyStruct* );

Users are supposed to never allocate the struct themselves nor to copy it by value. this is the main difference from the question Semantic versioning: minor or major change?. For instance a program should use it like this:

void f(){
  MyStruct *ms = allocate(0,0);
  ms->x += 1;
  print(ms);
  destroy(ms);
}

Now I need to add a new field to the struct, while the signature of the functions doesn't change.

typedef struct { int x; int y; int z; } MyStruct;

The new struct takes more memory than the old one: if a program tries to allocate a MyStruct instance directly or to copy it by value it might break, if it's linked against a version of the library different from the one it was built with.

However that's not how programs use MyStruct: as long as they follow the documentation everything work fine. But nothing in the code prevents them from misusing the struct.

I'm using Semantic Versioning to version my library. In the case above should I increase the minor version (functionality added in a backwards compatible manner) or the major one (incompatible API changes)?

like image 323
Blue Nebula Avatar asked Feb 25 '21 17:02

Blue Nebula


People also ask

What is meant by semantic versioning?

Semantic Versioning is a versioning scheme for using meaningful version numbers (that's why it is called Semantic Versioning). Specifically, the meaning revolves around how API versions compare in terms of backwards-compatibility.

Which is a SemVer major change?

Major change: a change that requires a major SemVer bump. Minor change: a change that requires only a minor SemVer bump. Possibly-breaking change: a change that some projects may consider major and others consider minor.


1 Answers

You made a breaking change to a publicly visible structure, that in C, is not backwards compatible with the earlier version. Beside the change in structure size, your own example demonstrates exactly why this is a breaking change:

void f(){
  MyStruct *ms = allocate(0,0);
  ms->x += 1;
  print(ms);   // Indicates you are aware of external dependencies.
  destroy(ms);
}

You are exposing a data structure, that may be incorporated by your customer's, into a data set of some kind. You may have some ideas about how your library will be used, but I can assure you that your customers will always surprise you.

Given the code you posted, and your stated intent wrt how your library is to be used, I'd say you need to redesign your API, such that MyStruct is entirely opaque to the user and never changes size (customer's often cache things like this). Borrow a page from the standard library, use a handle.

typedef int MyHandle;
MyHandle allocate(int x, int y);
void destroy(MyHandle h);
void print(MyHandle h);

The handle can be range checked by your internal code and then used as index into a table of struct or struct pointers, or it could be key into binary tree. The point is you are free do whatever you like, without disrupting your API.

If you intend for the x y bits to be visible, use an extendable struct:

typedef struct { 
    int x; 
    int y; 
    void* reserved; // For internal use only!
} MyStruct;

The MyStruct.reserved field should always be NULL until you need it for internal use. Just keep in mind, that once you expose data fields like this, how they are used by your customers, is completely out of your hands. When you expose structure in this way, you are making a commitment to your customers.

Regarding getters and setters.

// Using the MyHandle type described earlier:
typedef struct _My_XY {
    int x;
    int y;
} My_XY;

My_XY GetXY(MyHandle h);

Problem solved.


I will add that since adding the z field is a breaking change anyway, and you seem to feel the need to extend the functionality of your product, this would be an opportune time for a radical API rewrite. But if you must extend it without breaking your customer base, you can hide the z data point internally with something like:

// Public API
typedef struct { int x; int y; } MyStruct;
MyStruct *allocate( int, int );
void destroy( MyStruct* );
void print( MyStruct* );
// Implementation

typedef struct _XY_Node {
    MyStruct* xy;
    int z;
    struct _XY_Node *pNext;
} XY_Node;

XY_Node root = null;

XY_Node* AddNode(int x, int y, int z) {...}
XY_Node* RemoveNode(int x, int y, int z) {...}
XY_Node* FindNode(int x, int y, int z) {...}

int DeriveZ(int x, int y) {...}

MyStruct *allocate(int x, int y)
{
    return AddNode(x, y, DeriveZ(x, y)).xy;
}

// etc...

You can use a hash table, or some form of binary tree, rather than list. The point is, you don't have to break your API in order to extend it. You could do this as a final release in the 1.y.z series, and then completely overhaul your API, such that you have a cleaner customer experience, and a more efficient implementation.

like image 153
jwdonahue Avatar answered Nov 05 '22 05:11

jwdonahue