This problem confuses me the most when thinking about using feature toggles in applications. Most of the features require some changes in database. How then feature flag can be implemented to be able revert database changes smoothly along with turning on/off feature toggle ? Especially when some database migration tools are used like flyway or liquibase.
Let's use some example. For instance I have simple table Playlist with Id, Name columns. I want to add feature of adding description to Playlist in application. DB table needs to be changed. New column has been added. Column value is not mandatory. Simple so far. After some time feature is going to be turned off from production. Solution is quite easy but messy - we can leave column in db and don't use it in a code anymore. Migration tools works incrementally so i don't see any option to get back to previous db state when i turn off feature toggle (or am I wrong ?). Harder example is when field needs to be mandatory. Then i can't just leave that field in db. It will be backward incompatible. So what's then ? How to process this situation ? I think it's quite common situation.
Additionally even with slight changes on db, my application model for that class will change too. (assuming usage of some ORM). This is not as easy as replace implementation with strategy pattern. Unless you provide a lot of abstraction to your ORM model usage. So adding such a feature flag seems to be very complicated. Can anyone help me understand it how it's possible to use feature toogles then ? Or maybe someone has full example to show ? Preferred with Java
Feature toggles are a way of changing behaviour without changing code - which means they are also not associated with database schema changes. Turning off a toggle should do something reversible like hiding a field from the UI, not something permanent like dropping a database column. With that in mind, your example would look like:
- Ship v1 of Playlist
- In v2, add UI for descriptions but disabled by feature flag. Migrate database to add new description column. Ship.
- Enable feature flag. Users can now use description
- Disable feature flag. Users cannot use description, but it's still in application and db
- In v3, remove dormant UI elements and logic. Migrate database to drop unwanted description column. Ship.
The problem of supporting multiple versions of the same system at once is at the root of this question, and it's one I'm actively pondering. Let's go one piece at a time!
What is a feature toggle?
Sorry other answer! Feature toggles need to do one thing only: decouple feature releases from deployments. How a toggle does so for any arbitrary change is the tricky bit, as this question identifies.
Generally, the system powering a toggle has to satisfy two conditions to be considered complete:
-
Backward compatibility. Older features must still work regardless of this toggle's state. This includes the old behaviour of this feature!
-
Forward compatibility. New features have to run properly regardless of this toggle's state.
For any non-trivial change, this is... challenging. That's one word for it anyway 🙄
Example 1: UI styling
Pretend you have a form whose input styles aren't accessible. Your job is to fix the styling to be accessible, but you use trunk-based development and your team expects all work to be integrated with your project's only branch, master
, by the end of each day.
This change is extremely isolated. It affects:
- A single form
- No functionality (i.e. no business logic has to change)
- No non-functional requirements (this won't introduce scaling issues, for example)
As a result, switching which CSS stylesheet is loaded based on the state of a toggle is enough:
-
Backward compatibility: There's no functionality affected, so old stuff should work by default. Turning the flag off requires no special behaviour, so simple conditional logic is all that's needed.
-
Forward compatibility: This is trickier, but assuming the form is properly DRYed out, any new inputs will automatically inherit the styles indicated by the flag. Assuming good separation of concerns, changes to these styles won't affect any other components and vise-versa.
Example 2: A new form field
I hope you enjoyed that beautiful simplicity, because we're now in trouble. This is exactly the kind of case OP is describing.
This change spans multiple systems. It affects (at a minimum):
- The UX form
- The back-end's data API, since there's a new field
- The database layer, since there's a new field
Such a small difference makes things much more challenging. We'll go system by system here.
UX form:
-
Backward compatibility: Identical to the previous example. If this is truly a new field, then old code shouldn't care. Any code path that does has to be covered by this feature toggle.
-
Forward compatibility: The major concern here is that a field could exist one day, then be gone the next when the switch is flipped back. New logic may require a default be set in the front-end state management, or be provided by the back-end.
Data API:
-
Backward compatibility: This field represents a change to the API's contract. In order to support certain use-cases (validation comes to mind) defaults may need to be provided if the toggle is off. Otherwise, old stuff should be okay though YMMV
-
Forward compatibility: Once again, the tricky part comes down to making sure there's something for new code to consume if this toggle gets turned off. In the worst cases, special conditional logic may need to be coded into new features to handle the case where the flag is turned off
Database layer:
-
Backward compatibility: At a database level, backwards compatibility requires us to only add optional fields. Requiredness can be enforced elsewhere in our application, but a schema can't be considered backwards compatible if it adds a new required field. Old inserts and updates will immediately break. So, your data migration adds a new, optional field. Easy?
-
Forward compatibility: Okay, new code comes in. Should it expect the field or not? If it must, this is where defaults come into play. Note I'm not specifying what should declare the default, since this will depend on the application, but something has to be there. In the worst cases, special conditional logic will have to cover the possibility that the field is the default.
This sounds like madness! How do I keep it manageable?
There are three major principles to follow here to keep things sane:
-
Keep your changes as small as possible, and refactor pain points that prevent small changes. This means more flags, but less complexity. Incremental improvement is the name of the game.
-
Consider long-lived flags to be critical technical debt. Your flags shouldn't last long in production. Have clear rules about when a change is "stable", and a clear window to clean up related flags. Cleaning up your flags as a regular part of your maintenance is essential for controlling how many code paths need to be supported.
-
Don't be dogmatic! Throw away extremes, and use long-lived feature branches when you have to. Some changes are too complex already, and the extra overhead of the flagging isn't worth it. If you follow points #1 and #2, this should happen less and less.
Best of luck!