Mastering feature flags: take control of your production features

Gabriela Cavalcante
August 29, 2024

As software complexity grows, the challenge of safely delivering new features increases. The feature flag is one powerful yet simple technique we use at Vinta to evolve our systems without overwhelming our teams or our users. In this article, we will share practical examples of the benefits you can achieve using it and the insights we learned during our journey. Hugo Bessa also has another article explaining how to implement it and tool recommendations.

Before you ask, “What are feature flags?” they are a development technique that allows developers to toggle a feature on and off or show different versions of a feature without deploying new code. This control over feature releases offers a flexible, risk-mitigating approach to software development. With the simple flip of a switch, you can release new functionalities quickly and safely.

if flag_is_active('NEW_PRICING_CALCULATION_ENABLED'):
    run_new_algorithm()
else:
    run_old_algorithm()

In practical terms, feature flags are simply if-else statements controlling a flow according to the switch state. But how can they revolutionize your development workflow? And, mainly, what trade-offs do we need to be careful about when we implement them in the real world?

The scenario

Imagine you’re starting a new task. Let's say you are implementing a dark-mode version of a site. You branch out to start coding. You first notice you will need to add an external lib to the project to handle the behavior. After adding it, then you start testing and changing all the components that will be affected. You notice you will need a database migration to create new fields to control whether each user's default mode will be dark or light.

Then, you dive into refactoring a particularly chaotic function before adding your new code. Finally, you write some tests to safeguard your changes. All that work goes on for a few days, and after it's all done, you open your pull request and - oh god! The diff is huge. 

If you’ve been coding long enough, you know the drill. Working on long-running branches can feel like you’re building a whole world in a silo. Another headache happens when it's time to merge all that code with other long-running branches. It’s a mess to review, merge, debug, and understand.

The solution to solve that problem is to integrate your code in short cycles; to enable that, you need to break it into small batches. But how could you deliver code that is still a work in progress?

Here’s a game-changer: feature flags. With feature flags, you can deliver in manageable increments. This approach significantly reduces the time needed to merge, review, and receive feedback, less risk of conflict, and fewer deployment nightmares.

Feature Flags benefits

Synchronize code seamlessly

Regarding our initial scenario, let’s think about strategies to prevent the problems introduced by long-running and isolated branches. First, you must create new fields in your models and open a PR. This can be reviewed and merged independently of your larger changes.

Next, you could wrap the refactor of the chaotic function under the feature flag. Ask reviewers to focus just on this change. Once approved, you can merge your refactored code with the main code and deploy it without worry as long as you keep the feature flag disabled.

Now, you can implement the dark mode for each component in small batches, each with its PR. This simplifies the review process and keeps changes manageable. You can also break the functionality into blocks that can be turned on for specific tester users, allowing you to catch issues early with a shorter feedback loop - but let’s talk about this benefit soon.

Integrating feature flags into your workflow transforms a daunting review and deployment process into a streamlined, safer, and more flexible workflow. Each step forward is secure, manageable, and less prone to errors, making your development cycle more efficient and less stressful.

Decouple deployment from release

The most immediate benefit of feature flags is the ability to separate code deployment from its release. Imagine deploying code that isn't immediately visible to all users but can be activated when needed. This flexibility allows development and release cycles to progress independently, streamlining the process and enhancing team efficiency.

Let's bring back our dark-mode feature! By wrapping this change in a feature flag, the deployment can proceed without exposing the new functionality to users. This separation ensures that ongoing development isn't bottlenecked by staggered release schedules, allowing for continuous delivery and more frequent deployments without disrupting user experience.

Impact reduction 

But what do I mean, when I say “without disrupting user experience”? Suppose a critical issue or incident happened during the dark mode release. For example, you introduced an external dependency to help in the feature development, and it broke other libraries, or maybe the UI didn’t behave as expected. 

Resolving such issues might involve reverting the pull request, running the build and the CI, and deploying again, consuming valuable time and disrupting the user experience. However, with feature flags, the situation is handled much more smoothly. You can turn off a feature without needing a new deployment to roll back to a previous version. This reduces downtime and minimizes the impact on users.

A/B testing with real users

By using feature flags, you can gradually enable the dark mode feature for a subset of users.  You can enable this interface for just 10% of your users, specifically those who might benefit most or are more receptive to the new change, making it possible to monitor how the feature performs in a controlled environment.

It’s possible to plan these controlled rollouts by percentage or attribute, such as device type, size, location, etc. This targeted approach allows you to collect valuable feedback and identify any issues before a full rollout, significantly reducing risk. If you are eager to know more about A/B tests and how to perform them, read this great article

The real-time feedback obtained during these controlled rollouts is invaluable. It enables your team to make adjustments based on actual user experiences rather than assumptions. This proactive issue identification helps fine-tune the feature to meet user needs better and ensures a smoother, more successful launch.

While Feature Flags are great for reducing risk and enabling continuous delivery, there are some concerns to be aware of during the implementation. 

Feature Flags: best practices and practical insights from our experience

Implementing new tools, architectures, or patterns like feature flags isn't just about adoption; it’s about strategic integration. In this process, you can find challenges you could only learn about after using it in practice. We want to share some learnings from our experience using Feature Flags in various projects.

Let’s elaborate on the hard part: naming things 

Naming is a common struggle in many contexts of software development. It is not different when we need to create a new flag. Adopting clear and consistent naming conventions for your feature flags is critical to ensure the clarity and manageability of feature flags, especially as your project grows. Here’s how you can develop effective naming practices:

  • Descriptive Names: Always choose clear and descriptive names over short but ambiguous ones. For example, instead of NEW_FEATURE_ENABLED, use DARKMODE_ON_DASHBOARD_ENABLED. This immediately informs any developer of the flag's purpose.
  • Avoid Double Negatives: When enabled, names like DISABLE_NEW_DASHBOARD can create confusion. To maintain clarity, it’s better to use positive assertions such as NEW_DASHBOARD_ENABLED.
  • Use patterns to identify usage: If you have flags with different purposes, use prefixes like TEMP_ for flags intended as temporary measures or OPS_ for flags focused on infrastructure changes. This emphasizes that these flags are expected to be removed in the future.
  • Scope and Context: Ensure that the flag’s name reflects its scope and context. Avoid vague terms that do not specify what the flag controls or the conditions under which it applies. For example, avoid IMPROVEMENTS_ENABLED. That doesn’t say anything about which improvement it controls.
  • Avoid Name Reuse: If someone mistakenly changes the wrong flag, reusing names from old flags can lead to serious incidents. Always create new, unique names for new features.
  • Prevent Misuse: Inaccurate flag naming can lead to errors, such as a flag being disabled when it should be enabled. Someone accidentally disabling a flag due to a name misunderstanding would be an example of such an error.

The big challenge: keeping your code clean

As software grows and scales, we keep adding new features and new flags to control their flows. Maintaining them can quickly become a nightmare. Tracking the context of each flag will require a high cognitive effort, and the risk of introducing bugs will increase. Every time someone changes a piece of code with flags, they'll feel stressed about that. The tests will need to set different combinations of flags to guarantee consistency.  

Effective management of feature flags is essential to prevent them from becoming a source of technical debt and system inefficiency. Two main concerts helped us reduce the complexities associated with feature flags.

Understanding flag types and purposes

Distinguish between flags meant for short-term experiments or beta features and those intended for long-term use. If a flag becomes permanent, or you already know it won’t be removed, consider it should be transitioned to a system setting or user configuration option. Flags add complexity to your code, so avoiding adding permanent complexity will help you with maintainability. 

For flags that are genuinely needed on a long-term basis, such as switches that disable a feature entirely, ensure they are clearly documented and managed. For example, suppose you have an integration with a third-party system you want to gracefully disable in case of failure.

In this situation, the switch is expected to be permanent. One approach to making things obvious is to choose a clear name to identify that, like the OPS_ prefix. 

Managing flag lifecycles

Treat feature flags like technical debt and plan their removal as you would the removal of legacy code. Once the flag is enabled for everyone, it’s dead code in your project, and the system's maintenance will be affected by it since too many feature flags can lead to undefined or untested system behaviors. So, regularly review your feature flags to remove outdated, unused, or fully enabled flags. 

This process might involve aligning with the product team to assess risks associated with flag removal and planning with the development team. Assigning a specific person or team to manage the feature flag can help you control this process. The person responsible for creating it also monitors its impact, communicates status updates to stakeholders, and removes it after it's done.

Don’t lose track

A robust monitoring and logging system allows teams to track the performance of their flagged features, monitor user engagements, and quickly identify and address any issues that may arise. 

Consider whether you need to track flags' creation dates or record who enabled them. Traceability can be crucial for auditing changes and understanding feature flag impact over time, especially in environments where many team members can alter flag settings. 

Set up alerts to notify relevant teams when critical thresholds are breached. For example, an immediate alert can ensure prompt attention and action if a new feature under a flag causes a spike in error rates. Some monitoring tools integrate seamlessly with your feature flag platform and provide comprehensive monitoring and alerting capabilities.

Ensuring consistency, no matter the state 

Feature flags often control critical system states, introducing challenges in maintaining consistency. In our experience, some scenarios require extra attention.

Keep the data integrity 

Careful planning is essential when a new feature requires database changes, like creating new tables or changing existing structures. Consider the case where you must migrate a simple text stored on a text field to a more complex data structure, like another table with more associated data. You choose to use a feature flag to cover your changes because you want to make incremental changes and be careful about the deployment.

However, there is a tricky detail about this case: the flag controls flows that don't share the same source of data. When your flag is off, you fetch from the old field, and if it’s on, from the new field. How do you guarantee both flows will be consistent and that you can safely change the flag state without incidents?

When we talk about planning, that means thinking about the whole behavior of your code, independent of the flag state. One approach to safely handle this type of change with flags is redundancy. You can decouple data ingestion from flag states by simultaneously writing data to both the old and new structures regardless of the flag's state.

Use the feature flag only to determine which data structure to read from, not which one to write to. This ensures that your system operates consistently regardless of the feature flag being turned on or off

This strategy prevents the need for large-scale data migrations triggered by flag changes. You can quickly identify discrepancies (if you missed a place updating the old field) and address them without disrupting the system’s functionality. Moreover, avoid deleting data before you are 100% sure the new flow is working correctly. If you need to turn the flag off, the old data must be there.

Don’t lose user trust 

Feature flags should enhance, not complicate, the user experience. Variability in user experience based on flag status can lead to confusion and unreliability. If the user sees a different behavior every time they open your system, this can increase support requests and affect user feedback. We found some strategies that helped us avoid a negative impact on the user experience.

  • Organization-level consistency: If users are grouped by organization, consider activating flags at the organization level rather than for individual users to maintain consistency across the user base. This prevents members of the same organization from seeing different behaviors and noticing them as bugs.
  • Communication strategies: Implement tooltips or notifications to inform users about changes or new features under test. This helps set expectations and reduces confusion.
  • Monitoring user reactions: Monitor which users have features enabled and monitor their reactions and system performance. This data is vital for assessing the impact of new features and making informed decisions about full-scale rollouts.

Managing asynchronicity

Imagine you have a feature flag that controls a new section in the weekly email updates about promotions. Those emails are dispatched through a task queue. You enable the flag and trigger the emails but then discover an error in the new section and immediately turn off the flag. However, emails continue to be sent with the error because all the tasks were queued before the flag was disabled.

def send_promotional_emails(): 
  if flag_is_active('SEND_PROMO_EMAIL'): 
    for user in users: 
      send_async_notification(user.id)

Working in the async world requires us to pay attention to consistency. Be careful where you check your flag state and how you handle the collateral effect of that. For example, don’t pass the flag state as an argument of the task. 

# Don't do this!
send_async_notification(user.id, flag_is_active('SEND_PROMO_EMAIL'))

def send_async_notification(user_id, send_promo_email):
    if send_promo_email:
        # logic to send email

Changing the flag state after you assign this task to the queue won’t impact the task. Check the feature flag status at a more granular level, ideally per individual task, to provide flexibility in halting operations if needed:

for user in users:
    send_async_notification(user.id)

def send_async_notification(user_id):
    if flag_is_active('SEND_PROMO_EMAIL', user_id):
        # logic to send email

By designing your asynchronous systems to check feature flag status at the point of task execution rather than at the time of task creation, you can significantly reduce the risk of executing outdated or erroneous tasks.

Conclusion 

We've been using Feature Flags for a long time at Vinta and have learned how amazing this technique is, but it requires attention to avoid increasing complexity.

I hope these insights can empower you to use feature flags to their full potential, enhance flexibility, reduce risks, and enable more innovative and responsive development practices.