Should developers care about Azure Cost?

Posted on February 14, 2022 by steefjan1970

The days of prepurchasing a large amount of infrastructure are gone. Instead, in the Cloud, we deal with buying small units of resources at a low cost. As a result, developers have the freedom to provision resources and deploy their apps. They can spend company money at a click of a button or line of code. There is no longer a need to go through any procurement process.

Therefore you could ask the question: Should developers be aware of the running costs of their apps and belonging infrastructure? And also worry about SKU’s, dimensioning, and unattended resources? I would say yes, they should be aware. Depending on requirements, environments (dev, test, acceptance, and production), availability, security, test strategy, and so on, costs will accumulate. Having an eye on the cost from the start will prevent discussion when the bill is too high at the end of the month or lacks justifying of the chosen deployment of Azure resources.

Fortunately, there are services and tools available to help you in the estimation of costs, monitoring, and analysis for cost optimization. Furthermore, you can help identify costs by applying tags to your Azure resources – important when costs of Azure resources in a subscription are shared over departments.

Azure Calculator

Microsoft provides a Cloud Platform called Azure containing over 100 services for its customers. They are charged for most of the services when consuming them. These charges (cost) can be estimated using the so-called ‘Pricing calculator.’

You can search for a product (service) with the pricing calculator and subsequently select it.

Next, a pop window on the right-hand side will appear, and you click on view. Finally, a window will appear with the options for, in this case, Logic Apps. You can select the region where you like to provision your product (service), and depending on hosting, other criteria specify what you like to consume. In addition, you can select what type of support you want and licensing model – and there is also a switch allowing you to see what the dev/test pricing is for the product.

Furthermore, if you want to estimate a solution consisting of multiple products, you can select all of them before specifying the consumption characteristics. The calculator will, in the end, show the accumulated costs for all products.

Other tabs in the calculator showcase sample scenarios to calculate the cost potential savings when already running resources in Azure and FAQs. And lastly, at the bottom, you can click purchasing options for the product(s).

More details of Azure pricing are available on the pricing landing page.

Considerations Cost Calculator

An Azure calculator is a tool for estimating and not actual costs generated by a client when using the products. It depends on the workload, the number of environments, sizing, and support costs (not just from Microsoft itself, yet also the cost of those managing the product from the client-side). Using the tool can be a good starting point to provide the client a feeling of the cost generation of potential workloads that run on the platform. Furthermore, you can also use the tool to perform an overall calculation by including multiple environments, sizing, and support leveraging Excel. In addition, there is also a TCO calculator through the Azure pricing landing page.

Cost Management

The cost management + billing service and features are available in any subscription in the Azure portal. It will allow you to do administrative tasks around billing, set spending thresholds, and proactively analyze azure cost generation. For example, in the Azure Portal, under Cost Management and Billing, you can find Budgets to create a budget for your costs in your subscription. In the create budget, you can define thresholds on actual and forecasted costs, manage an action group, specify emails (recipients for alerts) and language.

Considerations Cost Management

A key aspect regarding cost control is to set up budgets (mentioned earlier) at the beginning once a subscription before workloads land or resources are provisioned to develop cloud solutions. Furthermore, once consumption of Azure resources starts, you can look at recommendations for cost optimizations and Costs Analysis. For instance, the cost analysis (preview) can show the cost per resource group and services.

It is recommended to separate workloads per subscription as per the subscription decision guide. And one of the benefits is splitting out costs and keeping them under control with budgets. And lastly, Azure Advisor can help identify underutilized or unused resources to be optimized or shut down.

Tagging

Tagging Azure resources is a good practice. A tag is a key-value pair and is helpful to identify your resource. You can order your resource with, for instance, a key environment and value dev (development) and a key identifying the department with value marketing. Moreover, you can add various tags (key/values), up to 50. Each tag name (key) is limited to 512 characters and values to 256 characters. More information on limitations is available on the Microsoft docs.

Tagging Considerations

With tags, you can assign helpful information to any resource within your cloud infrastructure – usually information not included in the name of available in the overview of the resource. Tagging is critical for cost management, operations, and management of resources. More details on how to apply them are available in the decision guide. Furthermore, you can enforce tagging through Azure policies – see the Microsoft documentation on policy definitions for tag compliance.

Reporting

Stakeholders in Azure projects will be interested in cost accumulation for workloads in subscriptions. Therefore, reports of resource consumption in the euro, for example, are required. These reports can be viewed in the Azure Portal under Cost Management and Billing. However, you will need filters in the cost analysis or use the preview functionality to be more specific. Or you can export the data to a storage account and hook it up to PowerBI, or use third-party tooling like CloudCtrl.

And finally, as a developer, you can also leverage the available APIs to get costs and usage data. For example, the Azure Consumption APIs give you programmatic access to cost and usage data for your Azure resources. With the data, you can build reports.

Reports considerations

With costs, reports are essential to realize who the target audience is, what information they are looking for and how to present it. In addition, each active resource consumes the Azure infrastructure inside a data center, leading to cost. And cost should represent value in the end. Hence, reporting is critical for stakeholders in your cloud projects. The analysis of costs is in good hands with the cost analysis capabilities; however, the presentation requirements might differ and sometimes require a custom report by leveraging, for instance, PowerBI or a third-party tool.

Wrap up

In this blog post, we discussed Azure cost and hopefully made it clear that developers should care about cost, and they have tools and services available to make life easier. For example, they can set up cost management infrastructure themselves in their dev/test subscriptions if not already enforced or done by IT. Furthermore, they can make IT and the architect(s) aware of it if it is not in place. In the end, I believe it is a shared responsibility of developers and IT responsible for managing the Azure environments/subscriptions.

Event-Driven Services in the Cloud: Azure Event Grid, AWS Event Bridge, and Google EventArc

Posted on January 30, 2022 by steefjan1970

I mentioned Azure Event Grid in a scenario with D365FO Business Events in a previous blog post. It is a Platform as a Service (PaaS) capability in Azure or eventing platform or event bus (I see various terms describing the service) allowing you to centrally manage events. In addition, it supports direct event filtering based on event type, prefix, or suffix, so your application will only receive events that are relevant to it.

Whether you want to handle built-in Azure events, such as a file being added to storage, or create your own custom events and event handlers, Event Grid supports both options via the same underlying model. Thus, regardless of the service or use case, intelligent routing and filtering capabilities apply to every event scenario and ensure that your apps focus on core business logic rather than worrying about event routing.

In this blog post, I like to dive into Azure Event Grid and competitive offering on the two other big cloud providers, AWS and Google.

Azure Event Grid

In 2017 Microsoft introduced Azure Event Grid as a fully-managed event routing service and the first of its kind (meaning the public cloud claimed it was the first offering the service). Dan Rosanova, previously Principal Program Manager Lead at Microsoft, now Director Program Management at Confluent, said in an InfoQ news item on the introduction:

Azure Event Grid fills a gap in the current cloud messaging space, not just in Azure but also across all cloud providers. We have services for messaging, queuing, and telemetry, but nothing for comprehensive eventing, particularly for cross-service or cross-cloud scenarios.

Within Azure service supporting Event Grid generates events routed to several event handlers. These handlers support event filtering and reliable delivery, ranging from Azure Functions to webhooks. Furthermore, underhood, the service relies on Service Fabric and thus can scale automatically to handle millions of events per second.

Event Grid model of sources and handlers

Source: https://docs.microsoft.com/en-us/azure/event-grid/overview

The Event Grid concept revolves around events emitted from a source (publisher), an Azure service, or a third-party source that adheres to the event schema (proprietary schema or the CNCF Cloud Events schema). For example, IoT Hub, Storage, and others are all event publishers in Azure. Following that, the events are sent to a topic in Event Grid, and each topic can have one or more subscribers (event handlers). A topic can be set up with the event publisher, or it can be a custom topic for custom events. Finally, event handlers respond to and process the events. Functions, WebHooks, and Event Hubs are examples of event handlers in Azure.

Azure Event Grid generally became available (GA) in February 2018 and Clemens Vasters, Principal Architect Messaging Services at Microsoft, said:

Event Grid is catching everyone’s attention because it unlocks new architectural possibilities for cloud platforms and applications: it’s the glue that enables information flow between services, and Event Grid allows expanding the capabilities of existing services by extension.

And that’s what also triggered or got the attention of AWS as they released EventBridge in July 2019, labeled as a serverless event bus that allows AWS services, Software-as-a-Service (SaaS), and custom applications to communicate with each other using events.

Since the GA, Azure EventGrid received several updates, including advanced filtering, retry policies, and support for CloudEvents. More details and samples are available on the Microsoft documentation and GitHub. Note that there is also an introductory paper available on Azure Event Grid and GitHub from Clemens.

AWS Eventbridge

You can use EventBridge to build and manage event-driven solutions by centrally controlling event ingestion, delivery, security, authorization, and error handling. Furthermore, you do not have to manage any infrastructure or scaling and only pay for the events that their applications consume, similar to Azure Event Grid. Moreover, the concepts are the same too.

How Amazon EventBridge connects applications using events

Source: https://aws.amazon.com/eventbridge/

However, Amazon Eventbridge surpasses Azure Event Grid with features (as you can see from the diagram above). It has a schema registry allowing you to discover, create, and manage OpenAPI schemas for events on EventBridge. According to the documentation, you can find schemas for existing AWS services, create and upload custom schemas, or generate a schema based on events located on an event bus. Furthermore, EventBridge enables you to generate and download code bindings for all event schemas to help quickly build applications that use those events.

Next to the schema registry, the service integrates easily with third-party services like Zendesk, Pagerduty, and SignalFx. Amazon has set up an extensive partner program for these integrations. Event Grid supports partner events (still preview) yet only has one with Auth0.

Another capability Amazon EventBridge offers is event replay and archive – allowing you to archive events so that you can easily replay them later by starting an event replay. Again, a capability that is not available in Azure Event Grid. Although it is something, you can find in Azure Event Hubs. You can configure the archive capability with the actions menu on the EventBridge Console and set the events’ retention period (ranging from zero days to infinite). Subsequently, you can optionally set a pattern matching filter for which events to archive. Later, when events run through the event bus, you can replay the events by selecting the appropriate archive.

Sample Implementation AWS EventBridge

Since the inception of Event Grid, I followed its evolution and wrote and presented on it. Moreover, I followed its competitive solution on AWS and, next to writing about it on InfoQ, built a simple demo around it using .NET in combination with AWS EventBridge. Below you will find a diagram of the demo I created.

From .NET code, I send an event to a custom event bus containing a rule to send the event to a destination, an Amazon Simple Queue. Subsequently, an AWS Lambda function can poll the queue and receive the message – below shows the steps until the SQS queue.

You can find a live demo on YouTube with demoing the above (minute 19). Furthermore, you can look at other samples like in the AWS documentation or on GitHub.

Google Eventarc

With Azure and AWS offering a service to centrally manage events, Google followed in October 2020 with Eventarc to provide customers with a service to connect Cloud Run services with events from various sources, adhering to the CloudEvents standard. It became generally available in January 2021.

Eventarc’s underlying delivery mechanism is Pub/Sub, which includes topics and subscriptions similar to previously discussed Event Grid and EventBridge. Event sources create events and publish them in any format on the Pub/Sub topic. The events are then delivered to the Google Run sinks. For applications running on Cloud Run, you can use Eventarc to use a Cloud Storage event (via Cloud Audit Logs) to trigger a data processing pipeline or an event from custom sources (publishing to Cloud Pub/Sub) to signal between microservices.

Source: https://codelabs.developers.google.com/codelabs/cloud-run-events#1

The diagram above shows what Google hopes to achieve with Eventarc. Currently, you can Cloud Run Service as a destination, and recently Cloud Run for Anthos has been added. Additionally, you can leverage a UI through the Google Cloud console allowing you to view, edit, and delete EventArc triggers. Lastly, you can find more details and samples on GitHub.

CloudEvents Schema

Before I end the blog post with some conclusions, I like to discuss the CloudEvent schema. CloudEvents is an open-source specification for consistently describing event data to make event declaration and delivery easier across services, platforms, and beyond. The Cloud Native Computing Foundation (CNCF) is the driving force behind the specification, which reached the version 1.0 milestone in October 2019.

Clemens Vasters, Principal Architect Messaging Services at Microsoft, stated in an InfoQ news item on CloudEvents:

The goal was to provide an industry definition and open framework for what an “event” is, what its minimal semantic elements are, and how events are encoded for transfer and how they are transferred and do so using the major encodings and application protocols in use today rather than inventing new ones.

Earlier I mentioned that Azure Event Grid has its own proprietary schema and supports CloudEvent schema. The differences are shown below:

Note that Azure Event Grid and Google Eventarc support the CloudEvent schema; however, AWS EventBridge does not, leading to customization.

Conclusion

From this blog post, you can probably conclude that AWS with Eventbridge delivers the most complete event bus or eventing platform in the cloud than Event Grid and Eventarc. If I rank each, AWS comes first, Azure second, and Eventarc third based on features and maturity. The service overlap in concepts, yet implementation, support, and features differ dramatically. Interestingly, they all support changes in their respective storage service. Azure Event Grid brings support for events like when blobs are created, and EventBridge supports S3 notifications and Eventarc triggers for Cloud storage. You can think of various scenarios regarding storage and events, for instance, the pipe and filters pattern implementation discussed in my first blog post.

Secret Management in the Cloud

Posted on January 22, 2022 by steefjan1970

I have been using Azure Key Vault for secret management for the last two or three years in my projects or advice my peers, client, and colleagues I work with to do so. Azure Key Vault is a service that provides storing and managing secrets with policies and the ability to access them using .NET code. Moreover, it is not just .NET yet also a service principal that can access it to get a secret for establishing a connection or a pipeline. The secrets can be API keys, connection strings, credentials, certificates, etc. I like to discuss a secret management use case in this blog post and dive into its details.

Use case Key Vault and D365 FO Business Events

In a recent project regarding unlocking data from a Dynamics 365 Finance and Operations (FO) instance, I leveraged the concept of Business Events, where a Logic App subscribes to a specific event published on a custom Event Grid Topic. Let me further explain the scenario and where Key Vault comes into play. Below you see a diagram of integration between D365 FO and third party system. The latter receives data from D365 based upon a specific business event.

Within D365 FO, you can define a destination for a business event. As shown in the diagram, the destination is an Event Grid Topic. When following the Microsoft documentation of Business Events and Event Grid, you will notice that a Key Vault is required to keep the access key of the Event Grid Topic as a secret. Furthermore, you will need to create a so-called App registration in

Azure Active Directory. Azure App registrations are a simple and effective way to configure authentication and authorization workflows for many client types. In this case, a client identifying D365 – allowing access to the Key Vault instance to extract the access key for the custom Event Grid Topic.

Once the app registration is in place, the next step is to add it to the access policies in the Key Vault instance. The registration represents D365, and it needs access to the Key Vault to extract the access key for the Azure Event Grid topic. The app registration only requires the Get and List secret permissions to retrieve the Key Vault secrets.

The endpoint configuration is the next step when the app registration and policy are in place, the custom Event Grid topic is available, and its access key is a secret in Key Vault. The screenshot below shows the configuration of an actual endpoint (destination) for the events – the custom Event Grid topic.

For configuring the endpoint (destination), you need to provide a name. So first, the endpoint type is filled in by default, followed by the endpoint URL (destination endpoint – Event Grid topic URL) and then the details for the Key Vault. These details are the client id of the app registration, its secret, the DNS name of the Key Vault instance, and key vault secret name – which has the secret, i.e., access key to the custom Event Grid topic. And finally, you can press Ok for the creation of the endpoint. You can subsequently attach the endpoint to the necessary business event and activate it when the endpoint is created.

Once the endpoint is active and a specific business event is attached to the endpoint, the event will end up with the subscriber – Logic App. An example of a business event is shown below:

{
  “BusinessEventId”: “PurchaseOrderConfirmedBusinessEvent”,
  “ControlNumber”: 5637365024,
  “EventId”: “9D42A382-12E8-48F6-9BB2-29A1G4E39773”,
  “EventTime”: “/Date(1642759229000)/”,
  “LegalEntity”: “fnl1”,
  “MajorVersion”: 0,
  “MinorVersion”: 0,
  “PurchaseJournal”: “PO1-002342-11”,
  “PurchaseOrderDate”: “/Date(1642723200000)/”,
  “PurchaseOrderNumber”: “PO1-002342”,
  “PurchaseType”: “Purch”,
  “TransactionCurrencyAmount”: 1553.46,
  “TransactionCurrencyCode”: “EUR”,
  “VendorAccount”: “IFF1095”
}

The Logic App can use the details to retrieve more information (through OData calls) about the purchase order in this case. And as shown in the diagram, send the enriched json to a service bus queue to handover the another Logic App to transform it into an XML to be sent to an application Basware (provider of software for financial processes, purchase to pay, and financial management).

Managing Key Vault

To properly set up the process around Key Vault and secrets, the administrator (Azure Ops) is responsible for creating the app registration. The administrator will make the app registration and manage the Key Vault. Moreover, the person is also the one in my view that does the endpoint configuration. Therefore, the integration developer will only need to connect the Logic App to the Event Grid topic. Similarly, the SFTP connection requiring credentials or certificates can also leverage the Key Vault and require the same administrator.

The diagram below shows what the administrator can do regarding the app registration and managing the Key Vault instance. Also, the authentication process is shown from the application side – in our case, creating the endpoint from D365. Finally, D365 will use the app registration to authenticate against Azure AD to retrieve a token necessary to access the key vault secret.

I like to point here regarding this scenario that business events might need to be set up again when a database refresh is done. Note that when the endpoint configuration fails, you can see an error like:

Unable to get secret from Key Vault DNS: <dns of the key vault instance> Secret name: <name secret>

In that case, either the app registration client id or secret is wrong, or worse, the app registration is expired (the error messages will not tell you that!). An app registration expires (the max is two years). Hence, be aware that the events when the app registration is expired will not reach the Event Grid topic, and errors will occur on the D365 side. Therefore, I recommend monitoring the expiration for the app registration, and also, the secrets can have an expiry date – so keep an eye on that too!

Other Cloud Public Cloud Providers

Interestingly, Azure is not the only public cloud platform with a secret certificate and key management service. For example, AWS actually has three services – AWS Secrets Management, AWS Certificate Manager, and AWS CloudHSM. With AWS Secrets Manager, users can manage access to secrets using a fine-grained set of policies, control the lifecycle of secrets, and secure and audit secrets centrally. Furthermore, this is a managed service with a pay-as-you-go model available in most AWS regions. Sound familiar? Azure Key Vault is similar, right? Almost, yet Key Vault has most of the capabilities found in the three earlier mentioned AWS Services.

What about the Google Cloud Platform? Well, on GCP, you will find Secret Manager, which also enables users to store and manage secrets, including policies and rotation. Furthermore, the service offers management of certificates. And lastly, the public cloud has a separate service for key management with Key Management Service (KMS).

A High-Level View of Cloud Governance

Posted on January 9, 2022 by steefjan1970

Something that intrigues me in the cloud is governance. As a technical integration architect, that’s the role/function I have in my current day-to-day job. Yet, during designing solutions, I usually do not think about it or talk to a customer set on moving to the cloud – that’s a cloud migration process, which I am generally not involved with. Still, it should have my attention, and it has now.

You might ask if it sounds unfamiliar to you, what is governance? First, you could look up the term in Wikipedia. And you’ll find the explanation or definition in the first lines mentioning a process of interactions through laws, norms, power, or language of an organized society over a social system such as tribe, family, formal or informal organization. Yet how does this relate to the cloud? Well, very simple, it is still a process of interactions, however, defined by what a cloud provider deems necessary to keep costs, access to data, consistency, and deployments under control.

A Cloud provider like Microsoft, AWS, and Google can provide you with guidance regarding governance to manage costs, secure resources and access to data, and consistency in the deployment of resources – each provides frameworks for that:

Microsoft – Cloud Adoption Framework (CAF)
AWS – AWS Cloud Adoption Framework (AWS CAF)
Google Cloud Platform (GCP) – Google Cloud Adoption Framework (Whitepaper)

The Google Adoption Framework whitepaper will mention governance regarding data, cost control, security, and cloud resources management. While AWS CAF has governance as one of its six perspectives. And Microsoft has a section of Govern in their Framework and a landing page.

Source: https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/overview

I now like to zoom further into governance on Microsoft Azure since I predominantly work as a (solution) architect (integration) on that Cloud platform. Furthermore, I will not look at the process extensively described in the CAF, yet more on some of the services and capabilities available in Azure and add some of my views and relevant resources I found.

Azure Resources

Microsoft provides policies on Azure to allow you to keep resources compliant. When a policy is assigned, it will, when it is triggered, evaluate if it adheres to a definition. You can use these policies to implement governance for resource consistency, regulatory compliance, security, cost, and management. For more details on Azure Policies, see Azure Policy on GitHub.

Next to policies tagging is another aspect of governance in Azure or any cloud platform. With tags, you can assign helpful information to any resource within your cloud infrastructure – usually information not included in the name of available in the overview of the resource. Tagging is critical for cost management, operations, and management of resources. More details on how to apply them are available in the decision guide.

If you work at a company with many subscriptions, or the customer you work for does, you can leverage management groups –a level of scope above subscriptions. It provides a way to organize subscriptions into containers and thus provide a logical structure. Moreover, you can apply specific governance conditions with management groups as each subscription in a group inherits them.

Diagram of a sample management group hierarchy.

Source: https://docs.microsoft.com/en-us/azure/governance/management-groups/overview

More details on management groups are available on the GitHub page.

Another intriguing service is the Azure Resource Graph, a capability in Azure to query, explore, and analyze your cloud resources. It includes an Explorer you can use in the Azure portal and can also be used programmatically through the Azure CLI, Azure PowerShell and Azure SDK for .NET.

You can use Graph Explorer to explore resources based on your governance requirements and assess the impact of applying policies in your environments. The query language is based on the Kusto query language used by Azure Data Explorer. More details are available on the GitHub page.

And lastly, Azure Blueprints can enable you to define a repeatable set of Azure resources that implements and adheres to an organization’s standards, patterns, and requirements. As a result, you can orchestrate the deployment of various resource templates and other artifacts such as the earlier mentioned policies, role assignments, ARM templates, and resource groups in a declarative way. With blueprints, you can consistently deploy predefined environments. Other public cloud providers offer blueprints as well: AWS Blueprints and GCP Blueprints. You can find more details on Azure blueprints on GitHub.

Cost Management

The cost management + billing service and features are available in any subscription in the Azure portal. It will allow you to do administrative tasks around billing, set spending thresholds, and proactively analyze azure cost generation. A key aspect is regarding cost control is to set up budgets at the beginning once a subscription before workloads land or resources are provisioned for the development of cloud solutions. Furthermore, once consumption of Azure resources starts, you can look at recommendations for cost optimizations. Moreover, Azure Advisor can help identify underutilized or unused resources to be optimized or shut down.

Example of the Subscription Overview tab showing Offer and Offer ID

Source: https://docs.microsoft.com/en-us/azure/cost-management-billing/costs/understand-cost-mgt-data

Security

An essential aspect of governance is security, for example, who gets access to what resource in Azure. A consistent way to set that up is by applying the earlier mentioned blueprint. Azure AD plays a role as well when you add accounts, service principles (an identity created for use with applications, hosted services, and automated tools to access Azure resources – similar to a service account on Windows), and app registrations (Application Object).

Azure AD is an Identity and Access solution with several features, such as conditional access, Multi-Factor Authentication (MFA), and Singel-SignOn (SSO) support. In addition, it is an essential service with regards to governance to provide access to the application (services) and people to Azure resources – and you want that consistent and accurate when it comes to who is responsible for what. And lastly, Microsoft provides best practices and guidance on this service you can look into.

Data Governance

Microsoft launched Purview into a public preview for data governance in December 2020, and it became generally available later in October 2021. With Azure Purview, the company delivers an Azure service that can help you understand what data your company has and provide means to manage the data’s compliance with privacy regulations and derive valuable insights.

Implementing Pipes and Filters Pattern using Azure Integrations Services

Posted on January 7, 2022 by steefjan1970

In my first blog post, I like to discuss implementing the pipes and filter pattern using Azure Integration Services (AIS).

A Pipe and Filter pattern uses multiple event queues and topics to streamline events’ flow across numerous cloud services. You can implement this by using queues or topics in a service bus namespace and Logic Apps or Azure Functions. Implementing this pattern with the mentioned services allows you to process messages in multiple stages like receiving, validating, processing, and delivering. Moreover, you can also opt for an event-driven approach using Logic Apps, Event Grid, and Functions.

The Pipe and Filter pattern is described in the Microsoft docs as a way to decompose complex processing into a series of separate elements that can be reused, providing improved performance, scalability, and reusability. Each element in the context of Azure Integration Services (AIS) can be either a Logic App or Azure Function and connected through a topic – which can be either a Service Bus Topic or an Event Grid Topic. The latter allows an event-driven approach to the pattern.

To process messages more timely, choosing the Event Grid is more efficient than using a service bus queue. Although you can select the queue or topic as a pipe with the pipe and filter pattern, having each filter subscribe and publish to them. However, it is less efficient as you will need to poll the queue or topic quite frequently.

Assume you receive multiple order files as a batch during the day, and these need to be processed as soon as possible in the shortest amount of time. You would need a way to receive or pick the files, validate them, process them, and subsequently deliver them to one or more sub-systems or locations. In the diagram below, we outlined the scenario.

Pipes and filter pattern implementation diagram

In the scenario, the Logic App function as on- and off-ramp receiving and sending data. The benefit of leveraging Logic Apps for data transport is its versatility in connectors. By leveraging the out-of-the-box connectors, you can consistently use a standard connector, in this case, Secure File Transport (SFTP).

Next, the functions will act as single processing units, i.e., one will store the data, including adding instance properties (context). Finally, another is triggered because the files are stored as a blob in a storage container (blobCreated Event). The chaining of the functions is done through the event mechanism – each storing of a blob results in an event (blobCreated) that another function can subscribe to. And lastly, a Logic App can be triggered to deliver the processed file. Moreover, multiple Logic Apps could be activated to deliver the file to various locations using various connectors if necessary.

The benefit of using functions is that you can have them scaled automatically (and thus also have elasticity) using Serverless mode (consumption) as a hosting option. Or, if you need more compute you can choose premium or dedicated plans. And by adding more compute, the performance of processing the files can be increased.

With the implementation of the pipes and filter pattern described above, you can see that you could easily reuse components (functions), add or remove components or shift around if necessary – hence you also have flexibility (agility). As described in the enterprise integration pipe and filter pattern with the context of the implementation:

Each filter (Function) exposes a straightforward interface (Binding): it receives messages on the inbound pipe (Event Grid Topic), processes the message, and publishes the results to the outbound pipe (Storage Container). The pipe (Event Grid Topic) connects one filter to the next, sending output messages from one filter to the next. Because all components use the same external interface, they can be composed into different solutions by connecting the components to other pipes. We can add new filters, omit existing ones or rearrange them into a new sequence — all without changing the filters themselves. The connection between filter and pipe is sometimes called port. Each filter component has one input port (Event Grid Topic) and one output port (Storage Account) in the basic form.

To summarize, the pipe and filter pattern brings the following benefits:

loose and flexible coupling of components (filters)
flexibility as it allows filters to be changed without modifications to other filters
parallel processing
reusability as each component (filter) can be called and used repeatedly

Note that it is less suitable when there are too many components (filters) concerning performance. Individually, the components – functions in our example can be tuned using the available hosting options – however, the sum of all the components determines the overall performance. Furthermore, the pattern is not suitable for interactive systems or long-running processes.

And lastly, with the pipe and filter pattern, there are also some challenges or considerations when implementing it according to the Microsoft documentation. For example, it mentions complexity when filters are distributed across servers. However, in our example, every component is presumably in the same Azure region, and messages are persisted in storage, benefiting from the built-in redundancy (reliability is available as well).

Finally, again in our example, when a message fails to reprocess, it is an option to reprocess it (retry); however, it depends on whether the message has to be reprocessed or dropped in a different container in the storage account. And when that happens, it is up to how monitoring and notifications are set up to troubleshoot and correct the error.

Cloud Perspectives

Steef-Jan Wiggers

Category Archives: Azure