Toon Vanhoutte explains how using Webhooks is more efficient than polling, walks through some Webhook best practices. He then demos how to manage Webhooks with BizTalk and Event Grid and how to synchronize contacts and documents.
Integrate 2018, June 4-6, etc.venues, London
He is Microsoft Azure MVP and working as a lead architect in Codit. He is going to present on the topic “Hybrid Integration with BizTalk Server and Webhooks.” Let us welcome Toon on the stage.
Toon: Thank you very much for the introduction. So good morning or noon, everybody. My name is Toon Vanhoutte. I’m working as a solution architect at Codit and this session is all about unlocking the power of hybrid integration with BizTalk Server and Webhooks. At Codit, we believe that connecting is everything and we do that in four domains. It all started with integration but lately we also did a lot of API management, internet of things and Azure solutions.
On the agenda for today we will start with a little introduction to webhooks, then some design considerations, if you want to implement webhooks, what do you need to think about? And then I want to look at the publisher side of webhook. So if you are the provider of the webhook, what are your responsibilities? I’ll do some demos on that and then next we go to the consumer side where you also see on the responsibilities and how you can consume webhooks in BizTalk Server. And I want to conclude with a nice conclusion and then we have lunch. So I’ll hurry up.
So as an introduction, I think it’s always good, if you talk about webhooks, to compare it with polling. So polling, you have a client and a server and the client is interrogating the server at regular time intervals mostly through APIs. So he’s asking the server, “Hey, do you have new modified employees for me or do you have new events?” Server says, “Nope. Sorry. Nothing for you.” Five seconds later, again hammering the database…sorry, the server with no responses. And so you constantly are hammering the API, calling the API but actually nothing happens. Suddenly a server side event occurs. The client is not aware of that. The client is only aware the next time it polls and then you get the event back. And so you’re constantly hammering the API which leads to a lot of API calls and the resource utilization which is unneeded.
So that’s why webhooks were introduced. So with webhooks, you have the client who registers first at the server. So he does an API call and says, “Hey, server. If you have an event, please inform me on this URL, on this endpoint.” The server says, “Okay. I’ll do that.” And he registers that subscription in his own database or internal stuff. Then if a server side event occurs, the server will initiate an HTTP connection to the client. The client will say, “Thanks, mate,” and will start processing the event. In this way, you have really reactive program in that server.
Some real life analogies. If we talk about the police, for example, it’s really, “Don’t call me. I’ll call you.” I don’t think there’s any police department in the world who calls his citizens every day to ask, “Hey, are you fine?” No, if you are in trouble, you pick up the phone…you take the phone and you call the police.
Also with the postman. You don’t go every day to the post office. No, the postman comes to you and drops the letter in your mailbox. Of course, there’s still a little polling involved here because you go to your mailbox every day. Unless you’re Jeff Holland and you have an IoT enabled mailbox. You get notified and then it’s completely a webhook system.
The last analogy I want to do is with a baby monitor. So if you have little kids, you don’t want to go upstairs every 5 seconds or every 5 minutes. So we get notified through the baby monitor. And I take this analogy because a baby monitor…it’s a concept of webhook but actually it’s more a websocket implementation because you have a long connection, a long running connection and you have bidirectional communication. So this is the idea of webhooks, so event driven programing but actually it’s more the websocket implementation.
So if you look at the advantages of webhooks, they are more efficient, like you see in the comparison compared to polling. They are faster because almost near real time, you get warned. No client side state, because if you need to poll, you typically need to keep the last state for example, the date modified and so on. And it’s an excellent way to provide extensibility to your application. So for example if you are sales provider, you have your core functionality and then you have your API combined with webhooks to provide extensibility points for your customers.
Some disadvantages, of course. It’s not standardized at all. Recently, there has been the cloud events, that’s initiative that Clemens] talked about. So that’s a first step in standardizing how cloud events, how webhooks could work in a standardized manner. It was great to see that Microsoft is really betting on that and it’s already implemented in Azure Event Grids.
It comes with extra responsibilities and that’s the main focus of this session. Both the server and the client have responsibilities and if one of them does not fulfil their responsibilities, the webhook integration will not work. And then as the last thing, a webhook system is often considered as a black-box. Earlier this year, at the Codit Connect Events, I had the demo where I changed something in the sales application but the webhook was not fired. The only thing I could do was pressing F5 and waiting. The webhook didn’t come in. When I went off stage, the webhook came in. So fingers crossed for today but at least at that moment, it’s a black-box. You don’t have any visibility in the loss of your sales application. So that’s important to take into account.
Some examples of webhooks. I think if you want to implement webhook,s GitHub is a good reference. It’s really state of the art implementation for webhooks. We can provide a payload URL, the content type but very important, you can also provide the secrets when you register your webhook and that secret will be used to compute a hash-based message authentication code. And that could be used then by the clients to validate the authenticity of the message and also validate against non-repudiation.
Another example and I show this on demos of that is Teamleader. That’s a CRM and project management tool mostly used in Europe where you have a nice webhook system where you can say, “Okay. After a company has been added, when an invoice has been booked, please notify me on this URL.”
Webhooks are also very well implemented in Azure. Azure Monitor is a good example of that. If you have metrics and certain thresholds exceeds, then you can set up an alert. A typical alert is just sending an email but you can also fire up a webhook and then have some more logic for advanced notification.
And then Event Grid, Azure Event Grid, the first class citizen for webhooks embedded in the Azure ecosystem. So Dan talked already a little bit about it. So it’s really a push-push model. As a event source you push your events to Azure Event Grid. Azure Event Grid will have a look who is interested in this event, who has subscribed and then Azure Event Grid will forward, will push your message to the event handler. So it’s a push-push model. Event Grid takes care of all the reliability so it does 24 hours lead time with a backup mechanism and it has even a dead letter queue functionality too.
So some design considerations. First of all, think about your event names. Make sure that you are clear, descriptive and consistent. Typically, it’s a combination of an entity, the object type and the state change. You can do this CRUD-based, user created and employee deleted as well, but you can also go more fine grained. You can say more declarative, a user address changed or an employee retired or an order that has been shipped, invoice has been booked. You could also do a combination of the two depending on your business scenarios and the needs for your customers.
The event data is also very important. What kind of data? What kind of information will you put in the events? And actually, there are two ways to do it. The most common way to do it is only add the entity ID in the payload. For example, if an invoice has changed, you only add the invoice ID to your event. The consumer of your events needs then to do an additional call to get the details of your invoice. In that way, you keep your payload small and you have also an additional security check because you need to authenticate to the API to get all the details. So there’s also some security reasons behind that.
Another way to do it is just include the entity data already in your events and that’s to overcome chatty integrations where constantly you need to go to the API to get the details. So choose the right one for your solution or maybe do a combination of the two but make sure that it’s consistent across your events.
The registration. So at a certain moment in time, you, as a consumer, you need to register, you need to say that you’re interested in certain events. So make sure that that is simple, user friendly and well-documented, actually like any API, and Azure API Management could help in that.
Lately, I had a customer. They wanted to integrate with three systems. We looked at the APIs. Two of them had a wonderful well-documented API. The last one had terrible documentation. In the end we didn’t integrate with that third one. Just those two. So you see, for good adoption, for fast adoption, a well-documented API and webhooks is very important.
So the registration can be done UI-based or you can also offer it through an API which is ideal for production ready environments.
Unregistration is also important of course because you don’t want to get notified forever. Again two options over there. It mostly ends with an explicit opt out. So you say you need to unregister yourself if you don’t want to receive any events anymore. Or it could give it expiration. If I’m not mistaken, SharePoint works like that. So you register yourself and your webhook registration is valid for two months. So you need to make sure that you renew before it gets expired.
Let’s have a look now at the publisher sites of webhooks, and this is all about the responsibilities. And as a publisher, your first responsibility is reliability. Make sure that you implemented asynchronously, not in line. Make sure you have retries because the internet could be down or your consumer could be recycling or down for several seconds. And optionally, you could have a fallback protocol, like putting the message on a queue after you did some retries.
Another responsibility is security of course. HTTPS, needless to say, but still we see a lot of HTTP implementations. Give that option like GitHub does and calculate a hash-based message authentication codes. So you give more security options to your webhook consumer. The validation of the endpoint is also quite important. If somebody registers at the registration time, it’s maybe good to see if you can access the webhook. If not, you can give some error details so the client can actually solve your issue.
So let’s go into the demo right now. So in my demo, I will have an API where you kind of register for webhooks. Under the hood, I will use Azure Event Grid, but I will shield that away from my customers, and the events will originate from BizTalk. So on the right side, we have our consumer, the third party consumer and it calls an API which is actually on a [inaudible 00:11:53] Azure Function and in the register webhook, you give an event type. What type of events are you interested in? In this case, it will be the order received events. You give also the URL because that’s the address, the HTTP endpoint where you want to be notified, and you provide also the shared secrets that will be used to compute the hash-based message authentication codes. And the Azure Function underneath will register your subscription in Azure Event Grid.
From that moment on, you are registered and you will be notified if something happens. That something happens will be a fancy EDI order that is developed on my BizTalk machine. BizTalk will pick it up, push it to Azure Event Grid. Azure Event Grid will have a look who is subscribing on the order received event and then will forward the message not directly to your third-party consumer, in this case, but it will forward it to an intermediate Azure Function who acts a little bit as a proxy. And why do I have that Azure Function in between? Several reasons.
First of all, it will remove the Event Grid envelope because as you’ve seen on the first day, Event Grid has a common envelope. So it removes that all so your client is not aware that you’re using Event Grids. It’s also computing the hash-based message authentication codes like GitHub does. And also, you maybe know but at the registration time, Azure Event Grids requires you to echo back validation codes. This is something I don’t want to force my clients to implement such logic so that’s why I do it also at the Azure Function side.
So from that moment on, every order that is received gets pushed to Azure Event Grid and there we can have 0 subscribers or a 100 subscribers. We don’t need to take care of that. Azure Event Grid is doing everything.
And then the third-party consumer can also unregister. So it calls the API to delete the subscription and then the subscription will be removed via an Azure Function so it will be removed from Azure Event Grid.
So let’s have a look to the implementation of that. I have here my register webhook call. This is…I just need to do a post on my Azure Function. I provide the URL. It’s webhook site. It’s an alternative to request bin. So I should see my requests coming in over there. This is my shared secret and I want to subscribe on the order received event. So I click send. I don’t need to blame the Wi-Fi because I’m wired up but I’m using Azure Functions with a consumption plan. So I’m now facing a cold start issue. I think it will last for 15 seconds. You can also solve that but for demo purposes ,I didn’t invest in that, but you could use precompiled Azure Functions, for example, and then app service plan. Then you don’t face the cold start.
So you see I have here my subscription ID returns and if I now go to Azure Event Grids and do a refresh, you see a have a subscription created. The subscription is a15, the GUI [SP] and if I go here, it’s the same GUI today, a15. If I have a look at the details of the subscription, I can copy the URL.
There it is. So the URL is actually…first of all, this is the address of my Azure Function that will act as a proxy who will forward the requests. This is my secret and then I also have the URL who is URL in Codit. All righty. So this is all the information that the Azure Function in between needs to forward your message to the consumer. The secret should of course be encrypted for real production usage.
So I have here my webhook tester waiting for the requests. So this one is listening on the URL I provided. And if I now drop a message, an EDI order, I can open it with EDI Notepad. And you can see, it’s just an old school order. I think I use it already 10 years in demos and stuff. And it’s actually an order for several books. I heavily recommend you to read the “Robust Cloud Integration with Azure.” It’s a recommended read, that one.
So if I now drop this message here in the queue, it will get picked up by BizTalk. Just a normal receive location and then we have the send port that will push the event to Azure Event Grid. Actually it’s nothing more than the web HTTP binding that I’m using, sending to the URL of Azure Event Grids. Yeah. And I need to provide here my sas key in the HTTP header. And the next short thing I developed is an endpoint behavior over here. It’s actually a message inspector because I don’t want. in my BizTalk environments, to create the whole envelope that Event Grids needs. The only thing I send to the send port is my JSON, my JSON object and this behavior will wrap that JSON object in the envelope for Azure Event Grid so you don’t need to take care of that. The only information that is needed, you need to provide it on your port or dynamically via context properties.
And hopefully if I now go back to my request bin alternative, you see here that the order has been received. So for every order that gets received, I get notified here through this URL via webhooks. So just to be sure, let’s do it another time. Let’s press F5 over here and you see another event coming in. Now I want to unregister as a client. So I have here…I need to take my subscription key, copy, paste it, go to the unregister which is just a delete over here. I need to provide in my URL… I click Send. Should be a little bit faster now. And if I now go to Azure Event Grids and do refresh again, the webhook is gone.
So you see very flexible system and a nice collaboration between BizTalk Server and Azure Event Grids. So my whole API infrastructure, I keep it in the cloud because it’s public facing but BizTalk Server can be used to connect actually legacy systems towards Azure Event Grids and there you don’t need to take care of the reliability. That’s what’s done by Azure Event Grid.
Also the security is quite good. So we have HTTPS connection between BizTalk and Azure Event Grids and we also implemented to the hash-based message authentication codes and the validation is also done in this scenario.
Let’s move now to the consumer sides of webhooks and that’s mostly the side where we are because we need to talk with sales applications. Your first responsibility is high availability. So again, if you’re not available, the webhook will fail and eventually it will not be arriving at your place. Scalability is also very important because if you are consuming webhooks, you don’t control the loads. It’s the sender who controls the loads. So it could be that you have peak loads and you need to make sure that you can scale to handle that peak. And on the other hand, you need to be able to throttle a little bit towards your backend systems because your old legacy applications or your slow databases will not always be able to handle a lot of concurrent calls. So on the receive sides, you need to be scalable but on the other hand you need to be able to throttle if needed.
Reliability is very important and I always stress on this. If you want to consume webhooks, the right thing to do is receive the webhook, persist it for reliability, acknowledge it and then process it to your backend system. Don’t start processing it in your synchronous call because then the webhook publisher will time out and so on. So persist it and make sure that you have retry capabilities towards your backend system. So Azure Logic Apps has that functionality out of the box. Azure Functions has not really that retry mechanism or you need to build it yourself at Polly. A good advice could also be to use Azure Durable Functions because there you have reliable retries out of the box. And of course, our BizTalk Server can also do this very well because there you have the retry mechanisms on the send port.
If you look at the security perspective, if you want to consume webhooks mostly you need to have a public available endpoints. So make sure that you can expose it in a secure way or you do it through an Azure Relay as Wagner explains or you need to have a reverse proxy infrastructure to manage it yourself. Make sure if there is an API key in the header or there’s a hash-based message authentication code, that you don’t ignore it but that you validate on it. So you make sure that you know from who the event originates and you know the authenticity of it and you know that it hasn’t been changed on the way.
The last responsibility and often forgotten is sequencing. The producer of events will not guarantee the sequence, because there is network in between so it won’t do it in a sequenced way. So a typical example, somebody creates a contact. He clicks Save, and then suddenly he sees he made a typo. He corrects the typo and clicks Save again. Then you need to make sure at your sites that only the latest version gets updated. So do you need to do explicit sequencing? I don’t think so. But you need to have some logic to make sure that old events don’t override new ones. So a little bit of state is needed to make sure that you can ignore these.
So again demo time. For this demo, I’ll do some contact synchronization. So we have the sales application Teamleader, that CRM application, and there I’ll create a contact. When I do that, I subscribe of course with the webhook. An event will be triggered and I pass the event through Azure API Management. And Azure API Management will forward the request to BizTalk. My BizTalk is running here on this virtual machine and I use the WCF relay, as Wagner showed to you, to expose my local endpoints to the clouds.
What is now the problem with Azure Relay, WCF relay? If you want to authenticate with that, you need to add an HTTP header. But that is not available in a Teamleader. There you don’t have the option to add an HTTP header. I only have an URL. So that’s why API management is introduced in between. API management will have the API key in the query stream. He will validate that and if that one is correct, then API management will forward your request to BizTalk sever and will add the security header that is needed to consume your WCF relay. So in that way, you have security both on your front end but also on the back end side which is often forgotten.
BizTalk will receive the events and then it will consume Teamleader to get the details of the contact because Teamleader only sends the contact ID. So we do get details, get contact details. Then asynchronously, the message is pushed into a database with a store procedure and that procedure has some logic to ignore old events. So there is some little state in it to make sure that old events don’t override new ones.
So let’s see this in action. Let’s start with Teamleader. So here we have Teamleader. And this is the place where I register the webhooks. So you have after a contact has been added, deleted and edited, I ask them to invoke this URL. If you look at this URL…oh, I don’t think I copied it well.
Yes. You see this is just invoking my API management and adding the subscription key to it. So this is a security towards my API management. If I look in API management, I have here my contacts API and I added here, in my inbound processing, a simple policy to add a header because we need to have the authorization header with a shared access signature to authenticate against the WCF relay.
Then we have the WCF relay. It should be available over here. You see I have two relays registered. For this demo, I only need to do contacts relay. And this is actually pretty cool. Wagner also showed it. I won’t show you the configuration because he covered it already but the cool thing is if I stop my receive location over here and I do a refresh…there’s no refresh button, so I need to do it this way. You see it’s gone. So at the moment, we start our receive location there will be a handshake from BizTalk towards Azure and we will register the WCF relay at that moment in time. So if I now start it again like this, do again a cumbersome refresh, now you see it’s added again. So that’s the way it works.
The connection is outbound, so BizTalk will do an outbound firewall connection towards Azure and then the connection stays open. So it’s a firewall-friendly way to expose your on-premise web services to the cloud, firewall-friendly and secure. Because your attack server is not locally. It’s in Azure data center where they have all the things to cover denial of service attacks and so on.
I think that’s what I need to show. So then the message comes in into BizTalk and then we just have a simple send port that consumes the API of Teamleader. So this is just web HTTP. We call the Teamleader API and the only special thing I do is I need to add the contact ID over here. So this is contacts property that gets find replaced in the URL to make sure that I have the dynamic contact. And then it send ports towards my SQL server which will just insert here in this database.
So let’s go to Teamleader, to the contacts. Where are they? Over here. I’ll add a new contact. Let’s add myself. Yes, like this, 8520 in Kuurne. I click SAVE. I go to the database, I refresh, it’s already there. So quite fast. I can also do an edit, just here, add my street. So if you are in the neighborhood, you are always welcome. SAVE. Refresh. So very fast because it works in a push-push model, yeah.
The last thing I can show you is also just deleting it. Delete. Are you sure? Yes, I’m sure. It’s deleted also. So we are GDPR compliant now too.
So if you look at this demo, a lot of things involved. API management and stuff. High availability is good because API management is high available. Your BizTalk Server could be high available if you set it up the right way and also with Azure Relay you could have multiple listeners. So it’s a high available service. If one is down, the other one will take over. And for the rest, it’s just round-robin between the two listeners.
Scalability. I think BizTalk scales well until a certain level and you have also a separation between your receive processing and your send processing. The reliability, we have it, thanks to the retry mechanisms on the send ports. The security. That’s where API management comes in. And then the sequencing, I’ve done it with some stateful logic in the start procedure to make sure that old events don’t override new events. Just for every contact, I keep the latest modified date and make sure that I check that one when I do an insert.
A final demo I want to show you is a document synchronization. So here, a document will be uploaded on blob storage and we will configure Azure Event Grids as already demoed a few times to push an event again to our API management system that will forward it again to BizTalk. And BizTalk will then get the details through the blob storage API because you can just access it through HTTP and that will put it on the file, yeah.
The important thing here is that API management will also reflect and return the validation codes. Because if you want to register your endpoint in Event Grid, at the registration time, you need to return a validation code. So you could do that in BizTalk. I wrote an IOperationInvoker, a WCF operation invoker for that, who checks, is it a validation request. If yes, I echo that back. So you can use that and that way you don’t need to implement that logic in your application. You can just do it on the adopter. In this case I’ve chosen to use Azure API Management because you can write a policy actually that takes care of that. That policy is written by a community member so I just did a copy-paste of that. And it works smooth.
So maybe let’s have a look at this API management policy. Where is it? Over here. All the rest is a little bit the same as the previous setup. This is the API management policy. Now if you look in the headers, is it a subscription validation request or just a notification? If it’s a notification, we just send it back to BizTalk, the requests and add the authorization header. If it’s a subscription validation request, we create a JSON message that contains the validation codes and we return that’s validation code. So we [inaudible 00:30:32] in Azure Event Grids. And this way I don’t need to do anything in BizTalk. API management covers it for me.
If you then look over here, I’m here in the storage account. And you have here the top events. This is the Event Grid integration and here I added a webhook subscription. So BizTalk here is subscribing with this endpoint. This is my API management endpoint that will route back to my BizTalk. So if I now upload over here in this document library, I upload a PDF, click Upload, yeah, it should go all the way to BizTalk. I won’t show you the details of the ports. Nothing fancy in that but here, my document arrived one minute ago. Yes, it worked. Thanks for the demo gods.
So to wrap this up, we have also the high availability, scalability, reliability, security again and the sequencing. Sequencing is not important here because a document is a document.
So as a conclusion, webhooks are very powerful if you want to have events driven programing. So the advantages: It’s more efficient. Faster, it’s almost near real time. You’ve seen it in the demo, it was quite quick. No client side state you need to keep unless you need to do some sequencing. And it’s a perfect way to provide extensibility. Disadvantages: No standardization but we’re getting there. Mostly the extra responsibilities. So if a customer…if he dare suggest to do webhooks, you always focus on these responsibilities. You need to make sure that both client and server are aware of these responsibilities because otherwise, it will fail any time. And of course, consider it as a black-box. Mostly you’re depending on your provider of the webhooks for that.
The responsibilities for the publisher. It’s all about reliability, security and endpoint validation. For the consumer, make sure you’re high available, scalable, reliable again, security is very important because your endpoint is mostly on the public internet unless you have some VPN connections with your provider, and sequencing.
Thank you very much and enjoy the lunch.
Fill the form below to get all the presentations delivered as a single zip file in your mailbox.
byJon Fancey & Matt Farmer
byMicrosoft Integration Team