Devops with Azure Kubernetes

•

Let us straight to know about the topic devops with azure Kubernetes service here at Cloudnow technologies I hope you had a great conference so far my name is Nagarjoon and I'm a Microsoft azure MVP working at the active solution and we work a lot with the main applications based on azure and we see a growing interest in Kubernetes specifically around azure communities services the last couple of years. So i want to share some experiences around the kind of devops related topics around kubernetes so the topic about this session is really so you created a kubernetes cluster you deployed your first applications there perhaps uh now what you know what are the things that you need to think about and solve uh that's not strictly related to your application in itself more related to the cluster to the health and the devops practices so to speak so i want to talk to you about a little bit about the infrastructure's code how you can provision not just your application but also the cluster itself using infrastructure as code and related to that automated deployments again not just for the applications but also for the cluster to make sure that we have a consistent set of cluster infrastructure that you can deploy in a safe and repeatable way getting feedback back from the cluster back from the application about the health performance and so on is of course super important so i'm going to talk a bit about how you can you know deal with this in azure using azure monitor and how ats integrates with that and then we have the topics of availability and scalability so you know making sure the application actually stays alive stays respondent even though maybe your applications are being uh you know getting a high load or perhaps you get some could be a problem with you know one of the data centers in asher.

So you want to make sure that you can fail over for example to another data center to another region perhaps and still be respondent and available to your end users so just a quick word about azure kubernetes service this is the managed service in azure for kubernetes where microsoft or you know azure handles uh all the things related to to the control plane so that is that the master nodes you don't really see them you can't access them you don't pay for them you only pay for the for the nodes so these are the virtual machines that will be running your applications and it handles things like upgrades patching on the nodes uh autoscaling and those kind of things uh so this is a huge benefit of course uh compared to running kubernetes on-premise where you have to deal with things yourself and you know kubernetes can be run in you know everywhere both on-premise and i think all the cloud providers have their offering so some things are of course specific to azure if you run it in aks so first up the virtual machines you're your nodes they when you deploy a cluster in azure they will be part of a virtual machine scale set so this is what allows aks in your cluster up and down based on how you configure the auto scaling when you want to expose your application uh in kubernetes you create what's called a service and you can configure that service to be a type of load balancer when you do this in aks that will automatically configure the azure load balancer to to route the traffic from the outside of the cluster into that cluster to the correct ip and so on so that's something something that is handled automatically for you with kubernetes you also very often use something that's called an ingress which is for for more powerful http based routing to route traffic based on url paths and so on and if you do this in aks you can have that also automatically configuring after dns records for you so you can you can access various parts of your system based on dns records so this can also be handled automatically for you uh in terms of uh authentication identity.

Every time you create a cluster you need to provide this it with an identity so this can either be a service principle or better yet a managed identity which is a fairly recent addition to aks so either way you you define an identity for this cluster and that is important when you need to access other resources other services in azure a very common example is that you need to pull your docker images from a container registry in this case the azure container registry and when you do that you need to make sure that the cluster has the necessary permissions to do so so you will give that to the identity or the service principle or the cluster aks also supports kubernetes rule based access control and this then integrates with the azure active directory meaning that you can when you need to provide access to the cluster uh you can do that based on the users and group in azure active directory and still use the standard role-based uh access controlling kubernetes to do so you can map that to typically to groups then to so you can give for example cluster admins uh the necessary permissions to administrate the clusters and so on when you need some kind of external storage for your containers in kubernetes this is called a persistent volume and when you do this in azure uh you can map this to either an azure file that's a storage account or to an azure disk so a disk attached to your virtual machines um so so this is configurable how you want to persist that storage and finally azure monitor as i mentioned before uh this is a service in azure that basically all resources integrates with and so does aks um so everything that you run either your applications and also the nodes everything related to cluster will send data to azure monitor if you tell it to do so and that will give you a lot of insight into the health and the state of your applications and the cluster so let's talk a little bit about infrastructure as codon so you know this means that instead of just creating a cluster manually through the portal for example and you know switch with some settings there you define the cluster in in source code and you have a various options here how you do this but the important thing that this is called you check that into source control either together with your application or maybe as you know in a separate infrastructure repository and by doing this you will get a consistent infrastructure it's very common to actually deploy not just one cluster with multiple clusters it could be clusters for various environments so that's just prod or you can have multiple clusters where you dedicate a cluster to a particular application for example so either way you want to make sure that these clusters actually are consistent and the only really way to do this is to actually position it using code there so of course this will now also give you repeatable deployments you can deploy this over and over again making sure that you have the correct state and also making sure that you don't get configuration drifts whereby someone actually modifies some cluster on the fly somewhere um that is not a good thing to do and you shouldn't really give the permission to do so everything should go through a deployment pipeline and this also allows you to spin up clusters on demand which is actually quite common thing to do a common example is to if you're working with pull requests you know you will create a separate branch or you develop some new functionality and then you have a pipeline uh creating a cluster for that particular pull request branch uh deploying the application into the cluster doing some automated testing perhaps or rather some other kinds of testing and then when you're done you can you know just remove the cluster and delete it uh so this is actually a very efficient way to to work with the uh work with kubernetes and actually allows you to do this pretty fast as well and of course it will save you some costs by not having clusters run all the time unless you actually need it.

So you have multiple options here to how you want to implement this arm templates is of course a native option in azure actual resource manager templates and this is what i'm going to show you but there are also other options and i highly recommend you to look into both the terraform and pollute me as two other options for doing uh you know infrastructures code not just for azure that these are cross-cloud supported you can also use the azure cli in fact to implement this so so multiple options you need to kind of decide for yourself which options is best for you now when we talk about deploying uh the infrastructure in aks we're not just talking about the cluster because you can see kubernetes as a platform where on where you then would deploy your different applications from different teams and typically that requires more than just creating a variable cluster so first up uh every cluster needs to run in a virtual network so typically you will make sure that you have a virtual network setup before you create the cluster and configure it properly next up you deploy the cluster itself and once that is done some more things that you typically want to do as part of the infrastructure pipeline is you create namespaces so you know namespaces in kubernetes this is how you can divide the cluster into multiple parts for example divide it up to you could potentially have a test namespace in the same cluster or you can have separate namespaces for multiple applications if you have multiple applications running in the same cluster you can divide it up or you should divide them up using name spaces so consider this as part of the infrastructure and create them at the same time that you create the cluster next up you typically want to have some kind of ingest controller and related to that the certificate manager so this is for you know handling http traffic or https traffic going from the outside in into your cluster so again this is often considered as part of the underlying infrastructure and not really part of the application that you deploy later on so instead you want to make sure that you when you're creating your cluster you already have this set up so the application can just deploy and then use these these resources also you want to set up the necessary permissions so you know you create service accounts and fibonitus you create necessary azure ad groups and the rows and role bindings including this so that is in place so you know for each team you you need to set up permissions for them to actually be able to deploy their applications using some kind of the public pipeline and also if they need access to the cluster you should provision that's this as part of the infrastructure pipeline so let's talk about automated deployment then um so let's say that you have two clusters first you have a dev test cluster where you have separated them into two different name spaces and then you have a separate production cluster which might be typically running more nodes uh maybe more powerful nodes and so on uh then if we're looking at a typical deployment pipeline for your applications you would have some kind of build first which will then produce the necessary container images and you will push them off into the azure container registry you can actually you can select which bridges you want to use here uh and then you have a you know a staged deployment pipeline here where you first deploy uh the new version into the dev namespace run some testing might go into the test namespace and so on and finally you will deploy into production so so this this is part of the application pipeline but again you should have a pipeline also for the infrastructure which actually creates the clusters to start with as i talked about and doing all the steps that i just mentioned here now if you have clusters dedicated to a single application you might consider this to be the same pipeline you can actually combine these two into into the same pipeline so at the same time when you deploy a new application you will also make sure that the cluster is is up to date uh otherwise it's quite common to kind of consider the cluster to be part of the infrastructure this is located in an infrastructure repository and you have a separate pipeline there it might be a separate team that actually owns that repository and any changes going out to the cluster is done through that pipeline to make sure that they are up to date and consistent and so on so talking about application pipelines i'm going to use azure pipelines for this this session to show you how you can set this up and of course you can do you can use any kind of ci cd automation tool here but there are some some things uh specifically for kubernetes electric pipelines i want to show you so when you do set up a pipeline in azure pipelines they are typically consist of a set of stages so each stage typically correspond to an environment so you could have that test group for example and each stage then running multiple jobs you know one or more jobs on some agent somewhere and so if you want to have this pipeline deploy into a cluster into a specific namespace then in your cluster uh what you do in actual pipelines is that you create uh something called an environment here so the environment really encapsulates a namespace in kubernetes in a specific cluster and it contains them a service connection which really is the necessary credentials to be able to deploy into that namespace and this environment concept also gives you great visibility into that cluster and i will show you how that looks in azure pipelines so it lets for example the developers through azure pipelines actually take a look inside your cluster to see for example troubleshooting deployments and so on that's a great addition also something that it provides is a functionality related to pull requests as i mentioned before that it's very common to for every pull request to actually create a separate deployment to test this pull request not just doing code review but actually deploying little request brush and perhaps running automated tests and so on so there is a specific functionality here in azure pipelines related to kubernetes that we can use that in the case if this is the code request deployment it will automatically create a new environment and a new namespace and then we can deploy into that environment so this can then be set up automatically which is a very powerful flow then we also create a new pull request administrator we will have a new namespace uh within with that particular version running in the cluster and we can access it and wrap tests against it so i want to show you a little bit of demos around this to start with so let's start with the infrastructure as code and uh so for this demo let's smoke literature i have a i have a sample application called the quiz box or qbox with a simple quiz application and i have two repositories here it's the q box that's the action application and then i have a separate uh repository for the environment so this is where i store my my arm templates and so on so let's take a look at a few of these arm templates and and all this code will be available uh on on github after the session so you can take a look there for so first up let's take a look at the the network this is the first part as i mentioned before that we need to create and uh so i just want to point out a few things on this one this is using the virtual network uh type here and uh you know when you create a network you need to make sure that it has the the correct the ip address prefix so that you have you know enough i addresses for your cluster and for the posts running in that cluster i also create uh two different subnets and this is a subnet then the first one that's the one that we're going to deploy the cluster into so i want to make sure that that's uh that's available on the stockton so that's the network but then the more interesting part perhaps is the cluster itself so this is a separate uh arm template so this is actually two different arm templates that i do and uh might be easy to look in little the temp outline here so let's take a look at the cluster here so this is then using the managed cluster type and let's just point out a few a few interesting things here first up your your application will run in on the nodes and this is actually in what's called a node pool so down here we have the the profile for that node pulse this specifies uh what your nose should look like when you create the virtual underlying virtual machines here so here you specify first up you know how many how many virtual machines do i need to talk with uh what size should i be using so you know the number of cores and memory and so on what type of operating system should it be uh specify the subnet id so this is the subnet that i just showed you before so we're going to deploy into that subnet and then we have a lot of more options here so i'll come back to later on things like auto scaling availability zones and so on also important now that when you create a cluster talk about identity so in this case i am using a managed identity so and this is done with this i'm just specifying that the identity is used to design so now azure will create a managed entity for me and this will now be the identity of the cluster so then i need to give that identity the necessary permissions to do a few things so i have three different role assignments down here the first one is i want to make sure that this cluster has the permissions to send data to azure monitor this is a specific role here the monitoring metrics publisher that's the first one i also need to give permissions into that the subnet that i just created to have it it needs permission to you know create new nodes dynamically and assign it to that subnet so it needs to have the permission as well and finally i need to give it the permission to pull images from the container registry so in this case i'm giving it the role assignments uh as part of the container registry namespace and i'm giving it the role uh acr cool azure container relative tool so this is actually what then allows the cluster to pull images the images that i'm building and publishing to that registry and you know you know to run the application so these are the arm templates let's take a look at the pipeline for actually running these things so i again i have two different pipelines one for the environment and one for their actual application so let's take a look at the environment pipeline first and let's start just looking at the latest pipeline here so you can see here that i'm having a staged deployment here where the first one is deploying you know either creating or updating the dev test cluster and after this uh i have two more stages where i deploy uh one cluster in europe and another cluster in u.s so this is my production environment which is running in in two regions but the dev test cluster is just one cluster that i can use for you know for multiple teams uh let's take a look at the the pipeline it looks underneath so this is a yamaha pipeline which is what we use nowadays in azure national pipelines and here you can just you know you can see the stages here i can collapse a little bit so you can see it so this is just my stages and if we look inside here we can actually see that i'm using a job template here and the reason for this is that i i want to make sure that i'm deploying in this case the cluster in exactly the same way in all environments here i don't want to have specific things going on in a specific environment and then template of course is a great thing to do so i don't have to repeat it and i can make sure it's the same same set of steps so this is a separate file and i'm just going to show you what this looks like so it's a lot of details here i was going to go through this in details but here you can see that i'm doing all the things that i talked about before on the slide so i'm first up i'm creating uh the azure active directory groups i'm creating one called cluster reader and another one called cluster admins so i can make sure that those actually exist and then later on i can assign permissions for for the groups and to to be part of the sorry for for users to be part of that group and get the necessary permissions next up i'm running the the the arm template for the network the arm time before the cluster and creating namespaces permissions and finally i installed the traffic english controller here so again this is typically you know various steps in the pipeline uh this is the arm temple step where i deployed the network you know i deployed the cluster and so on and so on but the the details here are not you know super important it's more the fact that what we're actually doing as part of this pipeline interesting here so again this is the the environment pipeline and uh let's take a look at the other one the application pipeline it's the one so this is run then every time i make a change to the application and let's take a look at the latest run here so down here you can see that this is also of course at the stage the pipeline where the first stage now is the build part so you know you can you can consider build just being the first part uh of your of your python really at the first stage of your pipeline so this is where i'm you know building the the docker images and pushing them to a container industry and then i'm deploying it to the dev test cluster and after this it will actually wait for approval and then when i approve it it will go on and run to deploy it into my production cluster here so again let's take a look at this uh the pipeline and then do an edit here just to see what this looks like and this is actually you know it's simpler although you know the first stage here is the build so this is i'm using token compose here for building and pushing the images and next up we have again the various stages here for for europe us and the delta's cluster here and again i'm using a template here for action for the actual steps in the deployment and let's take a quick look here to show you what this looks like switch repository here and put a template so this one is actually a little simpler i'm just using a helm here which is a common way to package and deploy the application so i'm just doing a help me install to make sure i have it on the build server and then i'm deploying the helm charge which is also an output from the build pipeline so one interesting thing here now let's go back to the run here and if we look at the latest run though uh we have this concept here of the environments so that's the one i talked about before so here we can see that this this particular run of this pipeline was deployed into these environments and i can now view the actual environment here and this is also available here in the pipeline hub you have the environments so here you can see the dev test for the eu and the product us environments and this is where you get the visibility actually into the cluster so first up you see the deployment so i can go in here and i can see all the deployments to this particular environment into the depth test but even more so i can actually go kind of drill into this environment now so this is actually a view into the cluster so when i click on this i actually get to see live information coming from the cluster this is it's a great feature for in particular for for the development teams uh you know to make sure that the deployments are working smoothly and it gets some kind of deployment related error you can go in here and look at the status of the the various deployments and you can even drill down into these uh pulse you know the actual containers running in the cluster and make sure that the the details and the you know the amplifier that were deployed actually was correct uh and you can even go and see live logs coming from the cluster like this and of course this information as you will see soon is also available through the portal and through the kubernetes cli tools but it's quite common to not give all the developers you know access to to the cluster itself um and even if you do so you need you need to have some some more knowledge about the question itself this is actually a great way for for developers to easily get access without giving them access to the cluster specifically so and the other feature i talked about was the fact that we can now automatically deploy pull requests and have these environments created for us and i'm going to show you this so i'm actually prepared a pull request here it's a draft pull request so i'm going to just publish it uh and that's with this one i'll kick off a pipeline here so it will actually run a new round of this pipeline against this particular branch then to make sure that this branch is okay but it will also deploy it into a new namespace and while this is running let's just show you how this actually is implemented because if we go back to our pipeline that i showed you before let's screen up here so we can see now that we actually had uh in addition to the dev test stage here i also have a stage that called dev test pr so this is a specific stage for the pull request builds and there's a there's a condition here to make sure that i'm only running this uh if the source branch is you know it's a pull request forth punch and in this case i'm using another template here that is mostly the same as the the first one that i showed you but it has an extra step here so in addition to doing the the helm install and help deploy uh it has this special step called a review app and in addition then we have a step for creating a namespace so this uh task this is a built-in test you can just add here and it will this is the one that will actually create this environment dynamically for us and set up all the permissions to be able to deploy into that namespace that we're creating here so this is what kind of uh does the magic behind this into creating a new namespace creating an environment initial pipelines that points to that namespace and then we can just do the normal steps here so let's see what happens to the pipeline it can take a little while sometime to build it hopefully it should be done fairly soon so yes it's built in the images now it's pushing it to the container registry and as soon as this is done uh it will start deploying it into that new name space so and this you know working with protocols like this is really really powerful we do this all the time so the ability to just create new new pull requests push it and then a few minutes later have it available in an isolated namespace in the cluster and then we can run tests against it we can have stakeholders looking at the functionality and they can be part of the approval process for the pull request so it's a really powerful workflow uh and just maybe this just makes the whole thing uh easier for for actually implementing it so let's see if the if we're done here so it builds and you can see now that it skipped the test it doesn't go here because this is a pull request instead it triggered the deploy to to the polar to the pull request stage which is now done so interesting thing now is if we go back to our environments here we have the dev test called eu and was.

If i now drill down into the dev test you will see that now we actually have an additional resource here it's now called review app dash one two two one two two is actually the id of the pull request but the name here is something that you can you can configure of course but this resource will now create it for us automatically and if we drill down in here we can now see the the deployments that was just created into this namespace and again i can drill all the way down here without the troubleshooting any issues and so on but again now i have a new namespace in my cluster i have the new version deployed into the namespace and i still have this functionality related to environments in national pipelines so this is a really nice integration between azure pipelines and and strongly if you are doing community deployment i strongly recommend you to look into it all right so next up let's talk about monitoring and feedback so if you are working with azure uh if you're running things in azure you definitely have come in touch with azure monitor which is the kind of umbrella service for everything related to the monitoring to logs to analytics and so on so so basically all things in running azure will send data into azure monitor uh so this could be both your applications uh the guest operating systems in addition and all the various resources and services and so on even subscriptions and tenants and so on send different types of analytics information to these stores so azure monitor basically consists of two different stores you have the metric store and then the log store and to the right you can see all the options that you have once you have this data what you can do with it so first up you have the different types of insight tooling so for specific purposes uh there is also a specific uh type of insight that you can use so for example application insights is something for you know giving a lot of insight into your applications so there's a specific tool for that and then you have the the container which is the one that we're going to look at in this session so this is actually monitor for containers so again this is specific tooling for for kubernetes to give you even more insight in addition to being able to you know to view the metrics and the logs and so on we have this additional functionality going on here and then you can use of course you can visualize all the information here using dashboard using power bi and so on and you can yourself analyze the data using those the metric analytics which goes against the metric store and then you have log analytics which allows you to run queries against the log store and you know doing charts doing uh alerts uh working with dashboards as well so again azure monitor for containers this is a kind of a self-settle for azure monitor so this will now pick up when you enable this you will get a lot of data coming from both your containers so anything your container logs will be sent to the log store and in addition azure monitor will then pick up metrics from your containers so things like cpu usage memory usage and so on and send this into the data stores and in addition to your containers uh also everything related to the kubernetes cluster itself so the nodes uh the system parts running inside there everything and the guest operating system everything will also go in here and be available to you in terms of queries and dashboards and alerts and so on so again when you do enable this for your cluster this is optional you don't have to use it but when you do aks will deploy the the log analytics agent uh in a containerized version so it will be running inside the cluster and you connect it to the log analytics workspace so the workspace is where all the information is stored in those data stores you can have multiple workspaces if you want to and then again it collects the memory processor metrics through the kubernetes metrics api everything from your controllers to nodes containers and this is written to the metric store and then it picks up also the the loads of simultaneously to the log store and enabling this is really easy either way how you create the cluster itself using arm templates we can just specify that we want to enable this and we give it the resource id of the of the log analytics workspace where it's just where it should send the data so let's take a look at the azure monitor and specifically then for for aks so this is the start page for azure monitor and uh you can you can see here the the different insights that i talked about so you have the application insights for for your application information you have the four version machines storage accounts and so on and then you have the containers hub here so this is the functionality built on top of the ats monitoring so here i actually get a kind of a overview of all my clusters so i have three clusters that i showed before i have the the test cluster and then i match your production clusters and i can get a quick glimpse here of the status and the health of these clusters so if anything it would be faulty here i would get another status here and then and then of course now i can drill down into these clusters to to get more insights here so let's go into the the dev test cluster here and this will now give me a quick overview of the nodes the cpu and memory utilization the number of nodes the node count and also the number of ports running also in various states here so you can see now that it looks pretty good i have currently around 28 or something pots running in these clusters i can actually also turn on the the live interval which now you switch to a live view so it will just show me information coming in and of course i can also you know select different time periods here so this is a nice overview and another thing you can do here and should do here is related to alerts so you can create all types of alerts here i will show you that in a minute as well but in addition to this you have a lot of predefined alerts so recommended alerts which is things that you typically want to be alerted on and this feature just makes it super simple to just enable these types alerts so for example if i have containers having cpu being you know greater than 95 percent over a certain period of time uh i can enable this alert to this sentence to the srt for example also if i have the number of ports in a failed state are greater than zero so if i post failing i can also set up this alert just by you know flipping this switch so so you know this is something that you really want to enable and again then you can build more customized more advanced looks if you want to so now what i can do now is so this is kind of an overview but then i can drill down here into into my cluster to get more specific information so these are my nodes they're running and actually it's kind of small here but you can actually see the trend here in terms of cpu usage for each of these resources so this is a the node it's kind of under under utilize here so it's hard to see it here but then when i drill down here i see the the controllers and all the way down to the containers here and so you know i can drill down into the specific container i can see you know the cpu usage i can switch to to the memory view see how is the memory going over time and this is really good information.

When you're troubleshooting something that has happened you can kind of look back see what has the trend been like how we've seen you know increasing memory or increasing cpu and so on and again i don't have to you know go all the way from those down to containers i can you know go directly to these containers and in addition i can also filter this because we have in a community customer you always have a lot of system parts running and you might only be interested in in your application so i can filter it on on my namespace so now i only see the my containers here so these are my three deployments i have a front and a back and a database right here so look at looking at one of these containers i can get some information about this container status environment variables and so on and i can also actually get the live data view here similar to what we saw in azure pipelines here it's also great for for troubleshooting and running instance so actually we could go to the application it looks good and actually every time i select the category here there is something that should uh turn the log here so you can see now that i i load just the uh the selected category just as an example so you can see now that the logs are live streaming there which is great for troubleshooting i can also see the events related to the pod itself so i can see things like the image was cool which version it was great it started and scheduled to a specific node and so on the great information available here in this view now then we can work with the two different data stores that i talked about we have the metric store so this is where all type of metrics are stored so i can easily do charts uh type like showing the the number of poles across different states uh i can add multiple metrics and i can create charts here and then i can i can pin these charts uh to a dashboard i can build dashboards with the information that i want to have and basically all the resources all services in azure have this metrics uh store available on the startup so you can easily create your own metrics and you also you always have this type of standard uh metrics chart that i showed you sorry on on this page so if you go into the virtual machine for example you will see here you will see a similar charts based on the metrics for virtual machines then in addition you have the the log analytics so this is where all the logs information are stored and here we can run a lot of different queries using a language called kusto so like for example let's do a container cpu query so this is a sql like query so i'm selecting information from the performance table and then i'm doing some filtering and i'm doing some aggregating so i can view the results here either as a table or a charge you can see this is a huge amount of data but even so it runs really really fast and once i've created this this chart so these queries i can again i can pin it to dashboard but i can also create you know an alert here so this is how you can build more advanced alerts by by using these as these queries and then you can create an alert for example saying that if the number of results coming from this is greater than a specific threshold value over a specific period of time then send an alert a typical example here so really really powerful and the the query syntax take a little bit of time to get used to it but once you know it it's quite simple to write this queries and again everything related everything running in azure will send data here so you can you know do queries across multiple resources so you can combine information from other applications for example application insights in the same query so it's a really really powerful thing all right so left part availability and scalability i'll talk you to talk about as well uh when you're doing talking about availability in asher you have the concept of regions you know we have the west europe east us and so on the radius around the world and inside each of these regions you have something called availability zones so each of these zones they are they are you know physically distributed so they are not really close to each other and the reason for that is of course that the space should not be for example if you have a major thing like an earthquake or a thunderstorm should not be able to wipe all these uh availability also at the same time so each availability zone in addition doesn't have multiple physical data centers which in turn have independent power the water supply and cooling as well so so this is make it available to build highly available applications that are not um susceptible for for uh problems that can occur i mean data centers can go down or they can be you know target of a denial of service attacks for example which by a particular region or availability zone will actually have a degraded performance so when you're deploying something into azure and you want to make use of that availability zones so then you want to just make sure that you're distributing it across multiple availability cells so for example if you're doing virtual machines or virtual machine scale sets you want to deploy the machines into multiple availability zones and then you have a load balancer in front of it that will direct traffic to it so with aks you can take advantage of it and specify that you want to have multiple availability zones so in this case if i'm deploying three nodes and i say i want to have three different availability zones aks will automatically distribute these nodes uh in in these these three availability zones so if i have three nodes i will have one in each availability zone and then if i create more nodes of course they will be you know distributed across these zones again this is very easy to enable just specifying here in our template how many availability zones i want to have or which ones actually and so when you do this you you can set this up then so you can deploy this into multiple region as well so within the region you have multiple availability zones so you have a high availability there but then you know a whole region you know doesn't really tend to go down but you know in addition to this you might want to be able to actually have a performance cluster [Music] no matter where where the user is you might have users across the world and you want to send them to the closest one or you might have some kind of networking issue that actually causes problems in one of the regions so to do this you actually set up multiple regions you deploy the same cluster into multiple regions and then you put something like the traffic value in front of it and the traffic manager can then direct the users to the clusters either by geography so sends you to the closest one or by performance so you can detect which one has the the best response time currently so you will send it to the tester that actually performs the best at the moment when you do this you also want to make sure that you are replicating your container industry so if you have your if you're pushing images to the container industry in your primary region then when you have a cluster running in a secondary region uh it would be not optimal to have it pull images from from the other region across the world it can do that of course but it tends to to increase the time it takes for it to for example deploy new versions so instead you can enable your replication on the container registry so when you push a new image to the to the registration primary it will automatically replicate it to the secondary region so that cluster can upload from the registry that is close to it so related to availability is of course scalability and the different types of skating you can do with aks the first one is is called when you talk about horizontal scaling so you want to scale up the number of instances running and this is something that has you know it's built it's a built-in feature in kubernetes that you can enable and configure and it's called the horizontal port auto scaling hpa so if you enable and configure this now you can then scale up the number of pods depending on how you configure it to handle an increased load now the next part is scaling the cluster itself so you know you can scale up uh or scale out the number of pots to a certain amount but at some point in time you will run out of resources on your nodes and then you can enable the cluster autoscaler which would then add additional nodes to your cluster and then automatically schedule the new ports onto that node and therefore balancing the load over these nodes and the pod autoscaler it works like this you have a deployment in kubernetes where you specify which ports and how many ports you want to run you you can create the hp and configure it so this is one way of configuring it by using the auto scale command so here i'm saying that uh if my pods are are running above 80 cpu um then it should kind of enable the the auto scaling and i'm specifying here that i want to have at least three nodes three quads running and i wanted to scale up to a maximum of 10 pulse so i don't run the risk of totally consuming all resources in my cluster but i want to have the the autoscaler to scale between three and ten here depending on the cpu load here and all pods running in the cables cluster they send data to the metrics server api and this is what the port autoscaler reads and bases the decisions to then either scale up or scale down your deployments based on the cpu so and if you want to for example scale on on cpu percentage you also need to make sure that you're actually defining in your deployment what your your application actually requests so here i'm saying i'm requesting at least 250 millicourse of cpu and also at the top limiter and this is actually used by the auto scaler to to in the algorithm that it uses to to calculate whether to scale up or scale down and how much to scale up it uses this information to do so the cluster auto scaling is uh simple to enable you can just you know enable it and specify i mean again the minimum number of nodes here there are more things you can configure but this is the basic of it and it's hard to demo cluster auto scaling live because it's not you know takes a few minutes to set everything up so actually i ran this this yesterday and just to show you what it is actually would look like after the fact is on the top there you can see the number of pots running the cluster and on the bottom left you see the account of nodes here and if you if you see over here to when the pod start arrives here this is when i added some loads to the cluster and you can also see here that the number of pending posts are increasing errors this means that the autoscaler wants to create more pods but there's not enough resources in the cluster to do so so this is a trigger for the cluster auto scale it's actually create a new node and once that is up and running you can see that now we can scale the box here and we don't have any more coming pods here and finally when the load has gone down after a while the javascript scale will then also remove the node and go back to the minimum here which is three nodes and there's a final type of scaling in a guest which is kind of cool it's we call it serverless scaling and this uses the concept of a virtual node and so in azure we can take advantage of a thing called azure container instance which is a very fast and simple way of running containers so instead of creating new nodes which takes you know a couple of minutes we can instead use a virtual node which under the hood then well instead of creating a physical virtual machine it will actually uh schedule my called inside of azure inside an azure container instance instead and this is actually based on uh something called the virtual it's actually an open source project and this is integrated into aks and the virtual cube it registers itself as a node so for communities it looks like a normal node but what happens when the communities schedule support onto that node what happens under the hood is that uh the cubelet will now instead uh create actual container instances but this is really powerful in certain scenarios when you have some type of burst though so let's say that you have a cubase system where you have some workers that are you know pulling messages from a cube and suddenly you get a lot of messages that you need to handle then instead of either having a huge cluster or scaling up new nodes you can instead use these version nodes which is much faster because it's much faster to start a container instance compared to creating new virtual machines and so you can scale up from zero and you know process those messages and then it can scale down to zero again and of course this also means that you don't really you don't have to pay for nodes running unless you you really need them so if you are running take a look at these virtual nodes this is something that you can just enable and then use when you when you find a need to so let's finish up with some demos on on this autoscaling availability and first up let me just look at the take a look at the availability part so this is the cluster again and again we have uh the node tools here so it says i have two instances here in my in my notepad so under under the hood as i talked about before this is a vector machine scale set which you can drill into if you want to like this so each in my cluster has a scale set and if i go into my notepad here you can see here that the location is multiple zones and looking at the instances actually i i can't see this in this level but if i go into the actual virtual machine i can see that this one is running in zone one uh in west europe and if i go into the other one you can see this running in seoul too in west europe so again aks has automatically distributed my node on top of multiple availability zones and if i would add one more node here it will actually be placed in the third cell so again this is what we can do to make it highly available and then to support the the multi-region functionality i talked about before you can set this up using traffic manager so when you're using a traffic manager this is the this is the end point that your users will browse to and then it will uh redirect the traffic uh based on how you configure it in this case i configured the geographic routing method meaning that it will send traffic to the closest one so you know this is the url so if i go to this url i will be automatically routed to the closest one this can be seen here as well if i do a thing like this you can see you can see that it results to the eu cluster but if you know if i would be in the us or close to the u.s it would automatically direct me to that cluster but i could also i could also use it on the performance routing instead to actually make sure that i always come to the cluster that is most performant in the current time span so last up uh let's download the autoscaling part and let's take a look inside my cluster here if i do i get the deployment here that's all right get deployments in my namespace so again you can see that i have three deployments the front and the back end and database and i'm running uh only one instance here at this point of every pod there so it's not really you know it's not really suitable now for for a high load so i'm gonna enable the auto scaling so uh i'm doing cubecat autoscale um i want to scale my my front and my web app of course i could also scale the autoscale back end if i want to and here i'm specifying that the cpu percentage should be just 10 this is not really realistic but for the demo purpose the next easier to actually show you this and specifying that the minimum should be 3 and the maximum should be 10 here so let's uh let's do this and then i want to add some load to it so i'm using this tool called bombardier to just give it a a lot of hp requests coming from my machine here and actually we can go back to see the inside view here so if we lock you you can also see this scaling here so as it comes in here so if we switch to last 30 minutes and enable the live view here we should be able to see the the pod going up here in a few few seconds so basically now the the autoscaler will look at the at the cpu load of my pulse and what it sees that the you know the average cpu is about ten percent uh it will increase the the number of replicas here i can also monitor this using uh just the status so you can see now that it's actually currently 60 so it's already actually scaled up to six instances here which then uh you should be able to see this here coming in but it actually increased the number of pots here uh it might be hard to see on this scale unfortunately but you can see now that it's scaled up to six replicas and if the load continues it will go all the way up to you know 10 instances based on the cpu load here so you can see now that the average is 31 because we kind of scale up a bit so but as long as it's above 10 it should continue to scale uh even more.

So yeah now it's actually skinned up to 10 instances so if i do get the deployment box you can again see that i'm actually having 10 instances here on my front this is completely done now by by the horizontal port of the and yeah now we can see also here that the pod count is going up here so this is how you can handle uh increased demand and again the auto scale will then run and if you configure it to you know eventually you will run out of resources in the node and that at that point the the cluster of the scale will begin and we increase the number of nodes here as well uh but i just show you that on the slide as well because you can't really wait for that now all right i think i'm actually out of time but that is what i wanted to show you so [Music] talk about the infrastructure's code and deployment to make sure you have a safe repeatable way of provisioning both the cluster and your application uh how you can use azure monitoring to get insights into the cluster application creating alerts creating charts and so on and how you can think and and work with the availability zones and regions and also the scalability functions and both in kubernetes itself. Cloudnow technologies offers devops consulting services and devops services for their clients.Cloudnow ranked as top devops services company in usa.

Devops with Azure Kubernetes

Published: March 23rd 2022

Follow Following Unfollow

Devops with Azure Kubernetes

Owner

Devops with Azure Kubernetes

Creative Fields