Are Azure Certifications Valuable?

It has been years since I stopped taking certification exams and lately I decided to get back to that adventure. In September 2020 I passed exams to be recognised as Microsoft Certified Azure Solutions Architect Expert.

Arian Celina - Certified Azure Solutions Architect Expert

Before and after passing the exam, I asked myself the question which I have seen many people ask as well; Are Azure Certifications valuable? I believe the question is applicable not only for Azure Certifications, but in general for any industry certification. Are they valued by the community and employers? Do they make a difference in your career? I have also written about this topic in the past, and yet it seems the same questions are still coming up. Let me share my updated opinion with you.

From the motivational perspective, it is very important to clarify the answer to those questions so we know why we are putting all that effort to learn and prepare for the exams. I also think these questions should be rephrased with a reversed perspective. Before we modify question, let us first discuss about the value itself.

What is the value and who defines it? Who can tell if a certification is valuable? I think think there is no single source of truth for that. From experience, I have encountered employers and peers who value a certified professional more than someone who is not certified. I have also seen opposite. As someone who have interviewed more than 100 candidates so far, I can say both can be right and wrong.

I have met certified candidates who have not been able to defend their title with knowledge. So I have met people who were not certified on a topic but were more knowledgeable than their certified peers. I have also met certified people who were subject matter experts and knew their topic in detail, kind of detail that we do not easily learn by randomly playing with the technology.

When I reflect on my late Azure certification journey but also on my previous certifications for .NET and Java, what I remember is that while preparing for the exams I have often learned hidden details about the topics which I have not encountered during the daily work. Those hidden details then later have quite often have saved me time or effort and enabled me to bring better solutions into life. This in itself is a value for me.

Considering this, the revised question I think we need to ask ourselves would be:

What do we gain from this certification?


In my opinion, we should not get certified so other people value us more because we hold that title. We should do certifications to learn better the technology we like. Of course one can do that without taking the exams and without getting certified. It’s just that taking the exams pushes you to follow a certain curricula which is reviewed by experts and that often gives more structure to the learning.

I have been using Azure for years and have quite some experience deploying and running web applications on various forms of workloads, containerised and non-containerised. Yet, when I took the late exams for Azure Cloud Solutions Architect Expert certification, I learned a lot about some not so familiar topics to me, like migrations of Virtual Machines from on-premise datacenter to Azure, about backing up Virtual Machines, or about Express Routes. Taking those exams pushed me to learn more about those topics which probably I would not in my daily work.

As a conclusion, in my opinion industry certifications are valuable as they push us to get better at that topic and this inherently will make us a more valuable contributor to our team, company and community. When we do that, whether the certification is valued by our potential future employers plays a smaller role in answering this question.

Automated resource creation in Azure

When creating single resources, e.g. a Virtual Machine or a Web App Service, it is very convenient to do it from Azure Portal. It is so convenient, that quite often we rush into creating a whole setup immediately from the portal. It is as soon as we need to copy the setup when we see that this does not scale. In this article, I will elaborate our current options for automated resource creation in Azure and their advantages and disadvantages based on my personal experiences.

When is automated resource creation necessary?

Before digging into technical details, let us first understand, when do we need to create resources automatically. This approach is my personal choice anytime I am setting up something that will stay longer than a day or two. In other words, if I am testing an idea, need a quick web app service or a storage, etc…, I go ahead and create it from the portal. If I am setting up a solution to deploy an application, then I know that eventually I will need to recreate that setup again. This could be that we will need to replicate the environment, e.g. create a staging environment. It could also be that we need to replicate the same setup in another region. In such scenarios, automating the resource creation will save you great deal of effort later.

Another reason to have resource creation automated is security. One has to be prepared for a catastrophic situation, e.g. your existing setup is not accessible anymore, then having it automated will enable you to recreate it fast and lowering your downtime.

Note: although this post is focused on options for Azure, equivalent alternative options exist for AWS Cloud as well.

Let us discuss what are our automation options?

ARM Templates

Azure Resource Manager (ARM) Templates is the Microsoft’s recommended way of automation. These templates essentially are JSON files with a “simple” structure. They can be written using any text editor, but there is particularly good support in VS Code editor for these templates.

Advantages

There are certain advantages in using ARM templates, the most important one in my opinion being the support given inside Azure. They can be very well integrated into Azure DevOps to automate infrastructure creation. I can also download the template of every manually created resource (mostly accurate), put it in source control and connect to a pipeline. Voila, from manual to automated resource creation in a few simple steps.

These templates can also be used to make sure that the infrastructure has not changed. A valid scenario is, if you run the infrastructure pipeline regularly, and if someone has changed a setting or removed a resource manually, the pipeline will correct the change and bring it to a desired state.

Also, its good integration with Azure KeyVault allows us to store secrets safely in the vault and access them easily from the pipeline, a particularly useful feature.

Disadvantages

There are certain disadvantages about ARM templates too. They are an Azure feature and this knowledge is not reusable in other cloud platforms. Also, writing them is not super easy (though this is improving every day with better support in VS Code).

In my opinion, I would also classify the documentation about ARM templates as a disadvantage. Microsoft is improving this continuously by providing more samples and tutorials, but at the time of this writing, if you need something beyond a simple setup, finding your way will not be super easy.

Terraform

Terraform is a cloud automation tool created by HashiCorp. Its support for Azure is pretty well and mature. Having multi cloud support also makes Terraform an appealing tool.

Advantages

One of the main advantages of Terraform is it having multi cloud support. Even if you are not using different cloud platform, you can reuse this knowledge later when you need to use another cloud provider. The documentation is pretty decent and you can find a lot of blogs and samples on the internet. Another very big advantage is possibility to preview my changes. I can preview all the potential changes my script is going to make if I run giving me a possibility to prevent an endangering operation. This gives me confidence before executing any scripts especially against productive environments

Another favourite feature of Terraform is the workspaces. Workspaces makes it super easy to manage different environments of the same infrastructure setup.

Disadvantages

Although terraform is advertised as a multi cloud solution, the abstraction of cloud providers are not at the desired level. Meaning, the building blocs of terraform files are cloud specific. When you create an Azure storage, it’s not the same code used to create an AWS S3 bucket.

Another big problem that can arise with Terraform is the Terraform state file. The state file contains the current state and if it gets corrupted, then one cannot execute any changes against the cloud environment. Therefore, it’s important to state it in a shared drive where all the users of the script have access to. Also, as the state file contains all the secrets applied, it itself becomes a secret you want to protect well.

Azure SDKs

Azure offers SDKs for different programming languages. They offer another programmatic alternative to automated resource creation in Azure. This could be very useful in some specific scenarios, e.g. if you want to create a custom one-stop-shop resource creation in your organisation where you can enforce your standards as well as hook resource creation to your specific organisational workflows and needs.

My experience so far in using Azure SDKs has been very limited, though I have not developed a very pleasant opinion about them. One big disadvantage that I have faced so far was that the SDK itself was not up to date with Azure. One such scenario was when creating an App Service, available options for selecting the tech stack and some other settings offered by SDK was not lacking the options available in the portal.

Conclusion

Overall I would say, we are lucky having many options we can choose from. I believe often which one one will choose depends on the circumstances, but I hope this list of advantages and disadvantages will help you shape your decision. In my case, I try to use Terraform whenever I can, if not then I usually fall back to ARM templates.

Kubernetes basic questions answered

when getting started with Kubernetes, it can be a daunting task at first to get a grasp of the basic concepts which will allow you to move forward with it. I would like to try to answer some basic questions about Kubernetes, some of which I was having in my mind when I first started to learn and work with Kubernetes.

If you do not have any understanding of containers and containerised applications, it becomes even harder to realise where in the big picture does Kubernetes fit.

Explaining what containers are is out of the scope of this article. If you feel like you still need clarity in understanding what containers are and how do they help, I suggest to find that out first before moving forward. If you feel comfortable with containers, then off we go.

What is Kubernetes?

Kubernetes is a container orchestrator. And what would that be? Well, when we containerise an application, we package the application and its environment in an image, but we still need somehow to run it. We can execute a docker run command and run a docker container, but then when we have an update or a new image, we would manually need to kill the running container and run the new one. And what happens when the container crashes because of e.g. unhealthy state? Who would take care of restarting the container? This is where Kubernetes fits into the big picture.

Kubernetes orchestrates the lifecycle of a container. It can deploy containers together with their dependencies, restart them if they crash, update them if the image version changes, create new instances of the image without downtime, etc. Most of these can be fairly easily automated with the help of Kubernetes.

What is a Pod?

Well, in the previous paragraph I made a false statement. Kubernetes doesn’t actually focus a lot directly on the containers. Kubernetes works on a one level higher abstraction concept called Pods. Pods can be thought of as mini virtual machines which can run multiple containers inside. Usually, Docker is used as a container engine, but this can be configured differently if needed.

Ideally, a pod would run a main container (e.g. one application or a service), and it can run other side containers which would serve the main container. The reason behind this is that, if one of the containers signals to be unhealthy, Kubernetes will kill the whole Pod and try to create a new one. Therefore, it’s a good practice to have one main service running in a Pod, and if that service is not healthy, then the Pod is instatiated by the orchestration scheduler.

What applications can I run on Kubernetes?

Anything that can be containerised. Kubernetes supports stateless as well as stateful applications. Although, from experience I can say, running stateless applications is easier. That’s because managing the state requires more management work from our side.

Personally, I try to push stateful software outside Kubernetes and use them from PaaS providers. One example of such a scenario is the Database. This leaves me more room to focus on running the in-house developed applications and less attention on dependencies.

What is kubectl?

Kubectl is a CLI tool to query and manage kubernetes. Kubernetes has several types of resources. Those resources can be Pods, Services, Deployments, ConfigMaps, etc. Kubectl allows us easily to find information about those resource as well as change them. One example would be to read the deployment configuration of a pod, another would be scaling up a deployment.

One can get most (if not all) of these using a UI, but come on, who needs a UI nowadays ☺️.

I want to have a Kubernetes cluster, what are my options?

Starting from the most obvious option, you can get some bare metal servers and install your own Kubernetes cluster. Though, I would strongly not recommend this until you really know what you are doing. Kubernetes is a very complex system. It has several components and a good configuration would require several servers. Only keeping a safe, available and up to date configuration would be a challenge, let alone taking care of more complex topics like the security of the cluster.

Unless you are constrained here, I would strongly recommend you start with one of the cloud providers that provide Kubernetes as a service. It is offered by many providers, amongst them Azure, AWS, and DigitalOcean.

The cloud providers abstract away the management of the cluster itself and give you freedom to focus on actually building your application infrastructure.

When is Kubernetes good for me?

If you have only one or two applications running, you are better off without it. Kubernetes offers great functionality to orchestrate containers, but it also comes with an administration overhead. If you are not building many (3+) different applications or micro services that you deploy frequently (several times per month), in my opinion it would not be a good option.

Kubernetes is a great helper in an environment of multiple micro services where continuous delivery is the process. It is an overkill to run 2-3 applications which get deployed a couple of times per month. You get my point.

Start small and adjust as you grow!

Conclusion

Kubernetes is one of our time’s coolest tools. It has enabled many business solutions scale flexibly and shine. But at the same time, it can be a complex beast. Take it with a grain of salt and prepare well before adopting. Equipped with knowledge, it will take your DevOps processes and with it your possibility to reacting to changes to a whole new level.

My language learning framework

The most important lesson that I have learned in my career of almost 15 years in software development is that programming languages are just tools to solve real life or business problems. About 5 years ago I was doing full time .NET, then I jumped to Java for about two years, then I switched to Ruby and Rails, and just recently I started using Kotlin to create two microservices (don’t ask me why, the business needed it ;)).

Because of this, there is no point whatsoever to be bound to a language/platform religiously. The languages are tools but it’s the core knowledge of computer science and software development practices that prevail.

These changes have been inspired by different rationales, sometimes influenced by the employer and sometimes by the economic conditions of the runtime environment. E.g. hosting a Java web application during 2006 was way more expensive than a PHP application, therefore, I used PHP as a primary language for my web facing pet projects. So during my career so far, I have used Java, .NET, PHP and Ruby as my core languages and also have implemented small solutions using Kotlin, Javascript, and Python.

To adapt to this changing environment of ours I have noticed I had created a framework, which I call my language learning framework. The framework itself is not about the language per se, but about the whole ecosystem around that language. It came as a result of identifying unchanging things around any language ecosystem. E.g. no matter the type of language, there is a way to deal with strings, numbers or arrays. Or the language has some sort of collections.

With almost every language that I have used, I have had the need to also learn a framework to develop web applications, a framework to do testing, and other common things around the application. So out of this experience, here I share with you my language learning framework. It lists the things I try to learn or pay attention to when I start learning a new language.

I. Language
1. Runtime & ecosystem (general knowledge)
2. Syntax
3. Data types (if there are types)
4. Main constructs
4.1. packages/modules, classes and methods/functions
4.2. loops & conditioning
4.3. arrays
5. Core libraries
5.1. String manipulations
5.2. collections (lists, arrays, maps, sets)
5.3. math (rounding, sqrt, pow, PI)
5.4. important packages/gems/etc. and any language/platform specific library
II. Frameworks
1. Application development (evaluate popular ones and pick one)
2. Testing (unit & integration)
3. Main design patterns and in that language
III. Toolset & Other
1. IDE platform specifics
2. CI/CD
3. Deployment (does the docker & Kubernetes work or is there anything platform specific)
4. Community
5. important learning resources

Some of these items on the list are dead clear so you just follow it, but some of them are hard, like choosing which framework to use. How do you choose which framework to pick. Well, in those cases, I use different inputs to make my decision but most often it is an educated guess. What I consider is, I ask my colleagues/friends if they have experience with any of the ones I am considering and get their opinions. I also evaluate possible blog posts about them and also try to measure the popularity by checking the number of contributors, how often does a major version gets released, how many job openings I find requiring that framework, and based on all these inputs I narrow my list to two options. Then I try both of them with a certain simple scenario and I try to evaluate how easily I could implement that scenario and get a feeling of both and then decide. Out of all the factors I consider, how much I like it and how many job vacancies are there requiring that framework are two criteria that weigh most with me. The pleasure to work with is very important for me, but also having an opportunity to use it (have an employer/business which actually uses the framework) is also important in my opinion.

One important thing is, when I try to jump on a new language, I try to do all these steps in a relatively short period, potentially with max 2-3 months, and then try to repeat it at least 2 times with 2-3 sample projects. In this way I make it possible for my brain to remember it, otherwise, if it takes too long to go through the list, I forget things in between and the end result is not satisfactory.

This is not a definitive list and doesn’t include everything we need to learn to be productive in one language/platform, but it certainly serves as a good starting point for me when I want to jump into a new language.

Do you have a different approach? Share with us!

Taking over a team, what should you do?

As team leaders, chances of us taking over a team rather than creating a new one are higher. When taking over a new team, I have found very useful to go through several steps to find out some information before moving forward.

As a baseline, I first want to state what is important for me at this stage, so the next steps make more sense to you. I believe that some characteristics must be present in a team to make teamwork possible, and one of the most crucial one for me is communication. If there is no open and clear communication, then you are opening the doors to all sorts of problems, starting with misunderstandings to not delivering any work. So let’s go through my observation points:

Communication

This is the most important one for me and the key to all the successes or failures in the team. It is of vital importance to understand and define the communication means the team feels most comfortable with. Some do prefer verbal communication (this is also my preference when the team is on site) and some do like more written communication, especially when part of the team is remote. From experience, a mix of both, a hybrid communication works pretty well. It maintains the human touch as well as leaves chance for missing or remote people to stay informed as well.

Team dynamics

I believe every team has an individual spirit. That is not individuals, but the group acting together. It is like a one way hash of the team. It is like that as if you remove one person from the team, or if you ad one, the team spirit is different, so the hash changes. The team dynamics tell you how well the currents in the team flow. How well do the team members communicate and get along with their teammates? Who dominates, who leads, and who is the strong influencer? Who is more silent and needs help to get heard and who maintains the balance of the team? Finding answers to these questions will help you determine where and how should your focus be directed to try to get the team to an optimal performance level,

Individuals in the team

Although the team spirit is super important, individuals in the team are as important as the former. Every team member has their unique personality, therefore, their influence on the team is individually different. Ignoring one team member may result in breaking the whole team dynamics. One needs to continuously foster and maintain the individual relationship with every team member. One thing that helps this a lot is having 1 on 1 meetings regularly to check how they feel about the team, is there anything that can be improved or changed that can contribute to having a better engagement or more interesting experience.

Strengths and weaknesses of team members

To take the team to an optimal performance level, I find it of utmost importance to fine-tune or calibrate the way individuals contribute to the team. Every individual has strengths and weaknesses. The importance is to engage them on things they are best at while allowing their teammates to cover on their weaknesses. This contributes to people feel better as they perform best when they do the things they feel most strong at. One might argue that we should push our teammates to improve on their weaknesses, and yes, I agree, but remember, the focus of this post is on improving team performance, there will be another post on helping individuals thrive 😉

Conclusion

When you take over a team which is already established, it is not an easy task to find out how to behave best so you improve the team performance and do not decrease it. My usual approach in these cases is to take my time, usually for a sprint, and do nothing except observe these characteristics. After I create the picture in my head, then I start approaching individuals to find out more about them and complete the picture. It is this time when I create a list of things that need to improve and start acting on improving what needs to be improved and reach a fluent delivery cycle.

Do you have a different experience or opinion? I’d like to hear about it. Please share it with us.

How to unleash employee creativity

At my current company, Springer Nature, we have a great benefit of having the freedom to dedicate 10 percent of our work time working for a side project, learn something new, or on anything that can help us learn something new. Our employer gave us this freedom so we can grow personally and professionally, but one observation I have had during these months that we are practicing this was that it also helps to unleash employee creativity.

How we do it?

This initiative firstly started as a Hack Day for developers. Then we renamed it to “10 percent time” so it can be more inclusive to other profiles that are part of our department, such as UI & UX designers, PMs, and POs. We spend every second Friday of the month by doing something other than work related stuff, something that would in one way or another help us learn something new. Sometimes we do an online course, test that new version of a library we use every day, evaluate a new framework or even learn a new programming language. Beginning of the day we do a joint stand up where we share our plans for that day with other participants. Sometimes someone likes somebody’s idea and we join forces for that day to create something awesome. By the end of the day, we gather together and share what we have created and what did we learn. Some do a demo, some showcase their code and some just summarize their learnings. During this sharing session often people get the inspiration for their next hack day, or sometimes we realize that a presented idea could be of a benefit for the company to grow as a project and we pitch it to our colleagues and management.

What did we do during these days?

During the previous Hack Day, one of my colleagues did create a simple  NodeJS CRUD API as she wanted to learn NodeJS. On the other side, as I usually do backend stuff, from time to time I am quite interested to learn things about frontend. For a long time, I wanted to learn Vue.js, so I volunteered to create the frontend for that API. During those few hours of coding, we managed to do a simple Vue.js application and implement a frontend for CRUD operations of that API. The code can be found at https://github.com/acelina/books-fe GitHub repo. Of course, I didn’t become proficient in Vue.js in one day, but next time I need a frontend for my app, at least I know where to start and I value this.

In another case, me and a colleague of mine started a Hack Day project to improve the process of managing code challenges for our developer candidates. We worked on this project for  three Hack Days. The result was an application that included features like managing the automatic creation of a GitHub repo for a candidate, including there her code challenge and give her the privileges to commit to that repo. It also included the feature to manage the workflow of submission, so when the candidate creates a PR of her finished code challenge, the application will remove her from project collaborators and notifies us in a Slack channel that a submission is ready to be reviewed. It was a three fun Hack Days for two of us and it resulted in a production-ready application which eliminated manual labor. There are several other successful results which came out of this 10 percent time.

What we achieved?

I understand that the projects we do during these days are never ready for production, but we achieved to create a culture of sharing the knowledge with others and by it to foster employee creativity. This 10 percent time creates space for us to experiments with things we don’t have the time to experiment during our regular work days because of deadlines or priorities. It also helps us to grow professionally and personally. Sometimes it results in a useful thing for the company as well, and most importantly it helps us to unleash our creativity while having fun. As a developer, I value this a lot in a company, and I would recommend every company to start practicing it. You never know where brilliant ideas come from!

Setup your deployment using Dokku

Production deployment environments come in all sorts of variations nowadays. The configuration architecture is mainly influenced by the size of the application and the budget but also about the process flow and how easy it is to do the deployment. Quite often, at the beginning phase of application/business, it is quite reasonable that you might not need a full-fledged cloud deployment setup. This post assumes that you have small infrastructure requirements but you want a smooth deployment process.For a simpler scenario when you need a single server or few of them, Dokku does a great job by making it possible to push code to production by a simple git push.

Dokku is a Heroku on your own infrastructure. It allows easy management of the deployment workflow of your application by managing applications in docker containers. It also uses an Nginx reverse proxy but as its configuration is managed by dokku itself, you barely notice it. After you deploy your application, you can scale your application up and down with a single command. Through Dokku plugins, you can also create and manage database instances (e.g. Postgres or MySQL), schedule automatic backups to AWS S3, create redis instances and configure HTTPS using Let’s Encrypt.

Some cloud infrastructure providers like Digital Ocean offer a ready-made server image of Dokku which you can instantiate in minutes and start using it right away. In this post, I will go through the process of installing and setting up your own Dokku environment. For this writing, I’ll suppose you have an Ubuntu Server with ssh enabled already in place. For your information, I have a virtual machine (VM) on VirtualBox running Ubuntu 16.04, but you can do the same on any instance of Ubuntu running anywhere. Off we go…

Step 1: Installation of dokku

To do the installation, we follow these commands:

wget https://raw.githubusercontent.com/dokku/dokku/v0.11.4/bootstrap.sh

sudo DOKKU_TAG=v0.11.4 bash bootstrap.sh

The first line will download the installation script and the second line will actually do the installation. Downloading the script will be fast, but the installation itself will take few minutes (depending on the performance of your server). Please check the this page to find out the most recent version.

dokku setup page

Here in the “Public Key” field, we should paste our public key (mine is at ~/.ssh/id_rsa.pub) and then click Finish Setup button. As of this step, our dokku is set and ready.

Step 3: Creating and deploying an application

Now that we have dokku ready, we want to deploy our first app. Setting up the application is a one-time thing which we need to do from the server’s console. After we are connected to our server using ssh, we can create and configure the application in two steps:

    • Run dokku apps:create name_of_app to create the app. We can verify that the app is created by running dokku apps:list
    • Run dokku domains:add hello hello-world-app.com to configure the domain of the app.

In order to serve the app, nginx needs to have a domain configured for the app. If you have a domain to configure, go for it. For this post I will use a fake domain which I will map in my /etc/hosts file to the IP address of my VM. So for this example, I will use hello-wold-app.com domain.

Next, we need to set the git remote for the dokku environment. We do this by running git remote add remote_name dokku@dokku_server:name_of_app. Here,

  • remote_name is the name we want to give to our deployment environment, this could be anything you like e.g. production, staging, etc.
  • dokku_server is the IP address or the URL of the dokku server we just configured
  • name_of_app is the name we specified when we created the app.

Now that we have it ready, we can deploy our app by running git push dokku_server master. When the deployment is finished, you can test you application by visiting http:://hello-world-app.com

p.s. if you don’t have a ready application to deploy, go ahead and download a hello world web application from my github account.

 

Step 4 (optional): Scaling your application

Now that we have our web application deployed, it is running in a single instance of a docker container. For a production deployment we probably would want to scale it to more instances to have a better performance. As an optimal configuration, probably we would want to scale the app to as many instances as the number of processor cores we have. Assuming we have an 8 core processor in our server, we could scale our app to 8 instances by running

dokku ps:scale name_of_app web=8

after that we can check the instances by running dokku ls either  or sudo docker ps.

Conclusion

When setting up new applications, I try to take the pragmatic approach and keep it as simple as possible. In my opinion, dokku is a good starting point. It makes the deployment process dead simple as well as gives us the flexibility to scale it as we need. As the application starts facing a lot of traffic and this infrastructure starts to have a hard time coping with it, then I think about more advanced deployment workflows.

If you have followed the steps and tried your own installation, I’d like to hear about your experience. Please post a comment and share it with us.

Hope it helped!

My impression of DevOpsCon 2017 Berlin

Lately I have developed interest on devops. From time to time I try to learn about the new possibilities to automate and optimize my software delivery process, and I find quite exciting to learn about what devops offers today.  I think all developers should at least know a little bit about system administration and devops to understand better the environment their applications are deployed and make themselves more productive. In this post, I will summarize my impression of DevOpsCon 2017 Berlin.

Last two days I had the opportunity to attend DevOpsCon 2017 conference in Berlin. The conference had a pretty busy agenda full of presentations on five different tracks and four different keynotes. The presentations covered broad subjects such as designing microservices, using tools like Jenkins to create container images, security of docker containers, managing delivery pipelines, managing deployment in a polyglot environment, etc.

What I liked: most of the given talks were about real life experiences from some companies and consultants on how they implemented or helped other companies implement devops in their company structure. Some shared their previous failures and what they learned from them, which I found very useful.

What I missed: most of the talks were plain presentations of slides and very little to no demo. It felt a little dry to just listen to people sharing their experience of using a tool without a single demo about it. I was expecting a lot more hands on demos.

What I didn’t like: some of the talks were given by the conference sponsors. As they were showing use cases around their products, sometimes the talks sounded more of a marketing pitch rather than experience sharing presentation.

As a takeaway from those talks and experiences shared, I understood that quite a lot of companies are already working towards having DevOps people on their teams, be it as a specialized position, or as a mixed responsibility of developers or operations people. I also understood that companies quite often are struggling to fit these positions into their current organizational structures and sometimes there is a need to change the way their teams communicate. Being a relatively new position, it is also one of the misunderstood positions as the responsibilities of a devops person are not quite clearly identified in most places.

Backup your database

To backup your database regularly is one of the tasks that can be very easily accomplished but is quite often neglected until the minute you face the first disaster. Backup solutions can be of very complex types which you might engineer to simple ones like copy and pasting. Depending on your application, the necessity for complex solutions might vary. So what could be a minimum solution that is viable and stable?

The simplest solution in my opinion should contain these elements:

  • Create a dump of the database
  • Copy the dump to another location (different from the DB server)

Now depending on your RDBMS and operating system, you can take different approaches to implement these two tasks. On a Linux system with a MySQL or PostgreSQL, my approach would be to write a shell script which does these two functions. One such a script might look like this

#!/bin/bash

DATE=`date +%Y%m%d-%s`

FILE=/path_to_bckp_folder/$DATE-blog.sql

mysqldump  db_name > $FILE

Line no. 3 creates the DATE variable which holds the current timestamp. It is used to construct the name of the backup file. Line no. 5 creates the FILE variable which holds the complete path of the backup file. Line no. 7 is the MySQL command to dump the database named “db_name” to the path determined in line no. 5.

So far, this code completes the first task of backup up the database. The next task is to copy it somewhere else so we don’t have the db and backup on the same disk. Now this “somewhere” could be some other server, some other disk, or somewhere else e.g. cloud storage. My preferred choice would be to put it to a cloud storage like AWS S3. Having aws cli installed, copying the backup file to an AWS S3 bucket is a one line code

aws s3 cp $FILE s3://bucket_path

The complete file would look like:

#!/bin/bash

DATE=`date +%Y%m%d-%s`

FILE=/path_to_bckp_folder/$DATE-blog.sql

mysqldump  db_name > $FILE

aws s3 cp $FILE s3://blog-db-bckp

Conclusion

I know a lot of people might not agree with me that this is a good solution, but in my opinion, this is the minimum code that does the work. What you need to do next is add this script to /etc/cron.daily folder and your script will execute daily and do the backup for you (please make sure that permissions are correctly set so it can be executed). It is not the most elegant solution but does the job.

As a bonus point, I would also add a “done” notification to a script which would send an email or post to a slack channel or whatever notification that suits you so you know that your script has executed.

ElasticSearch – Getting started

Context

A few weeks ago, our team decided to use ElasticSearch for the search functionality of the project we are implementing. I had no previous experience implementing any search engines, so I got excited to get my hands onto doing something really new. What I am going to describe you here is the process I followed do a spike on ElasticSearch and what I learned out of it.

Why should you care about it

Elasticsearch is useful on several scenarios. It is very good for searching product catalogs, creating recommendation engines, logs aggregation and searching for faster debugging, data mining and identification of patterns, and in many more scenarios. There is a chance, the product you are working on might need such a thing.

Some basic concepts first

As an entry point to this task, first I had to learn the basic concepts about ElasticSearch. In essence, ElasticSearch is a document database which uses Lucene as a search engine. It is fast and horizontally scalable, though, it is a very memory hungry system. It stores all documents inside so called Indices. An index can be imaged something like a database in the relational databases world. Inside indices, you can have types, which you might think of as database tables (not exactly like that, but a good analogy).

Data of an index is saved in one or more shards (the default is 5 but this is configurable). Every shard can have one or more replicas. A replica is just a copy of the shard and it serves for optimizing performance and failover. If you don’t have millions of data, it might be good to have fewer shards for a better performance.

Setup ElasticSearch on your computer

To start experimenting, I started with setting up an instance on my computer. Setting up ElasticSearch is pretty easy (at least for the beginning). There are numerous ways you could follow and they are very well documented in their installation page. The approach I took is to have it as a docker container on my local machine so I can start experimenting right away and for production to have a chef script to do the installation for us.

If you already have docker installed, having ElasticSearch up and running on your machine is a matter of minutes by executing pull command to get the image

docker pull docker.elastic.co/elasticsearch/elasticsearch:5.3.1

and running the container

docker run -p 9200:9200 -e "http.host=0.0.0.0" -e "transport.host=127.0.0.1" docker.elastic.co/elasticsearch/elasticsearch:5.3.1

This command runs the container and exposes port ElasticSearch on port 9200. If you try in your browser executing http://localhost:9200 (username: elastic, password: changeme) you will receive a response like

{
  name: "B4VsybF",
  cluster_name: "docker-cluster",
  cluster_uuid: "EDkqSWH1Q7mjM3RSEV7kVw",
  version: {
    number: "5.3.1",
    build_hash: "5f9cf58",
    build_date: "2017-04-17T15:52:53.846Z",
    build_snapshot: false,
    lucene_version: "6.4.2"
  },
  tagline: "You Know, for Search"
}

This shows that ElasticSearch is running inside the docker container and you can start playing with it. You can find all these installation details in Elastic’s site.

Loading data to an index

Now that ElasticSearch is running, the next step is to populate some data so we can test the search functionality. At simplest, we can communicate with ElasticSearch using curl cli tool or by using a http tool like Postman, or Sense – chrome extension. Pushing data can be done manually one record at a time, or in bulk. As a first step, we could create an index by executing

PUT /documents/person
{
  "shards":2,
  "replicas":1
}

This will create index named documents with two shards and one replica per shard. It will also create a type named person.

To insert a person document, we can execute a post request with person’s data

POST /documents/person/1
{
  "name": "Arian",
  "last_name": "Celina",
  "phone": "1234567"
}

The url segments map to {index}/{type}/{id}. If we didn’t specify id, then ElasticSearch would create an id for us. The response from ElasticSearch for this request would look like

{
  "_index": "documents",
  "_type": "person",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "created": true
}

From the response, I can read that I posted to index documents and type person and the request was successful.

For the sake of simplicity, we just posted the data to ElasticSearch without doing any data mapping. If we don’t do it, Elastic will do a dynamic mapping. If you want to enforce the type of data uploaded, you might consider to create a mapping when creating the index. Please take a look at mapping documentation for more details.

Loading more data

Posting one or two documents is easy, however, that would be of little help to really test searching functionality. To make the test more realistically, what I did was to export data from our PostgreSQL database to a json file and then posting the file to ElasticSearch using a curl command

curl -XPOST 'http://localhost:9200/documents/person' -d /path/to/file.json

If the export file is too large, you might want to split it to smaller ports (I got the right size by ‘trial and error’).

Searching

ElasticSearch offers plethora of query combinations. The query documentation is large and describes all the ways to construct complex queries with filters and aggregations. To the simplest form, for the sake of demonstration, if I want to search for the person whose phone number is ‘1234567’, the query would look like

GET /documents/person/_search?q=phone:1234567

or in a more complex form

GET /documents/person
{
  "query": {
    "match": {
      "phone": "1234567"
    }
  }
}

Either one would bring the same result. For more complex queries, please consult the documentation.

Conclusion

With such a short article, I could only touch the top of the iceberg about ElasticSearch. This article summarises the steps you can take to bring up an instance of ElasticSearch, load some data on it and start experimenting with it. Happy hacking.