(Sharing some personal suffering)

One thing that gets me mad it’s to make several .csv/.txt files in my computer to perform some analysis. I personally prefer to connect directly in some RDBMS (Redshift) and get the data in some straightforward way and store the query inside the Jupiter Notebook.

The main problem with this approach is: a high number of people put their passwords inside the notebooks/scripts and this is very unsafe. (You don’t need to believe me, check it by yourself)

I was trying to pass the environment variables in a traditional way using export VARIABLE_NAME=xptoSomeValue  but after starting the Jupyter Notebook I get the following error:



KeyError                                  Traceback (most recent call last)
<ipython-input-13-2288aa3f6b7a> in <module>()
      2 import os
----> 4 HOST = os.environ['REDSHIFT_HOST']
      5 PORT = os.environ['REDSHIFT_PORT']
      6 USER = os.environ['REDSHIFT_USER']

/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/UserDict.pyc in __getitem__(self, key)
     38         if hasattr(self.__class__, "__missing__"):
     39             return self.__class__.__missing__(self, key)
---> 40         raise KeyError(key)
     41     def __setitem__(self, key, item): self.data[key] = item
     42     def __delitem__(self, key): del self.data[key]



For some reason, this approach didn’t work. I make a small workaround to start using some environmental variables when I call of `jupyter notebook` command in that way:

env REDSHIFT_HOST='myRedshiftHost' REDSHIFT_USER='flavio.clesio' 
REDSHIFT_PASS='myVeryHardPass' jupyter notebook

I hope it helps!

There are countless benefits that centralization brings to your infrastructure. In the last years, several solutions appeared, and among them the Logstash stands out.

In this article I will explain how to create a high-availability cluster for centralized logs with Logstash and Elasticsearch. With this cluster we will be able to index 750GB of logs per day.

Continue reading

A long time ago, in a galaxy not so far away, there lived a peaceful people who used to contest with their friends to see who had the highest uptime server. At the time, it used to be awesome to have a server with more than 1000 days of uptime, to show how Linux was good and stable.


Nowadays, when I see a server with a long uptime, my thoughts are:

“Gosh, this server must be really outdated, it must have a lot of security issues, and God knows what will happen when we need to restart it! The fsck will take forever to run, if the server survives after the reboot!”

But don’t get me wrong: The uptime is still important, but not from a single machine. Now, what matters is the availability of the system as a whole. That’s why here at Movile, we design our platforms to be distributed among a lot of data centers and with fault tolerance, because we know that the hardware will fail sooner or later.

With this in mind, we are going to build a plugin to show the uptime from our nodes, with the ability to search separately by Data Center and order from major to minor. This will be a good opportunity to learn how to expand the Chef Server using a practical example. Also, this plugin is very welcome in times with the GHOST glibc vulnerability around, eating your time

Continue reading

At Amazon Web Services, one of the most common practices is to divide one account into multiple sub-accounts, where each one of them have their own credentials, instances and services. This practice adds complexity when we have a big network, but facilitates many things like: each account can have people with different roles and authorizations, without complex IAM rules; each cost center is entirely separated in a much easier way (without using tags, as it would be in the case of only one account); big projects may have their own totally independent infrastructure; among others.

When we have multiple accounts in the same company, we usually need to link these accounts in a secure way. A great way of doing this is using the VPC Peering to create VPCs in the same region. However, when the VPCs are in different regions, how can they communicate with each other? In this case, instead of using the native VPC Peering offered by AWS, we can create EC2 instances with IPSEC configured, to establish cryptographed VPNs between any network. We call this type of VPN a site-to-site VPN.

Here at Movile we use multiple AWS accounts in many different regions. To keep the monitoring and automation services up and secure, we had to implement these mechanisms. And here, in this tutorial, we show you how we did this.

Continue reading