How to show all options in a multi-select

For documentation purposes I needed to list all of the options in a multi-select. In other words, the number of lines or height of the select should be at least large enough to show every option in the select which has it’s multiple attribute enabled.

The fast fix is to use developer options to manually modify the height of the select. This works fine if you have just one select to modify, but for multiple it becomes repetitive and boring. Let’s use the JavaScript console to automate it!

The objective is to iterate over all the select tags in the current page, count the number of option tags and change the size of the select accordingly.

Step 1 is to open the developer console of the browser. In case of FireFox it’s called Web developer tools and in Chrome it’s called Developer tools. On my current system both are activated with the shortcut Ctrl + Shift + i.
Step 2 is to enter the javascript console and paste the following code:

var selects = document.getElementsByTagName('select');
for(var i = 0; i < selects.length; i++){
  var s = selects[i];
  if (s.multiple) {
    var opts = s.getElementsByTagName('option');
    s.size = Math.max(s.size, opts.length);

All select items should now display at maximum height necessary to see all the options.


Trampoline into AWS

Amazon Web Services started up 10 years ago with the Simple Storage Service. To celebrate this QwikLABS’ on-demand computer labs were free for a whole month. At the time of writing there’s less than a week left, but still it might be useful to publish a curated list of labs.

Currently there are 92 labs available. There are several ways to explore them and the most appealing one is to only do the labs in a quest that will help you achieve a certain badge. That’s right, you can show off that you know how to click through a step-by-step online document instruction! You can even share your board of badges. It seems all this really means is that there’s a public URL people can view.

Another approach might be to work through levels. Labs come in three levels that hint and difficulty, time and complexity. In brief they are:

Introductory level – A short introduction into one service. Often it’s clicking through the same steps as the introductory video of a service, but with links and some theory to read through. Most labs at this level are free.
Fundamental level – Usually have the title “Working with …” these labs jump a little higher. Instead of just clicking around in the AWS console, you might have to use command line tools or some API. These will be provided, but you will need to login via SSH or Remote Desktop.
Expert level – Expect a slightly more complex lab with a combination of services. Often what you’ll learn here can be used as a reference point for real-world implementations. Consider this to usually be at the security level of AWS’ Developer guides and not production ready at all.

So those are the two main approaches you might take, quests or working through levels. Personally neither are appealing to me, although I did use quests to try and get lots of badges. That’s why I decided to list a few labs based on categories. I’ve already distributed this selection internally at my place of employment, but it might be useful for more general distribution.


Amazon Web Services offers security and data protection in the cloud and can prove it.

For instance, it complies to the following sample of laws and regulations:

The infrastructure has certain certifications, examples include:

A full list of compliance is available, there’s just too much to list.However, depending on which service you’re using, you are still responsible for compliance and security of your application level, and perhaps even closer to the metal. The next few labs cover what I consider to be relevant topics on access and integration with Amazon Web Services.


Amazon offers lots of services. I made a selection of the ones I consider interesting building blocks for cloud-based applications.


Going through the individual services in labs get you a basic feeling of how they work, but it is more exciting to see them work together. The following labs are examples of combining different services in to functional applications.


DevOps, everybody has a definition and if you’re great at passing exams you can even become a AWS Certified DevOps person. The following two labs are very devops-ey and great starting points to explore more.

Aside from these lists, you could also do the reverse. Documentation at AWS often includes links to labs at QwikLABS.

QwikLABS’ Working with Amazon DynamoDB

Until the 31th of March 2016 all labs on QwikLABS are free, a nice opportunity to see how many badges I can get. My first try was the “Working with Amazon DynamoDB” course.

DynamoDB is a NoSQL database service that abstracts most of the setup, maintenance and scaling behind a less complex interface. Although there is a huge amount of documentation, there are few ways to get a good understanding of what DynamoDB can do without starting a project.

The first professional experience I had with DynamoDB turned out to be a false start. The project wanted to use too many hip new technologies, but designed tables the way that is best practice for relational databases. It wasn’t possible to run DynamoDB locally, which is something that can frustrate programmers who are uncomfortable using AWS or simply don’t have enough permissions.

At the same time I wasn’t sure how to configure read and write capacity on tables. These parameters are used by the service to provision and scale the capacity of the underlying technology.

These issues led to a decision that NoSQL wasn’t something to do for that project at that time.

The basics

The lab guides you through the basics of creating and using DynamoDB tables with the AWS console. You create tables through a wizard and learn a little about the configuration options. Then you do some queries and move on to the example.

The example

In this lab you’re using credentials from a Twitter account on an application which streams tweets to a web interface and also tries to store them in several DynamoDB tables. The rate of tweets streaming through is determined by a slider. Sliding right means tweets come in faster. Tweets that failed to be stored in DynamoDB are shown slightly transparent.

To avoid messing with my existing account, I made the effort of creating a new account specifically for this lab. The instructions mentioned at the beginning you might want to set up the account and API access before being with the AWS part, but the instructions on how to do that are very late into the lab.

The example is pretty straight forward. The steps are easy to follow. The majority of the duration should have been in getting all resources set up and waiting for Twitter API access. In my case, I got stuck trying to get the application to work.

After logging in with my Twitter account, the screen stayed blank. No tweets were coming in, but they should start shortly after the authentication with Twitter. It took me several minutes and working through the troubleshooting guide to discover an error in the application itself.

The single page app is backed by running on Node. When the user logs in with his Twitter account the API credentials are used to poll for new tweets. Redis is used to keep track of the different users and their web sockets. When a new tweet arrives, the Node app will store it in DynamoDB and also emit the data over the web socket by calling the publish method on the publisher variable with a user id and text version of a JSON object.

For the purpose of the lab there is no need to support multiple users. This meant that hacking the Node app into a one-user solution was enough. All I needed to do is honor the publish contract and emit to the web socket of the last user signing in. In code it looks something like this:

var publisher = {}

socket.on("i am", function ( ... ) {
  publisher.publish = function (userid, message) {
    socket.emit( ... , JSON.parse(message));

Once I fixed the Node app I could play around with the slider to see the effect of changing write and read capacity on DynamoDB.

Having a constant stream of data showed me that scaling reads and writes is a process you need to tweak and monitor. Although scaling goes quite fast, at higher speeds many of the write requests couldn’t be fully processed. Depending on the application there can be a big difference in provisioning read and write capacity. In this application the write capacity was crucial, each tweet triggered a write.

This lab helped me understand the basics of creating and querying DynamoDB tables. The example showed the importance and impact of configuring the read and write capacity at table level to match (predicted or measured) requests to it. With capacity too low on writes (or reads) the application won’t be able to handle load properly. On the other hand, provisioning too much capacity might just be a waste of cycles and coins.

Review: processing billions of events a day with Kafka, Zookeeper and Storm

On Tuesday January 12th 2016 four guys from Server Density, a server monitoring service, held a tech chat session about event processing with Kafka, Zookeeper and Storm. They use this setup to process lots of data points, which made me believe it is a good session to view for future reference or water cooler talk.

It was the first time I followed a live webinar through Google Plus Hangouts on Air. Normally I expect to sit and wait for the webinar to start, but somehow this didn’t work for me now. After some attempts I started viewing no later than 10 minutes in. This meant I missed introductions, but I guess not much else.

The first part consisted of questions among the guys present in the room. Most of the questions geared to general architecture and (personal) experience with this event processing stack. Some details might have gone lost on me during the live feed because of room acoustics, microphone volume or because I was too tired to process some pronunciations.

Once the questions of viewers started, I either paid less attention to the audio problems or they actually became less of a problem. This was also the point where all guys participated instead of the dialog that happened before.

One of the first questions asked, and more of a suggestion, was to move the microphone. There weren’t many questions coming in, but still more than time permitted. So I felt the need to vote on the ones I wanted answered. There weren’t many people voting either, so I guess my votes did make some impact.

It surprised me when they announced the last question to be answered. One or two more top questions seemed interesting, but time was limited and 5 minutes is too short for two questions.

My general impression is that starting with Kafka, Zookeeper or Storm might seem difficult, but it won’t take long until you get comfortable. Although the majority of the tooling is Java or JVM based software, the interfaces for other languages are sufficient. It won’t matter if, like Server Density, most of your software is written in Python. You’ll bump your head a few times, get past some (what seem to be) quirks and you’ll realise it just works.

Key points I got from this session:

  • Redundancy prevents data loss
  • In this case data is kept no longer then 10 minutes
  • Most data is processed near real-time
  • Latency between workers is reduced by either keeping them on the same machine or in close proximity
  • You should monitor your network bandwidth usage, things might get chatty and you’ll hit the limits
  • Debugging is much easier if you keep things small, one process doing one thing. That way it’s easier to reproduce bugs.
  • Monitoring and improving the code keeps the process healthy
  • People might ask two questions, which makes voting difficult. Vote or not if you don’t care about the other question?

Notes I made on giving tech talks or presentations:

  • Google Hangouts is a nice way to give a talk and have viewers ask questions
  • Test or rehearse tech talks to avoid sound problems
  • Prepare some graphics to support the story

I liked the format and content of this talk, so I will be sure to watch another session the next time it pops up in e-mail or other feed.

ECS deep dive webinar

On Wednesday 14th I attended the online webinar “Amazon EC2 Container Service Deep Dive” by Deepak Singh. Due to technical difficulties there were some outages. Over all I got a pretty good image of container technology that seems to get more popular but I didn’t know much about. It’s either going to be big or fade away, I’m not sure yet which. Continue reading

M202 July 2014 Final weeks

The first few videos of week 7 covered the write concern and some important changes to the basic parts of the system since version 2.4. It is good these subjects are explicitly covered, because MongoDB is going through a lot of functional upgrades. Without the emphasis, you might not realize there is now a more elegant way to do things.

My personal opinion is that write concern should hold no mystery this late in the course. As the other parts, it’s information you could find in the release notes.

Snapshot of operations

The next topic of the videos was interesting, but could have had more depth to it. A video was spent on the database command currentOp which will spew data on the operations of the current mongod process. This command generates a lot of data, but the important thing to remember is and most importantly is a snapshot in time. The videos went over some basic scenarios on how to use the database command and the reliability of the information.

This is all nice, until your system has a heavy load and capturing that problem becomes too difficult for a (wo)man and a simple shell. That’s when you need tools that are able to capture and process faster then most humans can do.


During the course there were some videos on mtools. If you wouldn’t know it’s related to MongoDB then you stumble onto several other software tools and packages with that name. This gets confusing when you search your favorite package manager, you will probably not find the tool you’re looking for.

The mtools mentioned in this course are a collection tools written in Python to simplify several tasks when experimenting with MongoDB or analyzing its output. This software isn’t officially supported by MongoDB Inc., but was written by a current employee and found it’s way across his peers.

The source and installation can be found on GitHub. You might have some trouble setting up the software correctly, because sometimes library dependencies not enforced to a particular version or need to be forcefully updated. This can be very frustrating and give the idea the tools don’t work, where installation of all the right versions of dependencies will make it work as intended.

So what can mtools do? It can set up basic architectures of MongoDB systems. For instance, it’s interface to setting up a sharded cluster is quite nice. The course material demonstrated it and I prefer the mtools way to the mongo shell’s ShardingTest object. Both mtools and the mongo shell can only start processes on the local system, which is good enough for experiments.

It can also filter, combine and plot log data. There are a lot of tools that can do that, but mtools doesn’t need tweaking to understand MongoDB’s log format and comes with some default reports that make sense. That being said, there are definitely tools that have a sleeker appearance, create more attractive reports or just integrate better with other tools.

If I would be faced with an immediate problem in a MongoDB cluster, then churning through the logs with mtools would be my preference. MongoDB Management Service be a second choice, because it takes some time for data to show up and it’s not as detailed. In time I would be looking for other tools to specifically suit what I need as a tool.

M202 July 2014 Fifth and sixth week

My first plan for this series was to get a blog post for each week. It didn’t work out in the past and because both the fifth and sixth week cover sharding, I decided to make one big post on sharding.

Sharding is used to scale horizontally. Instead of upgrading a MongoDB replica set with more computing or network capacity, you add more replica sets and divide the load among them. Each replica set will have a shard of the entire data. It’s not mandatory to use replica sets for your shards, single servers will do. However, when you consider sharding it’s very likely you want the redundancy and failover of replica sets as well. The reason for this is that once a shard fails, unreachable from any kind of perspective, the data in that shard will be unavailable for both reads and writes. With MongoDB’s failover functionality this becomes less of an availability problem.

During previous courses the basic steps on sharding databases and collections in MongoDB were already explained. The M202 course assumes you won’t need to dig deep for this knowledge, but if you do there’s always the tutorial in the MongoDB documentation. In my opinion sharding data got a little bit easier in version 2.6. The first step is to orchestrate all the pieces of the infrastructure, which could be simplified with configuration management tools. After that you need to explicitly enable sharding on databases and collections.

Mongos process

The recommended way to communicate with the cluster is through one of the mongos processes. These processes provide several tasks in the system. The most important task is to provide a proxy for connections made to other parts of the system. When clients query through a mongos process their queries will be routed to the right shards and their results will be merged before returning to the client. This makes the cluster behave as one big database, although there are limits on the amount of data a mongos process can reasonably sort and process.

Another important task of the mongos process is to provide an interface to manage the cluster. Through one of the mongos processes you are able to configure which databases and collections are sharded, what their shard key is and how data is distributed between shards. Once data is in the cluster, any mongos process is able to push it around by updating the cluster configuration and kicking off the necessary commands on the shards. In order to keep the configuration up to date these mongos processes communicate with the config servers.

Config servers

Config servers are separate mongod processes dedicated to holding metadata on the shards of a cluster. This metadata tells mongos processes which database and collections are sharded, how they are sharded and several other important metadata. Typical production environments will have three config servers in the cluster. This number coincidentally is similar to the recommended size of replica sets, but config servers never form a replica set. Instead they are three independent servers synchronized by mongos processes. All instances should be available, if one of them drops the metadata will be read-only in the cluster.

The videos put emphasis on monitoring config servers. Metadata of a cluster is kept in a special database you’re never supposed to touch unless instructed to do so. The videos make this very clear, and in general it’s a good idea because a mismatch in any of the data might make the data out of sync with the rest of the config servers. If the metadata can’t be synchronized to all of them, then data chunks can’t be balanced and the config server that’s out of sync will be considered broken. Manual intervention is needed to fix a failing config server.

Using ShardingTest

One of the most useful improvements to the Mongo shell that caught my attention was the ability to set up a sharded cluster with relative ease using a ShardingTest object. Of course, this is only for testing purposes because the complete cluster will be running on your current machine. During the fifth and sixth week of the course I made a habit of using tmux with a vertical split so that on the one side I could run the test cluster and on the other I had my connection to the mongos to try out commands.

First, you need to start the cluster. You’ll start with invoking the mongo shell without connection to the database (mongo --nodb). Then do something similar to the code below, which I took from the course and formatted a bit, to start a small cluster for testing purposes.

config = {
  d0 : { smallfiles : "",
         noprealloc : "",
         nopreallocj: "" },
  d1 : { smallfiles : "",
         noprealloc : "",
         noprealloc : "" },
  d2 : { smallfiles : "",
         noprealloc : "",
         nopreallocj: "" } };
cluster = new ShardingTest( { shards : config } );

Once you hit return on that last line (11) your screen will fill up with lots of logging which most of the time is just the different nodes booting and connecting into a sharded cluster system. You could wait until the stream slows down, but the mongos node might already be running at port 30999.

When you’re done, interrupt the mongo shell where you started the cluster and it will shutdown everything. I tried scripting this and making the mongo shell execute it, but once it hits the end of the script it will shut down the cluster. I see no other way than running this while keeping the mongo shell opened.