Painlessly Increase Your EBS Based Root Volume Without Rebuilding

You created an AWS instance and decided to make a small root volume. Maybe you were shortsighted or just being cheap. Now the root volume is nearly full and you can’t expand it because you didn’t use LVM.

[ec2-user@ip-172-16-14-78 ~]$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8256952 8173060 8 100% /

To add insult to injury, AWS recently cut the price of EBS storage in half. This issue can be fixed with a handful of commands from the CLI.

Before you begin, take note of a few important details about the instance. You need to know the instance-id and the availability-zone that your instance is running in. You can get the instance-id from the running instance by querying the instance metadata service. More details about the metadata service are available here.

[ec2-user@ip-172-16-14-78 ~]$ curl http://169.254.169.254/latest/meta-data/instance-id/
i-2db6840d

Determine which availability-zone your instance is running in.

dcolon@dcolonbuntu:~$ aws ec2 describe-instances --instance-id i-2db6840d --output text | egrep ^PLACEMENT
PLACEMENT default None us-east-1a

Shut down the instance. You can’t get a reliable snapshot of a volume while it’s attached to a running instance.

dcolon@dcolonbuntu:~$ aws ec2 stop-instances --instance-ids i-2db6840d --output text
i-2db6840d
CURRENTSTATE 64 stopping
PREVIOUSSTATE 16 running
RESPONSEMETADATA e8b8f0ab-7cd1-478f-a7c1-bb5bf81b2355

To determine the volume-id of the root volume run describe-instances:

dcolon@dcolonbuntu:~$ aws ec2 describe-instances --instance-ids i-2db6840d --output text | egrep ^EBS
EBS attached True vol-b6b98bfb 2014-02-11T20:36:00.000Z

Take a snapshot of vol-b6b98bfb:

dcolon@dcolonbuntu:~$ aws ec2 create-snapshot --volume-id vol-b6b98bfb --output text
None vol-b6b98bfb pending 8 None 2014-02-11T21:10:41.000Z snap-e76a3b25 647956677678
RESPONSEMETADATA 03f83444-a473-4bc1-b138-1065f6d5cee0

Now that you have a snapshot (snap-e76a3b25) of the root volume, you can create an exact larger copy of it. If the snapshot state is still pending, you will get an error. Do not forget to specify the size or the new volume will be the same size as the old.

dcolon@dcolonbuntu:~$ aws ec2 create-volume --availability-zone us-east-1a --size 40 --snapshot-id snap-e76a3b25 --output text
us-east-1a standard vol-b4aa98f9 creating snap-e76a3b25 2014-02-11T21:14:51.280Z 40
RESPONSEMETADATA 2257dac5-1685-488c-b545-2f8f9417a6ac

Finally detach the original volume (vol-b6b98bfb) and attach the newly created volume (vol-b4aa98f9).

dcolon@dcolonbuntu:~$ aws ec2 detach-volume --volume-id vol-b6b98bfb --output text
2014-02-11T20:36:00.000Z i-2db6840d vol-b6b98bfb detaching /dev/sda1
RESPONSEMETADATA ea10cbbd-2adf-43c9-b5e3-cca19c1e5c26
dcolon@dcolonbuntu:~$ aws ec2 attach-volume --volume-id vol-b4aa98f9 --instance-id i-2db6840d --device /dev/sda1 --output text
2014-02-11T21:20:00.318Z i-2db6840d vol-b4aa98f9 attaching /dev/sda1
RESPONSEMETADATA 146b0ecc-6916-49aa-931c-24c6cdbaca8c

Start your instance and verify that your root volume is now 40 GB.

dcolon@dcolonbuntu:~$ aws ec2 start-instances --instance-id i-2db6840d --output text
i-2db6840d
CURRENTSTATE 0 pending
PREVIOUSSTATE 80 stopped
RESPONSEMETADATA 9320cb43-589f-4ab5-a1ce-d23718c02297
dcolon@dcolonbuntu:~$ ssh -i .ssh/prd.pem [email protected]
ssh: connect to host 172.16.14.78 port 22: Connection refused
dcolon@dcolonbuntu:~$ ssh [email protected]
Last login: Tue Feb 11 21:06:04 2014 from

__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|

[ec2-user@ip-172-16-14-78 ~]$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G 7.8G 92M 99% /

The root volume is still 8 GB. This is because the volume was created based on a snapshot. Run resize2fs to resize the volume to 40 GB.

[ec2-user@ip-172-16-14-78 ~]$ sudo resize2fs /dev/xvda1
resize2fs 1.42.3 (14-May-2012)
Filesystem at /dev/xvda1 is mounted on /; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 3
The filesystem on /dev/xvda1 is now 10485760 blocks long.
[ec2-user@ip-172-16-14-78 ~]$ df -h /
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 40G 7.8G 32G 20% /

For details on how to grow an EBS based LVM filesystem, see this article that I wrote.

If you have any questions, please post them in the comments below.

Quick and Dirty LVM Volume Expansion Using EBS on Amazon EC2

Problem: I have a central rsyslog server that is filling up. I used LVM on the log partition so that I can grow it on demand. Here are the exact steps that I took to grow the partition.

I allocate fifteen 25 GB volumes and write the ec2-create-volume output to a file disks.txt. Volumes must be allocated in the same availability zone (AZ) as the instance (server) that they will be attached to.

dcolon@dcolonbuntu:~/f$ for ((i=1; i<16; i++));
> do
>    ec2-create-volume -s 25 -z us-east-1b >> disks.txt
> done
dcolon@dcolonbuntu:~/f$ awk '{print $2}' disks.txt 
vol-61dac038
vol-0adac053
vol-3bdac062
vol-d5dac08c
vol-efdac0b6
vol-92dac0cb
vol-84dac0dd
vol-acdac0f5
vol-45dbc11c
vol-15dbc14c
vol-07dbc15e
vol-34dbc16d
vol-25dbc17c
vol-e1dbc1b8
vol-aedbc1f7

Next I attach the fifteen volumes to the instance. I found that you can’t have more than 15 partitions per device letter, ie you can’t go above /dev/sdk15.

dcolon@dcolonbuntu:~/f$ export i=1
dcolon@dcolonbuntu:~/f$ awk '{print $2}' disks.txt | while read disk
> do
>    ec2-attach-volume $disk -i i-1c4b3265 -d /dev/sdk$i
>    i=$((i+1))
> done
ATTACHMENT      vol-61dac038    i-1c4b3265      /dev/sdk1       attaching      2013-05-02T16:31:43+0000
ATTACHMENT      vol-0adac053    i-1c4b3265      /dev/sdk2       attaching      2013-05-02T16:31:45+0000
ATTACHMENT      vol-3bdac062    i-1c4b3265      /dev/sdk3       attaching      2013-05-02T16:31:48+0000
ATTACHMENT      vol-d5dac08c    i-1c4b3265      /dev/sdk4       attaching      2013-05-02T16:31:50+0000
ATTACHMENT      vol-efdac0b6    i-1c4b3265      /dev/sdk5       attaching      2013-05-02T16:31:53+0000
ATTACHMENT      vol-92dac0cb    i-1c4b3265      /dev/sdk6       attaching      2013-05-02T16:31:55+0000
ATTACHMENT      vol-84dac0dd    i-1c4b3265      /dev/sdk7       attaching      2013-05-02T16:31:58+0000
ATTACHMENT      vol-acdac0f5    i-1c4b3265      /dev/sdk8       attaching      2013-05-02T16:32:00+0000
ATTACHMENT      vol-45dbc11c    i-1c4b3265      /dev/sdk9       attaching      2013-05-02T16:32:03+0000
ATTACHMENT      vol-15dbc14c    i-1c4b3265      /dev/sdk10      attaching      2013-05-02T16:32:05+0000
ATTACHMENT      vol-07dbc15e    i-1c4b3265      /dev/sdk11      attaching      2013-05-02T16:32:09+0000
ATTACHMENT      vol-34dbc16d    i-1c4b3265      /dev/sdk12      attaching      2013-05-02T16:32:11+0000
ATTACHMENT      vol-25dbc17c    i-1c4b3265      /dev/sdk13      attaching      2013-05-02T16:32:14+0000
ATTACHMENT      vol-e1dbc1b8    i-1c4b3265      /dev/sdk14      attaching      2013-05-02T16:32:16+0000
ATTACHMENT      vol-aedbc1f7    i-1c4b3265      /dev/sdk15      attaching      2013-05-02T16:32:18+0000
dcolon@dcolonbuntu:~/f$

Up to this point, I have been working on my local Linux box. The next steps need to be done directly on the instance.

I loop through all fifteen partitions and create physical volumes on each using pvcreate so they can be used by LVM.

[root@syslog]# for ((i=1; i<16; i++));
> do
>   pvcreate /dev/sdk$i
> done
  Writing physical volume data to disk "/dev/sdk1"
  Physical volume "/dev/sdk1" successfully created
  Writing physical volume data to disk "/dev/sdk2"
  Physical volume "/dev/sdk2" successfully created
  Writing physical volume data to disk "/dev/sdk3"
  Physical volume "/dev/sdk3" successfully created
  Writing physical volume data to disk "/dev/sdk4"
  Physical volume "/dev/sdk4" successfully created
  Writing physical volume data to disk "/dev/sdk5"
  Physical volume "/dev/sdk5" successfully created
  Writing physical volume data to disk "/dev/sdk6"
  Physical volume "/dev/sdk6" successfully created
  Writing physical volume data to disk "/dev/sdk7"
  Physical volume "/dev/sdk7" successfully created
  Writing physical volume data to disk "/dev/sdk8"
  Physical volume "/dev/sdk8" successfully created
  Writing physical volume data to disk "/dev/sdk9"
  Physical volume "/dev/sdk9" successfully created
  Writing physical volume data to disk "/dev/sdk10"
  Physical volume "/dev/sdk10" successfully created
  Writing physical volume data to disk "/dev/sdk11"
  Physical volume "/dev/sdk11" successfully created
  Writing physical volume data to disk "/dev/sdk12"
  Physical volume "/dev/sdk12" successfully created
  Writing physical volume data to disk "/dev/sdk13"
  Physical volume "/dev/sdk13" successfully created
  Writing physical volume data to disk "/dev/sdk14"
  Physical volume "/dev/sdk14" successfully created
  Writing physical volume data to disk "/dev/sdk15"
  Physical volume "/dev/sdk15" successfully created
[root@syslog]#

Next I loop through and extend the volume group varlogVG to include the new physical volumes.

[root@syslog]# for ((i=1; i<16; i++));
> do
>   vgextend varlogVG /dev/sdk$i
> done
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
  Volume group "varlogVG" successfully extended
[root@syslog]#

Finally I extend the logical volume and the filesystem.

[root@syslog]# df -h /var/log
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/varlogVG-varlogLV
                      374G  140G  235G  38% /var/log
[root@syslog]# lvextend -L+375G /dev/mapper/varlogVG-varlogLV
  Extending logical volume varlogLV to 749.00 GB
  Logical volume varlogLV successfully resized
[root@syslog]# xfs_growfs /var/log
meta-data=/dev/mapper/varlogVG-varlogLV isize=256    agcount=41, agsize=2441216 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=98041856, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096
log      =internal               bsize=4096   blocks=19072, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 98041856 to 196345856[root@syslog-1b-217 servers]# df -h /var/log
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/varlogVG-varlogLV
                      749G  140G  610G  19% /var/log
[root@syslog]#

Note, because I’m using XFS the fileystem can be grown while the filesystem is mounted read/write. An ext3 filesystem can be grown online (mounted) using resize2fs.

To learn how to increase the size of an EBS based root filesystem without rebuilding, see this article that I wrote on the topic.

GNU Screen + Byobu vs Weechat

I am an avid GNU screen user. I have been using it for about fourteen years and can’t live without it. About two years ago I started using Byobu which adds some eye candy to screen. Recently I started using IRC again. A coworker introduced me to Weechat. If you still use irc, check out Weechat. All three of these programs are available in the standard Debian and Ubuntu repos.

While I’m getting used to Weechat, I discover that the keybindings to scroll through the list of nicks in a channel are F11 and F12. I hit F12 and screen invokes lockscreen. After a lot of Googling, I find that the termcap name for F12 is F2. Digging through my .screenrc I can’t find any reference to F2. Then it occurs to me to check my Byobu settings. Sure enough I find the following in ~/.byobu/keybindings:

bindkey -k F2 lockscreen           # F12 | Lock this terminal

I commented out that line. It took me a little while to figure out how to change the keybinding for my current session. There is no unbindkey command in screen. Instead you bind it to nothing.

bindkey -k F2

Hopefully this helps somebody who has a similar problem.

Change the Extension on a Large Number of Files

Here’s a common problem. You untar an archive and it contains a large number of files that end .jpeg. You need them to have a .jpg extension. At first glance, you might be tempted to try this:
mv *.jpeg *.jpg

Give it a try and you will find it doesn’t work. The correct way to change the extension is to iterate over all of the files. Let’s start with the following:

dcolon@dcolonbuntu:/tmp/foobar$ ls
friday.jpeg  monday.jpeg  Saturday.jpeg  sunday.jpeg  Thursday.jpeg  Tuesday.jpeg  WEDNESDAY.jpeg
dcolon@dcolonbuntu:/tmp/foobar$

Once in the loop, you need to save the filename before the extension. I use a temporary variable and awk to extract it.

Here is my solution:

dcolon@dcolonbuntu:/tmp/foobar$ for i in *.jpeg
> do
>    basefilename=$(echo $i | awk -F.jpeg '{print $1}')
>    mv "$i" "$basefilename.jpg"
> done
dcolon@dcolonbuntu:/tmp/foobar$ ls
friday.jpg  monday.jpg  Saturday.jpg  sunday.jpg  Thursday.jpg  Tuesday.jpg  WEDNESDAY.jpg
dcolon@dcolonbuntu:/tmp/foobar$

I use double quotes around the variable names to compensate for filenames with spaces. In the awk statement I use the entire replacement pattern as my field separator. Using -F. will fail if you have a filename like foo.bar.jpeg.

Bash Hostname Completion

One of the more well known features in bash is command and filename tab completion. Installing the bash-completion package adds onto this. This package enables hostname completion and a lot more. After installing bash-completion, add the following to your .bashrc:

. /etc/bash_completion
complete -F _known_hosts ssh
 
I also suggest adding
complete -F _known_hosts ping
complete -F _known_hosts traceroute

Hostname completion relies on the ssh known_hosts file. Most modern distributions hash the ~/.ssh/known_hosts file for security reasons. This prevents hostname completion from working. If you are comfortable turning off hostname hashing, then add the following to your ~/.ssh/config:
HashKnownHosts no

If you had to turn off hostname hashing, you will need to re-populate your known_hosts file. I suggest creating a list of all of your hosts and logging into them in a for loop:


for i in $(cat hostlist)
do
   ssh -n -o StrictHostKeyChecking=no $i "uname -a"
done

This assumes that you are using ssh keys. If you are not, you will need to type your password for each host. At this point you can now use hostname tab completion. If you added the second two complete commands, the same hostname completion will work for ping and traceroute.

Copy and Paste for KeePass Under Linux

If you are like most modern Internet users, you subscribe to dozens of services and websites and need an account for each one. Using the same username and password on all of these sites is easy but a terrible idea from a security perspective. Enter KeePass. KeePass is an encrypted software safe for all of your usernames and passwords. It is multi-platform and open source which is important to me. I use it under Linux, Mac OSX, Windows, iPhone/iPad, and Blackberry. To use version 2.0 under Linux, you need to run it under Mono.

One problem I ran into when using KeePass under Linux is that copy/paste does not work out of the box in its default configuration. Google pointed out that Linux has two copy buffers. Details can be found here. This led me to the tool autocutsel and a post on Superuser. After running:

$ autocutsel &
$ autocutsel -s PRIMARY &

I can now copy/paste from KeePass.

I recommend using a very long passphrase for your master password. I synchronize my KeePass database in Dropbox so I can access it from all devices anywhere I am as long as I have an Internet connection.

Matching a US Telephone Number With egrep Using Regular Expressions

This is the follow up to my post searching for social security numbers. US telephone numbers use the following format that can easily be matched with a regular expression.
(215) 555-1212
215-555-1212
215 555 1212
215.555.1212
2155551212

The phone number can be broken down into a series of character classes. Using egrep, character classes are written inside of square brackets. The character class [0-9] represents a single number from 0-9. You can expand upon this and match a series of numbers from the character class by following it with a number inside of curly braces. [3-7]{3} matches exactly three numbers in the range of 3 through 7. We will use this notation to build the three parts of the phone number.

You can also build character classes containing specific characters or symbols. After the first three numbers of the phone number there are a few possibilities for the next character. It can be a right paren, a hyphen, a space, or a period. It can also be none of these. The ? operator matches exactly zero or one instance. Putting these two concepts together, we would build a character class and use the ? operator: [)- .]?

OK, let’s combine all of these concepts to build our regex. This is just one solution that I’ve come up with to match a phone number. The beauty of Unix is that there are many other solutions that are correct; some of which are probably better than my solution.

egrep “[(]?[2-9]{1}[0-9]{2}[)-. ]?[2-9]{1}[0-9]{2}[-. ]?[0-9]{4}” filename

This reads: zero or one left paren followed by a single number in the range two through nine, followed by two numbers in the range zero through nine, followed by zero or one right paren, hyphen, period, or space, followed by a single number in the range two through nine, followed by two numbers in the range zero through nine, followed by zero or one hyphen, period, or space followed by four numbers in the range zero through nine.

Until the explosion of cell phones, US area codes followed the format: number from 2-9, a 0 or 1, followed by a number from 0-9. When additional area codes were needed to accommodate the growing number of phone numbers, the requirement that the middle digit be a 0 or 1 was dropped.

If you have any questions about this article, please post in the comments section.

Update:
I updated the regex thanks to the feedback from Boris. The first and fourth digits cannot contain a zero or one so I created two separate character classes to accommodate that requirement.

Searching for social security numbers in a file using a regular expression and egrep

egrep is a version of grep that supports extended regular expressions. egrep can be used to find all social security numbers in a file using a basic regex. All US social security numbers have the format: 123-45-6789. This can be broken into the regex containing a character class of three numeric digits, a dash, a character class of 2 numeric digits, a dash and finally a character class of four numeric digits.


egrep "[0-9]{3}[-][0-9]{2}[-][0-9]{4}" file

If the file contains social security numbers without dashes this regex will not match. To improve upon the regex you can use the ? operator that matches exactly 0 or 1 occurrence of the preceding character class.

egrep "[0-9]{3}[-]?[0-9]{2}[-]?[0-9]{4}" file

Placing the dash in its own character class is not required. I find that it makes the regex easier to read.

My next post will show you how to match a US telephone number using a slightly more complicated regular expression.

Display the contents of a Redhat RPM or Debian deb package

Displaying the contents of an RPM or deb package is simple. Each of these can be thought of as an archive of files plus an install script. To view the files in the archive execute the following:

Redhat rpm:


rpm -qlp package.rpm

Debian deb:

dpkg -c package.deb

For Debian/Ubuntu systems I use a program called apt-file that allows you to search for files provided by any package that is available to your system even if that package is not installed. This comes in handy if you are building a program from source that has libraries that it depends on. Finding a library is not as easy as finding a program using aptitude.

I downloaded the traceroute deb and displayed the contents using the command above. It provides a library called libsupp.a. If I was building an application that depends on libsupp.a, I wouldn’t be able to easily find it using aptitude. apt-file would show that it’s provided by traceroute.


dcolon@gold:~$ aptitude search libsupp.a
dcolon@gold:~$
dcolon@gold:~$ apt-file search libsupp.a
traceroute: /usr/lib/libsupp.a
dcolon@gold:~$

Logical AND and OR Using AWK

Before learning about AWK’s logical AND operator I used to string a pair of grep commands together to find two search terms:

grep abc filename | grep def

to find lines with both abc and def. This can be shortened into a single AWK command:

awk '/abc/&&/def/' filename

A logical OR is also provided using a pair of pipes ||

awk '/abc/||/def/' filename

You can also use the equivalent egrep command:

egrep 'abc|def' filename

To get comfortable with AWK, try using it instead of grep for a week. AWK has many more features than just printing fields from a file.