Linux Fireball: February 2012

Tuesday, February 28, 2012

Rails scaffolding types

These types are valid since Rails version 2:

string
text (long text, up to 64k, often used for text areas)
datetime
date
integer
binary
boolean
float
decimal (for financial data)
time
timestamp

Here is an example of using rails 3.x scaffolding with data types, run from the Rails application root directory:

ruby script/rails generate scaffold Modelname name:string title:string employed_on:date remarks:text

Find processes attached to TCP ports [classic]

Use fuser to find processes attached to TCP ports

To see all processes, run fuser as root or with sudo.

To list all processes (with owner of the process) connected to SSH, TCP port 22:

fuser -u -n tcp 22

Saturday, February 25, 2012

Finding IPs connected to your web server [classic]

Get all IPs connected to your web server

netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort -n

Get all unique IPs connected to your web server

netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort | uniq -c | sort -n

Dumping a Postgresql database remotely with SSH

I ran into a rare problem recently with a large Postgresql database that was filling up the local disks of a server. The database was large, over 100 GB and about 300 million records. There was a lot of churn and it had not been vacuumed in a long time. When I manually ran a vacuum on it, there was not enough working disk space to complete the operation, creating a bind.

What I decided to do instead of using vacuum was to dump it to a remote backup location, then drop the database and restore it from the remote dump. I used SSH to run the remote commands.

Dump a remote Postgresql database to the local machine

ssh user@remote-database-server 'pg_dump database-name -t table-name' > table-name.sql

Restore a remote Postgresql database dump to the local database server

ssh user@backup-machine 'cat table-name.sql' | psql -d database-name

Note that the dump command is run from the backup machine, while the restore command is run from the database server. Also note the single quotes around certain parts of the command.

Automating FTP with shell scripts [classic]

People frequently need to automate FTP sessions to upload or download files.
Most command line FTP clients, including the FTP client on the Mac, can automatically login to an FTP server by reading the .netrc file in the user home directory. Note that the FTP auto-login file starts with a dot in front of the name (dot netrc).

Syntax of the $HOME/.netrc

The .netrc file can contain more than one auto-login configuration. Each FTP server has a set of commands, the minimum being the login name and password. You can create as many machine sections as you need. Here is a generic example:

machine ftp.server.com
login myuserID
password mypassword

Very Important: .netrc permissions!
Since user IDs and passwords are stored in the .netrc file, the FTP client enforces permission checking on it. It must be set so that no groups and no other users can read or write to it. You can set the permissions on it with this command from the Terminal (from your home directory) once the file is created:
chmod 700 .netrc

Adding FTP commands in a BASH script
You can embed FTP commands in a BASH script to upload and download files.
For example, you could create a script file named ftpupload.sh:

#!/bin/bash
# upload a file
/usr/bin/ftp -i ftp.server.com <<ENDOFCOMMANDS
cd backupdir
cd subdir
put datafile
quit
ENDOFCOMMANDS

In this example, I added the -i switch when running FTP to prevent it from prompting on multiple file uploads/downloads, even though it is only uploading one file in the example. I also use the BASH HERE document feature to send commands to FTP. When the script is run, it will auto-login using the information in the .netrc file, change to the right remote directory and upload the datafile.

Scheduling the script with Cron
The last step is to get the BASH script to run unattended, say every day at 5:00 am. The old school UNIX way is to use Cron, but the fancy new Apple way is to use a launchd XML configuration. As long as cron is supported in OS X, I'll stick to the old school way. I leave the launchd configuration as an exercise for the reader.

Add these lines with the command "crontab -e", then save:

# automated FTP upload
0 5 * * * /Users/username/ftpupload.sh

Friday, February 24, 2012

Getting by in Git

Git is the source code management system used for the Linux kernel and many other highly complex projects. It was written by Linux Torvalds after some controversy over the proprietary Bitkeeper program that used to manage the Linux kernel.

I've needed to upgrade my skills recently to use git in place of subversion because that is what my shop decided to use. I've moved all my Rails code into a remote git server and so far, so good. In one project, I was asked to track large video files with git and ran into a problem. When pushing files to a remote repository, git tries to compress all files in memory before sending. I got out of memory errors and had to commit a few files at a time.

One improvement is it has fewer "droppings" than subversion. There is no hidden .svn directory in each directory with source code. Only a single .git directory at the root of the project, plus a .gitignore file for files you don't want git to track.

Initialize project tracking

git init

Check out an existing project from remote server

git clone ssh://server/git/project

Add a file for git to track

git add

Add all files from this directory and below for git to track

git add .

Commit all files to local repository

git commit -a -m "message"

Undo changes to a file (re-check out from repository)

git checkout --

Undo the most recent commit

git reset --soft HEAD~1

Pull files from remote repository and merge with local repository

git pull

Push files to remote repository (must commit first)

git push

Move file or directory to new location

git mv path destination

Remove file or directory from the working tree (stops tracking, but doesn't delete)

git rm --cached /path/to/file

Remove file or directory from the working tree (deletes the file)

git rm /path/to/file

To create a remote repository from an existing project takes several steps

cd /tmp

git clone --bare /path/to/project (creates a /tmp/project.git directory)

scp project.git to remote server

cd /path/to/project

git remote add origin ssh://server/git/project

git config branch.master.remote origin

git config branch.master.merge refs/heads/master

Branches

Create a new branch

git checkout -b mybranch

After committing all changes to the new branch, you eventually need to merge those back into the master repository (usually origin/master) with the following four steps.

Switch back to the origin

git checkout origin

Merge in the new branch as one commit (no fast forward)

git merge --no-ff mybranch

Delete the branch

git branch -d mybranch

Push to master repository

git push origin

This is a good article on branch and release management.

Another simple guide on the basics.

Updating the master repo after a hard reset

git clone ssh://repo-server/repos/project.git . git reset --hard 89480e60 touch dummy.txt git add . git commit -m "Rolled back (added dummy.txt, can be removed later)" git push --force

Tuesday, February 21, 2012

Yum indigestion

Yum (Yellow Dog Updater Modified) has become a standard tool in Red Hat (and CentOS) Linux. It replaced the aging up2date tool for installing, updating, and deleting packages.

While yum brought a number of improvements, it also brought a hidden problem: the dbcache.

Yum archives downloaded packages/updates and tracks them with sqlite files in /var/yum/cache. Downloaded packages are not automatically deleted, and neither are the cache files, which can chew up a lot of disk space over time. The remedy is to periodically delete the cache files with:

yum clean dbcache

To remove downloaded packages, use:

yum clean packages

To remove cache, packages, and metadata (must be downloaded again the next time yum is run), use:

yum clean all