Linux Fireball: 2012

Wednesday, December 5, 2012

Ruby one liners

One liner to edit a file in place

This command edits a file in place, performing a global text search/replace. The -p switch tells ruby to place your code in the loop while gets; ...; print; end. The -i tells ruby to edit files in place, and -e indicates that a one line program follows.

ruby -p -i -e '$_.gsub!(/248/,"949")' file.txt

One liner to remove DOS line endings from a file

A Unix or Mac text file uses a single line feed character to mark the end of a line (hex 0A). Files created on DOS or Windows use two characters, carriage return and line feed (hex 0D 0A). This command edits a file in place, removing the carriage returns.

ruby -p -i -e '$_.gsub!(/\x0D/,"")' file.txt

One liner to edit files in place and backup each one to *.old

ruby -p -i.old -e '$_.gsub!(/248/,"949")' *.txt

Saturday, November 3, 2012

Finding files owned by a user or group

To find all files from the current directory owned by user "john":

find . -user john -print

To find all files from the current directory owned by group "john":

find . -group john -print

You can also use numeric IDs (-uid or -gid) instead. The find command searches hidden files and directories automatically (those with names that start with a period).

Saturday, August 11, 2012

Finding the size of directories and subdirectories

I can't take credit for this sequence of commands. I found several variations through search engines. I apologize for not linking to the site(s) where I found it. I can't find them. However, it is useful enough to document for posterity.


ls -AF | grep \/ | sed 's/\ /\\\ /g' | xargs du -sh

Monday, July 9, 2012

Installing Python modules

Installing a Python module

Most modules come with a setup.py program that uses the distutils module import function to install a new module. Use this command in the module directory to install it:
python setup.py install

Successful installation should make the module available in either /usr/bin/ or /usr/local/bin/.

Listing installed modules

pydoc modules

Saturday, June 9, 2012

Crontab [classic]

Crontab fields
Here is a typical crontab entry and the definition of cron fields:
00 03 * * * /tmp/zing.sh

minute (00-59)
hour (00-23)
day of the month (01-31)
month of the year (01-12)
day of the week (0-6 with 0=Sunday)
program or command to run

An asterisk (*) in any field means run every time for this field.
Ranges (X-Y) and steps (X-Y/Z) can also be defined.

Special @reboot option
Recent versions of cron support @reboot instead of time and day settings to run a command after boot. For example:
@reboot /usr/bin/command-to-run

I am not sure if I like this option, compared to using system start up scripts, like /etc/rc.d/rc.local.

User crontabs (including the root user)
To edit a user crontab (including root):
crontab -e
To delete a crontab:
crontab -r

User crontabs are stored as text files in /var/spool/cron/.

System crontab
The system crontab is stored in /etc/crontab. It can be changed (as root) with a text editor.
The system crontab has one extra field before the program to run, which is the user to run the command as (usually root).

Thursday, June 7, 2012

Rsync [classic]

Rsync is a file and directory syncronization tool. It can be used to keep local and/or remote files syncronized. What makes rsync powerful is that it can detect changes within large files and only transfer the parts that are different. This is much more efficient than transfering entire files, especially as the file sizes get bigger.

Note: examples that use a shell use ssh with public key authentication

To synchronize a local directory with a remote one (pull), use:

rsync -r -a -v -e "ssh -l username" --delete hostname:/remote/dir/ /local/dir/

To synchronize a remote directory with a local one (push), use:

rsync -r -a -v -e "ssh -l username" --delete /local/dir/ hostname:/remote/dir/

To synchronize a local file with a remote one, use:

rsync -a -v -e "ssh -l username" hostname:/filename /local/filename

To synchronize a remote file with a local one, use:

rsync -a -v -e "ssh -l username" /local/filename hostname:/filename

To synchronize a local directory with a remote rsync server:

rsync -r -a -v --delete rsync://rsync-server.com/stage/ /home/stage/

To synchronize a local directory with a local directory (make a backup), use:

rsync -r -a -v --delete /local/dir/ /backup/dir/

Tuesday, May 22, 2012

PostgreSQL sorting NULLS

Here is a handy feature I didn't know existed until today.

You can sort null values to the top or bottom of a list by adding "NULLS FIRST" or "NULLS LAST" at the end of the ORDER BY clause.

For a real world example,


SELECT DISTINCT p.permit_id,city_permit_num,street_no,

             street, name, phone1, issue_date,

             pl1.notes as work_description,

             pl2.notes as work_type

             FROM building.permit p

             LEFT JOIN public.location l ON p.location_id = l.location_id

             LEFT JOIN building.contact_permit cp ON cp.permit_id = p.permit_id

             LEFT JOIN public.contact c ON c.contact_id = cp.contact_id

             JOIN building.permit_line pl1 ON pl1.permit_id = p.permit_id AND pl1.status = 'A'

             JOIN building.item i1 ON pl1.item_id = i1.item_id

               AND i1.description = 'Work Description'

             JOIN building.permit_line pl2 ON pl2.permit_id = p.permit_id

               AND pl2.status = 'A'

             JOIN building.item i2 ON pl2.item_id = i2.item_id

               AND i2.description ILIKE $1

             WHERE street = $2 AND street_no = $3

             ORDER BY issue_date DESC NULLS LAST

We get a nice date descending result with null dates at the bottom of the list!.
Yes, this is a real query in one of my apps. Sigh.

Wednesday, May 16, 2012

PostgreSQL Psql

PostgreSQL is a robust and powerful open source database. It has more advanced features than any other open source database and scales well with huge datasets and high traffic loads.

By default, PostgreSQL listens on TCP port 5432.

Dump all databases

pg_dumpall --clean > databases.sql

Dump a database with compression (-Fc)

pg_dump -Fc --file=database.sql --clean database

Dump a database, plain text, one schema only (-n)
pg_dump -Fp --file=filename.sql -n schema --clean database

Dump a single table

Specify the schema with the table name (if applicable) with
pg_dump -t schema.table database

Dump a table definition (no data)

pg_dump -s -t schema.table database

Restore a database from a dump file

pg_restore -Fc database.sql

Restore a single table from a dump file

pg_restore -v -e -Ft -d database -n schema -t tablename dumpfile.tar
note: in this case, the dump file is in tar format, the database to restore to is after the -d switch and the table to restore is after the -t switch.

Copy data from a file into a table (from the psql client)

COPY table-name FROM '/path/to/filename' DELIMITER 'delimiter';
note: the file must be readable by postgresql (chmod 755), the default delimiter is tab.

Copy data from a table to a file (from the psql client)

COPY table-name TO '/path/to/filename' DELIMITER 'delimiter';
note: the directory and file must be writable by postgresql, the default delimiter is tab.

List all schemas
select schema_name from information_schema.schemata

Start the PostgreSQL interactive terminal

psql

Psql - show a list of databases

\l
Lowercase L, not the number 1

Psql - show all users

select * from pg_user;

Psql - show all tables (including system tables)

select * from pg_tables;

Psql - show tables in the current context (database/schema)

\d

Psql - show description of tablename

\d tablename

Psql - show description of tablename, along with constraints, rules, and triggers

\d+ tablename

Psql - change current database

\c database;

Psql - show all schemas in the current database

\dn

Psql - Grant permissions on a schema to a user

GRANT ALL ON myschema TO user;

Psql - quit psql

\q

Psql - show help

\?

Psql - copy a table to a tab delimeted file

COPY table TO 'table.txt';

Psql - load a table from a tab delimeted file

COPY table FROM 'table.txt';

Psql - show permissions on database objects

\z [object]

 r -- SELECT ("read")

w -- UPDATE ("write")

a -- INSERT ("append")

d -- DELETE

R -- RULE

x -- REFERENCES (foreign keys)

t -- TRIGGER

X -- EXECUTE

U -- USAGE

C -- CREATE

T -- TEMPORARY

arwdRxt -- ALL PRIVILEGES (for tables)

* -- grant option for preceding privilege

/yyyy -- user who granted this privilege

Psql - getting or setting sequence values
Get next value of a sequence:
SELECT nextval('this_id_seq');

Set current value of a sequence to 1000:
SELECT setval('this_id_seq', 1000);

Grant access to all tables in a schema

GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO user;

Run the vacuum utility (for version less than 9.0)

vacuumdb --verbose --analyze --all
Note: vacuum reclaims space from deleted records and updates indexes. It should be set up in cron. Newer versions of postgresql may run vacuum automatically.

Increase perfomance with shared memory

One effective performance tuning tip for Postgresql is to increase the shared memory buffers. This might require adding RAM to the server. Many Linux distros default to 32MB of shared memory, controlled by two kernel parameters:
/proc/sys/kernel/shmmax /proc/sys/kernel/shmall

These values can be changed at run time, but it is better to set them at boot using the /etc/sysctl.conf file. This increases shared memory to 1GB:
# increase shared buffers for postgres at boot kernel.shmmax=1073741824 kernel.shmall=2097152

Then, tell PostgreSQL to use 768MB of the 1GB available in the /var/lib/pgsql/data/postgresql.conf file:
shared_buffers = 98304 # min 16, at least max_connections*2, 8KB each

Restart PostgreSQL for the change to take effect.

Tuesday, April 3, 2012

LVM Basics

I've just spent a few hours with an EqualLogic iSCSI SAN disks and Linux Logical Volume Manager (LVM). The abstraction is even deeper than that, because Linux at work is running under VMware, so it is really VMware talking to the SAN and presenting a SCSI disk to Linux. Since I only get into the LVM weeds a few of times a year, I thought it would be helpful to list the steps to get usable disk space under Linux starting with the raw disk space.

Step One - create a new partition

Create a new partition with FDISK or PARTED. Mark the partition type hex 8E for LVM. In my case, the SCSI disk appeared as /dev/sdb and the partition using all space became /dev/sdb1. LVM is capable of using a raw device (no partition type), but I stayed in familiar partitioning territory.

Step Two - create LVM physical volume

pvcreate /dev/sdb1

Step Three - create LVM volume group in the physical volume

vgcreate new_volume_group /dev/sdb1

Step Four - create LVM logical volume in the volume group

lvcreate --name new_logical_volume --size 100G new_volume_group

Step Five - create a file system on the logical volume

mkfs -t ext4 /dev/mapper/new_volume_group-new_logical_volume
Note: Linux device mapper automatically creates a symlink to the disk in /dev/mapper using the volume group and logical volume names. If you choose more meaningful names than the example, the name won't look so awful.

Step Six - turn off automatic file system checks (optional)

tune2fs -c 0 /dev/mapper/new_volume_group-new_logical_volume

Step Seven - add mount point in /etc/fstab

Once the mount point is list in fstab, mount it manually and it is ready to use.

Saturday, March 31, 2012

File Timestamps [classic]

Each file has three timestamps associated with it (stored as the number of seconds since the Epoch, Jan 1, 1970). The three timestamps are:

Access time (atime) - the last time the file was read
Modify time (mtime) - the last time the file contents were changed
Change time (ctime) - the last time the file permissions were changed

In a long directory listing, the timestamp shown is the Modify time (mtime). To see all timestamps and a lot of other useful information, use the stat program with the verbose option (-x):
stat -x filename

Here is sample output from stat:
keithw$ stat -x "Mona Lisa Overdrive.mp3"
File: "Mona Lisa Overdrive.mp3"
Size: 6853358 FileType: Regular File
Mode: (0644/-rw-r--r--) Uid: (501/ keithw) Gid: (501/ keithw)
Device: 14,9 Inode: 10208 Links: 1
Access: Fri May 25 11:46:30 2007
Modify: Fri Dec 8 16:38:54 2006
Change: Fri Dec 8 16:38:54 2006

Deleting files with bad names [classic]

If a file with a bad name gets accidentally created, such as a name that begins with a hyphen "-", it can't be deleted with a normal remove command (rm). Use the "--" option to tell rm that no more options follow, then it can delete the file.

To delete a file whose file name begins with "-":

rm -- -bad-file-name
Or
rm ./-bad-file-name

To delete a file with non-printable characters in the name:

Use shell wildcards, '?' for one character and '*' for zero or more characters. For example, if the file name "bad file name" can't be deleted, one of the spaces may in fact contain a hexademical value. Try:
rm bad?file?name

caution: run ls bad?file?name first to make sure you are not matching more files than you think with wildcards before deleting them.

Thursday, March 29, 2012

Asus 1001-PXD Project: Keyboard Replacement

After I got Easy Peasy running, I ordered a replacement keyboard on eBay.

The keyboard that came with it was warped and bulging a little in spots, so I wanted to install a fresh one.

I found some instructions online that described the steps involved, but this YouTube video was very helpful.

You can use a spudger or a small standard screwdriver to make the switch. The key to getting the top part of the keyboard out is to push the small connectors at the top back a little, allowing the keyboard to pop out. I mangled one of the keys on the old keyboard trying to find the right angle. No permanent damage.

The keyboard I bought was for a 1001HA, but it worked perfectly in the 1001PXD. In fact, the colorized icons look even better than the original. The netbook is working liking a dream.

Saturday, March 24, 2012

Asus 1001-PXD Project: Installation

I bought an Asus 1001PXD netbook on eBay to create custom Linux "couch computer". The idea was to have someone light, portable, and secure and that didn't cost as much as an iPad or Galaxy tablet. I had done a little research on which netbooks were Linux compatible and on the various netbook oriented distributions, but it turned out to be a little trouble than I expected.

Updating the BIOS for USB book

The netbook came with Windows 7 starter edition. The first step was to update the BIOS so it supported booting from a USB flash drive. One of the guides I referenced was this one for Crunch Bang Linux.

One of the early steps to get ready for the BIOS update is:

prepare your USB flash drive with a: 16 mb FAT16 partition at the start of the disk

I first tried using a FAT32 USB drive, but that was never recognized by the Asus BIOS update utility. Next, I tried to partition the USB flash drive from Windows 7 and that was not supported. I ended up creating a 4MB FAT16 partition on a small USB flash drive and that didn't work. Finally, I used a program called BootIce to create a 1MB FAT16 partition on the same flash drive and at last it was recognized and updated the BIOS.

Installing Easy Peasy

After looking through some choices of netbook optimized distributions, I decided to try Easy Peasy. The interface is tablet like, one maximized window at a time, big icons, easy navigation, etc. I downloaded the ISO and unetbootin to create the USB boot image. The one trick here is to first reformat the USB flash drive partition back to FAT32 for booting. I tried leaving is as FAT16 and it didn't work. When booting, I hit ESC to bring up the boot menu, then chose USB and once it was up, chose to install on the hard drive with default options.

Out of the box, every appears to be working. Wireless networking, suspend/resume, camera, package installation and updates, etc. The online documentation for Easy Peasy seems a bit sparse, and the wiki had some signs of spam vandals, but I an pleased with the choice so far.

Thursday, March 22, 2012

Dusting off nano

For a number of years, nano was my favorite text editor. It was easy, modeless, and had most of the features of you need. If doesn't compete with the heavyweights, vi and emacs, but is simple and elegant.

When I am wearing my programming hat, there are two killer features I need: brace matching and syntax highlighting. I was recently jarred by how much I had come to depend on syntax highlighting when opened a source file on a foreign computer where highlighting was not configured. I thought, how am I supposed to make sense of this? Then I remembered that I didn't always have highlighting and managed to get by.

Anyway, I dug into the current release of nano and found brace matching can be enabled with this .nanorc setting:
set matchbrackets "(<[{)>]}"

A matching brace can be found by putting the cursor on one and using Alt-].

Syntax highlighting has to be defined in the .nanorc file for each type of file. In Red Hat Linux and CentOS, a lot of languages are available to be included in the .nanorc file in /usr/share/nano/. For example,

## Ruby
include "/usr/share/nano/ruby.nanorc"

Go nano!

Tuesday, March 13, 2012

Vi[m] [classic]

While in command mode (case sensitive)

move the cursor with arrow keys; if there aren't any arrow keys, use j,k,h,l
i - change to insert mode (before cursor)
a - change to insert mode (after cursor)
A - change to insert mode (at end of line)
r - replace one character
R - overwrite text
x - delete one character
dd - delete one line
yy - yank line (copy)
p - paste deleted or yanked text after cursor
P - paste deleted or yanked text before cursor
G - go to end of the file
1G - go to top of the file
J - merge next line with this one
/ - search, follow / with text to find
:wq - write file and quit
:q! - quit without saving
:%s/old/new/g - substitute; replace "old" with "new" on all lines
:g/pattern/d - delete all lines that match the pattern

While in insert mode

ESC - change to command mode
any text typed is entered at the cursor

If you gave vi a whirl and don't dig it, give the nano editor a try.

Friday, March 9, 2012

Simple Ajax Update in Rails 3

There have been several updates to the way Rails 3 interacts with Ajax.

One big change was the switch from prototype to Jquery as the standard Javascript library. Having spent a short time with Jquery, I can see how it is much more elegant and easier to understand, at least superficially. I am not a Javascript expert, but I've done a fair bit of "raw" coding accessing and manipulating the DOM, enough to appreciate the power of Jquery.

Here are the minimal code snippets to update an element in a web page with an Ajax call using the new syntax.

In the view:
<%= link_to "test", { :action => 'testajax' }, :remote => true %>
<div id='testdiv'></div>

The :remote => true indicates an Ajax link. We are going to update the div testdiv.

In the controller, define a function to handle the call:

def testajax

  @testdata = "ajaxy data"

end

Create a new view file for the javascript response (in this case named testajax.js.erb):
$("#testdiv").html("<%= @testdata %>");

This is Jquery Javascript code, but you can use embedded ruby to pull in instance variables from the controller. We are selecting the html content of the element with ID testdiv, then updating it with the data set in the controller "ajaxy data".

Tuesday, February 28, 2012

Rails scaffolding types

These types are valid since Rails version 2:

string
text (long text, up to 64k, often used for text areas)
datetime
date
integer
binary
boolean
float
decimal (for financial data)
time
timestamp

Here is an example of using rails 3.x scaffolding with data types, run from the Rails application root directory:

ruby script/rails generate scaffold Modelname name:string title:string employed_on:date remarks:text

Find processes attached to TCP ports [classic]

Use fuser to find processes attached to TCP ports

To see all processes, run fuser as root or with sudo.

To list all processes (with owner of the process) connected to SSH, TCP port 22:

fuser -u -n tcp 22

Saturday, February 25, 2012

Finding IPs connected to your web server [classic]

Get all IPs connected to your web server

netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort -n

Get all unique IPs connected to your web server

netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort | uniq -c | sort -n

Dumping a Postgresql database remotely with SSH

I ran into a rare problem recently with a large Postgresql database that was filling up the local disks of a server. The database was large, over 100 GB and about 300 million records. There was a lot of churn and it had not been vacuumed in a long time. When I manually ran a vacuum on it, there was not enough working disk space to complete the operation, creating a bind.

What I decided to do instead of using vacuum was to dump it to a remote backup location, then drop the database and restore it from the remote dump. I used SSH to run the remote commands.

Dump a remote Postgresql database to the local machine

ssh user@remote-database-server 'pg_dump database-name -t table-name' > table-name.sql

Restore a remote Postgresql database dump to the local database server

ssh user@backup-machine 'cat table-name.sql' | psql -d database-name

Note that the dump command is run from the backup machine, while the restore command is run from the database server. Also note the single quotes around certain parts of the command.

Automating FTP with shell scripts [classic]

People frequently need to automate FTP sessions to upload or download files.
Most command line FTP clients, including the FTP client on the Mac, can automatically login to an FTP server by reading the .netrc file in the user home directory. Note that the FTP auto-login file starts with a dot in front of the name (dot netrc).

Syntax of the $HOME/.netrc

The .netrc file can contain more than one auto-login configuration. Each FTP server has a set of commands, the minimum being the login name and password. You can create as many machine sections as you need. Here is a generic example:

machine ftp.server.com
login myuserID
password mypassword

Very Important: .netrc permissions!
Since user IDs and passwords are stored in the .netrc file, the FTP client enforces permission checking on it. It must be set so that no groups and no other users can read or write to it. You can set the permissions on it with this command from the Terminal (from your home directory) once the file is created:
chmod 700 .netrc

Adding FTP commands in a BASH script
You can embed FTP commands in a BASH script to upload and download files.
For example, you could create a script file named ftpupload.sh:

#!/bin/bash
# upload a file
/usr/bin/ftp -i ftp.server.com <<ENDOFCOMMANDS
cd backupdir
cd subdir
put datafile
quit
ENDOFCOMMANDS

In this example, I added the -i switch when running FTP to prevent it from prompting on multiple file uploads/downloads, even though it is only uploading one file in the example. I also use the BASH HERE document feature to send commands to FTP. When the script is run, it will auto-login using the information in the .netrc file, change to the right remote directory and upload the datafile.

Scheduling the script with Cron
The last step is to get the BASH script to run unattended, say every day at 5:00 am. The old school UNIX way is to use Cron, but the fancy new Apple way is to use a launchd XML configuration. As long as cron is supported in OS X, I'll stick to the old school way. I leave the launchd configuration as an exercise for the reader.

Add these lines with the command "crontab -e", then save:

# automated FTP upload
0 5 * * * /Users/username/ftpupload.sh

Friday, February 24, 2012

Getting by in Git

Git is the source code management system used for the Linux kernel and many other highly complex projects. It was written by Linux Torvalds after some controversy over the proprietary Bitkeeper program that used to manage the Linux kernel.

I've needed to upgrade my skills recently to use git in place of subversion because that is what my shop decided to use. I've moved all my Rails code into a remote git server and so far, so good. In one project, I was asked to track large video files with git and ran into a problem. When pushing files to a remote repository, git tries to compress all files in memory before sending. I got out of memory errors and had to commit a few files at a time.

One improvement is it has fewer "droppings" than subversion. There is no hidden .svn directory in each directory with source code. Only a single .git directory at the root of the project, plus a .gitignore file for files you don't want git to track.

Initialize project tracking

git init

Check out an existing project from remote server

git clone ssh://server/git/project

Add a file for git to track

git add

Add all files from this directory and below for git to track

git add .

Commit all files to local repository

git commit -a -m "message"

Undo changes to a file (re-check out from repository)

git checkout --

Undo the most recent commit

git reset --soft HEAD~1

Pull files from remote repository and merge with local repository

git pull

Push files to remote repository (must commit first)

git push

Move file or directory to new location

git mv path destination

Remove file or directory from the working tree (stops tracking, but doesn't delete)

git rm --cached /path/to/file

Remove file or directory from the working tree (deletes the file)

git rm /path/to/file

To create a remote repository from an existing project takes several steps

cd /tmp

git clone --bare /path/to/project (creates a /tmp/project.git directory)

scp project.git to remote server

cd /path/to/project

git remote add origin ssh://server/git/project

git config branch.master.remote origin

git config branch.master.merge refs/heads/master

Branches

Create a new branch

git checkout -b mybranch

After committing all changes to the new branch, you eventually need to merge those back into the master repository (usually origin/master) with the following four steps.

Switch back to the origin

git checkout origin

Merge in the new branch as one commit (no fast forward)

git merge --no-ff mybranch

Delete the branch

git branch -d mybranch

Push to master repository

git push origin

This is a good article on branch and release management.

Another simple guide on the basics.

Updating the master repo after a hard reset

git clone ssh://repo-server/repos/project.git . git reset --hard 89480e60 touch dummy.txt git add . git commit -m "Rolled back (added dummy.txt, can be removed later)" git push --force

Tuesday, February 21, 2012

Yum indigestion

Yum (Yellow Dog Updater Modified) has become a standard tool in Red Hat (and CentOS) Linux. It replaced the aging up2date tool for installing, updating, and deleting packages.

While yum brought a number of improvements, it also brought a hidden problem: the dbcache.

Yum archives downloaded packages/updates and tracks them with sqlite files in /var/yum/cache. Downloaded packages are not automatically deleted, and neither are the cache files, which can chew up a lot of disk space over time. The remedy is to periodically delete the cache files with:

yum clean dbcache

To remove downloaded packages, use:

yum clean packages

To remove cache, packages, and metadata (must be downloaded again the next time yum is run), use:

yum clean all