Recently in Software Category

Unless you've got the correct packages installed, you'll likely get an error with some missing dependencies. You can find the missing packages at:

libicu38
libboost-filesystem1.34.1
libboost-regex1.34.1
libboost-thread1.34.1
libboost-iostreams1.34.1
libboost-signals1.34.1
libboost-date-time1.34.1

Click the above links, choose your architecture, choose a mirror, then download and open with GDebi.

Then install the Amazon MP3 downloader as normal.

When performing a find and execing a grep on the results (scenario: searching for a string in all files in a directory and all subdirectories recursively) on Solaris, it has always frustrated me that the output shows matches but doesn't identify the filename that the match occurred in.

$ find . -type f -exec grep string {} \;

horse
horsefeathers

I've worked out a couple solutions that were cumbersome and kludgy when I recently came across a simple, direct solution to the problem.

In Solaris, when you grep for a pattern in one file, the grep command doesn't output the name of the file a match is found in. There is no option to force the display of filenames (like -H in Linux) so you are left to engineer a solution yourself. The best solution I've worked out involves providing a second filename (/dev/null) to the exec'd grep command, forcing it to print out the filename a match occurred in, when a match occurs.

$ find . -type f -exec grep string {} /dev/null \;

./filename1:horse
./filename2:horsefeathers

Problem solved and the solution is cross-platform.

My hardware:
  • Canon Rebel XTi (uses Compact Flash (CF) memory)
  • Inland USB multi-format SD/CF card reader ($12 at Micro Center)
  • Dell PowerEdge 1800 running Fedora Core 11
My goal is to start out with files like this on the Compact Flash card::
[CF Card]/dcim/102canon/img_9999.CR2
[CF Card]/dcim/102canon/img_9999.JPG
[CF Card]/dcim/103canon/img_0000.cr2
[CF Card]/dcim/103canon/img_0000.jpg
run the script, and end up with the same files, transferred to the server and renamed like this:
[server photo repository]/2009/200902/20090215/img_1029999.cr2
[server photo repository]/2009/200902/20090215/img_1029999.jpg
[server photo repository]/2009/200902/20090215/img_1030000.cr2
[server photo repository]/2009/200902/20090215/img_1030000.jpg
Removal of files from CF card is performed manually by the Format function on the camera, once I'm sure all the files have survived the trip from CF to hard drive.

The (re)naming convention assures unique file names for photos coming from my camera until I get to 10,000,000 photos. At my current rate of taking pictures, that would be several thousand years. It also allows me to easily manage my photos by year, month, or day (and eventually decade) as I choose.

I've written a script to automatically copy and rename my files from the CF card to an appropriate directory on the server when it is run.

#!/bin/ksh
startdir=`pwd`
cfsourcedir="/mnt/usb"

# make sure to use your device name here, check output of 'dmesg' \
# on your server with card reader connected.
cfdeviceid="/dev/sdc1"
canondir="${cfsourcedir}/dcim"
datepath="/photos/raw/`/bin/date +%Y/%Y%m/%Y%m%d`"

sudo umount $cfsourcedir

sudo mount $cfdeviceid $cfsourcedir
result=$?;

if [ $result -eq 0 ];then

   if [ -d $datepath ]; then
      echo "Directory $datepath exists"
   else
      mkdir $datepath
   fi
   #echo "canondir: $canondir"
   #echo "datepath: $datepath"

   rsync -az $canondir/* $datepath --stats --progress | \
   tee -a /tmp/loadfromcf.out
   cd $datepath
   for i in `find -type d |sed 's/^.\///g' |grep -v ^\.$`
   do
      cd $i
      directorynumber=`echo $i | sed 's/CANON//g' |sed 's/canon//g'`
      for j in `ls -1 *`
      do 
         k=`echo $j |sed "s/img_/img_$directorynumber/g" |\
   sed 's/IMG_/IMG_${directorynumber}/g' |sed 's/JPG/jpg/g' |sed 's/CR2/cr2/g'`
         mv $j ../$k
         chown speed:speed ../$k
         chmod 0544 ../$k
      done
      cd ..
      rmdir $i 
   done
fi

sudo umount /mnt/usb

if [[ $result = 0 ]];then
   grep -i Number /tmp/loadfromcf.out | tail -2
   pwd
fi
I've also contemplated modifying this script so it would automatically (cron) check for a CF card in the reader, then automatically start the copying process. If I make this modification, I'll post it here as well.

I'll be adding code/comments as I improve this script.

I've been working with variations of Unix for a long time now and thought I'd jot down some of my favorite tips and tricks. They are mostly OS/distribution/shell/language independent (unless I indicate otherwise...)

  1. Get rid of blank lines in a file
    grep . inputfile > outputfile
    This matches (and thus prints) only lines that contain some text, not blank (empty) lines.

  2. comm
    Many people never cross paths with the comm command, but it is very useful. I works similarly to diff, but outputs the contents of two compared files into three columns. The first column is content only in the first file, second column is content only in the second file, and third column is content that is in both files (matches between the two files.) While this may not seem useful at first, you can select which columns to output, so if you only want to know what is in both file1 and file 2 (column 2) you'd suppress columns 1 and 2, by running:
    comm -12 file1 file2
    Don't forget that your input files must be sorted.

  3. paste
    Systems administrators frequently use the cut command to parse files, but many people I run into have never used the paste command. The paste command will concatenate two files line by line (as opposed to file by file, like cat.)

  4. less instead of more
    This is not available on all Unix-based OSes, but the less command works very similar to more, but will let you move through a file forwards and backwards more easily. Want to jump to the end of the file, type Shift-G Depending on the version of less you are running, it will provide context highlighting when you search for a pattern.

  5. Jump to vi from more
    While paging through a file in more, press "v" to jump to editting the file in vi at the current position in the file.

  6. Jump to a line number when editing a file with vi
    vi +linenumber filename
    will open up the file with the cursor automatically moved down to the specified line. This is useful when you get an error that indicates "syntax error on line 2047." You can jump straight to the problem without fumbling around.

  7. Invisible characters become visible
    Sometimes you'll end up with carriage returns on each line in a file originally created on a DOS/Windows system, or filenames with spaces, tab, or other control characters in them, but you can't see them typically.
    The cat command provides three useful options -v, -e, and -t that will let you understand these invisible characters
    -v (displays non-printing characters)
    -e (prints a "$" at the end of each line to indicate a NL character)
    -t (prints "^I" for each Tab in the file)
    cat -vet filename |more

  8. Remove DOS ^M from ends of lines
    The "^M" characters are visible when editing in vi but here are two approaches to remove the characters.
    sed 's/<Ctrl-V><Ctrl-M>//g' -i filename

    or in vi: <Esc>:%s/<Ctrl-V><Ctrl-M>//g

If you ever do end up with me interviewing you, I'll likely work one or two of these into the discussion to explore your level of knowledge about Unix. If you need more information, remember: man pages are your friend.

In a product as carefully developed and mature as Microsoft Office Excel 2003 SP2, you'd think there'd be a nice way of intuitively removing multiple hyperlinks from a set of cells. The general thought process for designing this functionality would be something like:

  1. Select range of cells from which you want to remove hyperlinks,
  2. Right-click, then select "Remove Hyperlink(s)" option,
  3. Done.
This seems intuitive, heck, that's how you remove a hyperlink from one cell, why wouldn't multiple removes work the same way?

I recently received a spreadsheet with several hundred hyperlinks included on a key field that had me accidentally linking to a web page (along with being prompted for authentication every time) whenever I selected the hyperlinked field. Useful for someone, a major annoyance to me (click-Hold, BTW is the correct way to select a hyperlinked field.) I tried the approach detailed above several times with no success. The choice to "Remove Hyperlink" disappears when multiple cells are selected. I gave in and asked Microsoft Office Help how to "remove hyperlink" and got the following procedure:

  1. Type the number 1 into a blank cell,
  2. Select and copy it (right-click, select "Copy" option),
  3. Ctrl-Select the fields from which you want to remove hyperlinks (Shift-select or click-drag are also acceptable), release Ctrl or Shift keys if you are using them,
  4. Select the Edit menu, then the "Paste Special" menu item,
  5. When the Paste Special window opens, select "Multiply" under the "Operation" section, then choose "OK.
Who at Microsoft thought this was a useful, intuitive, or even understandable way of removing hyperlinks from multiple cells? It's almost as if some programmer's hack made it into the end-user's experience and no one noticed. Someone had to do functional test on this in Quality Control, right? Did it seem right then? I haven't used the newest Microsoft Office Excel offering, but I sure hope they've improved this bad user interface design. I'll add to this post tonight to comment on how OpenOffice deals with this task.

So I've been having a problem with my "Send To Flickr" Bookmarklet functionality on Flickr, and been in a foul mood about it. I finally got to the right person at Flickr who set me in the right direction and give me permission to post the javascript contents for the bookmarklet here.

"Send To Flickr" Bookmarklet Code:

<20050906 Edit>How Do I Use This?

Highlight the text in this box and drag it onto your "Bookmarks Toolbar" (Netscape/Mozilla) or ":Personal Toolbar" (Opera.) Optionally, you can manually make a bookmark. For the 'Location' field (Mozilla) or 'Address' field (Opera) paste in the javascript from above. (It needs to be copy-and-pasted as one line.) Then go load the page with the picture you want to upload to Flickr. Click on the bookmark(let) and it will list all images on that page. Click the image you want to upload. Login to Flickr if it asks you to. Choose "Upload To Flickr" button. Voila! Your file should be uploaded into your account on Flickr.

To make this really useful, add the bookmark to your "Bookmarks Toolbar" (Mozilla) or "Personal Toolbar" (Opera) under View->Toolbars, so you can use your bookmarklet easily while browsing. If at this point, you can't figure it out, please wait for Flickr's new Bookmarklet coming to a browser near you soon.

</20050906 Edit>

I believe this is the exact JavaScript from Flickr (they wrote the code, not me), but I can't guarantee it. It works correctly and YOU can scan through the JavaScript, there doesn't appear to be anything untoward in the code. Flickr is reworking their bookmarklet upload function currently, so keep an eye on their Upload Tools page (you must be logged in to use that link) and be sure to use their new bookmarklet when they post it. (I'll modify this blog entry when they do...)

</20070323 Edit>

I've stopped using YahooFlickr, so I can't verify whether or not this still works. I'll leave it here for history's sake, and in case anyone finds it useful. As before, this code came from Flickr, I do not support it or provide any warranty to it working or working correctly for any of your needs. Your mileage may vary, use at your own risk/benefit.

My Digital Photo Workflow Notes

I am currently using a Canon Rebel XTi DSLR camera that records onto Compact Flash media. I'm primarily interested in filing, categorizing and annotating my image files. Post-processing (raw conversion, adjustments, etc.) is usually minimal unless I really messed up my settings or am working commercially.

  1. Transfer the images, as taken, to fileserver in a working directory. This is done via USB-connected Compact Flash reader attached to my fileserver (I'm considering adding a built-in card reader if my motherboard provides USB connections internally, need to investigate.) The files are written to a dedicated RAID 1 share on my Fedora Core 6 (currently) Linux fileserver.
  2. Set the Compact Flash card aside, just in case...
  3. From Unix shell, check that the file ownership and permissions are set correctly on the copied files. File permissions are all changed to 0544 for my purposes, so the files are read-only, but my automated system can access the files.
  4. If necessary, renumber (rename) files. Canon stores 10,000 photos per folder, then automatically increments to the next folder. I prefer to append the folder number to the image number to form individual file names like so:
    DCIM/CANON101/IMG_2345.JPG becomes IMG_1012345.JPG. The little shell script I use to rename my files looks like this:
    for i in `ls -1`
          do
          j=`echo $i |sed `s/IMG_/IMG_101/g'`
          mv $i $j
          chmod 0544 $j
       done
    
    I currently keep all my photos in one directory (may rethink this later) so it is important for me to keep filenames unique.
  5. Record the filenames of the new files in the work directory, move them to the production directory.
  6. Process the photos through raw conversion using Bibble Pro
  7. Process the new files (recorded in the previous step) through the script that loads the image records, vital statistics and meta data (EXIF) into the database:
    • sample the image and record the average color in HEX
    • record default values (photographer, photographer e-mail, photographer URL, file directory, filename, url
    • extract EXIF data with jhead and load that data
    • create appropriate .html files which call scripts via SSI and verify new .html file permissions
    • categorize photos where I can by date/time and other extracted information
  8. Create test lightbox view of thumbnails
  9. Correct any images that need their orientation (rotation) adjusted. Usually this is a rotate operation -90 degrees to counteract the camera being used vertically instead of horizontally.Now done in Bibble Pro, so not an issue.
  10. Rebuild thumbnail files for new images. Other than thumbnails, all images are resized on-demand, real-time, and fed as a BLOB from my server to your browser. Neat tricks with Perl, Imagemagick, and Apache.
  11. Rebuild category files to account for new files assigned to categories.
  12. Verify that everything works correctly and looks sane/sober/kosher.
  13. Format Compact Flash media.
At this point, I've got all the information about the image into the database and appropriate web pages have been built to index and display the photos on my online photo galleries. Now I can sleep restfully. Additional notation, categorization, sorting, filing, etc. is an on-going task (with 20,500+ photos and growing, it may never be finished.) The photo files and databases are backed up on a RAID 1 partition locally (to protect against hardware failure) and are also mirrored remotely (using rsync) so I'll always have a copy off-site (in case of local catastrophe (weather, fire, etc.))

I'm in a constant development process with this project, right now. The to-do list is long. Here are some details of the current implementation.

All photos have appropriate copyright notation and all rights are reserved. All photos are available for licensing. All photos are available in their original, unmodified resolution and format. Please contact me for specific information about licensing.

VLC Media Player

Anyone else out there tired of messing with junk video players that do everything but play the video you want to watch? They all work 90% of the time and leave you hanging the rest.

I've found the Golden Video Player: VLC Media Player
Get VLC media player

This thing has a library of useful features and supports Windows, WinCE, BeOS, and every version of Unix (Mac OS/X, Linux, Solaris, BSDs) you could ever want. It plays almost every format on one player, across multiple platforms, reliably. Wow! Did I mention it's free and the source code is available! As if this wasn't enough, it can also be used as a video streaming server as well.

The install on Windows platforms is absolutely painless. My install on Fedora Core 4 was a bit more challenging, but I finally got all the pieces together. I highly recommend using apt-get and making sure you've got the "dag" repository in your /etc/apt/sources.list file, then, as root, issue the command:
apt-get install videolan-client
(Note: not 'vlc'.) There will be a bunch of dependent packages that will tag along for the install automatically.

This is what Free Software is all about. This project sits on the shoulders of many other quality free and open source software efforts. No contribution is required, but if you'd like to contribute your time, materials, skills, or anything else the VideoLAN team, I'm sure they would appreciate it.

Okay, so I've got all these fancy tools to write, link, and otherwise manage my blog postings. Great for me. How are all my readers supposed to keep up with my updates (or, more likely, all the updates to all the blogs they read?)

Unless you are already using a news reader/aggregator, somebody has already taken a couple steps ahead of you. When I publish a new blog entry (or post a new photo on Flickr, for example) the software managing the site updates a list of meta-data that it publishes. This is called usually done in a standard format called RSS (Really Simple Syndication) or ATOM. Both formats are XML, so if you really wanted to, you could look at the file and probably understand it. It simply provides a list of the last "n" number of changes to my blog (or site, photostream, podcast, etc.) This all sounds complicated, but remember, that part is automated.

Then, I publish my RSS feed (a link, for example the RSS feed for my blog is http://blog.transmit.net/atom.xml) somewhere where people can find it. If you take that URL (address) and put it into your favorite news reader/aggregator as a subscription, the news reader application will automatically track and collect changes on my blog and notify you of any changes I make to my syndicated content. Now multiply this across 10 or 50 websites and you start to see the power. Instead of loading each website and trying to determine what is new (10 or 50 times) you only see the updates. Once you start to exploit the RSS/ATOM data that is published out there, your web exploring will turn a corner.

There are all sorts of resources on the web that provide RSS feeds, from newspapers and blogs to photo sharing sites (Flickr) and new horizons in push marketing (Amazon and others.) The opportunity

I know, I know, what's a news reader/aggregator and where can I get one , right? The good news is that they are available in various flavors for just about any platform. There are client side news aggregators for all platforms and web-based aggregators to meet just about anyone's needs. Here are a few to get you started:

Non-technical description of RSS
Google Personalized Homepage (just paste a feed URL in under "Add Content")
BlogLines (web-based news aggregator/reader)
Mozilla Thunderbird (Windows/Mac OS/X/Linux e-mail and news client, free and open source)
Straw (Linux, free Gnome-based news aggregator)
SharpReader (Simple Windows news aggregator, requires .NET, donationware)
Net News Wire (Mac OS/X news aggregator)
Google Reader (Web-based news aggregator/reader))

If you find the list I provided doesn't provide enough variety, you can go loose yourself in the Wikipedia News Aggregator article.

Why aren't you using RSS?

I've had my own domain since 1994 and always ran my own personal web and e-mail services on a personal server. Running a public server on the Internet requires some technical expertise and a whole lot of patience. You will be bombarded with spam, you will face script kiddies scanning your services trying to 'hack' your passwords, you will see the underbelly of the Internet that most ISPs effectively hide from their users. I particularly enjoyed running my own e-mail server and quickly implemented RBLs, spamassassin, pyzor and razor to keep the good mail flowing and the spam headed to /dev/null.

All has gone well for several years until my ISP decided suddenly to block incoming traffic on port 25 (SMTP) to my connection. Everything froze, all my personal e-mail was blocked from my server, all my friend's e-mail traffic to their domains I was hosting stopped. I don't run a high-volume e-mail server, but nonetheless the messages traversing it are very important to me. I never was given a good explanation for why my ISP suddenly decided to do this, but their fix was that I'd have to upgrade to a commercial account ($70+/month instead of ~$35/month) in order to run my e-mail server on my Internet connection. I capitulated to their extortion while looking for an alternative e-mail hosting solution. This is strictly personal e-mail, not commercial/business related in any way, so the $20/month to $50+/month solutions I was finding really didn't fit my needs. Recently, I caught wind of a hosted e-mail service that Google has been developing and would be beta testing. Bingo! Sign me up!

I've been beta testing Gmail for your domain (Google's hosted e-mail offering) for a little over a month now on my personal domain, and am very pleased to say it works great. It's based on their Gmail infrastructure and interface, so Gmail users have almost no learning curve. It offers the same great feature set (excellent spam blocking (it has caught 1,000 spams in the last two weeks, with no false positives), easy e-mail content searching, rich text composing, AJAX interface, the Web Clip bar (which can be disabled)) as Gmail with some capabilities to customize the GUI for your users (by inserting your logo instead of the Gmail logo.)

Setup was very easy, some changes are required to your DNS MX records (remember your password for your account with your registrar?) for your domain to point e-mail to Google's servers. Once this has been completed (and propagated) you can log into the management interface, setup e-mail accounts, e-mail lists (think aliases) and make some minor changes to the interface (choose a background color, include your logo.) Once you've created the accounts everything is ready to go, it works just like Gmail. If you have a larger domain, you can use the account import feature so you can batch create your user accounts by uploading a CSV spreadsheet of your usernames and passwords.

The Gmail model has been to offer free e-mail service by displaying relevant(?) ads on the right hand side of your screen according to the contents of the e-mail you are currently reading. They still do this, I still find it minimally intrusive personally. I'd like to know what kind of revenue Google gets for clicked-through Gmail ads in a year (click fraud notwithstanding.) When you open your Spam mailbox, the Web Clip bar displays Spam Recipie links. French Fry Spam Casserole, Spam Breakfast Burritos, Spicy Spam Kabobs, Spam Quiche. I don't touch the stuff, but I like it when a corporate giant like Google doesn't take themselves too seriously and can still have a little fun. When in the Trash, the Web Clip bar displays recycling tips...

Wish List for "Gmail for your domain":

  • For the beta test, each mailbox is frozen at 2GB, that's a lot of space, but why not let the mailbox size grow like other gmail accounts? That is one of your strongest features.
  • Allow for pattern definitions in e-mail addresses. (I prefer to use "speed-[sitename]@example.com" when registering an e-mail with a third-party website. Right now I'd have to define each and every e-mail address I create instead of saying deliver "speed-*@example.com" mail to my Inbox.) This is one of the benefits of having your own domain name. Google has mentioned they are looking at doing this.
  • Allow for a "catch all" bucket for all those "other" e-mail addresses that I have used over the years. Mail coming to my domain, unless it is determined to be spam, should be delivered to me. As it stands now, if I don't specify an e-mail address in my Google configuration, mail sent to the unspecified address (@mydomain) bounces. "Gmail for your domain" now offers a "catch all" e-mail address for your domain, an excellent example of the responsiveness of Google to their users' needs. It can be configured under "Domain Settings" once logged in to your administrative account.
  • Especially during the beta test: put a link to the support e-mail on every page! (I had to go back and re-read my beta testing agreement to figure out what address to post feedback to. Isn't this the point of beta testing?)
  • Announce your pricing structure for the future. It's a wonderful offering at it's current free price level (while beta testing.) How much will you charge afterwards? Keep in mind some of us are hobbyists, not commercial entities with big budgets, please.
  • Put back the "invite a friend" links on your hosted e-mail offering
  • Settle the issue of whether or not you will give user account information out to third parties and on what terms.

Right now Google's offering is in beta testing. They do have an "I'm interested" link at the bottom of their Google Hosted E-mail FAQ page. I've had the pleasure of telling my ISP where to stick their pricing and I've dropped back down into the reasonable price bracket (as opposed to their $70+/month commercial-account-because-your-host-your-own-email pricing.)

Overall, the offering looks excellent to me. It's been easy to use, as reliable a Gmail, and Google has been responsive to improvement suggestions. There are still some privacy concerns as to how and who Google shares their users' information with. If you run your own domain and are looking for e-mail hosting specifically, check out Gmail for your domain. Consider setting it up as a subdomain if you just want to test the service, so you can just direct mail from users@subdomain.example.com to "Google for your domain" while testing. This may necessitate signing up for the service twice if you decide to use it for your entire domain later, but gives you flexibility.

About this Archive

This page is a archive of recent entries in the Software category.

Programming is the previous category.

User Interfaces is the next category.

Find recent content on the main index or look in the archives to find all content.