<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>I Can Has Linux? &#187; awk</title>
	<atom:link href="http://icanhaslinux.com/category/awk/feed/" rel="self" type="application/rss+xml" />
	<link>http://icanhaslinux.com</link>
	<description>Invisible Patent Infringement!</description>
	<lastBuildDate>Mon, 29 Aug 2011 13:37:26 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>basics in awk</title>
		<link>http://icanhaslinux.com/2008/08/13/basics-in-awk/</link>
		<comments>http://icanhaslinux.com/2008/08/13/basics-in-awk/#comments</comments>
		<pubDate>Wed, 13 Aug 2008 13:48:59 +0000</pubDate>
		<dc:creator>LightningCrash</dc:creator>
				<category><![CDATA[awk]]></category>
		<category><![CDATA[ihatereadingbooks]]></category>

		<guid isPermaLink="false">http://icanhaslinux.com/2008/08/13/basics-in-awk/</guid>
		<description><![CDATA[awk is a very, very useful command-line program that any Linux/Unix ninja should be familiar with. Awk is specifically geared towards processing text, and it was actually a combination of awk and sed that were an inspiration for Perl. To start with, awk has three major elements that you need to be aware of when [...]]]></description>
			<content:encoded><![CDATA[<p>awk is a very, very useful command-line program that any Linux/Unix ninja should be familiar with. Awk is specifically geared towards processing text, and it was actually a combination of awk and sed that were an inspiration for Perl.</p>
<p>To start with, awk has three major elements that you need to be aware of when you&#8217;re working with it. These are the field separator, the pattern, and the action for the pattern.</p>
<p>Your fied separator is obviously what is inbetween the text elements you want to work with. If you open up a terminal and type &#8216;ps -elf&#8217;, you&#8217;ll see that this would just be spaces. Some files, like CSV files, have commas as the separator. Awk can be told what to look for via the -F option on the command-line, or in the program itself. For one-off piping, I prefer to do it via the -F option.</p>
<p>The pattern is much like an  &#8216;if &#8230; then&#8217; statement in other programming languages. If there isn&#8217;t a pattern, the action specified will be applied to all rows of input.</p>
<p>What makes awk handy is that it gives you capabilities that the `cut` command simply can&#8217;t provide.  For instance, if I have a twenty-column CSV and I would like to spit out the third and eleventh column, I can execute the following:</p>
<p><code>awk -F','    '{print $3 FS $11}' file.input</code></p>
<p>The -F&#8217;,&#8217; tells awk that the input fields will be separated by commas. The area enclosed in the braces is the action I talked about earlier. I didn&#8217;t specify a pattern before the action, so the action was applied to every line of input. &#8220;<code>print $3 FS $11</code>&#8221; tells awk to print to the screen the third field of input, the field separator (which we defined as a comma with the -F&#8217;,'), and the eleventh field of input.</p>
<p>If I wanted to do the same, but only print lines where the third field was over a number, say, 110, I could execute the following:</p>
<p><code>awk -F','    '$3 &gt; 110 {print $3 FS $11}'" file.input</code></p>
<p>The pattern before the braces functions much like an &#8220;if &#8230; then&#8221;. If the third field is over 110, awk prints out the third field, the field separator, and the eleventh field.</p>
<p>There is much, much more that you can do with awk, but this should be enough to hint you in the right direction. I know I use awk daily for various tasks related to command-line mischief.  A common thing I use awk for is to manipulate /etc/passwd, where some user account information is stored.</p>
<p>Fortunately, GNU awk is often smart enough to pick up the field separators without specifying the -F option. For instance, /etc/passwd is separated by a colon &#8220;:&#8221;, but GNU awk automatically recognizes this. It&#8217;s worth noting that on some other systems without GNU utilities, awk may behave in ways that you don&#8217;t anticipate.</p>
<p>That&#8217;s it for the moment, just some small tips to get you moving. I&#8217;d recommend picking up a book on AWK. I recommend you pick up a copy of &#8220;The AWK Programming Language&#8221; by Aho, Kernighan and Weinberger. It only makes sense, since they are the creators of AWK. I have also been told that the O&#8217;Reilly AWK book is very good. In addition, the GNU awk is well-documented all over the Internet, so you shouldn&#8217;t be lacking in study material if you put some effort into it.</p>
<p>Until next time!</p>
<p>-LightningCrash</p>
]]></content:encoded>
			<wfw:commentRss>http://icanhaslinux.com/2008/08/13/basics-in-awk/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>De-RIAAing my music collection</title>
		<link>http://icanhaslinux.com/2007/10/05/de-riaaing-my-music-collection/</link>
		<comments>http://icanhaslinux.com/2007/10/05/de-riaaing-my-music-collection/#comments</comments>
		<pubDate>Fri, 05 Oct 2007 15:34:00 +0000</pubDate>
		<dc:creator>LightningCrash</dc:creator>
				<category><![CDATA[awk]]></category>
		<category><![CDATA[findutils]]></category>
		<category><![CDATA[grep]]></category>
		<category><![CDATA[music]]></category>
		<category><![CDATA[riaa]]></category>
		<category><![CDATA[sed]]></category>
		<category><![CDATA[xargs]]></category>

		<guid isPermaLink="false">http://icanhaslinux.com/2007/10/05/de-riaaing-my-music-collection/</guid>
		<description><![CDATA[I recently decided that I won&#8217;t own any music from an artist that is represented by the RIAA. Now, how do I go about De-RIAAing my ripped albums? RIAA Radar has a website that will let you search for artists, albums, keywords, etc and it will give you information as to whether or not an [...]]]></description>
			<content:encoded><![CDATA[<p align="left">I recently decided that I won&#8217;t own any music from an artist that is represented by the RIAA. Now, how do I go about De-RIAAing my ripped albums?</p>
<p align="left"><a href="http://www.riaaradar.com/">RIAA Radar</a> has a website that will let you search for artists, albums, keywords, etc and it will give you information as to whether or not an album was released under the RIAA.</p>
<p align="left">So I did a view-source on their search page and determined that there are only three variables that you need to POST in order to search: searchtype, keyword, and submit.</p>
<p align="left">I can use wget to grab the file, like so:<br />
<code>wget http://www.riaaradar.com/search.asp --post-data "searchtype=ArtistSearch&amp;keyword=Audioslave&amp;submit=Go\!" -O Audioslave</code></p>
<p align="left">This saves the file as Audioslave. Audioslave IS represented by the RIAA, by the way.</p>
<p align="left">Now, how do I take my ripped albums and compare them to the RIAA Radar site?</p>
<p align="left"><span id="more-59"></span><br />
In my album collection, all of the albums are formatted the same: Artist &#8211; Album Name<br />
This little bit of effort a long time ago makes it easy for me to separate these now.<br />
I simply cd into my albums directory and do the following:<br />
<code>ls|awk '{FS="-"};{ print $1}'|uniq &gt;&gt; artists</code></p>
<p align="left">I now have a file called artists in my album collection that contains unique artist names for every album in the collection.</p>
<p align="left">Now, to find out if they&#8217;re represented by the RIAA:</p>
<p align="left"><code>cat artists|tr " " "+"|xargs -i wget http://www.riaaradar.com/search.asp --post-data "searchtype=ArtistSearch&amp;keyword={}&amp;submit=Go\!" -O radarresults{}.html</code></p>
<p align="left">This will pull down the search result for every artist in my album list, and save it in a file formatted the way I want.<br />
For instance, Jimi Hendrix would be saved as radarresultsJimi+Hendrix.html</p>
<p align="left">I browse this with lynx and see that the text &#8220;Warning!&#8221; would be pretty good to search on.</p>
<p align="left"><code>grep Warning! radarresults*|sed -e 's/&lt;[^&lt;&gt;]*&gt;//g'|tr "+" " "|cut -c 13-|uniq|awk '{FS=".html"};{print $1}' &gt;&gt; riaapunks.txt</code></p>
<p align="left">Explanation: grep searches the files for Warning!, then sed strips out the html. tr converts those + signs to spaces, cut trims off the radarresults portion of the output, uniq filters out duplicates, awk cuts off everything after and including .html, then it all gets dumped to a file.</p>
<p align="left">Now I&#8217;ve got a nice list of everyone who is represented by the RIAA, in a file called riaapunks.txt</p>
<p align="left">Now I get to have fun with it!<br />
<code>cat riaapunks.txt|xargs --verbose -i find ./ -name *{}* </code></p>
<p align="left">Output looks good. Now for the coup de grace:<br />
<code>cat riaapunks.txt|xargs --verbose -i find ./ -name *{}* -delete</code></p>
<p align="left">Buh-bye RIAA music!</p>
]]></content:encoded>
			<wfw:commentRss>http://icanhaslinux.com/2007/10/05/de-riaaing-my-music-collection/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How I loathe regexps&#8230;.but wait&#8230;.</title>
		<link>http://icanhaslinux.com/2007/09/12/how-i-loathe-regexpsbut-wait/</link>
		<comments>http://icanhaslinux.com/2007/09/12/how-i-loathe-regexpsbut-wait/#comments</comments>
		<pubDate>Wed, 12 Sep 2007 23:40:07 +0000</pubDate>
		<dc:creator>LightningCrash</dc:creator>
				<category><![CDATA[awk]]></category>
		<category><![CDATA[egrep]]></category>
		<category><![CDATA[regexp]]></category>
		<category><![CDATA[sed]]></category>

		<guid isPermaLink="false">http://icanhaslinux.com/2007/09/12/how-i-loathe-regexpsbut-wait/</guid>
		<description><![CDATA[Well, I got frustrated with having to refer to documentation every time I wanted to do something with regexps, so I decided to find a cheat sheet. I hate regexps, but I love them too, you know? Thankfully, www.ilovejackdaniels.com has a cheat sheet I don&#8217;t mind having. It&#8217;s more thorough than the others I&#8217;ve found [...]]]></description>
			<content:encoded><![CDATA[<p>Well, I got frustrated with having to refer to documentation every time I wanted to do something with regexps, so I decided to find a cheat sheet. I hate regexps, but I love them too, you know?</p>
<p>Thankfully, www.ilovejackdaniels.com has a cheat sheet I don&#8217;t mind having. It&#8217;s more thorough than the others I&#8217;ve found and it comes in PDF and PNG formats. I had to print the PNG one, since evince printed the greys as solid black in the PDF version.</p>
<p>Anyway, check it out <a href="http://www.ilovejackdaniels.com/cheat-sheets/regular-expressions-cheat-sheet/" target="_blank">here</a>.</p>
<p>I&#8217;ve been using regexps so much lately that I contemplated taping it over my second monitor. I settled for hanging it from the cube wall right next to it.</p>
<p>Until next time!</p>
<p>-LightningCrash</p>
]]></content:encoded>
			<wfw:commentRss>http://icanhaslinux.com/2007/09/12/how-i-loathe-regexpsbut-wait/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

