Lightning Talk - Geekfest
As part of Geekfest, a weekly chicago area meetup I gave a 5-minute lightning talk on some basic command line shortcuts. My outline and the commands are below.
#Introduction
#I'm Philip Corliss a Software Developer here at Groupon, while pairing and my partner is using the keyboard I'll often throw out requests for them to tab-complete, pipe to grep, or ESC-. They usually look at me like I'm crazy. This talk is designed to quickly outline a common scenario of parsing a log file or some other formatted file.
#What's this file
ls
#Whoah, watch this, if you're not familiar with tab completion you should be, it will make your life ridiculously easy. I for one can only spell the first three letters of most words at this point
less post....
#Lets take a look, binary huh? weird
less postal_codes.csv.gz
#Oh yea it's a gzip file, meaning it's compressed we could decompress it, but for this example lets pretend this compressed file is 1Gb and a 1:10 compression ratio that's 10Gb of data, we don't want to wait for that. So lets just look at it inline. Way faster
gzip ...
#Decompresses directly to STDOUT, great for working with the file
gzip -dc postal_codes.csv.gz
#Whoa, too much data, that doesn't really help us
^C
#What's that, it's a Pipe, it pipes stdout from one program to another
gzip -dc postal_codes.csv.gz |
#First 15 lines then it exits, no extra resources or IO consumed and we get a quick sample of data
gzip -dc postal_codes.csv.gz | head -15
#Lets cut this up alternative syntax in awk for multi-char delimeters awk -F'|' '{ print $3 ":" $4}'
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4
#Lets sort it, get the unique values and count them and then sort numerically the results
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4 | sort | uniq -c | sort -n
#Top 30 city/province pairs for canadian zipcodes, it takes some time so we'll cache it to a file
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4 | sort | uniq -c | sort -n | tail -30
#Lets see how long it takes, and lets write it to a file
time gzip -dc postal_codes.csv.gz | cut -d'|' -f3,4 | sort | uniq -c | sort -n > ~/my_sweet_report.txt
#Use ESC-. to bring up last arg
less ~/my_sweet_report.txt
#Introduction
#I'm Philip Corliss a Software Developer here at Groupon, while pairing and my partner is using the keyboard I'll often throw out requests for them to tab-complete, pipe to grep, or ESC-. They usually look at me like I'm crazy. This talk is designed to quickly outline a common scenario of parsing a log file or some other formatted file.
#What's this file
ls
#Whoah, watch this, if you're not familiar with tab completion you should be, it will make your life ridiculously easy. I for one can only spell the first three letters of most words at this point
less post....
#Lets take a look, binary huh? weird
less postal_codes.csv.gz
#Oh yea it's a gzip file, meaning it's compressed we could decompress it, but for this example lets pretend this compressed file is 1Gb and a 1:10 compression ratio that's 10Gb of data, we don't want to wait for that. So lets just look at it inline. Way faster
gzip ...
#Decompresses directly to STDOUT, great for working with the file
gzip -dc postal_codes.csv.gz
#Whoa, too much data, that doesn't really help us
^C
#What's that, it's a Pipe, it pipes stdout from one program to another
gzip -dc postal_codes.csv.gz |
#First 15 lines then it exits, no extra resources or IO consumed and we get a quick sample of data
gzip -dc postal_codes.csv.gz | head -15
#Lets cut this up alternative syntax in awk for multi-char delimeters awk -F'|' '{ print $3 ":" $4}'
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4
#Lets sort it, get the unique values and count them and then sort numerically the results
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4 | sort | uniq -c | sort -n
#Top 30 city/province pairs for canadian zipcodes, it takes some time so we'll cache it to a file
gzip -dc postal_codes.csv.gz | head -15 | cut -d'|' -f3,4 | sort | uniq -c | sort -n | tail -30
#Lets see how long it takes, and lets write it to a file
time gzip -dc postal_codes.csv.gz | cut -d'|' -f3,4 | sort | uniq -c | sort -n > ~/my_sweet_report.txt
#Use ESC-. to bring up last arg
less ~/my_sweet_report.txt