Codestin Search App

Showing posts with label shell. Show all posts

Wednesday, June 5, 2013

sarge, a wrapper for Python's subprocess module

Welcome to sarge’s documentation! — Sarge 0.1.1 documentation

sarge provides a somewhat more user-friendly interface to the subprocess module from Python's standard library. It lets your Python program talk to external commands.

Some of the features of sarge include easier usage, security (has some support for preventing shell injection attacks), ability to capture the standard output/error/both of the subprocess it runs, support for I/O redirection and pipes, and even some support for interacting with the subprocess, like the Unix Expect tool can.

Overall, it looks worth checking out for those with such needs.

sarge is by Vinay Sajip, who is also the developer of Py, the Python launcher for Windows (with support for both Python 2 and Python 3), and of the logging module in the Python standard library.

Monday, May 20, 2013

HuffShell suggests aliases for your Unix commands

paulmars/huffshell · GitHub

huffshell is a gem to suggest optimized aliases for your frequently used Unix commands. It looks at your command history to do that.

Seen via this Hacker News post which is interesting too:

What are your top 100 unix commands? (for science) :

https://news.ycombinator.com/item?id=5733426

- Vasudev Ram
dancingbison.com

Sunday, February 17, 2013

PAWK, a Python tool like AWK

http://pypi.python.org/pypi/pawk/0.3

pawk gives you some of the features of the AWK programming language (which is a powerful Unix tool), but in Python.

The pawk link above has some examples of awk commands and the equivalent pawk ones.

In some cases, the pawk commands are shorter - except for the p in pawk; heh, reminds me of the anecdote about the Unix creat system call :-)

Wednesday, January 23, 2013

What the shell? pyxshell :)

nojhan/pyxshell · GitHub

Excerpt:

[
Pyxshell aims to bring text stream manipulation commands with pipelines, like in Unix shells, but in pure Python.
A short example:
>>> from pyxshell.common import grep,glue
>>> pl = []
>>> ['python', 'ruby', 'jython'] | grep(r'yt') > pl
>>> pl | glue("\n") > sys.stdout
python
jython
]

Looks interesting.

As I said before, there are many ways to skin a pipe, er, cat, in Python:

http://jugad2.blogspot.com/2011/09/some-ways-of-doing-unix-style-pipes-in.html

And:

http://jugad2.blogspot.com/2012/10/swapping-pipe-components-at-runtime.html

- Vasudev Ram
www.dancingbison.com

Saturday, November 24, 2012

pinger utilities have multiple uses

Python | Host/Device Ping Utility for Windows

Saw the above pinger utility written by Corey Goldberg a while ago. It is in Python and is multi-threaded.

Seeing it reminded me of writing a pinger utility some years ago for a company I worked for at the time; it was for Unix systems, and was not multi-threaded.

It was written in a combination of Perl (for regex usage), shell and C.

The C part (a program that was called from the controlling shell script) was used to overcome an interesting issue: a kind of "drift" in the times at which the ping command would get invoked.

The users wanted it to run exactly on the minute, every n minutes, but it would sometimes run a few seconds later.

I used custom C code to solve the issue.

Later I learned (by reading more docs :) that the issue could probably have been solved by calling a Unix system call or two (like gettimeofday or getitimer, I forget exactly which right now) from my C program.

Anyway, the tool ended up being used to monitor the uptime of many Unix servers at the company. The sysadmins (who had asked me to create the tool) said that it was useful.

As Corey says in his post, pinger tools can be used to monitor network latency, check if hosts / devices are alive, and also to log and report on their uptime over an extended period, for reporting and for taking corrective action (as my utility was used).

Also check out pingdom.com for an example of a business built on these concepts. Site24x7.com is another such business; it is part of the same company as Zoho.com. They were (and still are) into network monitoring / management before they created the Zoho suite of web apps.

I use both Pingdom and Site24x7 on my business web site www.dancingbison.com, from over a year or more now, to monitor its uptime, and both are fairly good at that.

Tuesday, August 21, 2012

PyDSh, Python Distributed Shell

PyDSH: The Python Distributed Shell

Sunday, July 15, 2012

sed and awk one-liners - two good pages

By Vasudev Ram

I saw these sed and awk one-liner pages via this post on Rajiv Eranki's blog (he works at Dropbox):

Scaling lessons learned at Dropbox, part 1

(The post is about scaling at Dropbox and is interesting in itself.)

The sed and awk one-liner pages:

sed one-liners page

awk one-liners page

For a one-liner that happens to use both sed and awk, check this older post of mine:

UNIX one-liner to kill a hanging Firefox process

The comments on it relating to UNIX processes may be of interest (orphan processes, etc.)

- Vasudev Ram - Dancing Bison Enterprises

Share |

The Bentley-Knuth problem and solutions

By Vasudev Ram

I recently saw this post about an interesting programming problem on the Web (apparently initially posed by Jon Bentley to Donald Knuth.

For lack of a better term (and also because the name is somewhat memorable), I'm calling it the Bentley-Knuth problem: More shell, less egg

The problem description, from the above post:

[
The program Bentley asked Knuth to write is one that’s become familiar to people who use languages with serious text-handling capabilities: Read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies.
]

The post is interesting in itself - read it. For fun, I decided to write solutions to the problem in Python and also in UNIX shell.

My initial Python solution is below. The code is not very Pythonic / refactored / tested, but it works, and does have some minimal error checking. See this Python sorting HOWTO page for some ways it could be improved. UNIX shell solution coming in a while.

UPDATE: Unix shell solution added below the Python one.

Note: I should mention that neither my Python nor UNIX shell solution works exactly the same as the McIlroy shell solution, since that one converts upper case letters to lower case, and also, uses a strict "English dictionary"-style definition of a "word", i.e. only alphabetic characters, whereas my two solutions use the definition of a word as "a sequence of non-blank characters", as is more commonly used in parsing computer programs. But I could add both of the tr invocations to the front of my shell pipeline and get the same result as McIlroy.

# bentley_knuth.py
# Author: Vasudev Ram - http://www.dancingbison.com
# Version: 0.1

# The problem this program tries to solve is from the page:
# http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/

# Description: The program Bentley asked Knuth to write:

# Read a file of text, determine the n most frequently 
# used words, and print out a sorted list of those words 
# along with their frequencies.

import sys
import os
import string

sys_argv = sys.argv

def usage():
 sys.stderr.write("Usage: %s n file\n" % sys_argv[0])
 sys.stderr.write("where n is the number of most frequently\n")
 sys.stderr.write("used words you want to find, and \n")
 sys.stderr.write("file is the name of the file in which to look.\n")

if len(sys_argv) < 3:
 usage()
 sys.exit(1)

try:
 n = int(sys_argv[1])
except ValueError:
 sys.stderr.write("%s: Error: %s is not a decimal numeric value" % (sys_argv[0], 
  sys_argv[1]))
 sys.exit(1)

print "n =", n
if n < 1:
 sys.stderr.write("%s: Error: %s is not a positive value" % 
  (sys_argv[0], sys_argv[1]))

in_filename = sys.argv[2]
print "%s: Finding %d most frequent words in file %s" % \
 (sys_argv[0], n, in_filename)

try:
 fil_in = open(in_filename)
except IOError:
 sys.stderr.write("%s: ERROR: Could not open in_filename %s\n" % \
  (sys_argv[0], in_filename))
 sys.exit(1)

word_freq_dict = {}

for lin in fil_in:
 words_in_line = lin.split()
 for word in words_in_line:
  if word_freq_dict.has_key(word):
   word_freq_dict[word] += 1
  else:
   word_freq_dict[word] = 1

word_freq_list = []
for item in word_freq_dict.items():
 word_freq_list.append(item)

wfl = sorted(word_freq_list, 
 key=lambda word_freq_list: word_freq_list[1], reverse=True)
#wfl.reverse()
print "The %d most frequent words sorted by decreasing frequency:" % n
len_wfl = len(wfl)
if n > len_wfl:
 print "n = %d, file has only %d unique words," % (n, len_wfl)
 print "so printing %d words" % len_wfl
print "Word: Frequency"
m = min(n, len_wfl)
for i in range(m):
 print wfl[i][0], ": ", wfl[i][1]

fil_in.close()

And here is my initial solution in UNIX shell:

# bentley_knuth.sh

# Usage:
# ./bentley_knuth.sh n file
# where "n" is the number of most frequent words 
# you want to find in "file".

awk '
    {
        for (i = 1; i <= NF; i++)
            word_freq[$i]++
    }
END     {
            for (i in word_freq)
                print i, word_freq[i]
        }
' < $2 | sort -nr +1 | sed $1q

- Vasudev Ram - Dancing Bison Enterprises