Knit: an Integrated Development Enviornment for AWK ====================================================== This paper introduces Knit, an IDE for AWK that supports source code libraries, literature programming, and test-driven development. The entire system is less than 400 lines of code: + About 100 lines of GNU make; + Several short AWK scripts (dozens of lines, or less); + One larger AWK script that implements Markdown (about 150 lines) Knit is (nearly) generic to multiple languages- there are only ten lines that are specific to calling AWK code (and these could easily be customised to other languages). So while IDEs like Eclipse are certainly valuable, I offer Knit as an example of how a lazy person might quickly roll their own particular IDE. Introduction ------------ Henry Spencer [wrote in 1991](http://awk.info/?awksys) that: + There is no fundamental reason why AWK programs have to be small "glue" programs. + Even the "old" AWK is a powerful programming language in its own right. + Effective use of its data structures and its stream-oriented structure takes some adjustment for C programmers, but the results can be quite striking. To prove the point, he offers an AWK version of the [Bell Labs NROFF tool](http://awk.info/?tools/awf). Sadly, apart from Spencer's work there are suspiciously few examples of large AWK programs. Since early 2009, I've been the [awk.info](http://AWK.info) webmaster. Based on that experience I diagnose the problem as follows. Certain key tools are missing from the AWK-verse. Specifically, there is: 1. No standard for source code library management . Without a library of parts, you can't build larger assemblies; 2. No standard for unit tests and regression tests. Without a way to exercise the code, you are reluctant to work as a team since you can't tell if someone else's latest change breaks your code. Worse, outsiders can't understands the code since you can't show off what it can do. 3. No standard for documentation. Without documentation tools, no one outside the team can see what you've done Various tools addressed point (1), but there are no AWK tools that handle (1) and (2) and (3) in an integrated manner. Hence, I built _Knit_. I won't presume to call Knit a "standard", yet. But I plan to use it for some Awk documentation projects and if it catches on, it would then become a "candidate standard". Knit is based around a new type of file. These _*.tim_ files are a mixture of Markdown and AWK code. Knit converts _*.tim_ files to executable AWK and browseable HTML: AWK <---- tim ----> html Also, Knit can find the support code used by some _main_ file and bundlle them all together into one standalone executable: main.tim uses -----> AWK/main.AWK --\ 1. a.tim -----> AWK/a.AWK ---\----> app/main 2. b.tim -----> AWK/b.AWK --/ While I was about it, I also added: + Support for literate programming (mixing code and documentation in one file); + Hyperlink navigation around the source code (just click on a function name, and you can jump to its definition); + Some SUBVERSION support (no more messy shared directories); + Quick application of GAWK's debugging tools (profiling, finding stray local variables); + Siimple specification of both usage strings and defaults for command-line options (these are all automatically extracted from the documentation). The rest of this paper describes how to use Knit. If this looks interesting to you, I invite you to join the "Cooking with AWK" project that uses Knit to documents standard solutions to common programming problems. A Quick Overview ---------------- .tim Files __________ In a manner similar to Perl POD files, Knit lets the code author mix up source code and documentation: + _Code_ paragraphs start with exactly one whitespace character; + _Verbatim_ paragraphs start with two or more whitespace characters; + _Text_ paragraphs, which is everything else, start with no whitespace. Two other special paragraphs are: + _Usage_ paragraphs are those that follow a _Usage_ heading. + _Uses_ paragraphs contain lines of the form _@uses x.tim_. These lines list the libraries required to execute the current file. Converting from .tim to .AWK ---------------------------- To buiild executable _*.AWK_ files, Knit converts a _*.tim_ file as follows: + _Vertabtim_ lines and _text_ paragraphs are commented out; + The AWK source _code_ files are left uncommented; + All the _*@uses_ files are loaded before loading this file; + The "usage" paragraphs are converted to an AWK function that can return a usage string; + The "usage" paragraph is parsed to define default values for command-line arguments. Converting from .tim to .html ----------------------------- In order to support documentation, Knit can also generate _*.html_ files from _*.tim_ files as follows: + _Vertabtim_ lines and _code_ paragraphs are wrapped in <pre>...</pre>; + The _text_ paragraphs are formatted as HTML code using the Markdown conventions. The conversion process collects together all the headings and prints them at the end of the file inside a special div. Note that each of these entries links back to the original heading in the file: <div id="AWKtoc"> <h1> <a href=#1>Download</a></h2> <h2> <a href=#2>General Information</a></h2> </div> To turn this end-of-page div into a top-of-page table of contents: + Reduce the font sizes for the headings in _awktow_ + Position the div at top of page. This can be done using CSS; e.g. #AWKtoc { position:relative; top:10px; } #AWKtoc h1 { color: black; font-size: small; font-weight: normal; margin: 0 0 0 0; margin-left: 3px;} Building Stand-alone Executables -------------------------------- Knit requires the standard Unix tools (make, sed, chmod, ls, etc) and can be used on LINUX, OS/X or Windows (once [Cygwin](http://cygwin.com) is installed). In order to suppprt other platforms, Knit can generate one *.AWK* file that bundles up the _@uses_ files into one file begining with #!/usr/bin/GAWK -f so it can be very simply downloaded and exectuted as one file. Testing and Debugging --------------------- Before code is bundled, it needs to be debugged. + Knit allows the programmer to quickly enable the GAWK debugging tools (profiling, looking for stray globals). + The Knit test engine supports test cases definition, execution and scoring. Editing Code ------------ Knit is intergrated with the VIM editor: + Knit tells VIM to edit "*.tim" files using the same syntax highlighting as ".AWK*" files. + Knit enables _ctags_ which turns VIM into a hyperlinked browser of the source code. If you click on a function name, then _Control-]_ jumps you to the definition (and _Control-t_ takes you back). (I'm not an EMACS expert but according the [Sourceforge](http://ctags.sourceforge.net/ctags.html#HOW%20TO%20USE%20WITH%20GNU%20EMACS), the same hyperlink trick is avilable in EMACS ( `M-.` finds the definition of the identifier under the cursor and `M-*` pops back to where you previously invoked `M-.`). SUBVERSION Support ------------------ Knit disables VIM writing backup files into the same files as the source code. This means that code repositories do not get messed up with temporary files. Also, Knit uses a "var" sub-directory to hold its auto-generated code. Knit takes care to tell SUBVERSION to ignore this directory. (BTW, I'd be interested in working with, say, a MECURIAL or a GIT guru to make this work for other version control systems..) Features -------- Documentation _____________ Knit extends the standard Source Code Library Management ______________________________ Borrowing form @uses file.tim Installation ------------ Before you begin ________________ You need a working version of GAWK, PGAWK + CTAGS + SVN + VIM + standard UNIX shell tools (sed, echo, chmod, ls, make,...) Getting the Code ________________ Create a working directory (e.g.) mkdir ~/svns # for example Change to that directory cd ~/svns svn export http://unbox.org/lawker/block/timm/tim tim (But if you have write access to _LAWKER_ you might replace the last line with:) svn checkout http://unbox.org/lawker/block/timm/tim tim --username yourUserName Then edit `~/.bashrc` and _~/.vimrc_. Edits to ~/.bashrc __________________ PATH="$PATH:~/snvs/tim/var/bash Edits to ~/.vimrc _________________ This line enable hypertext navigation of the source code: set tags="~/.vim/tags/tim" This line enables syntac highlighting: set background=light set syntax=on syntax enable These lines enable source code indentation: set smarttab set noexpandtab set tabstop=4 set shiftwidth=4 When sharing files in a version control system, it is a bad idea to have auto-generate files in directories that might be used by multiple users (since then the version control system will declare conflicts on those files). So it is a _very_ good idea to save temporary files to other directories: set backup set backupdir=~/tmp" where to store opt/tmp/backup The following lines are not required for _Knit_, but I still swear by them: set mouse=a "ascii mouse set title "place buffer name into window title set number "show line numbers autocmd BufEnter * cd %:p:h "auto-change to the file's directory set showmatch "show matching brackets set matchtime=15 "----- stuff for incremental search set ignorecase set incsearch set smartcase Author ------ Tim Menzies did most of the coding. Knit uses [Jesus Galan's implementation of Markdown](http://awk.info/?dsl/markdown), with several modifications. To do ----- + scoring over all wants is kinda broekn + use %.got to control the looping + write tiny tutorials (hiding the details) on making AWK ; unit tests in AWK debugging with the AWK tools (and my global conventions), optional globals. + make the .vimrc stuff easy to cut and paste + there is a ug in multi-line list entries + get the usage parser going + softlink vim tags to a local file called tags. + explain whey everything in one direcory (no package system, accept it) + define menu tools for finding all the XXX of thing + add a line to demso #demo make me laugh