CWEB Experience In 2022

Published on May 28, 2022

This blog post is about CWEB and how to use it more comforable with XeLaTeX and EMACS.

A Long Long History With Literate Programing

The CWEB Documentation give a brief history of CWEB, a program first created in 1987. Even the documentation is generated in year 2002, which is 20 years ago from now. Many people liked to follow the popular NEW things, but I prefer "The Old New Thing" way.

I've came across the idea of literate programming dates back to 2011, when I was a python script boy. I heard it from a famous python user whose name is zoomquiet. but I haven't dive deeper into it since then. A month or two months ago, I got the book 21st Century C from one of my friends, and it talks about CWEB in it. a week or two weeks ago, I read another book named from one of my friends Advanced Algorithms and Data Structures. I'd like to type these algorithms in C and in a way literate. So here comes this blog post.

One may thought a software created decades ago is hard to use, ineffecient and outdated. Its true, but not always true. Reading through the CWEB's documentation, you will find the idea is simple and beautiful.

Any way, I decided to use the ancient software, to literate program.

The Bads   ATTACH

_20220528_222532Screenshot from 2022-05-28 15-30-07.png
Figure 1: CWEB file in vim

The first time I came to program, I found myself is missing syntax highlight and structure edit.

Vim declares it support CWEB, and emacs says it support CWEB as well, but all of them is just support for LaTeX, not really for C in LaTex.

The only editor that support is NEdit. I haven't heard of it however.

As an emacs user for a long time, I just want to find an emacs support.

Despite the decades years history of CWEB, there is nearly NONE usable for me, or at least provide what I want out of box.

There are my needs:

  1. Write c easy with syntax highlight and reasonable indent for the c code.
  2. Navagate easily inside the cweb file. Despite one can acturely choose the "grep" way.
  3. Chinese Support for pdf export.
  4. Good with toolchains. Like line number for gdb or integrate support in make.

The Attempt

It's a long way dive into the long long history of CWEB, and much of the endeavor contributed by many people.

cwebbin

To my surprise, I found a modern fork of CWEB on github cwebbin, it is not even still alive, but carefully and actively maintained and improved by a good germany user whose name is Andreas Scherer . Greate respect to him/her! I even noticed a file in the repo: cweb.el, wow…

integrate it into DOOM EMACS is quiet easy:

(package! cweb
  :recipe (:host github :repo "ascherer/cweb"))
(use-package cweb
  :mode ("\\.w\\'" . cweb-mode)
  )

However, this major mode is more a way for how to edit LaTeX.

;; OK, here's part (1): some WEB-oriented functions whose main purpose is
; to maintain a stack of module names that are "pending" as you are writing
; a program. When you first think of a module that needs to be written later,
; put it into the pending list (by typing CTL-Z instead of @> after the
; name).
; ...
;; OK, here's part (2): Changes to TeX mode.
; The WEB modes below are very much like TeX mode, but some improvements were
; desirable in TeX mode:
; I made newline act as it does in indented-text mode, since this
; works nicely for both TeX and WEB (Pascal or C code).
; I made RET check for unmatched delimiters if it ends a paragraph.
; Otherwise TeX mode remains as it was before.

It provide a way to memorize a module name temproraly to writen lator. and it changes into Tex mode.

As for it is good for Don Knuth, it may not as good as me.

; Contributed by Don Knuth, July 1990

Specially, it maps insertion of " to LaTeX's quote, its annoying when type a c program.

And his keymap is strange for me…

I do not really need a stack to keep notes on my idea that way, moreover, what I really care about is a good experience for literate programming in Org Mode.

mmm-mode

Yes, I'm an emacs user, much of the reason I am an emacs user is that I am an org mode user.

Even this blog, the post, the whole site is writen in org mode. Org babel gives me a good experience to have a way to mix program with literate text.

So I know it is possible that the CWEB major mode to have different major mode for different part of a text. like in org babel.

I came across the mmm-mode which dates back more than 22 years ago…

Not suprise, it provides cweb in battery!

From the ChangeLog History, A man named Alan Shutko contribute CWEB into mmm-mode since 2001.

I integrate mmm-mode into DOOM EMACS and config it to work with CWEB is easy.

(package! mmm-mode)
(use-package mmm-mode
  :config
  (setq mmm-global-mode 'maybe)
  (mmm-add-mode-ext-class nil "\\.w\\'" 'cweb)
  )

However, it just not works at all.

I do not know why, after skimed through mmm-mode's info page, and tweaked a little. I just gave up. the c-mode not work well, it hangs, I dont know why. Maybe that is part of the view of ancient software.

polymode to rescure   ATTACH

EMACS Wiki provides several alternatives to mmm-mode, or I can jump into how org mode handle blocks syntax highlight with emacs overlay support. I used to got the book Master Emacs from one of my friends I do not remember.The author posted a clear illustration on how to use polymode, so I tried with it. It deserves.

integrate polymode in DOOM EMACS is quite easy.

(package! polymode)

Follow this link, tweak a c mode in latex mode for CWEB file is easy.

(use-package polymode
  :mode ("\.w$" . poly-cweb-c-mode)
  :config

  ;;(pm-debug-mode)
  ;;(setq flycheck-global-modes '(not LaTeX-mode latex-mode))
  (define-derived-mode cweb-latex-mode latex-mode "CWEB"
    "Major mode for LaTeX in CWEB syntax."

    ;; Highlight CWEB @<section name@>.
    (font-lock-add-keywords nil '(("@<[^@]+@>" . font-lock-warning-face)))
    (font-lock-add-keywords nil '(("^@\\*[^.]+.$" . font-lock-keyword-face)))
    ;; Imenu
    (setq imenu-generic-expression
          '(
            ("Section" "\\(@[(<][^@]+@>=\\)" 1)
            ("Group" "^\\(@\\*[^*]*\\.\\)$" 1)
            )
          )
    )

  (define-derived-mode cweb-c-mode c-mode "CWEB-C"
    "Major mode for C in CWEB syntax."

    ;; Highlight CWEB @<section name@>.
    (font-lock-add-keywords nil '(("@<[^@]+@>" . font-lock-warning-face)))
    )

  (define-hostmode poly-cweb-hostmode :mode 'cweb-latex-mode)

  ; code block
  (define-innermode poly-cweb-c-code-innermode
    :mode 'cweb-c-mode
    :head-matcher (rx bol (zero-or-one "@ ") "@c" eol)
    :tail-matcher (rx bol "@" (any " " "*"))
    :head-mode 'host
    :tail-mode 'host)

  ; section definition
  (define-innermode poly-cweb-c-part-innermode
    :mode 'cweb-c-mode
    :head-matcher (rx bol (zero-or-one "@ ") (group "@" (any "(" "<") (zero-or-more (not "@")) "@>="))
    :tail-matcher (rx bol "@")
    :head-mode 'host
    :tail-mode 'host)

  (define-polymode poly-cweb-c-mode
    :hostmode 'poly-cweb-hostmode
    :innermodes '(poly-cweb-c-code-innermode poly-cweb-c-part-innermode)
    )
  )

then it is the uhah time.

_20220528_222839Screenshot from 2022-05-28 13-54-24.png
Figure 2: Awesome polymode with multi major mode.

polymode is awesome!

however, there is still some improvements can be easily applied.

  1. Derive from c-mode.Because I don't want tons of functions like flycheck or lsp for c that not really work in CWEB shout out errors all the time.
  2. And I want to highlight CWEB specific syntax in C code. The default c-mode does not help.
  3. I'd like to have imenu help me to quickly navigate to desired section definition or group names. Instead of always resort to grep.

Here is the final config, such a simple config express itself, built on top the great efforts of many emacs hackers.

;;(add-to-list 'load-path "~/.doom.d/snippets")
;;(require 'cweb)
(use-package polymode
  :mode ("\.w$" . poly-cweb-c-mode)
  :config

  ;;(pm-debug-mode)
  ;;(setq flycheck-global-modes '(not LaTeX-mode latex-mode))
  (define-derived-mode cweb-c-mode c-mode "CWEB-C"
    "Major mode for C in CWEB syntax."

    ;; Highlight CWEB @<section name@>.
    (font-lock-add-keywords nil '(("@<[^@]+@>" . font-lock-warning-face)))
    ;; Imenu
    (setq imenu-generic-expression
          '(
            ("Section" "\\(@[(<][^@]+@>=\\)" 1)
            ("Group" "^\\(@\\*[^*]*\\.\\)$" 1)
            )
          )
    )

  (define-hostmode poly-cweb-hostmode :mode 'latex-mode)

  ; code block
  (define-innermode poly-cweb-c-code-innermode
    :mode 'cweb-c-mode
    :head-matcher (rx bol (zero-or-one "@ ") "@c" eol)
    :tail-matcher (rx bol "@" (any " " "*"))
    :head-mode 'host
    :tail-mode 'host)

  ; section definition
  (define-innermode poly-cweb-c-part-innermode
    :mode 'cweb-c-mode
    :head-matcher (rx bol (zero-or-one "@ ") (group "@" (any "(" "<") (zero-or-more (not "@")) "@>="))
    :tail-matcher (rx bol "@")
    :head-mode 'host
    :tail-mode 'host)

  ; c code in ||
  (define-innermode poly-cweb-c-embed-innermode
    :mode 'cweb-c-mode
    :head-matcher (rx "|")
    :tail-matcher (rx "|")
    :head-mode 'host
    :tail-mode 'host)

  (define-polymode poly-cweb-c-mode
    :hostmode 'poly-cweb-hostmode
    :innermodes '(poly-cweb-c-code-innermode poly-cweb-c-part-innermode poly-cweb-c-embed-innermode)
    )
  )

There is how imenu looks, just type M-SPC s i in DOOM EMACS.

_20220528_222938Screenshot from 2022-05-28 15-19-34.png
Figure 3: Imenu index groups and sections.

LaTex, even XeLaTeX…   ATTACH

The last thing I explored is the way to written a CWEB program in Chinese.

If one want to weave a CWEB file into a tex file. he/she may do this:

cweave heap.w - heap.tex

then he/she can translate the tex file into a pdf file using pdftex.

pdftex heap.tex

To get all of this to work, is not as easily as it is sound. In fact, as a tex user for a long long time, I still found hard to get it really work well.

Luckily, Paul Batchelor really use CWEB to maintain a real project, and some user complained about heavy dev dependencies for cweb. so I know I must install more TeX utilities.

apt install texlive-extra-utils texlive-formats-extra

Interesting enough, Paul Batchelor is also an Org Mode user, more specific an Org Babel user. He even implement a A tangler targeted at Org Mode and NOWEB to facilitate his literate programming life. Looks really cool, but let us not digress too much.

In 2018, Paul Batchelor write a post Initial Impressions using Org-Babel for Literate Programming, this blog compared CWEB with org babel. I found it informative! but he said

CWEB, on the other hand, has no fancy integrations or syntax highlighting, which could make navigation a real drag on larger documents.

Yeah, that's I tried to partially migrate with above chapters.

Even though nowadays are WEB prominent. I still prefer reading a pdf or pure text. Most of the reading I did is on my E-Ink tablets, and really, even most of the time in my work, I use a E-Ink Monitor. I use E-Ink Monitor so often. So TeX is still interesting for me, and read code in well printed way sounds interesting. For years, there are no good solution to read codes on E-Ink Tablets.

Let's come back to the topic on how to use LaTeX and even XeLaTex.

Tobi Lehman write a guiding blog post named CWEB LaTeX Experiment with with how to use LaTeX with CWEB. Again, a software looks "outdated" still works well. And more interesting enough, he/she is an EMACS and Org Mode user as well. I even found an article talked about why pdf is more preferable.

Fortunately, all this is exquisitely documented, albeit in a document that is 27 years old: cweb-user.pdf

I added these lines as the well-crafted LaTeX cweb class document suggested:

% preamble (limbo)
\documentclass{cweb}
% use package
\usepackage{hyperref}
\hypersetup{
    colorlinks,
    citecolor=black,
    filecolor=black,
    linkcolor=black,
    urlcolor=black
}

\begin{document}
\title{D-Way Heap}
\author{reverland}
\maketitle
\tableofcontents % if you want

... my content

@
\end{document}

I even tried to use CTeX to have Chinese typesetting for CWEB. it's not that hard.

% preamble (limbo)
\documentclass{cweb}
% use package
\usepackage[UTF8]{ctex}
\usepackage{hyperref}
\hypersetup{
    colorlinks,
    citecolor=black,
    filecolor=black,
    linkcolor=black,
    urlcolor=black
}

\begin{document}
\title{D-Way Heap, C实现}
\author{尼古拉斯 赵四}
\maketitle
\tableofcontents % if you want

@* D-way heap.

这是一篇关于 dway heap 实现的小文章。使用 \LaTeX 和 CWEB.

@*2 介绍.
D-way heap是一种数据结构。 这种数据结构本质上是一种保留部分顺序的树。

@c
...
_20220528_223019Screenshot from 2022-05-28 21-09-54.png
Figure 4: CWEB in Chinese

It can be compiled to pdf, but, things become a little clunky.

  1. I found myself have to use . to denote group end, that not so Chinsese.
  2. I had to add a space after some character or compilation will be failed. looks like the lexer is stuck at UTF-8.Considered that it is born 30 year ago when UTF-8 does not exist at all…
  3. Some references not really chinese, I may need to tweak CWEB source code to translate.
  4. last but not least, It looks weired to mix code with Chinese.

In fact, there still exists some issues about UTF-8 and unicode for CWEB. Igor Liferenko create a CWEB fork with UTF-8 support here: UTF-8 installations of CWEB. Yes, the old program is actively developed! Maybe one day, One can literate programming in C in any natural languages.

But now, lets just stick with English.

Summary

In this article, I talked about how I tried to use CWEB in EMACS, how to configure it to have some sugars in emacs and how to use latex with cweb.

In the end, we can have a decent experience with a decades-old software.

EMACS itself is a decades-old software as well, and it is versatile enough to achieve more. Like add a dump-jump for CWEB file for better completion and navigation. But this post will stop here.