I thought I had posted a thread or issue about this before, as this has been on my roadmap for a few years now, however any search is returning nothing, so forgive me if I’m posting again.
I’m the author of bevry/dorothy which has hundreds of terminal commands. So far there is testing that verifies exit status, stdout, and stderr - but nothing that verifies the overall experience.
I’d like to use asciinema to record the expected experience when everything works, then when developing and testing replay the same experience and check for divergences, failing if there is. This seems plausible right now by re-recording the experience and comparing .cast
files, however I would have to remove the timestamp and delay data from the .cast
files.
Here is an example of such a technique showcasing a bug of one in Dorothy’s ask
command that would have been detected from such a use case of asciinema, in which the the ANSI escape codes were meant to clear a few lines, however instead an ANSI escape code for going back to the default TTY was sent in error - these ANSI escape codes were being sent to the /dev/tty
making it impractical to catch with our existing test framework (which only captures stdout and stderr).
> fs-diff -- *.cast
2025-01-24-13-48-33-89136.cast ⟶ 2025-01-24-13-48-47-89392.cast
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
───┐
1: │
───┘
│ 1 │{"version":2,"width":202,"height":40,"timestamp":1737697713,"command":"ask q d","env":{"SHELL"↵│ 1 │{"version":2,"width":202,"height":40,"timestamp":1737697727,"command":"ask q d","env":{"SHELL"
│ │:"/opt/homebrew/bin/fish","TERM":"xterm-256color"},"theme":{"fg":"#ffffff","bg":"#1e1e1e","pal↵│ │:"/opt/homebrew/bin/fish","TERM":"xterm-256color"},"theme":{"fg":"#ffffff","bg":"#1e1e1e","pal
│ │ette":"#000000:#990000:#00a600:#999900:#0000b3:#b300b3:#00a6b3:#bfbfbf:#666666:#e60000:#00d900→│ │ette":"#000000:#990000:#00a600:#999900:#0000b3:#b300b3:#00a6b3:#bfbfbf:#666666:#e60000:#00d900
│ 2 │[0.351981, "o", "\u001b[?2004h\u001b[1m\u001b[4mq\u001b[22m\u001b[24m\r\n\u001b[2md\u001b[22m\↴│ 2 │[0.333133, "o", "\u001b[?2004h\u001b[1m\u001b[4mq\u001b[22m\u001b[24m\r\n\u001b[2md\u001b[22m\
│ │ …r\n> "]│ │ …r\n> "
│ 3 │[0.965807, "i", "a"] │ 3 │[0.932957, "i", "a"]
│ 4 │[0.965984, "o", "a"] │ 4 │[0.93314, "o", "a"]
│ 5 │[1.129581, "i", "n"] │ 5 │[1.578224, "i", "n"]
│ 6 │[1.129659, "o", "n"] │ 6 │[1.578295, "o", "n"]
│ 7 │[1.190577, "i", "s"] │ 7 │[1.728916, "i", "s"]
│ 8 │[1.190654, "o", "s"] │ 8 │[1.729008, "o", "s"]
│ 9 │[1.429664, "i", "w"] │ 9 │[1.937808, "i", "w"]
│ 10 │[1.429755, "o", "w"] │ 10 │[1.937879, "o", "w"]
│ 11 │[1.823403, "i", "e"] │ 11 │[2.087957, "i", "e"]
│ 12 │[1.823765, "o", "e"] │ 12 │[2.088104, "o", "e"]
│ 13 │[1.896892, "i", "r"] │ 13 │[2.178357, "i", "r"]
│ 14 │[1.896983, "o", "r"] │ 14 │[2.178443, "o", "r"]
│ 15 │[2.105498, "i", "\r"] │ 15 │[2.41771, "i", "\r"]
│ 16 │[2.105844, "o", "\r\n\u001b[?2004l\r"] │ 16 │[2.417763, "o", "\r\n"]
│ 17 │[2.24255, "o", "\u001b[?1049l"] │ 17 │[2.417764, "o", "\u001b[?2004l\r"]
│ 18 │[2.243033, "o", "answer\r\n"] │ 18 │[2.550842, "o", "\u001b[3F\u001b[J"]
│ │ │ 19 │[2.551235, "o", "answer\r\n"]
Furthermore, this would also have the advantage of having recordings of each of the commands in their working state, that would assist with publicity and documentation of the project.
Ideally, I would like just a asciinema rec
for the initial experience, then say a asciinema verify/replay
that attempts to replay the experience (replay the recorded inputs and delays against the same command) and if a divergence has been encountered, then it fails with what exactly diverged, and if no divergence occurred then return exit status 0 and exit without any success message if --quiet
was provided. A divergence being any change between stdout, stderr, TTY, and exit status. Variant delays in processing are okay for the proof of concept, perhaps later any signficant varience in delay should also trigger a divergence failure.
Building this functionality directly into asciinema
, besides providing a superior divergence reporting experience, would also provide a superior replay experience, as I wouldn’t have to write a custom parser of the .cast
file to grab the inputs and delays to replay them.