After a decade or more where Single-Page-Applications generated by
JavaScript frameworks have
become the norm, we see that server-side rendered HTML is becoming
popular again, also thanks to libraries such as HTMX or Turbo. Writing a rich web UI in a
traditionally server-side language like Go or Java is now not just possible,
but a very attractive proposition.
We then face the problem of how to write automated tests for the HTML
parts of our web applications. While the JavaScript world has evolved powerful and sophisticated ways to test the UI,
ranging in size from unit-level to integration to end-to-end, in other
languages we do not have such a richness of tools available.
When writing a web application in Go or Java, HTML is commonly generated
through templates, which contain small fragments of logic. It is certainly
possible to test them indirectly through end-to-end tests, but those tests
are slow and expensive.
We can instead write unit tests that use CSS selectors to probe the
presence and correct content of specific HTML elements within a document.
Parameterizing these tests makes it easy to add new tests and to clearly
indicate what details each test is verifying. This approach works with any
language that has access to an HTML parsing library that supports CSS
selectors; examples are provided in Go and Java.
Level 1: checking for sound HTML
The number one thing we want to check is that the HTML we produce is
basically sound. I don’t mean to check that HTML is valid according to the
W3C; it would be cool to do it, but it’s better to start with much simpler and faster checks.
For instance, we want our tests to
break if the template generates something like
<div>foo</p>
Let’s see how to do it in stages: we start with the following test that
tries to compile the template. In Go we use the standard html/template
package.
Go
func Test_wellFormedHtml(t *testing.T) span
In Java, we use jmustache
because it’s very simple to use; Freemarker or
Velocity are other common choices.
Java
@Test void indexIsSoundHtml() tt)\\b.*?>", "") // inline elements .replaceAll("<[^>]*>", " "); // block elements // replace HTML character entities html = html.replaceAll(" ", " ") .replaceAll("<", "<") // must be after stripping HTML tags, to avoid creating accidental elements .replaceAll(">", ">") .replaceAll(""", "\"") .replaceAll("'", "'") .replaceAll("&", "&"); // must be last, to avoid creating accidental character entities return normalizeWhitespace(html);
If we run this test, it will fail, because the index.tmpl
file does
not exist. So we create it, with the above broken HTML. Now the test should pass.
Then we create a model for the template to use. The application manages a todo-list, and
we can create a minimal model for demonstration purposes.
Go
func Test_wellFormedHtml(t *testing.T) small
Java
@Test void indexIsSoundHtml() cite
Now we render the template, saving the results in a bytes buffer (Go) or as a String
(Java).
Go
func Test_wellFormedHtml(t *testing.T) // custom visualization using data-test-icon attribute html = replaceAll(html, "<[^<>]+\\bdata-test-icon=\"(.*?)\".*?>", " $1 ") // strip all HTML tags: inline elements html = replaceAll(html, "</?(a
Java
@Test void indexIsSoundHtml() strong
At this point, we want to parse the HTML and we expect to see an
error, because in our broken HTML there is a div
element that
is closed by a p
element. There is an HTML parser in the Go
standard library, but it is too lenient: if we run it on our broken HTML, we don’t get an
error. Luckily, the Go standard library also has an XML parser that can be
configured to parse HTML (thanks to this Stack Overflow answer)
Go
func Test_wellFormedHtml(t *testing.T) {
templ := template.Must(template.ParseFiles("index.tmpl"))
model := todo.NewList()
// render the template into a buffer
var buf bytes.Buffer
err := templ.Execute(&buf, model)
if err != nil code
// check that the template can be parsed as (lenient) XML
decoder := xml.NewDecoder(bytes.NewReader(buf.Bytes()))
decoder.Strict = false
decoder.AutoClose = xml.HTMLAutoClose
decoder.Entity = xml.HTMLEntity
for {
_, err := decoder.Token()
switch err {
case io.EOF:
return // We're done, it's valid!
case nil:
// do nothing
default:
t.Fatalf("Error parsing html: %s", err)
}
}
}
source
This code configures the HTML parser to have the right level of leniency
for HTML, and then parses the HTML token by token. Indeed, we see the error
message we wanted:
--- FAIL: Test_wellFormedHtml (0.00s) index_template_test.go:61: Error parsing html: XML syntax error on line 4: unexpected end element </p>
In Java, a versatile library to use is jsoup:
Java
@Test
void indexIsSoundHtml() {
var template = Mustache.compiler().compile(
new InputStreamReader(
getClass().getResourceAsStream("/index.tmpl")));
var model = new TodoList();
var html = template.execute(model);
var parser = Parser.htmlParser().setTrackErrors(10);
Jsoup.parse(html, "", parser);
assertThat(parser.getErrors()).isEmpty();
}
source
And we see it fail:
java.lang.AssertionError: Expecting empty but was:<[<1:13>: Unexpected EndTag token [</p>] when in state [InBody],
Success! Now if we copy over the contents of the TodoMVC
template to our index.tmpl
file, the test passes.
The test, however, is too verbose: we extract two helper functions, in
order to make the intention of the test clearer, and we get
Go
func Test_wellFormedHtml(t *testing.T) { model := todo.NewList() buf := renderTemplate("index.tmpl", model) assertWellFormedHtml(t, buf) }
source
Java
@Test void indexIsSoundHtml() { var model = new TodoList(); var html = renderTemplate("/index.tmpl", model); assertSoundHtml(html); }
source
Level 2: testing HTML structure
What else should we test?
We know that the looks of a page can only be tested, ultimately, by a
human looking at how it is rendered in a browser. However, there is often
logic in templates, and we want to be able to test that logic.
One might be tempted to test the rendered HTML with string equality,
but this technique fails in practice, because templates contain a lot of
details that make string equality assertions impractical. The assertions
become very verbose, and when reading the assertion, it becomes difficult
to understand what it is that we’re trying to prove.
What we need
is a technique to assert that some parts of the rendered HTML
correspond to what we expect, and to ignore all the details we don’t
care about. One way to do this is by running queries with the CSS selector language:
it is a powerful language that allows us to select the
elements that we care about from the whole HTML document. Once we have
selected those elements, we (1) count that the number of element returned
is what we expect, and (2) that they contain the text or other content
that we expect.
The UI that we are supposed to generate looks like this:
There are several details that are rendered dynamically:
- The number of items and their text content change, obviously
- The style of the todo-item changes when it’s completed (e.g., the
second) - The “2 items left” text will change with the number of non-completed
items - One of the three buttons “All”, “Active”, “Completed” will be
highlighted, depending on the current url; for instance if we decide that the
url that shows only the “Active” items is/active
, then when the current url
is/active
, the “Active” button should be surrounded by a thin red
rectangle - The “Clear completed” button should only be visible if any item is
completed
Each of this concerns can be tested with the help of CSS selectors.
This is a snippet from the TodoMVC template (slightly simplified). I
have not yet added the dynamic bits, so what we see here is static
content, provided as an example:
index.tmpl
<section class="todoapp"> <ul class="todo-list"> <!-- These are here just to show the structure of the list items --> <!-- List items should get the class `completed` when marked as completed --> <li class="completed"> ② <div class="view"> <input class="toggle" type="checkbox" checked> <label>Taste JavaScript</label> ① <button class="destroy"></button> </div> </li> <li> <div class="view"> <input class="toggle" type="checkbox"> <label>Buy a unicorn</label> ① <button class="destroy"></button> </div> </li> </ul> <footer class="footer"> <!-- This should be `0 items left` by default --> <span class="todo-count"><strong>0</strong> item left</span> ⓷ <ul class="filters"> <li> <a class="selected" href="#/">All</a> ④ </li> <li> <a href="#/active">Active</a> </li> <li> <a href="#/completed">Completed</a> </li> </ul> <!-- Hidden if no completed items are left ↓ --> <button class="clear-completed">Clear completed</button> ⑤ </footer> </section>
source
By looking at the static version of the template, we can deduce which
CSS selectors can be used to identify the relevant elements for the 5 dynamic
features listed above:
feature | CSS selector | |
---|---|---|
① | All the items | ul.todo-list li |
② | Completed items | ul.todo-list li.completed |
⓷ | Items left | span.todo-count |
④ | Highlighted navigation link | ul.filters a.selected |
⑤ | Clear completed button | button.clear-completed |
We can use these selectors to focus our tests on just the things we want to test.
Testing HTML content
The first test will look for all the items, and prove that the data
set up by the test is rendered correctly.
func Test_todoItemsAreShown(t *testing.T) { model := todo.NewList() model.Add("Foo") model.Add("Bar") buf := renderTemplate(model) // assert there are two <li> elements inside the <ul class="todo-list"> // assert the first <li> text is "Foo" // assert the second <li> text is "Bar" }
We need a way to query the HTML document with our CSS selector; a good
library for Go is goquery, that implements an API inspired by jQuery.
In Java, we keep using the same library we used to test for sound HTML, namely
jsoup. Our test becomes:
Go
func Test_todoItemsAreShown(t *testing.T) { model := todo.NewList() model.Add("Foo") model.Add("Bar") buf := renderTemplate("index.tmpl", model) // parse the HTML with goquery document, err := goquery.NewDocumentFromReader(bytes.NewReader(buf.Bytes())) if err != nil { // if parsing fails, we stop the test here with t.FatalF t.Fatalf("Error rendering template %s", err) } // assert there are two <li> elements inside the <ul class="todo-list"> selection := document.Find("ul.todo-list li") assert.Equal(t, 2, selection.Length()) // assert the first <li> text is "Foo" assert.Equal(t, "Foo", text(selection.Nodes[0])) // assert the second <li> text is "Bar" assert.Equal(t, "Bar", text(selection.Nodes[1])) } func text(node *html.Node) string { // A little mess due to the fact that goquery has // a .Text() method on Selection but not on html.Node sel := goquery.Selection{Nodes: []*html.Node{node}} return strings.TrimSpace(sel.Text()) }
source
Java
@Test void todoItemsAreShown() throws IOException { var model = new TodoList(); model.add("Foo"); model.add("Bar"); var html = renderTemplate("/index.tmpl", model); // parse the HTML with jsoup Document document = Jsoup.parse(html, ""); // assert there are two <li> elements inside the <ul class="todo-list"> var selection = document.select("ul.todo-list li"); assertThat(selection).hasSize(2); // assert the first <li> text is "Foo" assertThat(selection.get(0).text()).isEqualTo("Foo"); // assert the second <li> text is "Bar" assertThat(selection.get(1).text()).isEqualTo("Bar"); }
source
If we still haven’t changed the template to populate the list from the
model, this test will fail, because the static template
todo items have different text:
Go
--- FAIL: Test_todoItemsAreShown (0.00s) index_template_test.go:44: First list item: want Foo, got Taste JavaScript index_template_test.go:49: Second list item: want Bar, got Buy a unicorn
Java
IndexTemplateTest > todoItemsAreShown() FAILED org.opentest4j.AssertionFailedError: Expecting: <"Taste JavaScript"> to be equal to: <"Foo"> but was not.
We fix it by making the template use the model data:
Go
<ul class="todo-list"> {{ range .Items }} <li> <div class="view"> <input class="toggle" type="checkbox"> <label>{{ .Title }}</label> <button class="destroy"></button> </div> </li> {{ end }} </ul>
source
Java – jmustache
<ul class="todo-list"> {{ #allItems }} <li> <div class="view"> <input class="toggle" type="checkbox"> <label>{{ title }}</label> <button class="destroy"></button> </div> </li> {{ /allItems }} </ul>
source
Test both content and soundness at the same time
Our test works, but it is a bit verbose, especially the Go version. If we’re going to have more
tests, they will become repetitive and difficult to read, so we make it more concise by extracting a helper function for parsing the html. We also remove the
comments, as the code should be clear enough
Go
func Test_todoItemsAreShown(t *testing.T) { model := todo.NewList() model.Add("Foo") model.Add("Bar") buf := renderTemplate("index.tmpl", model) document := parseHtml(t, buf) selection := document.Find("ul.todo-list li") assert.Equal(t, 2, selection.Length()) assert.Equal(t, "Foo", text(selection.Nodes[0])) assert.Equal(t, "Bar", text(selection.Nodes[1])) } func parseHtml(t *testing.T, buf bytes.Buffer) *goquery.Document { document, err := goquery.NewDocumentFromReader(bytes.NewReader(buf.Bytes())) if err != nil { // if parsing fails, we stop the test here with t.FatalF t.Fatalf("Error rendering template %s", err) } return document }
Java
@Test void todoItemsAreShown() throws IOException { var model = new TodoList(); model.add("Foo"); model.add("Bar"); var html = renderTemplate("/index.tmpl", model); var document = parseHtml(html); var selection = document.select("ul.todo-list li"); assertThat(selection).hasSize(2); assertThat(selection.get(0).text()).isEqualTo("Foo"); assertThat(selection.get(1).text()).isEqualTo("Bar"); } private static Document parseHtml(String html) { return Jsoup.parse(html, ""); }
Much better! At least in my opinion. Now that we extracted the parseHtml
helper, it’s
a good idea to check for sound HTML in the helper:
Go
func parseHtml(t *testing.T, buf bytes.Buffer) *goquery.Document {
assertWellFormedHtml(t, buf)
document, err := goquery.NewDocumentFromReader(bytes.NewReader(buf.Bytes()))
if err != nil {
// if parsing fails, we stop the test here with t.FatalF
t.Fatalf("Error rendering template %s", err)
}
return document
}
source
Java
private static Document parseHtml(String html) { var parser = Parser.htmlParser().setTrackErrors(10); var document = Jsoup.parse(html, "", parser); assertThat(parser.getErrors()).isEmpty(); return document; }
source
And with this, we can get rid of the first test that we wrote, as we are now testing for sound HTML all the time.
The second test
Now we are in a good position for testing more rendering logic. The
second dynamic feature in our list is “List items should get the class
completed
when marked as completed”. We can write a test for this:
Go
func Test_completedItemsGetCompletedClass(t *testing.T) { model := todo.NewList() model.Add("Foo") model.AddCompleted("Bar") buf := renderTemplate("index.tmpl", model) document := parseHtml(t, buf) selection := document.Find("ul.todo-list li.completed") assert.Equal(t, 1, selection.Size()) assert.Equal(t, "Bar", text(selection.Nodes[0])) }
source
Java
@Test void completedItemsGetCompletedClass() { var model = new TodoList(); model.add("Foo"); model.addCompleted("Bar"); var html = renderTemplate("/index.tmpl", model); Document document = Jsoup.parse(html, ""); var selection = document.select("ul.todo-list li.completed"); assertThat(selection).hasSize(1); assertThat(selection.text()).isEqualTo("Bar"); }
source
And this test can be made green by adding this bit of logic to the
template:
Go
<ul class="todo-list">
{{ range .Items }}
<li class="{{ if .IsCompleted }}completed{{ end }}">
<div class="view">
<input class="toggle" type="checkbox">
<label>{{ .Title }}</label>
<button class="destroy"></button>
</div>
</li>
{{ end }}
</ul>
source
Java – jmustache
<ul class="todo-list">
{{ #allItems }}
<li class="{{ #isCompleted }}completed{{ /isCompleted }}">
<div class="view">
<input class="toggle" type="checkbox">
<label>{{ title }}</label>
<button class="destroy"></button>
</div>
</li>
{{ /allItems }}
</ul>
source
So little by little, we can test and add the various dynamic features
that our template should have.
Make it easy to add new tests
The first of the 20 tips from the excellent talk by Russ Cox on Go
Testing is “Make it easy to add new test cases“. Indeed, in Go there
is a tendency to make most tests parameterized, for this very reason.
On the other hand, while Java has
good support
for parameterized tests with JUnit 5, they don’t seem to be used as much.
Since our current two tests have the same structure, we
could factor them into a single parameterized test.
A test case for us will consist of:
- A name (so that we can produce clear error messages when the test
fails) - A model (in our case a
todo.List
) - A CSS selector
- A list of text matches that we expect to find when we run the CSS
selector on the rendered HTML.
So this is the data structure for our test cases:
Go
var testCases = []struct { name string model *todo.List selector string matches []string }{ { name: "all todo items are shown", model: todo.NewList(). Add("Foo"). Add("Bar"), selector: "ul.todo-list li", matches: []string{"Foo", "Bar"}, }, { name: "completed items get the 'completed' class", model: todo.NewList(). Add("Foo"). AddCompleted("Bar"), selector: "ul.todo-list li.completed", matches: []string{"Bar"}, }, }
source
Java
record TestCase(String name, TodoList model, String selector, List<String> matches) { @Override public String toString() { return name; } } public static TestCase[] indexTestCases() { return new TestCase[]{ new TestCase( "all todo items are shown", new TodoList() .add("Foo") .add("Bar"), "ul.todo-list li", List.of("Foo", "Bar")), new TestCase( "completed items get the 'completed' class", new TodoList() .add("Foo") .addCompleted("Bar"), "ul.todo-list li.completed", List.of("Bar")), }; }
source
And this is our parameterized test:
Go
func Test_indexTemplate(t *testing.T) { for _, test := range testCases { t.Run(test.name, func(t *testing.T) { buf := renderTemplate("index.tmpl", test.model) assertWellFormedHtml(t, buf) document := parseHtml(t, buf) selection := document.Find(test.selector) require.Equal(t, len(test.matches), len(selection.Nodes), "unexpected # of matches") for i, node := range selection.Nodes { assert.Equal(t, test.matches[i], text(node)) } }) } }
source
Java
@ParameterizedTest @MethodSource("indexTestCases") void testIndexTemplate(TestCase test) { var html = renderTemplate("/index.tmpl", test.model); var document = parseHtml(html); var selection = document.select(test.selector); assertThat(selection).hasSize(test.matches.size()); for (int i = 0; i < test.matches.size(); i++) { assertThat(selection.get(i).text()).isEqualTo(test.matches.get(i)); } }
source
We can now run our parameterized test and see it pass:
Go
$ go test -v === RUN Test_indexTemplate === RUN Test_indexTemplate/all_todo_items_are_shown === RUN Test_indexTemplate/completed_items_get_the_'completed'_class --- PASS: Test_indexTemplate (0.00s) --- PASS: Test_indexTemplate/all_todo_items_are_shown (0.00s) --- PASS: Test_indexTemplate/completed_items_get_the_'completed'_class (0.00s) PASS ok tdd-html-templates 0.608s
Java
$ ./gradlew test > Task :test IndexTemplateTest > testIndexTemplate(TestCase) > [1] all todo items are shown PASSED IndexTemplateTest > testIndexTemplate(TestCase) > [2] completed items get the 'completed' class PASSED
Note how, by giving a name to our test cases, we get very readable test output, both on the terminal and in the IDE:
Having rewritten our two old tests in table form, it’s now super easy to add
another. This is the test for the “x items left” text:
Go
{ name: "items left", model: todo.NewList(). Add("One"). Add("Two"). AddCompleted("Three"), selector: "span.todo-count", matches: []string{"2 items left"}, },
source
Java
new TestCase( "items left", new TodoList() .add("One") .add("Two") .addCompleted("Three"), "span.todo-count", List.of("2 items left")),
source
And the corresponding change in the html template is:
Go
<span class="todo-count"><strong>{{len .ActiveItems}}</strong> items left</span>
source
Java – jmustache
<span class="todo-count"><strong>{{activeItemsCount}}</strong> items left</span>
source
The above change in the template requires a supporting method in the model:
Go
type Item struct {
Title string
IsCompleted bool
}
type List struct {
Items []*Item
}
func (l *List) ActiveItems() []*Item {
var result []*Item
for _, item := range l.Items {
if !item.IsCompleted {
result = append(result, item)
}
}
return result
}
source
Java
public class TodoList {
private final List<TodoItem> items = new ArrayList<>();
// ...
public long activeItemsCount() {
return items.stream().filter(TodoItem::isActive).count();
}
}
source
We’ve invested a little effort in our testing infrastructure, so that adding new
test cases is easier. In the next section, we’ll see that the requirements
for the next test cases will push us to refine our test infrastructure further.
Making the table more expressive, at the expense of the test code
We will now test the “All”, “Active” and “Completed” navigation links at
the bottom of the UI (see the picture above),
and these depend on which url we are visiting, which is
something that our template has no way to find out.
Currently, all we pass to our template is our model, which is a todo-list.
It’s not correct to add the currently visited url to the model, because that is
user navigation state, not application state.
So we need to pass more information to the template beyond the model. An easy way
is to pass a map, which we construct in our
renderTemplate
function:
Go
func renderTemplate(model *todo.List, path string) bytes.Buffer { templ := template.Must(template.ParseFiles("index.tmpl")) var buf bytes.Buffer data := map[string]any{ "model": model, "path": path, } err := templ.Execute(&buf, data) if err != nil { panic(err) } return buf }
Java
private String renderTemplate(String templateName, TodoList model, String path) { var template = Mustache.compiler().compile( new InputStreamReader( getClass().getResourceAsStream(templateName))); var data = Map.of( "model", model, "path", path ); return template.execute(data); }
And correspondingly our test cases table has one more field:
Go
var testCases = []struct { name string model *todo.List path string selector string matches []string }{ { name: "all todo items are shown", model: todo.NewList(). Add("Foo"). Add("Bar"), selector: "ul.todo-list li", matches: []string{"Foo", "Bar"}, }, // ... the other cases { name: "highlighted navigation link: All", path: "/", selector: "ul.filters a.selected", matches: []string{"All"}, }, { name: "highlighted navigation link: Active", path: "/active", selector: "ul.filters a.selected", matches: []string{"Active"}, }, { name: "highlighted navigation link: Completed", path: "/completed", selector: "ul.filters a.selected", matches: []string{"Completed"}, }, }
Java
record TestCase(String name, TodoList model, String path, String selector, List<String> matches) { @Override public String toString() { return name; } } public static TestCase[] indexTestCases() { return new TestCase[]{ new TestCase( "all todo items are shown", new TodoList() .add("Foo") .add("Bar"), "/", "ul.todo-list li", List.of("Foo", "Bar")), // ... the previous cases new TestCase( "highlighted navigation link: All", new TodoList(), "/", "ul.filters a.selected", List.of("All")), new TestCase( "highlighted navigation link: Active", new TodoList(), "/active", "ul.filters a.selected", List.of("Active")), new TestCase( "highlighted navigation link: Completed", new TodoList(), "/completed", "ul.filters a.selected", List.of("Completed")), }; }
We notice that for the three new cases, the model is irrelevant;
while for the previous cases, the path is irrelevant. The Go syntax allows us
to initialize a struct with just the fields we’re interested in, but Java does not have
a similar feature, so we’re pushed to pass extra information, and this makes the test cases
table harder to understand.
A developer might look at the first test case and wonder if the expected behavior depends
on the path being set to “/”
, and might be tempted to add more cases with
a different path. In the same way, when reading the
highlighted navigation link test cases, the developer might wonder if the
expected behavior depends on the model being set to an empty todo list. If so, one might
be led to add irrelevant test cases for the highlighted link with non-empty todo-lists.
We want to optimize for the time of the developers, so it’s worthwhile to avoid adding irrelevant
data to our test case. In Java we might pass null
for the
irrelevant fields, but there’s a better way: we can use
the builder pattern,
popularized by Joshua Bloch.
We can quickly write one for the Java TestCase
record this way:
Java
record TestCase(String name,
TodoList model,
String path,
String selector,
List<String> matches) {
@Override
public String toString() {
return name;
}
public static final class Builder {
String name;
TodoList model;
String path;
String selector;
List<String> matches;
public Builder name(String name) {
this.name = name;
return this;
}
public Builder model(TodoList model) {
this.model = model;
return this;
}
public Builder path(String path) {
this.path = path;
return this;
}
public Builder selector(String selector) {
this.selector = selector;
return this;
}
public Builder matches(String ... matches) {
this.matches = Arrays.asList(matches);
return this;
}
public TestCase build() {
return new TestCase(name, model, path, selector, matches);
}
}
}
Hand-coding builders is a little tedious, but doable, though there are
automated ways to write them.
Now we can rewrite our Java test cases with the Builder
, to
achieve greater clarity:
Java
public static TestCase[] indexTestCases() { return new TestCase[]{ new TestCase.Builder() .name("all todo items are shown") .model(new TodoList() .add("Foo") .add("Bar")) .selector("ul.todo-list li") .matches("Foo", "Bar") .build(), // ... other cases new TestCase.Builder() .name("highlighted navigation link: Completed") .path("/completed") .selector("ul.filters a.selected") .matches("Completed") .build(), }; }
So, where are we with our tests? At present, they fail for the wrong reason: null-pointer exceptions
due to the missing model
and path
values.
In order to get our new test cases to fail for the right reason, namely that the template does
not yet have logic to highlight the correct link, we must
provide default values for model
and path
. In Go, we can do this
in the test method:
Go
func Test_indexTemplate(t *testing.T) {
for _, test := range testCases {
t.Run(test.name, func(t *testing.T) {
if test.model == nil {
test.model = todo.NewList()
}
buf := renderTemplate(test.model, test.path)
// ... same as before
})
}
}
source
In Java, we can provide default values in the builder:
Java
public static final class Builder { String name; TodoList model = new TodoList(); String path = "/"; String selector; List<String> matches; // ... }
source
With these changes, we see that the last two test cases, the ones for the highlighted link Active
and Completed fail, for the expected reason that the highlighted link does not change:
Go
=== RUN Test_indexTemplate/highlighted_navigation_link:_Active index_template_test.go:82: Error Trace: .../tdd-templates/go/index_template_test.go:82 Error: Not equal: expected: "Active" actual : "All" === RUN Test_indexTemplate/highlighted_navigation_link:_Completed index_template_test.go:82: Error Trace: .../tdd-templates/go/index_template_test.go:82 Error: Not equal: expected: "Completed" actual : "All"
Java
IndexTemplateTest > testIndexTemplate(TestCase) > [5] highlighted navigation link: Active FAILED org.opentest4j.AssertionFailedError: Expecting: <"All"> to be equal to: <"Active"> but was not. IndexTemplateTest > testIndexTemplate(TestCase) > [6] highlighted navigation link: Completed FAILED org.opentest4j.AssertionFailedError: Expecting: <"All"> to be equal to: <"Completed"> but was not.
To make the tests pass, we make these changes to the template:
Go
<ul class="filters"> <li> <a class="{{ if eq .path "/" }}selected{{ end }}" href="#/">All</a> </li> <li> <a class="{{ if eq .path "/active" }}selected{{ end }}" href="#/active">Active</a> </li> <li> <a class="{{ if eq .path "/completed" }}selected{{ end }}" href="#/completed">Completed</a> </li> </ul>
source
Java – jmustache
<ul class="filters"> <li> <a class="{{ #pathRoot }}selected{{ /pathRoot }}" href="#/">All</a> </li> <li> <a class="{{ #pathActive }}selected{{ /pathActive }}" href="#/active">Active</a> </li> <li> <a class="{{ #pathCompleted }}selected{{ /pathCompleted }}" href="#/completed">Completed</a> </li> </ul>
source
Since the Mustache template language does not allow for equality testing, we must change the
data passed to the template so that we execute the equality tests before rendering the template:
Java
private String renderTemplate(String templateName, TodoList model, String path) { var template = Mustache.compiler().compile( new InputStreamReader( getClass().getResourceAsStream(templateName))); var data = Map.of( "model", model, "pathRoot", path.equals("/"), "pathActive", path.equals("/active"), "pathCompleted", path.equals("/completed") ); return template.execute(data); }
source
And with these changes, all of our tests now pass.
To recap this section, we made the test code a little bit more complicated, so that the test
cases are clearer: this is a very good tradeoff!
Level 3: testing HTML behaviour
In the story so far, we tested the behaviour of the HTML
templates, by checking the structure of the generated HTML.
That’s good, but what if we wanted to test the behaviour of the HTML
itself, plus any CSS and JavaScript it may use?
The behaviour of HTML by itself is usually pretty obvious, because
there is not much of it. The only elements that can interact with the
user are the anchor (<a>
), <form>
and
<input>
elements, but the picture changes completely when
we add CSS, that can hide, show, move around things and lots more, and
with JavaScript, that can add any behaviour to a page.
In an application that is primarily rendered server-side, we expect
that most behaviour is implemented by returning new HTML with a
round-trip to the user, and this can be tested adequately with the
techniques we’ve seen so far, but what if we wanted to speed up the
application behaviour with a library such as HTMX? This library works through special
attributes that are added to elements to add Ajax behaviour. These
attributes are in effect a DSL that we might want to
test.
How can we test the combination of HTML, CSS and JavaScript in
a unit test?
Testing HTML, CSS and JavaScript requires something that is able to
interpret and execute their behaviours; in other words, we need a
browser! It is customary to use headless browsers in end-to-end tests;
can we use them for unitary tests instead? I think this is possible,
using the following techniques, although I must admit I have yet to try
this on a real project.
We will use the Playwright
library, that is available for both Go and
Java. The tests we
are going to write will be slower, because we will have to wait a few
seconds for the headless browser to start, but will retain some of the
important characteristics of unit tests, primarily that we are testing
just the HTML (and any associated CSS and JavaScript), in isolation from
any other server-side logic.
Continuing with the TodoMVC
example, the next thing we might want to test is what happens when the
user clicks on the checkbox of a todo item. What we’d like to happen is
that:
- A POST call to the server is made, so that the application knows
that the state of a todo item has changed - The server returns new HTML for the dynamic part of the page,
namely all of the section with class “todoapp”, so that we can show the
new state of the application including the count of remaining “active”
items (see the template above) - The page replaces the old contents of the “todoapp” section with
the new ones.
Loading the page in the Playwright browser
We start with a test that will just load the initial HTML. The test
is a little involved, so I show the complete code here, and then I will
comment it bit by bit.
Go
func Test_toggleTodoItem(t *testing.T) { // render the initial HTML model := todo.NewList(). Add("One"). Add("Two") initialHtml := renderTemplate("index.tmpl", model, "/") // open the browser page with Playwright page := openPage() defer page.Close() logActivity(page) // stub network calls err := page.Route("**", func(route playwright.Route) { if route.Request().URL() == "http://localhost:4567/index.html" { // serve the initial HTML stubResponse(route, initialHtml.String(), "text/html") } else { // avoid unexpected requests panic("unexpected request: " + route.Request().URL()) } }) if err != nil { t.Fatal(err) } // load initial HTML in the page response, err := page.Goto("http://localhost:4567/index.html") if err != nil { t.Fatal(err) } if response.Status() != 200 { t.Fatalf("unexpected status: %d", response.Status()) } }
source
Java
public class IndexBehaviourTest { static Playwright playwright; static Browser browser; @BeforeAll static void launchBrowser() { playwright = Playwright.create(); browser = playwright.chromium().launch(); } @AfterAll static void closeBrowser() { playwright.close(); } @Test void toggleTodoItem() { // Render the initial html TodoList model = new TodoList() .add("One") .add("Two"); String initialHtml = renderTemplate("/index.tmpl", model, "/"); try (Page page = browser.newPage()) { logActivity(page); // stub network calls page.route("**", route -> { if (route.request().url().equals("http://localhost:4567/index.html")) { // serve the initial HTML route.fulfill(new Route.FulfillOptions() .setContentType("text/html") .setBody(initialHtml)); } else { // we don't want unexpected calls fail(String.format("Unexpected request: %s %s", route.request().method(), route.request().url())); } }); // load initial html page.navigate("http://localhost:4567/index.html"); } } }
source
At the start of the test, we initialize the model with two todo
items “One” and “Two”, then we render the template as before:
Go
model := todo.NewList(). Add("One"). Add("Two") initialHtml := renderTemplate("index.tmpl", model, "/")
Java
TodoList model = new TodoList() .add("One") .add("Two"); String initialHtml = renderTemplate("/index.tmpl", model, "/");
Then we open the Playwright “page”, which will start a headless
browser
Go
page := openPage() defer page.Close() logActivity(page)
Java
try (Page page = browser.newPage()) { logActivity(page);
The openPage
function in Go returns a Playwright
Page
object,
Go
func openPage() playwright.Page { pw, err := playwright.Run() if err != nil { log.Fatalf("could not start playwright: %v", err) } browser, err := pw.Chromium.Launch() if err != nil { log.Fatalf("could not launch browser: %v", err) } page, err := browser.NewPage() if err != nil { log.Fatalf("could not create page: %v", err) } return page }
and the logActivity
function provides feedback on what
the page is doing
Go
func logActivity(page playwright.Page) { page.OnRequest(func(request playwright.Request) { log.Printf(">> %s %s\n", request.Method(), request.URL()) }) page.OnResponse(func(response playwright.Response) { log.Printf("<< %d %s\n", response.Status(), response.URL()) }) page.OnLoad(func(page playwright.Page) { log.Println("Loaded: " + page.URL()) }) page.OnConsole(func(message playwright.ConsoleMessage) { log.Println("! " + message.Text()) }) }
Java
private void logActivity(Page page) { page.onRequest(request -> System.out.printf(">> %s %s%n", request.method(), request.url())); page.onResponse(response -> System.out.printf("<< %s %s%n", response.status(), response.url())); page.onLoad(page1 -> System.out.println("Loaded: " + page1.url())); page.onConsoleMessage(consoleMessage -> System.out.println("! " + consoleMessage.text())); }
Then we stub all network activity that the page might try to do
Go
err := page.Route("**", func(route playwright.Route) {
if route.Request().URL() == "http://localhost:4567/index.html" {
// serve the initial HTML
stubResponse(route, initialHtml.String(), "text/html")
} else {
// avoid unexpected requests
panic("unexpected request: " + route.Request().URL())
}
})
Java
// stub network calls
page.route("**", route -> {
if (route.request().url().equals("http://localhost:4567/index.html")) {
// serve the initial HTML
route.fulfill(new Route.FulfillOptions()
.setContentType("text/html")
.setBody(initialHtml));
} else {
// we don't want unexpected calls
fail(String.format("Unexpected request: %s %s", route.request().method(), route.request().url()));
}
});
and we ask the page to load the initial HTML
Go
response, err := page.Goto("http://localhost:4567/index.html")
Java
page.navigate("http://localhost:4567/index.html");
With all this machinery in place, we run the test; it succeeds and
it logs the stubbed network activity on standard output:
Go
=== RUN Test_toggleTodoItem >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html Loaded: http://localhost:4567/index.html --- PASS: Test_toggleTodoItem (0.89s)
Java
IndexBehaviourTest > toggleTodoItem() STANDARD_OUT >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html Loaded: http://localhost:4567/index.html IndexBehaviourTest > toggleTodoItem() PASSED
So with this test we are now able to load arbitrary HTML in a
headless browser. In the next sections we’ll see how to simulate user
interaction with elements of the page, and observe the page’s
behaviour. But first we need to solve a problem with the lack of
identifiers in our domain model.
Identifying todo items
Now we want to click on the “One” checkbox. The problem we have is
that at present, we have no way to identify individual todo items, so
we introduce an Id
field in the todo item:
Go – updated model with Id
type Item struct { Id int Title string IsCompleted bool } func (l *List) AddWithId(id int, title string) *List { item := Item{ Id: id, Title: title, } l.Items = append(l.Items, &item) return l } // Add creates a new todo.Item with a random Id func (l *List) Add(title string) *List { item := Item{ Id: generateRandomId(), Title: title, } l.Items = append(l.Items, &item) return l } func generateRandomId() int { return abs(rand.Int()) }
Java – updated model with Id
public class TodoList { private final List<TodoItem> items = new ArrayList<>(); public TodoList add(String title) { items.add(new TodoItem(generateRandomId(), title, false)); return this; } public TodoList addCompleted(String title) { items.add(new TodoItem(generateRandomId(), title, true)); return this; } public TodoList add(int id, String title) { items.add(new TodoItem(id, title, false)); return this; } private static int generateRandomId() { return new Random().nextInt(0, Integer.MAX_VALUE); } } public record TodoItem(int id, String title, boolean isCompleted) { public boolean isActive() { return !isCompleted; } }
And we update the model in our test to add explicit Ids
Go – adding Id in the test data
func Test_toggleTodoItem(t *testing.T) { // render the initial HTML model := todo.NewList(). AddWithId(101, "One"). AddWithId(102, "Two") initialHtml := renderTemplate("index.tmpl", model, "/") // ... }
Java – adding Id in the test data
@Test void toggleTodoItem() { // Render the initial html TodoList model = new TodoList() .add(101, "One") .add(102, "Two"); String initialHtml = renderTemplate("/index.tmpl", model, "/"); }
We are now ready to test user interaction with the page.
Clicking on a todo item
We want to simulate user interaction with the HTML page. It might be
tempting to continue to use CSS selectors to identify the specific
checkbox that we want to click, but there’s a better way: there is a
consensus among front-end developers that the best way to test
interaction with a page is to use it
the same way that users do. For instance, you don’t look for a
button through a CSS locator such as button.buy
; instead,
you look for something clickable with the label “Buy”. In practice,
this means identifying parts of the page through their
ARIA roles.
To this end, we add code to our test to look for a checkbox labelled
“One”:
Go
func Test_toggleTodoItem(t *testing.T) { // ... // click on the "One" checkbox checkbox := page.GetByRole(*playwright.AriaRoleCheckbox, playwright.PageGetByRoleOptions{Name: "One"}) if err := checkbox.Click(); err != nil { t.Fatal(err) } }
Java
@Test void toggleTodoItem() { // ... // click on the "One" checkbox var checkbox = page.getByRole(AriaRole.CHECKBOX, new Page.GetByRoleOptions().setName("One")); checkbox.click(); } }
We run the test, and it fails:
Go
>> GET http://localhost:4567/index.html
<< 200 http://localhost:4567/index.html
Loaded: http://localhost:4567/index.html
--- FAIL: Test_toggleTodoItem (32.74s)
index_behaviour_test.go:50: playwright: timeout: Timeout 30000ms exceeded.
Java
IndexBehaviourTest > toggleTodoItem() STANDARD_OUT
>> GET http://localhost:4567/index.html
<< 200 http://localhost:4567/index.html
Loaded: http://localhost:4567/index.html
IndexBehaviourTest > toggleTodoItem() FAILED
com.microsoft.playwright.TimeoutError: Error {
message="link the label to the checkbox properly:
generated HTML with bad accessibility
<li>
<div class="view">
<input class="toggle" type="checkbox">
<label>One</label>
<button class="destroy"></button>
</div>
</li>
We fix it by using the for
attribute in the
template,
index.tmpl – Go
<li>
<div class="view">
<input id="checkbox-{{.Id}}" class="toggle" type="checkbox">
<label for="checkbox-{{.Id}}">{{.Title}}</label>
<button class="destroy"></button>
</div>
</li>
index.tmpl – Java
<li>
<div class="view">
<input id="checkbox-{{ id }}" class="toggle" type="checkbox">
<label for="checkbox-{{ id }}">{{ title }}</label>
<button class="destroy"></button>
</div>
</li>
So that it generates proper, accessible HTML:
generated HTML with better accessibility
<li>
<div class="view">
<input id="checkbox-101" class="toggle" type="checkbox">
<label for="checkbox-101">One</label>
<button class="destroy"></button>
</div>
</li>
We run again the test, and it passes.
In this section we saw how testing the HTML in the same was as users
interact with it led us to use ARIA roles, which led to improving
accessibility of our generated HTML. In the next section, we will see
how to test that the click on a todo item triggers a remote call to the
server, that should result in swapping a part of the current HTML with
the HTML returned by the XHR call.
Round-trip to the server
Now we will extend our test. We tell the test that if call to
POST /toggle/101
is received, it should return some
stubbed HTML.
Go
} else if route.Request().URL() == "http://localhost:4567/toggle/101" && route.Request().Method() == "POST" { // we expect that a POST /toggle/101 request is made when we click on the "One" checkbox const stubbedHtml = ` <section class="todoapp"> <p>Stubbed html</p> </section>` stubResponse(route, stubbedHtml, "text/html")
Java
} else if (route.request().url().equals("http://localhost:4567/toggle/101") && route.request().method().equals("POST")) { // we expect that a POST /toggle/101 request is made when we click on the "One" checkbox String stubbedHtml = """ <section class="todoapp"> <p>Stubbed html</p> </section> """; route.fulfill(new Route.FulfillOptions() .setContentType("text/html") .setBody(stubbedHtml));
And we stub the loading of the HTMX library, which we load from a
local file:
Go
} else if route.Request().URL() == "https://unpkg.com/htmx.org@1.9.12" {
// serve the htmx library
stubResponse(route, readFile("testdata/htmx.min.js"), "application/javascript")
Go
} else if (route.request().url().equals("https://unpkg.com/htmx.org@1.9.12")) {
// serve the htmx library
route.fulfill(new Route.FulfillOptions()
.setContentType("text/html")
.setBody(readFile("/htmx.min.js")));
Finally, we add the expectation that, after we click the checkbox,
the section of the HTML that contains most of the application is
reloaded.
Go
// click on the "One" checkbox checkbox := page.GetByRole(*playwright.AriaRoleCheckbox, playwright.PageGetByRoleOptions{Name: "One"}) if err := checkbox.Click(); err != nil { t.Fatal(err) } // check that the page has been updated document := parseHtml(t, content(t, page)) elements := document.Find("body > section.todoapp > p") assert.Equal(t, "Stubbed html", elements.Text(), must(page.Content()))
java
// click on the "One" checkbox var checkbox = page.getByRole(AriaRole.CHECKBOX, new Page.GetByRoleOptions().setName("One")); checkbox.click(); // check that the page has been updated var document = parseHtml(page.content()); var elements = document.select("body > section.todoapp > p"); assertThat(elements.text()) .describedAs(page.content()) .isEqualTo("Stubbed html");
We run the test, and it fails, as expected. In order to understand
why exactly it fails, we add to the error message the whole HTML
document.
Go
assert.Equal(t, "Stubbed html", elements.Text(), must(page.Content()))
Java
assertThat(elements.text())
.describedAs(page.content())
.isEqualTo("Stubbed html");
The error message is very verbose, but we see that the reason it
fails is that we don’t see the stubbed HTML in the output. This means
that the page did not make the expected XHR call.
Go – Java is similar
--- FAIL: Test_toggleTodoItem (2.75s) === RUN Test_toggleTodoItem >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html Loaded: http://localhost:4567/index.html index_behaviour_test.go:67: Error Trace: .../index_behaviour_test.go:67 Error: Not equal: expected: "Stubbed html" actual : "" ... Test: Test_toggleTodoItem Messages: <!DOCTYPE html><html lang="en"><head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Template • TodoMVC</title> <script src="https://unpkg.com/htmx.org@1.9.12"></script> <body> <section class="todoapp"> ... <li class=""> <div class="view"> <input id="checkbox-101" class="toggle" type="checkbox"> <label for="checkbox-101">One</label> <button class="destroy"></button> </div> </li> ...
We can make this test pass by changing the HTML template to use HTMX
to make an XHR call back to the server. First we load the HTMX
library:
index.tmpl
<title>Template • TodoMVC</title>
<script src="https://unpkg.com/htmx.org@1.9.12"></script>
Then we add the HTMX attributes to the checkboxes:
index.tmpl
<input data-hx-post="/toggle/{{.Id}}" data-hx-target="section.todoapp" id="checkbox-{{.Id}}" class="toggle" type="checkbox">
The data-hx-post
annotation will make HTMX do a POST
call to the specified url. The data-hx-target
tells HTMX
to copy the HTML returned by the call, to the element specified by the
section.todoapp
CSS locator.
We run again the test, and it still fails!
Go – Java is similar
--- FAIL: Test_toggleTodoItem (2.40s) === RUN Test_toggleTodoItem >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html >> GET https://unpkg.com/htmx.org@1.9.12 << 200 https://unpkg.com/htmx.org@1.9.12 Loaded: http://localhost:4567/index.html >> POST http://localhost:4567/toggle/101 << 200 http://localhost:4567/toggle/101 index_behaviour_test.go:67: Error Trace: .../index_behaviour_test.go:67 Error: Not equal: expected: "Stubbed html" actual : "" ... Test: Test_toggleTodoItem Messages: <!DOCTYPE html><html lang="en"><head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Template • TodoMVC</title> <script src="https://unpkg.com/htmx.org@1.9.12"></script> ... <body> <section class="todoapp"><section class="todoapp"> <p>Stubbed html</p> </section></section> ... </body></html>
The log lines show that the POST call happened as expected, but
examination of the error message shows that the HTML structure we
expected is not there: we have a section.todoapp
nested
inside another. This means that we are not using the HTMX annotations
correctly, and shows why this kind of test can be valuable. We add the
missing annotation
index.tmpl
<input
data-hx-post="/toggle/{{.Id}}"
data-hx-target="section.todoapp"
data-hx-swap="outerHTML"
id="checkbox-{{.Id}}"
class="toggle"
type="checkbox">
The default behaviour of HTMX is to replace the inner HTML of the
target element. The data-hx-swap=”outerHTML”
annotation
tells HTMX to replace the outer HTML instead.
and we test again, and this time it passes!
Go
=== RUN Test_toggleTodoItem >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html >> GET https://unpkg.com/htmx.org@1.9.12 << 200 https://unpkg.com/htmx.org@1.9.12 Loaded: http://localhost:4567/index.html >> POST http://localhost:4567/toggle/101 << 200 http://localhost:4567/toggle/101 --- PASS: Test_toggleTodoItem (1.39s)
Java
IndexBehaviourTest > toggleTodoItem() STANDARD_OUT >> GET http://localhost:4567/index.html << 200 http://localhost:4567/index.html >> GET https://unpkg.com/htmx.org@1.9.12 << 200 https://unpkg.com/htmx.org@1.9.12 Loaded: http://localhost:4567/index.html >> POST http://localhost:4567/toggle/101 << 200 http://localhost:4567/toggle/101 IndexBehaviourTest > toggleTodoItem() PASSED
In this section we saw how to write a test for the behaviour of our
HTML that, while using the complicated machinery of a headless browser,
still feels more like a unit test than an integration test. It is in
fact testing just an HTML page with any associated CSS and JavaScript,
in isolation from other parts of the application such as controllers,
services or repositories.
The test costs 2-3 seconds of waiting time for the headless browser to come up, which is usually too much for a unit test; however, like a unit test, it is very stable, as it is not flaky, and its failures are documented with a relatively clear error message.
See the final version of the test in Go and in Java.
Bonus level: Stringly asserted
Esko Luontola, TDD expert and author of the online course tdd.mooc.fi, suggested an alternative to testing HTML with CSS selectors: the idea is to transform HTML into a human-readable canonical form.
Let’s take for example this snippet of generated HTML:
<ul class="todo-list"> <li class=""> <div class="view"> <input id="checkbox-100" class="toggle" type="checkbox"> <label for="checkbox-100">One</label> <button class="destroy"></button> </div> </li> <li class=""> <div class="view"> <input id="checkbox-200" class="toggle" type="checkbox"> <label for="checkbox-200">Two</label> <button class="destroy"></button> </div> </li> <li class="completed"> <div class="view"> <input id="checkbox-300" class="toggle" type="checkbox"> <label for="checkbox-300">Three</label> <button class="destroy"></button> </div> </li> </ul>
We could visualize the above HTML by:
- deleting all HTML tags
- reducing every sequence of whitespace characters to a single blank
to arrive at:
One Two Three
This, however, removes too much of the HTML structure to be useful. For instance, it does not let us distinguish between active and completed items. Some HTML element represent visible content: for instance
<input value="foo" />
shows a text box with the word “foo” that is an important part of the way we perceive HTML. To visualize those elements, Esko suggests to add a data-test-icon
attribute that supplies some text to be used in place of the element when visualizing it for testing. With this,
<input value="foo" data-test-icon="[foo]" />
the input element is visualized as [foo]
, with the square brackets hinting that the word “foo” sits inside an editable text box. Now if we add test-icons to our HTML template,
Go — Java is similar
<ul class="todo-list"> {{ range .model.AllItems }} <li class="{{ if .IsCompleted }}completed{{ end }}"> <div class="view"> <input data-hx-post="/toggle/{{ .Id }}" data-hx-target="section.todoapp" data-hx-swap="outerHTML" id="checkbox-{{ .Id }}" class="toggle" type="checkbox" data-test-icon="{{ if .IsCompleted }}✅{{ else }}⬜{{ end }}"> <label for="checkbox-{{ .Id }}">{{ .Title }}</label> <button class="destroy" data-test-icon="❌️"></button> </div> </li> {{ end }} </ul>
we can assert against its canonical visual representation like this:
Go
func Test_visualize_html_example(t *testing.T) { model := todo.NewList(). Add("One"). Add("Two"). AddCompleted("Three") buf := renderTemplate("todo-list.tmpl", model, "/") expected := ` ⬜ One ❌️ ⬜ Two ❌️ ✅ Three ❌️ ` assert.Equal(t, normalizeWhitespace(expected), visualizeHtml(buf.String())) }
Java
@Test void visualize_html_example() { var model = new TodoList() .add("One") .add("Two") .addCompleted("Three"); var html = renderTemplate("/todo-list.tmpl", model, "/"); assertThat(visualizeHtml(html)) .isEqualTo(normalizeWhitespace(""" ⬜ One ❌️ ⬜ Two ❌️ ✅ Three ❌️ """)); }
Here is Esko Luontola’s Java implementation of the two functions that make this possible, and my translation to Go of his code.
Go
func visualizeHtml(html string) string { // custom visualization using data-test-icon attribute html = replaceAll(html, "<[^<>]+\\bdata-test-icon=\"(.*?)\".*?>", " $1 ") // strip all HTML tags: inline elements html = replaceAll(html, "</?(a|abbr|b|big|cite|code|em|i|small|span|strong|tt)\\b.*?>", "") // strip all HTML tags: block elements html = replaceAll(html, "<[^>]*>", " ") // replace HTML character entities html = replaceAll(html, " ", " ") html = replaceAll(html, "<", "<") html = replaceAll(html, ">", ">") html = replaceAll(html, """, "\"") html = replaceAll(html, "'", "'") html = replaceAll(html, "&", "&") return normalizeWhitespace(html) } func normalizeWhitespace(s string) string { return strings.TrimSpace(replaceAll(s, "\\s+", " ")) } func replaceAll(src, regex, repl string) string { re := regexp.MustCompile(regex) return re.ReplaceAllString(src, repl) }
source
Java
public static String visualizeHtml(String html) { // custom visualization using data-test-icon attribute html = html.replaceAll("<[^<>]+\\bdata-test-icon=\"(.*?)\".*?>", " $1 "); // strip all HTML tags html = html.replaceAll("</?(a|abbr|b|big|cite|code|em|i|small|span|strong|tt)\\b.*?>", "") // inline elements .replaceAll("<[^>]*>", " "); // block elements // replace HTML character entities html = html.replaceAll(" ", " ") .replaceAll("<", "<") // must be after stripping HTML tags, to avoid creating accidental elements .replaceAll(">", ">") .replaceAll(""", "\"") .replaceAll("'", "'") .replaceAll("&", "&"); // must be last, to avoid creating accidental character entities return normalizeWhitespace(html); } public static String normalizeWhitespace(String s) { return s.replaceAll("\\s+", " ").trim(); }
source
In this section, we have seen a technique for asserting HTML content that is an alternative to the CSS selector-based technique used in the rest of the article. Esko Luontola has reported great success with it, and I hope readers have success with it too!
This technique of asserting against large, complicated data structures such as HTML pages by reducing them to a canonical string version has no name that I know of. Martin Fowler suggested “stringly asserted”, and from his suggestion comes the name of this section.