What is Human Interaction Interface (HII)? It is a term that I invented to describe a potentially useful extra layer on top of representation layer (MVC) on the client side… It is a concept similar to that of Google’s layer for separating style from design (Material Design). Now HII is about separating user interaction from the user interface (i.e. the representation)…
O
-+-
|
/ \
---
HII
---
MV*
---
API
---
The User Interface (UI) does not cover the interaction, which means true physical interaction between a person and software running on a piece hardware. We do need a way to model the this interaction between physical human world and digital computer world. You can argue that for example HTML forms already provide means for this (with mouse and keyboard), but there are certain problems with them. First and foremost, the HTML is not easy to test. Improvements would definitely be beneficial in this domain.
Although modern web browsers implement many WEB APIs they could benefit from an Interaction API. There are some existing efforts made in UI markup languages which abstract application development using XML. For example, VoiceXML and JavaFX are represented in XML format, but they could be rendered into HTML as well. Another way of looking this is visual flows in BMPN frameworks: inputs and outputs only are relevant. Anyways, the Interaction API would allow getting rid of HTML when defining application behavior.
When writing automated tests for a web application using React, I found myself writing HTML selectors (for Selenium) for Katalon/Robot framework. Now, the selectors themselves are pretty brittle, but most importantly the development flow goes backwards: I was writing selectors and adding element ids after I had already completed writing the app itself. There is nothing wrong with re-factoring code to make it more testable per se, but the changes I made caused me to add more code and the code became more messy – definitely not the right direction. And automated testing is a basic requirement for a high quality app.
My tests follow the usual pattern of Arrange-Act-Assert (AAA), and they are mostly automated end-to-end tests, which allow effective testing targeting end-user value. But again, because of how web apps are coded, the tests themselves became cluttered. I noticed that I would definitely benefit from being able to clearly define what are the required inputs from the user and the outputs from the system. For example, a user registration flow could require user’s email address, and the system would generate user name for the registered user.
But what is the Human Interaction Interface (HII) then?It sits between user and hardware that is used for controlling both inputs and software. The interface defines needed input and outputs which are required for specific flows. In Katalon, for instance, I end my test with an assertion which checks that all outputs for the end-user are given. And please note, this is a final check at the end; often tests just wait for certain (fixed) output from the system, but in my case the assertions cover the whole end-user path.
And before going any further I want to stress that figuring out user inputs and system outputs, creating arrangements and checking assertions against outcomes is REALLY painful when doing it by selecting HTML elements or attributes and checking their values etc. Automated testing is painful when HTML is present. Cleaner representation leads into easier testing…
If you have ever coded VoiceXML (VXML), you will have easier time to understand where I am going at. Since the “computer” is a telephone and the “keyboard” is the keypad, it is easier to realize that the “computer” awaits user’s input in a specific position. It is in a way “aware” of itself – it would be better to say that it is aware of its state. Typically internet apps are controlled by the user and thus the browsers are not aware of their states. HII makes it possible to define the sequence for interaction with the computer. And it also stores relevant user inputs and system outputs to the flow. And as stated: it is intelligent, so it knows what input is expected from the end-user. This (meta)data can be used for automated testing!
So how does my idea differ from the VXML or similar. Well, I would borrow and combine keywords into my idea. Keywords are keywords how they are defined and used in automated testing frameworks like Robot or Katalon. What I really like about the keywords is that they provide quite abstract and high-level means to describe WHAT. Keywords can contain variables which can be resolved and populated dynamically like in Ansible!
But there’s even more potential when you consider Accessibility (A11Y): many websites using HTML are lacking logical flow structures, which is not the case with for instance VXML. When the logical flow is present in HTML pages, it also makes it accessible: for example navigating the site using keyboard and a screen reader. Another way of putting is that websites are often missing semantic structure, which is partially solved with accessibility.
P.S. To retrofit this kind of feature in HTML5 is difficult. I would suggest trying to store some of the flow data to HTML tree where it can be accessed with normal testing selectors… Inspiration could be taken from Page Object Model (POM).
You must be logged in to post a comment.